refactor: standardize Observation class

There are 9 different observation classes that implement the same things but with minor differences. It should be standardized in the base class.


## Common Patterns Identified

### 1. Error Handling (7/9 observations)
**Pattern variants:**
```python
# Variant A: String error field (4 observations)
error: str | None = None              # FileEditor, Grep, Glob, Browser

# Variant B: Boolean error flag (2 observations)
error: bool = False                   # ExecuteBash
is_error: bool = False                # MCPTool

# No error field (2 observations): Think, Finish
```

**Usage Pattern in `to_llm_content`:**
```python
if self.error:
    return [TextContent(text=f"Error: {self.error}")]
# ... normal content
```

---

### 2. Output/Content Field (8/9 observations)
**Pattern variants:**
```python
output: str                           # ExecuteBash, FileEditor, Browser
content: str                          # TaskTracker, Think
message: str                          # Finish
content: list[TextContent | ImageContent]  # MCPTool

# No direct output field: Grep, Glob (use structured data instead)
```

## Refactoring Opportunities

### 1. Standardized Error Handling

**Current Problem:**
- Inconsistent error field naming (`error` vs `is_error`)
- Inconsistent types (`str | None` vs `bool`)
- Repeated error-first pattern in `to_llm_content`

**Proposed Base Class Enhancement:**
```python
class Observation(Schema, ABC):
    # Optional error field that can be overridden
    error: str | None = Field(
        default=None, 
        description="Error message if operation failed"
    )
    
    @property
    def has_error(self) -> bool:
        """Check if observation represents an error."""
        return self.error is not None
    
    def _format_error(self) -> TextContent:
        """Standard error formatting."""
        return TextContent(text=f"Error: {self.error}")
```

### 2. Success/Status Indication

**Current Problem:**
- No standardized way to indicate success/failure
- Some use exit codes, some use error flags, some have no indicator

**Proposed Base Class Enhancement:**
```python
from enum import Enum

class ObservationStatus(str, Enum):
    SUCCESS = "success"
    ERROR = "error"
    
class Observation(Schema, ABC):
    @property
    def status(self) -> ObservationStatus:
        """Compute observation status."""
        if self.has_error:
            return ObservationStatus.ERROR
        return ObservationStatus.SUCCESS
```

### 3. Standardized Output Field

**Current Problem:**
- Inconsistent field names for primary output: `output`, `content`, `message`
- 8 different observations use 3 different naming conventions

**Current Field Usage:**
```python
# Variant A: "output" field (4 observations)
output: str                           # ExecuteBash, FileEditor, Browser

# Variant B: "content" field (3 observations)  
content: str                          # TaskTracker, Think
content: list[TextContent | ImageContent]  # MCPTool

# Variant C: "message" field (1 observation)
message: str                          # Finish

# Variant D: No direct output field (2 observations)
# Use structured data only: matches, files  # Grep, Glob
```

**Proposed Base Class Enhancement:**
```python
class Observation(Schema, ABC):
    # Standardized primary output field
    output: str = Field(
        default="",
        description="Primary text output from the tool operation"
    )
```

and add a base implementation of `to_llm_content` 

```python
@property
def to_llm_content(self) -> Sequence[TextContent | ImageContent]:
    if self.error:
        return [TextContent(text=f"Error: {self.error}")]
    return [TextContent(text=self.output)]
```

Sub-classes can override the base implementation of to_llm_content to provide more context to the agent if needed





Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

refactor: standardize Observation class #923

Common Patterns Identified

1. Error Handling (7/9 observations)

2. Output/Content Field (8/9 observations)

Refactoring Opportunities

1. Standardized Error Handling

2. Success/Status Indication

3. Standardized Output Field

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

refactor: standardize Observation class #923

Description

Common Patterns Identified

1. Error Handling (7/9 observations)

2. Output/Content Field (8/9 observations)

Refactoring Opportunities

1. Standardized Error Handling

2. Success/Status Indication

3. Standardized Output Field

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions