Skip to content

Make Stuck Detection Thresholds Configurable #762

@neubig

Description

@neubig

Make Stuck Detection Thresholds Configurable

Summary

Loop/stuck detection is currently implemented in the StuckDetector class but the detection thresholds are hardcoded. This causes issues in legitimate use cases where agents need to perform repetitive operations or wait for long-running tasks to complete.

Problem Statement

The stuck detector currently uses hardcoded thresholds for detecting various loop patterns:

  1. Repeating action-observation cycles: 4 identical action-observation pairs trigger stuck detection (line 107)
  2. Repeating action-error cycles: 3 identical actions with errors trigger stuck detection (line 142)
  3. Agent monologue: 3 consecutive agent messages trigger stuck detection (line 175)
  4. Alternating patterns: 6 actions/observations in ping-pong pattern trigger stuck detection (line 194-195)

Real-World Use Case

When agents interact with long-running API endpoints (e.g., model training, data processing), they need to poll repeatedly over extended periods (e.g., 15 minutes). The current thresholds cause false positives:

  • Agent is instructed to "poll every 5 minutes for up to 15 minutes"
  • Agent executes the same polling command 4 times (0, 5, 10, 15 minutes)
  • Stuck detector triggers after 4 identical action-observation pairs
  • Agent stops with "Agent stuck in loop" error, even though the behavior is intentional and instructed

Current State

Is loop detection implemented?YES

  • Implemented in openhands/sdk/conversation/stuck_detector.py
  • Can be enabled/disabled via stuck_detection=True parameter in LocalConversation.__init__()

Is it configurable?NO

  • Thresholds are hardcoded in the StuckDetector class
  • No way to adjust sensitivity for different use cases
  • Binary on/off is insufficient for legitimate repetitive operations

Proposed Solution

Make the stuck detection thresholds configurable through the StuckDetector constructor:

class StuckDetector:
    def __init__(
        self, 
        state: ConversationState,
        # Configurable thresholds with sensible defaults
        action_observation_threshold: int = 4,
        action_error_threshold: int = 3,
        monologue_threshold: int = 3,
        alternating_pattern_threshold: int = 6,
    ):
        self.state = state
        self.action_observation_threshold = action_observation_threshold
        self.action_error_threshold = action_error_threshold
        self.monologue_threshold = monologue_threshold
        self.alternating_pattern_threshold = alternating_pattern_threshold

Then expose these parameters through LocalConversation.__init__():

class LocalConversation(BaseConversation):
    def __init__(
        self,
        agent: AgentBase,
        workspace: str | LocalWorkspace,
        persistence_dir: str | None = None,
        conversation_id: ConversationID | None = None,
        callbacks: list[ConversationCallbackType] | None = None,
        max_iteration_per_run: int = 500,
        stuck_detection: bool = True,
        # New parameters
        stuck_detection_thresholds: dict[str, int] | None = None,
        visualize: bool = True,
        secrets: Mapping[str, SecretValue] | None = None,
        **_: object,
    ):
        # ...
        if stuck_detection:
            thresholds = stuck_detection_thresholds or {}
            self._stuck_detector = StuckDetector(
                self._state,
                action_observation_threshold=thresholds.get('action_observation', 4),
                action_error_threshold=thresholds.get('action_error', 3),
                monologue_threshold=thresholds.get('monologue', 3),
                alternating_pattern_threshold=thresholds.get('alternating_pattern', 6),
            )

Usage Example

# For long-running tasks that need more tolerance
conversation = Conversation(
    agent=agent,
    workspace=os.getcwd(),
    stuck_detection=True,
    stuck_detection_thresholds={
        'action_observation': 10,  # Allow 10 repetitions for polling
        'action_error': 5,         # More retries on errors
    }
)

Benefits

  1. Backward compatible: Default values maintain current behavior
  2. Flexible: Users can adjust thresholds based on their use case
  3. Still safe: Stuck detection remains enabled by default with sensible defaults
  4. Explicit: Users must consciously increase thresholds, preventing accidental infinite loops

Alternative Considered

Instead of disabling stuck detection entirely (stuck_detection=False), configurable thresholds allow:

  • Protection against actual infinite loops
  • Support for legitimate repetitive operations
  • Fine-grained control per detection pattern

Implementation Checklist

  • Add threshold parameters to StuckDetector.__init__()
  • Update all detection methods to use configurable thresholds
  • Add stuck_detection_thresholds parameter to LocalConversation.__init__()
  • Add tests for custom thresholds
  • Update documentation and examples
  • Consider adding to RemoteConversation for consistency

Related

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions