Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
45 commits
Select commit Hold shift + click to select a range
f4f7aa5
Fix typo in pipeline source node description
Tejeshyewale Apr 26, 2026
b22e844
docs: improve README clarity and deployment instructions
Tejeshyewale Apr 26, 2026
319fca1
Update README.md
Tejeshyewale Apr 26, 2026
7ce10e3
Create README.md
Tejeshyewale Apr 28, 2026
90fac4d
Create code.py
Tejeshyewale Apr 28, 2026
b0598a2
Create IGlobal.py
Tejeshyewale Apr 28, 2026
ca2b870
Create IInstance.py
Tejeshyewale Apr 28, 2026
5862bd8
Create services.json
Tejeshyewale Apr 28, 2026
b33eff4
Create requirements.txt
Tejeshyewale Apr 28, 2026
be13299
Update code.py
Tejeshyewale Apr 28, 2026
4d88f21
Add files via upload
Tejeshyewale Apr 28, 2026
48a784f
Rename IInstance.py to Instance.py
Tejeshyewale Apr 28, 2026
7e6963d
Update services.json
Tejeshyewale Apr 28, 2026
5952843
Update requirements.txt
Tejeshyewale Apr 28, 2026
c667348
Update README.md
Tejeshyewale Apr 28, 2026
12c00e2
Update requirements.txt
Tejeshyewale Apr 28, 2026
c587ada
Update code.py
Tejeshyewale Apr 28, 2026
bb482f0
Merge branch 'rocketride-org:develop' into develop
Tejeshyewale Apr 30, 2026
8e676e5
Fix PreProcessor structure and improve error handling
Tejeshyewale Apr 30, 2026
50a0bd4
Fix PreProcessor node structure and improve error handling
Tejeshyewale Apr 30, 2026
96cf781
Update IGlobal.py
Tejeshyewale Apr 30, 2026
92593be
Update Instance.py
Tejeshyewale Apr 30, 2026
fe1f418
Update code.py
Tejeshyewale Apr 30, 2026
89504f9
Update code.py
Tejeshyewale Apr 30, 2026
dc7f679
Update IGlobal.py
Tejeshyewale Apr 30, 2026
4c9d406
Update Instance.py
Tejeshyewale Apr 30, 2026
c39113d
Update services.json
Tejeshyewale Apr 30, 2026
47ad4f9
Update code.py
Tejeshyewale Apr 30, 2026
3e8e46b
Update IGlobal.py
Tejeshyewale Apr 30, 2026
5399865
Update Instance.py
Tejeshyewale Apr 30, 2026
43c725d
Update README.md
Tejeshyewale Apr 30, 2026
5602c71
Merge branch 'rocketride-org:develop' into develop
Tejeshyewale Apr 30, 2026
9eaba7e
Update IGlobal.py
Tejeshyewale May 1, 2026
efa13f6
Update Instance.py
Tejeshyewale May 1, 2026
66e54c3
Update code.py
Tejeshyewale May 1, 2026
cc8840e
Create __init__.py
Tejeshyewale May 1, 2026
1f33c2f
Update requirements.txt
Tejeshyewale May 1, 2026
4d29752
Update services.json
Tejeshyewale May 1, 2026
9ad5458
Rename Instance.py to IInstance.py
Tejeshyewale May 1, 2026
282a102
Delete nodes/src/nodes/ml_sklearn/model.pkl
Tejeshyewale May 1, 2026
718b5ce
Merge branch 'rocketride-org:develop' into develop
Tejeshyewale May 1, 2026
46d6428
Merge branch 'develop' into develop
kwit75 May 1, 2026
83abe32
Merge branch 'develop' into develop
kwit75 May 1, 2026
3d91916
Merge branch 'rocketride-org:develop' into develop
Tejeshyewale May 2, 2026
4fe5fe9
feat(ai): add structured logging to task engine lifecycle
Tejeshyewale May 3, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
50 changes: 50 additions & 0 deletions nodes/src/nodes/ml_sklearn/IGlobal.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
# =============================================================================
# MIT License
# Copyright (c) 2026 RocketRide Contributors
# =============================================================================

# ------------------------------------------------------------------------------
# This class controls the data shared between all threads for the task
# ------------------------------------------------------------------------------
import os

from rocketlib import IGlobalBase, OPEN_MODE, warning
from ai.common.config import Config


class IGlobal(IGlobalBase):
"""Global state for the ml_sklearn node — holds the loaded sklearn model."""

preprocessor: object = None # The sklearn model/pipeline instance

def validateConfig(self):
"""Validate that scikit-learn and numpy are available."""
try:
from depends import depends

requirements = os.path.dirname(os.path.realpath(__file__)) + '/requirements.txt'
depends(requirements)
except Exception as e: # noqa: BLE001
warning(str(e))
Comment on lines +22 to +28
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial | ⚡ Quick win

Requirements path is duplicated across validateConfig and beginGlobal.

The expression os.path.dirname(os.path.realpath(__file__)) + '/requirements.txt' appears verbatim in both methods. Extract it to a class-level constant to avoid divergence on future renames.

♻️ Proposed refactor
 class IGlobal(IGlobalBase):
     """Global state for the ml_sklearn node — holds the loaded sklearn model."""

     preprocessor: object = None  # The sklearn model/pipeline instance
+    _REQUIREMENTS = os.path.join(os.path.dirname(os.path.realpath(__file__)), 'requirements.txt')

     def validateConfig(self):
         """Validate that scikit-learn and numpy are available."""
         try:
             from depends import depends
-            requirements = os.path.dirname(os.path.realpath(__file__)) + '/requirements.txt'
-            depends(requirements)
+            depends(self._REQUIREMENTS)
         except Exception as e:  # noqa: BLE001
             warning(str(e))

     def beginGlobal(self):
         """Load the sklearn model at runtime startup."""
         if self.IEndpoint.endpoint.openMode == OPEN_MODE.CONFIG:
             pass
         else:
             from depends import depends
-            requirements = os.path.dirname(os.path.realpath(__file__)) + '/requirements.txt'
-            depends(requirements)
+            depends(self._REQUIREMENTS)

Also applies to: 39-40

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/ml_sklearn/IGlobal.py` around lines 22 - 28, The duplicated
requirements path expression used in beginGlobal and validateConfig should be
extracted to a single class-level constant (e.g. REQUIREMENTS_PATH) so both
methods reference that constant; update the class (in IGlobal) to define
REQUIREMENTS_PATH = os.path.join(os.path.dirname(os.path.realpath(__file__)),
'requirements.txt') and replace the inline expressions in beginGlobal and
validateConfig with that constant, keeping the existing try/except and
depends(requirements) call semantics.


def beginGlobal(self):
"""Load the sklearn model at runtime startup."""
if self.IEndpoint.endpoint.openMode == OPEN_MODE.CONFIG:
# Config mode: don't load the model, we'll only be called
# to configure the service definition.
pass
else:
from depends import depends

requirements = os.path.dirname(os.path.realpath(__file__)) + '/requirements.txt'
depends(requirements)

# Deferred import — only after deps are installed
from .code import PreProcessor

config = Config.getNodeConfig(self.glb.logicalType, self.glb.connConfig)
self.preprocessor = PreProcessor(config)

def endGlobal(self):
"""Release the sklearn model."""
self.preprocessor = None
45 changes: 45 additions & 0 deletions nodes/src/nodes/ml_sklearn/IInstance.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
# =============================================================================
# MIT License
# Copyright (c) 2026 RocketRide Contributors
# =============================================================================

# ------------------------------------------------------------------------------
# This class controls the data for each thread of the task
# ------------------------------------------------------------------------------
import copy

from rocketlib import IInstanceBase, Entry

from .IGlobal import IGlobal


class IInstance(IInstanceBase):
"""Per-thread instance for the ml_sklearn node."""

IGlobal: IGlobal

def open(self, obj: Entry):
"""Called before each new pipeline object — nothing to reset for this node."""
pass

def writeAnswers(self, question):
"""
Receive a question from upstream, run sklearn inference on its text,
and forward the result to the answers output lane.

The question is deep-copied to prevent mutation in fan-out pipelines.
"""
if self.IGlobal.preprocessor is None:
raise RuntimeError('sklearn PreProcessor not initialized')

question = copy.deepcopy(question)

# Get the text to process
text = question.text if hasattr(question, 'text') else str(question)

# Run inference
result = self.IGlobal.preprocessor.process(text)

# Write result back to the question object and forward downstream
question.text = result
self.instance.writeAnswers(question)
19 changes: 19 additions & 0 deletions nodes/src/nodes/ml_sklearn/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
# ML Sklearn Prediction Node

This node performs predictions using a trained scikit-learn model.

## Input

- text (number as string)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Test fixture in services.json contradicts the documented input format.

The README specifies the input as text (number as string), but the test fixture in services.json supplies "hello world" — a non-numeric string. These should be consistent so the test fixture exercises the documented contract. Consider aligning the fixture to a numeric string (e.g., "250") to match the stated input type.

Also applies to: 44-49

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/ml_sklearn/README.md` at line 7, The README entry that
documents the node input as "text (number as string)" conflicts with the test
fixture in services.json which currently supplies "hello world"; update the test
fixture(s) in services.json to provide a numeric string (for example "250") so
the fixture matches the documented input contract described in README.md (also
fix the other instances noted around lines 44-49).


## Output

- text (predicted value as string)

## Example

Input:
250

Output:
3.5
4 changes: 4 additions & 0 deletions nodes/src/nodes/ml_sklearn/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
from .IGlobal import IGlobal
from .IInstance import IInstance

__all__ = ['IGlobal', 'IInstance']
47 changes: 47 additions & 0 deletions nodes/src/nodes/ml_sklearn/code.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
# =============================================================================
# MIT License
# Copyright (c) 2026 RocketRide Contributors
# =============================================================================

# ------------------------------------------------------------------------------
# PreProcessor: sklearn-based text inference class
# All heavy imports are deferred — this file is imported only after
# depends() has installed requirements.txt in beginGlobal().
# ------------------------------------------------------------------------------


class PreProcessor:
"""Wraps a scikit-learn model/pipeline for text inference."""

def __init__(self, config: dict):
"""
Initialize the sklearn model.

In a real deployment, you'd load a pickled model from a path
specified in config. This stub returns text unchanged so the
node is CI-safe without a pre-trained model artifact.
"""
# Example: load a real model like this:
# import joblib
# model_path = config.get('model_path', '')
# self._model = joblib.load(model_path)
self._model = None # Replace with actual model loading
Comment on lines +16 to +28
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

config is not stored — silently drops runtime configuration; also missing -> None annotation.

Two related problems:

  1. Unused config (ARG002): The parameter is received and ignored — self._config = config is never set. The commented-out loading example works because config is in-scope within __init__, but if any future method (e.g., a reload helper) needs it, it won't be accessible. At minimum, prefix the parameter _config to signal intentional non-use in the stub, or store it.
  2. Missing -> None annotation (ANN204): Ruff flags this; per the project's ruff lint requirement for nodes/**/*.py, __init__ should be annotated with -> None.
🛠️ Proposed fix
-    def __init__(self, config: dict):
+    def __init__(self, config: dict) -> None:
         """
         Initialize the sklearn model.
         ...
         """
+        self._config = config  # Retained for real model loading
         # Example: load a real model like this:
         # import joblib
-        # model_path = config.get('model_path', '')
+        # model_path = self._config.get('model_path', '')
         # self._model = joblib.load(model_path)
         self._model = None  # Replace with actual model loading
🧰 Tools
🪛 Ruff (0.15.12)

[warning] 16-16: Missing return type annotation for special method __init__

Add return type annotation: None

(ANN204)


[warning] 16-16: Unused method argument: config

(ARG002)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/ml_sklearn/code.py` around lines 16 - 28, The __init__
currently accepts config but doesn't store it and lacks a return type; update
the constructor signature to include the return annotation (def __init__(self,
config: dict) -> None:) and either store the config on the instance
(self._config = config) so future methods (e.g., reload helpers) can access it,
or rename the parameter to _config to signal intentional non-use; ensure
existing usage of self._model remains unchanged (self._model = None) after
adding the annotation and storing/renaming the config.


def process(self, text: str) -> str:
"""
Run sklearn inference on input text and return processed text.

Args:
text: The input string to process.

Returns:
The processed string. Currently passes through unchanged.
"""
if self._model is None:
# Pass-through when no model is loaded (safe for CI)
return text

# Example with a real model:
# prediction = self._model.predict([text])
# return str(prediction[0])
return text
2 changes: 2 additions & 0 deletions nodes/src/nodes/ml_sklearn/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
scikit-learn>=1.0.0,<2.0.0
numpy>=1.21.0,<3.0.0
52 changes: 52 additions & 0 deletions nodes/src/nodes/ml_sklearn/services.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
{
"ml_sklearn": {
"name": "ML sklearn",
"description": "Applies a trained scikit-learn model to process text through the pipeline.",
"icon": "python.svg",
"group": "preprocessor",
"color": "#f97316",
"runtime": "python",
"pipe": {
"lanes": {
"answers": {
"in": true,
"out": true
}
}
},
Comment on lines +9 to +16
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Check what lane schema is used by other existing nodes to confirm
# whether ml_sklearn's pipe.lanes schema matches or deviates from the pattern.

rg -l '"pipe"' nodes/src/nodes --include='services.json' -n

echo "--- agent_langchain services.json ---"
fd 'services.json' nodes/src/nodes/agent_langchain --exec cat {}

echo "--- search_exa services.json ---"
fd 'services.json' nodes/src/nodes/search_exa --exec cat {}

echo "--- ml_sklearn services.json ---"
fd 'services.json' nodes/src/nodes/ml_sklearn --exec cat {}

Repository: rocketride-org/rocketride-server

Length of output: 7550


🏁 Script executed:

# Check if any other nodes use the "pipe" pattern
echo "=== Checking for 'pipe' usage in all services.json files ==="
find nodes/src/nodes -name 'services.json' -type f | while read file; do
  if grep -q '"pipe"' "$file"; then
    echo "Found 'pipe' in: $file"
  fi
done

echo ""
echo "=== Total services.json files in codebase ==="
find nodes/src/nodes -name 'services.json' -type f | wc -l

echo ""
echo "=== List all services.json file locations ==="
find nodes/src/nodes -name 'services.json' -type f

Repository: rocketride-org/rocketride-server

Length of output: 3314


🏁 Script executed:

# Search for service registry or parser code that validates/processes lanes
echo "=== Searching for service registry or parser related code ==="
fd -t f '(registry|parser|service)' --type f | grep -E '\.(ts|js|py)$' | head -20

echo ""
echo "=== Searching in src directory for registry/parser patterns ==="
find src -type f \( -name '*registry*' -o -name '*parser*' -o -name '*service*' \) 2>/dev/null | head -20

Repository: rocketride-org/rocketride-server

Length of output: 796


🏁 Script executed:

# Examine pipelineParser.ts to understand how it processes lanes
echo "=== Checking pipelineParser.ts ==="
wc -l apps/vscode/src/shared/util/pipelineParser.ts

# Check first portion to understand structure
head -100 apps/vscode/src/shared/util/pipelineParser.ts

Repository: rocketride-org/rocketride-server

Length of output: 3312


🏁 Script executed:

# Check registry.js for schema handling
echo "=== Checking registry.js ==="
cat scripts/lib/registry.js | head -200

Repository: rocketride-org/rocketride-server

Length of output: 4861


🏁 Script executed:

# Sample a standard node to confirm pattern
echo "=== Standard node: accessibility_describe ==="
cat nodes/src/nodes/accessibility_describe/services.json | head -50

Repository: rocketride-org/rocketride-server

Length of output: 1884


🏁 Script executed:

# Search for Python code that loads/parses services.json
echo "=== Searching for services.json loading in Python ==="
rg 'services\.json' packages/client-python packages/common --type py -B 2 -A 2 | head -100

echo ""
echo "=== Searching for lane/pipe parsing logic ==="
rg '(lane|pipe)' packages/ai/src/ai/modules/services/ --type py -B 2 -A 2 | head -150

Repository: rocketride-org/rocketride-server

Length of output: 839


🏁 Script executed:

# Broader search for services.json loading
echo "=== Searching for services.json file operations ==="
rg 'services\.json' --type py --type ts --type js -B 3 -A 3 | head -200

echo ""
echo "=== Searching for 'lanes' key handling ==="
rg '"lanes"' --type py --type ts --type js -B 2 -A 2 | head -150

Repository: rocketride-org/rocketride-server

Length of output: 12607


🏁 Script executed:

# Check the actual service loader in packages/ai
echo "=== Checking services.py in packages/ai ==="
find packages/ai -name 'services.py' -o -name '*service*.py' | head -10

Repository: rocketride-org/rocketride-server

Length of output: 168


🏁 Script executed:

# Examine the actual services module that loads and processes lanes
echo "=== Checking packages/ai/src/ai/modules/services/services.py ==="
wc -l packages/ai/src/ai/modules/services/services.py

# Check first 150 lines for import and lane handling
head -150 packages/ai/src/ai/modules/services/services.py

echo ""
echo "=== Look for 'lanes' key access ==="
grep -n 'lanes' packages/ai/src/ai/modules/services/services.py | head -20

Repository: rocketride-org/rocketride-server

Length of output: 1432


🏁 Script executed:

# Search for ml_sklearn usage in tests or code
echo "=== Searching for ml_sklearn references ==="
rg 'ml_sklearn' --type py --type ts --type js -B 2 -A 2 | head -100

echo ""
echo "=== Checking if there's lane schema validation ==="
rg '(lane.*schema|schema.*lane|validate.*lane)' --type py --type ts --type js -i | head -50

echo ""
echo "=== Look for any documentation on lane format ==="
find . -name '*.md' -type f -exec grep -l 'lane' {} \; | head -10

Repository: rocketride-org/rocketride-server

Length of output: 3505


🏁 Script executed:

# Check if ml_sklearn has any test or implementation files
echo "=== ml_sklearn implementation files ==="
find nodes/src/nodes/ml_sklearn -type f

echo ""
echo "=== Check test files for ml_sklearn ==="
find . -name '*test*' -type f | xargs grep -l 'ml_sklearn' 2>/dev/null | head -5

Repository: rocketride-org/rocketride-server

Length of output: 431


🏁 Script executed:

# Examine ml_sklearn's implementation to understand how it expects to receive input
echo "=== ml_sklearn code.py ==="
cat nodes/src/nodes/ml_sklearn/code.py

echo ""
echo "=== ml_sklearn README ==="
cat nodes/src/nodes/ml_sklearn/README.md

Repository: rocketride-org/rocketride-server

Length of output: 2089


🏁 Script executed:

# Check documentation on lane format and expected schema
echo "=== Checking ROCKETRIDE_COMPONENT_REFERENCE for lane schema ==="
grep -A 20 -B 5 'lane' docs/agents/ROCKETRIDE_COMPONENT_REFERENCE.md | head -100

Repository: rocketride-org/rocketride-server

Length of output: 3931


Lanes schema must follow the established pattern — replace nested pipe structure with top-level lanes object.

The ml_sklearn node is the only one (out of 71) that nests lanes under "pipe" and uses boolean in/out properties. All other nodes, including agent_langchain and search_exa, declare lanes at the top level with array values:

"lanes": {
  "answers": ["output_lane"]
}

Additionally, all other nodes include an "input" array for routing configuration, which ml_sklearn lacks entirely. According to RocketRide documentation, the lanes object is "the definitive reference for data flow" and is expected at the top level with keys mapping to output lane arrays. The current structure will not be parsed correctly by the service registry, causing pipeline wiring failures at runtime.

Diff showing required changes
"pipe": {
    "lanes": {
        "answers": {
            "in": true,
            "out": true
        }
    }
},

Should be:

"lanes": {
    "answers": ["answers"]
},
"input": [
    {
        "lane": "answers",
        "output": [{ "lane": "answers" }]
    }
],
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@nodes/src/nodes/ml_sklearn/services.json` around lines 9 - 16, The JSON uses
a nested "pipe" -> "lanes" object with boolean "in"/"out" for the "answers"
lane; replace that with a top-level "lanes" object where "answers" maps to an
array (e.g., "answers": ["answers"]) and add a top-level "input" array that
routes from lane "answers" to output lane "answers" per the other services'
schema; specifically remove the "pipe" block and add the top-level "lanes" and
"input" keys for the ml_sklearn service so the registry can parse "answers"
correctly.

"preconfig": {
"default": {
"object": "default",
"properties": []
}
},
"profiles": {
"ml_sklearn.default": {
"object": "default",
"properties": []
}
},
"fields": [],
"shape": {
"inputs": [
{
"name": "answers",
"type": "answers"
}
],
"outputs": [
{
"name": "answers",
"type": "answers"
}
]
},
"test": {
"answers": [
{
"text": "hello world"
}
]
}
}
}
31 changes: 31 additions & 0 deletions packages/ai/src/ai/modules/task/task_engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,10 @@
from .types import LAUNCH_TYPE
from .task_conn import TaskConn
from .task_metrics import TaskMetrics
from .task_logger import get_task_logger

# Module-level structured logger for task lifecycle events
_logger = get_task_logger(__name__)


if TYPE_CHECKING:
Expand Down Expand Up @@ -843,6 +847,15 @@ async def _terminated(self) -> None:
apikey=task_apikey,
)

_logger.info(
'Task terminated',
extra={
'task_id': self.id,
'step': 'termination',
'exit_code': self._status.exitCode,
'final_state': self._status.state,
},
)
self.debug_message('Resource cleanup completed successfully')

def _on_metrics_updated(self) -> None:
Expand Down Expand Up @@ -1470,6 +1483,10 @@ async def start_task(self) -> None:

# Set our current state
self._status.state = TASK_STATE.STARTING.value
_logger.info(
'Task starting',
extra={'task_id': self.id, 'step': 'start'},
)

# Resolve any ${...} in the pipeline
self._pipeline = self._resolve_pipeline(self._pipeline)
Expand Down Expand Up @@ -1586,6 +1603,15 @@ async def start_task(self) -> None:
env=subprocess_env,
)

_logger.info(
'Subprocess created',
extra={
'task_id': self.id,
'step': 'subprocess',
'pid': self._engine_process.pid,
},
)

# Initialize stdio interface
try:
self._debug_stdio = Task.TaskDbgStdio(
Expand Down Expand Up @@ -1661,6 +1687,11 @@ async def start_task(self) -> None:

except Exception as e:
await self._terminated()
_logger.error(
'Task startup failed',
extra={'task_id': self.id, 'step': 'error', 'error': str(e)},
exc_info=True,
)
self.debug_message(f'Task startup failed: {e}')
raise
Comment on lines 1688 to 1696
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Error log fires after _terminated() — "Task terminated" will always precede "Task startup failed" in the log stream, inverting the causal order.

Inside _terminated() (line 850), the termination event is logged unconditionally. Because await self._terminated() runs first (line 1689), every startup failure will produce log entries in this order:

{"step": "termination", "message": "Task terminated", ...}   ← logged inside _terminated()
{"step": "error",       "message": "Task startup failed", ...} ← logged after _terminated() returns

A developer querying for the root cause of a startup failure will see the termination message first and have to work backwards. Moving the error log before the _terminated() call restores the expected causal sequence.

Additionally, Ruff G201 flags the exc_info=True form — _logger.exception() is the idiomatic spelling and drops the redundant kwarg.

🐛 Proposed fix — log the error first, then clean up, use exception()
         except Exception as e:
-            await self._terminated()
-            _logger.error(
+            _logger.exception(
                 'Task startup failed',
-                extra={'task_id': self.id, 'step': 'error', 'error': str(e)},
-                exc_info=True,
+                extra={'task_id': self.id, 'step': 'error', 'error': str(e)},
             )
+            await self._terminated()
             self.debug_message(f'Task startup failed: {e}')
             raise
🧰 Tools
🪛 Ruff (0.15.12)

[warning] 1690-1690: Logging .exception(...) should be used instead of .error(..., exc_info=True)

(G201)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ai/src/ai/modules/task/task_engine.py` around lines 1688 - 1696, The
exception handling in the task startup path logs termination before the error
(call to await self._terminated() happens before logging) and uses
_logger.error(..., exc_info=True) instead of the idiomatic _logger.exception;
update the except block in TaskEngine.task startup code so you first log the
failure with _logger.exception('Task startup failed', extra={...}) and call
self.debug_message(...) (or include same context) and only after logging await
self._terminated(); keep the final raise to re-raise the exception.


Expand Down
45 changes: 45 additions & 0 deletions packages/ai/src/ai/modules/task/task_logger.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
from __future__ import annotations

import json
import logging
import time
from typing import Any


class _StructuredFormatter(logging.Formatter):
_RESERVED = frozenset(logging.LogRecord(
'', 0, '', 0, '', (), None
).__dict__.keys()) | {'message', 'asctime'}

def format(self, record: logging.LogRecord) -> str:
record.message = record.getMessage()

payload = {
'timestamp': time.strftime(
'%Y-%m-%dT%H:%M:%SZ', time.gmtime(record.created)
),
'level': record.levelname,
'logger': record.name,
'message': record.message,
}

for key, value in record.__dict__.items():
if key not in self._RESERVED:
payload[key] = value

if record.exc_info:
payload['exception'] = self.formatException(record.exc_info)

return json.dumps(payload, default=str)


def get_task_logger(name: str) -> logging.Logger:
logger = logging.getLogger(name)

if not logger.handlers:
handler = logging.StreamHandler()
handler.setFormatter(_StructuredFormatter())
logger.addHandler(handler)
logger.propagate = False

return logger
Comment on lines +36 to +45
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

get_task_logger never sets a log level — all _logger.info() calls will be silently dropped in default configurations.

Python's effective-level resolution always traverses the logger hierarchy regardless of propagate. With propagate = False, records don't bubble up to parent handlers, but level filtering still uses getEffectiveLevel(), which walks up to the root until it finds a non-NOTSET level. The root logger's default level is WARNING, so every _logger.info(...) call added by this PR will be filtered out before it ever reaches the StreamHandler.

🐛 Proposed fix — set level so records reach the handler
 def get_task_logger(name: str) -> logging.Logger:
     logger = logging.getLogger(name)

     if not logger.handlers:
         handler = logging.StreamHandler()
         handler.setFormatter(_StructuredFormatter())
         logger.addHandler(handler)
-        logger.propagate = False

+    logger.setLevel(logging.DEBUG)   # pass everything through; let the handler/app filter
+    logger.propagate = False         # always disable propagation, not just on first setup

     return logger

Moving propagate = False outside the guard is also necessary: if the logger is already configured (e.g. by a test framework or an early basicConfig call that names this logger), the guard body is skipped and propagate stays True, causing duplicate output to parent handlers.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/ai/src/ai/modules/task/task_logger.py` around lines 36 - 45, The
get_task_logger function currently never sets a logger level and only sets
propagate=False inside the "if not logger.handlers" guard, causing INFO messages
to be dropped and duplicate output in some setups; fix by explicitly setting the
logger level (e.g. logger.setLevel(logging.INFO) or
logger.setLevel(logging.DEBUG) as appropriate) so records pass level filtering,
move logger.propagate = False outside the handlers-creation guard so it is
always applied, and optionally ensure the StreamHandler has an appropriate level
(handler.setLevel or leave NOTSET) so the handler receives the records; these
changes should be made in get_task_logger.

Loading