Skip to content

Conversation

swheaton
Copy link
Contributor

@swheaton swheaton commented Oct 17, 2025

What changes are proposed in this pull request?

Model changes to DelegatedOperationDocument to support always-run stages and other future functionality.
It separates the static definition of the pipeline types.Pipeline from run state info that is mutable - types.PipelineRunInfo.
This allows pipeline related information to be contained within one sub document instead of spread around - more organized.

The two fields we need added are:

  • active: if True, the pipeline has not failed and is ongoing. If False, the pipeline has failed in the past but we are proceeding through the pipeline to always-run stages only.
  • expected_children: either None if active, otherwise it is a list matching length len(pipeline.stages). This list contains the number of children we should expect to see for each stage. Basically, stages that have been skipped are 0.

How is this patch tested? If it is not, please explain why.

Added/edited unit tests
Real testing was with the whole system which will be outlined in the voxel hub PR.

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release
    notes for FiftyOne users.

Added a new PipelineRunInfo type within DelegatedOperationDocument that keeps track of pipeline run state as it goes. Currently useful for internal purposes only.

What areas of FiftyOne does this PR affect?

  • App: FiftyOne application changes
  • Build: Build and test infrastructure changes
  • Core: Core fiftyone Python library changes
  • Documentation: FiftyOne documentation changes
  • Other

Summary by CodeRabbit

  • New Features

    • Pipeline stages can be marked to always run.
    • Pipeline run state tracking added (active flag, expected children, current stage index).
  • Improvements

    • Serialization/deserialization preserves stage always-run flags and pipeline run state.
    • Tests updated to validate always-run propagation and pipeline run state round-trip.

@swheaton swheaton requested review from a team as code owners October 17, 2025 16:39
Copy link
Contributor

coderabbitai bot commented Oct 17, 2025

Walkthrough

This PR adds a PipelineRunInfo dataclass, adds an always_run field to PipelineStage, replaces pipeline_index with pipeline_run_info in DelegatedOperationDocument, and updates serialization/deserialization to handle the new fields and list-based pipeline input. (49 words)

Changes

Cohort / File(s) Summary
Pipeline types & exports
fiftyone/operators/_types/pipeline.py, fiftyone/operators/types.py
Add PipelineRunInfo dataclass (fields: active, expected_children, stage_index) with from_json/to_json. Add always_run: bool to PipelineStage, extend constructors to accept/ignore **kwargs, make Pipeline tolerant of list input in from_json. Expose PipelineRunInfo in public imports.
Delegated operation document
fiftyone/factory/repos/delegated_operation_doc.py
Replace pipeline_index with pipeline_run_info attribute. from_pymongo populates pipeline_run_info = PipelineRunInfo.from_json(doc.get("pipeline_run_info")); to_pymongo includes serialized pipeline_run_info when present. Pipeline still read from doc.get("pipeline").
Unit tests
tests/unittests/factory/delegated_operation_doc_tests.py, tests/unittests/operators/delegated_tests.py, tests/unittests/operators/types_tests.py
Update tests to construct PipelineStage with new kwargs (num_distributed_tasks, params, always_run) and to assert PipelineRunInfo serialization/deserialization and presence after round-trip.

Sequence Diagram

sequenceDiagram
    actor Caller
    participant DOC as DelegatedOperationDocument
    participant P as Pipeline
    participant PRI as PipelineRunInfo

    Caller->>DOC: from_pymongo(doc)
    activate DOC
    DOC->>P: Pipeline.from_json(doc.get("pipeline"))
    DOC->>PRI: PipelineRunInfo.from_json(doc.get("pipeline_run_info"))
    PRI-->>DOC: pipeline_run_info instance
    DOC-->>Caller: DelegatedOperationDocument (with pipeline & pipeline_run_info)
    deactivate DOC

    Caller->>DOC: to_pymongo()
    activate DOC
    DOC->>P: pipeline.to_json()
    DOC->>PRI: pipeline_run_info.to_json() (if present)
    DOC-->>Caller: dict including pipeline and pipeline_run_info
    deactivate DOC
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 Hops through stages, new flags unfurled,

Always_run set, and run-info whirled.
Old index rests, a clearer line,
Pipelines saved in JSON fine.
🥕 A rabbit cheers — small code, big shine.

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 23.53% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Title Check ✅ Passed The PR title "FOEPD-2109 PipelineRunInfo type to support always-run pipeline stage" accurately and specifically summarizes the main changes in the pull request. The title directly references the new PipelineRunInfo type being introduced and mentions always-run pipeline stages, which aligns with the core purpose of this PR: introducing a new type to support pipeline run state tracking and always-run functionality. The title is concise, clear, and avoids vague terminology, making it easy for teammates scanning the commit history to understand the primary change.
Description Check ✅ Passed The PR description comprehensively addresses all required sections of the template. The "What changes are proposed" section provides clear details about the model changes, the separation of Pipeline from PipelineRunInfo, and the two new fields with their purposes. The "How is this patch tested" section confirms unit tests were added/edited and notes that system-level testing was performed. The Release Notes section is complete, with the user-facing change checkbox marked as "Yes" and a concise description stating the new PipelineRunInfo type is available for tracking pipeline run state. The affected areas checkbox for "Core" is properly marked, meeting all template requirements.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/refactor-pipeline-types-again

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between f2e0eba and 4e7acbd.

📒 Files selected for processing (3)
  • fiftyone/factory/repos/delegated_operation_doc.py (5 hunks)
  • fiftyone/operators/_types/pipeline.py (6 hunks)
  • tests/unittests/operators/types_tests.py (5 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • tests/unittests/operators/types_tests.py
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-01-30T16:40:26.003Z
Learnt from: swheaton
PR: voxel51/fiftyone#5448
File: fiftyone/factory/repos/delegated_operation_doc.py:0-0
Timestamp: 2025-01-30T16:40:26.003Z
Learning: In FiftyOne's DelegatedOperationDocument class, all instance attributes are automatically serialized through `__dict__` in the `to_pymongo()` method, eliminating the need for explicit field inclusion in serialization.

Applied to files:

  • fiftyone/factory/repos/delegated_operation_doc.py
📚 Learning: 2025-10-07T01:19:39.063Z
Learnt from: swheaton
PR: voxel51/fiftyone#6385
File: fiftyone/operators/_types/pipeline.py:64-92
Timestamp: 2025-10-07T01:19:39.063Z
Learning: In fiftyone/operators/_types/pipeline.py, the Pipeline.stage() method intentionally accepts unused **kwargs for forward compatibility. This design pattern should not be flagged as an issue.

Applied to files:

  • fiftyone/operators/_types/pipeline.py
🧬 Code graph analysis (1)
fiftyone/factory/repos/delegated_operation_doc.py (1)
fiftyone/operators/_types/pipeline.py (7)
  • Pipeline (77-169)
  • PipelineRunInfo (173-190)
  • from_json (133-154)
  • from_json (184-187)
  • to_json (67-73)
  • to_json (156-169)
  • to_json (189-190)
🪛 Pylint (4.0.0)
fiftyone/operators/_types/pipeline.py

[refactor] 35-35: Too many positional arguments (6/5)

(R0917)


[refactor] 94-94: Too many positional arguments (6/5)

(R0917)

🪛 Ruff (0.14.0)
fiftyone/operators/_types/pipeline.py

54-54: Avoid specifying long messages outside the exception class

(TRY003)


89-89: Unused method argument: kwargs

(ARG002)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: test-windows / test-python (windows-latest, 3.9)
  • GitHub Check: test-windows / test-python (windows-latest, 3.12)
  • GitHub Check: test-windows / test-python (windows-latest, 3.10)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.11)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.12)
  • GitHub Check: test-windows / test-python (windows-latest, 3.11)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.9)
  • GitHub Check: e2e / test-e2e
  • GitHub Check: build
🔇 Additional comments (13)
fiftyone/factory/repos/delegated_operation_doc.py (5)

18-18: LGTM! Import addition is correct.

The import of PipelineRunInfo aligns with the new pipeline run state tracking introduced in this PR.


67-67: LGTM! Attribute addition aligns with PR objectives.

The pipeline_run_info attribute correctly replaces pipeline_index to centralize pipeline run state tracking.


133-136: LGTM! Deserialization is correct and symmetric.

Both pipeline and pipeline_run_info are properly deserialized using their respective from_json() methods, which handle None inputs gracefully.


147-153: LGTM! Correct pattern for custom serialization.

Excluding pipeline and pipeline_run_info from automatic __dict__ serialization is appropriate since they require custom serialization via to_json() in lines 163-166.

Based on learnings.


165-166: LGTM! Serialization is symmetric with deserialization.

The pipeline_run_info serialization via to_json() correctly mirrors the deserialization in line 134-136.

fiftyone/operators/_types/pipeline.py (8)

9-9: LGTM! Import addition is necessary.

The List import is correctly added for type hinting in the new PipelineRunInfo class.


29-29: LGTM! New field supports always-run pipeline stages.

The always_run field correctly enables cleanup/finalization stages that execute even after pipeline failures, as described in the PR objectives.


34-50: LGTM! Custom init correctly handles forward compatibility.

The custom __init__ with **_ appropriately discards unused kwargs while explicitly initializing defined fields. Calling __post_init__() ensures validation logic executes.

Based on learnings.


56-60: LGTM! Type normalization improves robustness.

Normalizing num_distributed_tasks to int when present ensures type consistency, especially when deserializing from JSON where numeric types can vary.


88-92: LGTM! Fixed implicit Optional from previous review.

The Optional type annotation is now explicit, resolving the RUF013 warning from the previous review. The **kwargs pattern is intentional for forward compatibility.

Based on learnings.


94-130: LGTM! stage() method correctly extended for always_run.

The always_run parameter is properly added, documented, and passed through to PipelineStage construction. The **kwargs forwarding maintains forward compatibility.


146-150: LGTM! Enhanced from_json with backward compatibility.

The None handling and list-to-dict conversion make from_json more flexible and user-friendly while maintaining backward compatibility with existing dict inputs.


172-190: LGTM! PipelineRunInfo correctly implements mutable run state.

The new PipelineRunInfo dataclass properly separates mutable run-time state from the static pipeline definition, aligning with PR objectives. The from_json and to_json methods follow established patterns and handle edge cases correctly.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
fiftyone/operators/_types/pipeline.py (2)

89-119: Forward kwargs in Pipeline.stage and silence unused-kwargs lint

Currently, always_run (and future fields) cannot be set via stage(); kwargs are accepted but dropped. Forward them to PipelineStage.

Apply:

     def stage(
         self,
         operator_uri,
         name=None,
         num_distributed_tasks=None,
         params=None,
         # kwargs accepted for forward compatibility
-        **kwargs,  # pylint: disable=unused-argument
+        **kwargs,  # noqa: ARG002  # pylint: disable=unused-argument
     ):
@@
-        stage = PipelineStage(
+        stage = PipelineStage(
             operator_uri=operator_uri,
             name=name,
             num_distributed_tasks=num_distributed_tasks,
             params=params,
-        )
+            **kwargs,
+        )
         self.stages.append(stage)
         return stage

This preserves forward-compatibility and lets callers do pipeline.stage(..., always_run=True). Based on learnings.


121-141: Handle None in Pipeline.from_json to match PipelineRunInfo pattern and avoid AttributeError

The callsite at fiftyone/factory/repos/delegated_operation_doc.py:133 passes doc.get("pipeline") which can be None when the "pipeline" key is absent. PipelineRunInfo.from_json already implements this None check (returns None), so Pipeline.from_json should follow the same pattern for consistency.

Apply:

 @classmethod
 def from_json(cls, json_dict):
     """Loads the pipeline from a JSON/python dict.

     Ex., {
         "stages": [
             {"operator_uri": "@voxel51/test/blah", "name": "my_stage"},
             ...,
         ]
     }

     Args:
         json_dict: a JSON / python dict representation of the pipeline
     """
+    if json_dict is None:
+        return None
     if isinstance(json_dict, list):
         json_dict = {"stages": json_dict}
     stages = [
         PipelineStage(**stage) for stage in json_dict.get("stages") or []
     ]
     return cls(stages=stages)
fiftyone/factory/repos/delegated_operation_doc.py (1)

147-147: Add pipeline_run_info to ignore_keys to prevent double serialization.

The pipeline_run_info field is manually serialized at lines 159-160 using .to_json(), but it's not included in ignore_keys at line 147. This causes the PipelineRunInfo object to be deep-copied in the dict comprehension (lines 148-152) and then manually serialized again, which is inconsistent with how pipeline is handled and could lead to serialization issues.

Apply this diff to add pipeline_run_info to the ignore set:

-        ignore_keys = {"_doc", "id", "context", "pipeline"}
+        ignore_keys = {"_doc", "id", "context", "pipeline", "pipeline_run_info"}

Based on learnings

Also applies to: 159-160

🧹 Nitpick comments (5)
fiftyone/operators/_types/pipeline.py (3)

34-51: Silence unused-kwargs and too-many-arguments on PipelineStage.init

Keep the forward‑compat behavior but address lints.

Apply:

-    def __init__(
+    def __init__(  # pylint: disable=too-many-arguments
         self,
         operator_uri: str,
         always_run: bool = False,
         name: Optional[str] = None,
         num_distributed_tasks: Optional[int] = None,
         params: Optional[Mapping[str, Any]] = None,
-        **kwargs,  # Accepts and ignores unused kwargs
+        **_,  # Accepts and ignores unused kwargs  # noqa: ARG002
     ):

52-61: Guard against type errors for num_distributed_tasks

Casting was removed; passing a non‑int (e.g., "5") will now raise TypeError at runtime when compared to 1. Add a clear type check.

Apply:

-        if (
-            self.num_distributed_tasks is not None
-            and self.num_distributed_tasks < 1
-        ):
-            raise ValueError("num_distributed_tasks must be >= 1")
+        if self.num_distributed_tasks is not None:
+            if not isinstance(self.num_distributed_tasks, int):
+                raise TypeError("num_distributed_tasks must be an int")
+            if self.num_distributed_tasks < 1:
+                raise ValueError("num_distributed_tasks must be >= 1")

Please confirm no callers supply strings via request params before we enforce this.


16-23: Update docstring to include always_run

Document the new field to avoid confusion.

Apply:

     Args:
         operator_uri: the URI of the operator to use for the stage
         name: the name of the stage
         num_distributed_tasks: the number of distributed tasks to use
             for the stage, optional
         params: optional parameters to pass to the operator, overwriting
             any existing parameters
+        always_run: if True, this stage runs even when the pipeline is inactive
+            (e.g., after a failure), enabling cleanup/finalization stages
tests/unittests/operators/types_tests.py (1)

46-83: LGTM; consider adding a stage()-based always_run test

Serialization looks correct. Optionally, add a test that sets always_run via Pipeline.stage(..., always_run=True) so the convenience API is covered once kwargs are forwarded.

tests/unittests/factory/delegated_operation_doc_tests.py (1)

65-86: LGTM; round‑trip coverage for pipeline and run_info

Solid serialization assertions. Optional: import PipelineRunInfo from fiftyone.operators.types for consistency with production imports.

Also applies to: 85-85, 89-89

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1a712cf and a38372a.

📒 Files selected for processing (6)
  • fiftyone/factory/repos/delegated_operation_doc.py (4 hunks)
  • fiftyone/operators/_types/pipeline.py (5 hunks)
  • fiftyone/operators/types.py (1 hunks)
  • tests/unittests/factory/delegated_operation_doc_tests.py (1 hunks)
  • tests/unittests/operators/delegated_tests.py (1 hunks)
  • tests/unittests/operators/types_tests.py (3 hunks)
🧰 Additional context used
🧠 Learnings (2)
📚 Learning: 2025-10-07T01:19:39.063Z
Learnt from: swheaton
PR: voxel51/fiftyone#6385
File: fiftyone/operators/_types/pipeline.py:64-92
Timestamp: 2025-10-07T01:19:39.063Z
Learning: In fiftyone/operators/_types/pipeline.py, the Pipeline.stage() method intentionally accepts unused **kwargs for forward compatibility. This design pattern should not be flagged as an issue.

Applied to files:

  • fiftyone/operators/_types/pipeline.py
📚 Learning: 2025-01-30T16:40:26.003Z
Learnt from: swheaton
PR: voxel51/fiftyone#5448
File: fiftyone/factory/repos/delegated_operation_doc.py:0-0
Timestamp: 2025-01-30T16:40:26.003Z
Learning: In FiftyOne's DelegatedOperationDocument class, all instance attributes are automatically serialized through `__dict__` in the `to_pymongo()` method, eliminating the need for explicit field inclusion in serialization.

Applied to files:

  • fiftyone/factory/repos/delegated_operation_doc.py
🧬 Code graph analysis (5)
tests/unittests/operators/delegated_tests.py (3)
fiftyone/operators/_types/pipeline.py (1)
  • PipelineStage (13-68)
fiftyone/operators/executor.py (2)
  • operator_uri (828-830)
  • num_distributed_tasks (810-813)
fiftyone/factory/repos/delegated_operation_doc.py (1)
  • num_distributed_tasks (70-73)
tests/unittests/factory/delegated_operation_doc_tests.py (2)
fiftyone/operators/_types/pipeline.py (5)
  • PipelineStage (13-68)
  • PipelineRunInfo (159-176)
  • to_json (62-68)
  • to_json (142-155)
  • to_json (175-176)
fiftyone/factory/repos/delegated_operation_doc.py (4)
  • num_distributed_tasks (70-73)
  • to_pymongo (140-162)
  • DelegatedOperationDocument (23-162)
  • from_pymongo (75-138)
fiftyone/factory/repos/delegated_operation_doc.py (1)
fiftyone/operators/_types/pipeline.py (7)
  • Pipeline (72-155)
  • PipelineRunInfo (159-176)
  • from_json (122-140)
  • from_json (170-173)
  • to_json (62-68)
  • to_json (142-155)
  • to_json (175-176)
fiftyone/operators/types.py (1)
fiftyone/operators/_types/pipeline.py (3)
  • Pipeline (72-155)
  • PipelineRunInfo (159-176)
  • PipelineStage (13-68)
tests/unittests/operators/types_tests.py (1)
fiftyone/operators/_types/pipeline.py (6)
  • PipelineRunInfo (159-176)
  • to_json (62-68)
  • to_json (142-155)
  • to_json (175-176)
  • from_json (122-140)
  • from_json (170-173)
🪛 Pylint (4.0.0)
fiftyone/operators/_types/pipeline.py

[refactor] 35-35: Too many positional arguments (6/5)

(R0917)

🪛 Ruff (0.14.0)
fiftyone/operators/_types/pipeline.py

42-42: Unused method argument: kwargs

(ARG002)


84-84: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


84-84: Unused method argument: kwargs

(ARG002)


96-96: Unused method argument: kwargs

(ARG002)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
  • GitHub Check: test-windows / test-python (windows-latest, 3.11)
  • GitHub Check: test-windows / test-python (windows-latest, 3.9)
  • GitHub Check: test-windows / test-python (windows-latest, 3.12)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.11)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
  • GitHub Check: test-windows / test-python (windows-latest, 3.10)
  • GitHub Check: test / test-app
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.9)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.12)
  • GitHub Check: lint / eslint
  • GitHub Check: e2e / test-e2e
  • GitHub Check: build / build
  • GitHub Check: build
🔇 Additional comments (6)
fiftyone/operators/types.py (1)

9-9: Re-export looks good

Exposing PipelineRunInfo alongside Pipeline and PipelineStage is appropriate and aligns dependents.

tests/unittests/operators/types_tests.py (1)

100-114: LGTM on PipelineRunInfo round‑trip

Covers to_json/from_json and equality; nice.

tests/unittests/operators/delegated_tests.py (1)

207-218: LGTM on updated stage construction

Good coverage for extended fields; pairs well with serialization tests.

fiftyone/factory/repos/delegated_operation_doc.py (3)

18-18: LGTM! Import is correct.

The PipelineRunInfo import is properly added alongside Pipeline to support the new pipeline run state tracking functionality.


67-67: LGTM! Initialization follows existing patterns.

The pipeline_run_info attribute is correctly initialized to None, consistent with the pipeline attribute pattern.


133-136: LGTM! Deserialization logic is correct.

The deserialization properly handles both pipeline and pipeline_run_info fields, and PipelineRunInfo.from_json correctly handles None values.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
fiftyone/operators/_types/pipeline.py (1)

84-84: Fix implicit Optional type on stages parameter (RUF013).

Avoid implicit Optional by explicitly annotating the type.

Apply this diff:

-    def __init__(self, stages: list[PipelineStage] = None, **kwargs):
+    def __init__(self, stages: Optional[list[PipelineStage]] = None, **kwargs):
         # Call the default dataclass initialization for the defined fields
         self.stages = stages if stages is not None else []
         # kwargs are implicitly discarded
🧹 Nitpick comments (1)
fiftyone/operators/_types/pipeline.py (1)

34-50: **Consider using kwargs unpacking for better maintainability.

The custom __init__ is necessary for forward compatibility, but manually assigning each field is verbose and error-prone. Note also that the comment "Call the default dataclass initialization" is misleading—you're manually assigning fields, not calling the dataclass __init__.

Consider this more maintainable pattern:

-    # ADD A CUSTOM __init__ METHOD TO ACCEPT AND DISCARD UNUSED KWARGS
     def __init__(
         self,
         operator_uri: str,
         always_run: bool = False,
         name: Optional[str] = None,
         num_distributed_tasks: Optional[int] = None,
         params: Optional[Mapping[str, Any]] = None,
-        **kwargs,  # Accepts and ignores unused kwargs
+        **kwargs,  # Accept and filter unused kwargs for forward compatibility
     ):
-        # Call the default dataclass initialization for the defined fields
-        self.operator_uri = operator_uri
-        self.always_run = always_run
-        self.name = name
-        self.num_distributed_tasks = num_distributed_tasks
-        self.params = params
+        # Extract known fields, ignore unknown ones
+        known_params = {
+            'operator_uri': operator_uri,
+            'always_run': always_run,
+            'name': name,
+            'num_distributed_tasks': num_distributed_tasks,
+            'params': params,
+        }
+        for key, value in known_params.items():
+            setattr(self, key, value)
         self.__post_init__()

This approach still supports forward compatibility while being more maintainable. Based on learnings.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a38372a and f2e0eba.

📒 Files selected for processing (2)
  • fiftyone/operators/_types/pipeline.py (5 hunks)
  • tests/unittests/operators/types_tests.py (3 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-07T01:19:39.063Z
Learnt from: swheaton
PR: voxel51/fiftyone#6385
File: fiftyone/operators/_types/pipeline.py:64-92
Timestamp: 2025-10-07T01:19:39.063Z
Learning: In fiftyone/operators/_types/pipeline.py, the Pipeline.stage() method intentionally accepts unused **kwargs for forward compatibility. This design pattern should not be flagged as an issue.

Applied to files:

  • fiftyone/operators/_types/pipeline.py
🧬 Code graph analysis (1)
tests/unittests/operators/types_tests.py (1)
fiftyone/operators/_types/pipeline.py (7)
  • Pipeline (72-158)
  • from_json (122-143)
  • from_json (173-176)
  • PipelineRunInfo (162-179)
  • to_json (62-68)
  • to_json (145-158)
  • to_json (178-179)
🪛 Pylint (4.0.0)
fiftyone/operators/_types/pipeline.py

[refactor] 35-35: Too many positional arguments (6/5)

(R0917)

🪛 Ruff (0.14.0)
fiftyone/operators/_types/pipeline.py

42-42: Unused method argument: kwargs

(ARG002)


84-84: PEP 484 prohibits implicit Optional

Convert to T | None

(RUF013)


84-84: Unused method argument: kwargs

(ARG002)


96-96: Unused method argument: kwargs

(ARG002)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (12)
  • GitHub Check: test-windows / test-python (windows-latest, 3.11)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.12)
  • GitHub Check: test-windows / test-python (windows-latest, 3.12)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.9)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.11)
  • GitHub Check: test-windows / test-python (windows-latest, 3.10)
  • GitHub Check: test-windows / test-python (windows-latest, 3.9)
  • GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
  • GitHub Check: lint / eslint
  • GitHub Check: e2e / test-e2e
  • GitHub Check: build / build
  • GitHub Check: build
🔇 Additional comments (7)
tests/unittests/operators/types_tests.py (3)

55-76: Excellent test coverage for the always_run field.

The test properly validates that always_run is serialized correctly in both the False (default) and True cases, and that round-trip serialization preserves the field value.


100-102: Good edge case coverage for None handling.

Testing that from_json(None) returns None for both Pipeline and PipelineRunInfo ensures robustness when deserializing optional fields.


104-118: Comprehensive test for PipelineRunInfo serialization.

The test validates all three fields (active, stage_index, expected_children) and confirms that round-trip serialization works correctly with non-default values.

fiftyone/operators/_types/pipeline.py (4)

9-9: LGTM!

The List import is necessary for the expected_children: Optional[List[int]] type annotation in PipelineRunInfo.


29-29: LGTM!

The always_run field is properly added with a sensible default value of False.


135-139: Excellent enhancements to from_json.

The additions improve robustness:

  • Explicit None handling prevents errors when deserializing optional fields
  • List input support provides a convenient shorthand for creating pipelines

161-179: Well-designed PipelineRunInfo implementation.

The dataclass cleanly encapsulates pipeline run state with:

  • Sensible defaults for all fields
  • Consistent from_json/to_json API matching other types
  • Proper None handling in from_json

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant