Skip to content

Conversation

@j053y
Copy link
Contributor

@j053y j053y commented Oct 14, 2025

What changes are proposed in this pull request?

Add If-Match header to sample patch endpoints to avoid race conditions and other multi-user update problems.

Successful sample PATCH endpoints now return the ETag header.

Sample PATCH endpoints now accept the If-Match header which can be one of the following:

  • The ETag header from a previous request (this is preferred as it is more flexible than the following options)
  • The sample's last_modified_at value as an ISO 8601 date
  • The sample's last_modified_at value as a UNIX timestamp

The first example will compare against the current calculated ETag, while the latter two will compare directly against the sample's last_modified_at and raise a 412 if the values do not match.

  • Add If-Match header support to the sample routes (with tests!)
  • Refactored fiftyone.server.utils into sub-modules to group similar areas.
  • Refactored fiftyone.server.decorators to use refactored fiftyone.server.utils

How is this patch tested? If it is not, please explain why.

Run the automated tests

Release Notes

Is this a user-facing change that should be mentioned in the release notes?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release
    notes for FiftyOne users.

(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)

What areas of FiftyOne does this PR affect?

  • App: FiftyOne application changes
  • Build: Build and test infrastructure changes
  • Core: Core fiftyone Python library changes
  • Documentation: FiftyOne documentation changes
  • Other

Summary by CodeRabbit

  • New Features
    • ETag utilities and If-Match enforcement for sample and sample-field PATCH operations (responses include ETag, 412 on mismatches).
  • Refactor
    • Centralized JSON (de)serialization and standardized JSON responses; removed legacy auto-registration transform behavior.
  • Bug Fixes
    • More robust patch parsing, value transformation, conditional save behavior, and clearer deserialization errors.
  • Tests
    • Migrated tests to pytest with fixtures; expanded coverage for header-driven and error scenarios.
  • Documentation
    • Endpoint docs updated to reflect ETag/If-Match semantics.

@j053y j053y self-assigned this Oct 14, 2025
@j053y j053y requested review from a team as code owners October 14, 2025 18:52
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Oct 14, 2025

Walkthrough

Centralizes JSON (de)serialization into a new utils.json package, adds utils.http.ETag utilities, implements If-Match/ETag conditional PATCH logic in sample and sample-field routes, relocates jsonpatch imports under utils.json, removes the json_transform registry/auto-registration, and migrates tests to pytest with updated import/ETag semantics.

Changes

Cohort / File(s) Summary of changes
Sample routes: ETag / If-Match
fiftyone/server/routes/sample.py
Adds get_if_last_modified_at, get_sample, generate_sample_etag, save_sample; enforces If-Match preconditions (412); uses utils.json for patch parsing/deserialization; returns ETag header on successful PATCH; updates docstrings and endpoint registration.
Server decorators JSON refactor
fiftyone/server/decorators.py
Replaces bespoke JSON handling and custom Encoder usage with fiftyone.server.utils.json.dumps/loads for response creation and route wrapper parsing.
HTTP utilities (ETag)
fiftyone/server/utils/http.py
New ETag helper class with create(value, is_weak=False) and parse(etag) to format and parse strong/weak ETags.
Utils package exports
fiftyone/server/utils/__init__.py
Adjusts imports to surface http and json; removes prior json_transform auto-registration side-effect import.
JSON utilities & encoder
fiftyone/server/utils/json/__init__.py, fiftyone/server/utils/json/encoder.py, fiftyone/server/utils/json/serialization.py
New JSON utilities: dumps/loads, JSONResponse that uses a custom Encoder for NumPy/BSON types; serialize/deserialize for FiftyOne label objects; integrates jsonpatch parse helpers.
jsonpatch import path & typing
fiftyone/server/utils/json/jsonpatch/__init__.py, fiftyone/server/utils/json/jsonpatch/patch.py
Moves jsonpatch imports under utils.json.jsonpatch; adjusts typing/generic ordering and forwards **kwargs in __init_subclass__ (typing changes, no API behavioral change).
Remove json_transform registry
fiftyone/server/utils/json_transform/__init__.py, .../transform.py, .../types.py
Removes registry, transform API, register decorator, and type-specific transform functions (auto-registration removed).
Tests: routes and jsonpatch
tests/unittests/sample_route_tests.py, tests/unittests/server/utils/jsonpatch/test_json_patch_patch.py, .../test_jsonpatch.py, .../test_jsonpatch_methods.py
Migrates sample-route tests to pytest with fixtures (dataset, request, if-match); adds/updates tests for ETag/If-Match success and failure; updates jsonpatch import paths and expected error wording.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  participant C as Client
  participant R as SampleRoute (PATCH)
  participant J as utils.json / jsonpatch
  participant D as DatasetStore
  participant H as utils.http.ETag

  C->>R: PATCH /datasets/{id}/samples/{sid}\nHeaders: If-Match: "<etag>"
  R->>R: get_if_last_modified_at(request)
  R->>D: load dataset & sample
  R->>H: parse(if-match)
  alt If-Match present and mismatch
    R-->>C: 412 Precondition Failed
  else OK
    R->>J: parse_jsonpatch + deserialize values
    J-->>R: patch ops
    R->>D: apply patch and save (conditional on If-Match)
    D-->>R: updated sample (reloaded)
    R->>H: create(new_last_modified_at)
    H-->>R: "new-etag"
    R-->>C: 200 OK\nHeaders: ETag: "new-etag"\nBody: JSONResponse (utils.json.dumps)
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~75 minutes

Possibly related PRs

Suggested reviewers

  • tom-vx51
  • benjaminpkane
  • kevin-dimichel

Poem

A hop and a patch in moonlit code,
I tag my carrots with an ETag ode.
If-Match hums, "only if true,"
I nibble JSON, crisp and new.
Precondition passed — thump! — we save. 🥕✨

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Title Check ✅ Passed The pull request title clearly and concisely summarizes the primary change by stating that an If-Match header is added to sample patch endpoints, directly reflecting the main functionality in the diff. It is specific and avoids unnecessary detail.
Description Check ✅ Passed The pull request description follows the repository template by providing a clear summary of the proposed changes, detailing the testing approach, and marking the release notes and affected areas sections appropriately. Each required section is present and fully filled out, and the description clearly instructs that automated tests were run. The release notes and checkboxes correctly reflect that this is not a user-facing change and identify the impact area.
Docstring Coverage ✅ Passed Docstring coverage is 90.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch josey/sample-endpoint-race-conditions

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 705ae54 and dd348e7.

📒 Files selected for processing (2)
  • fiftyone/server/routes/sample.py (5 hunks)
  • fiftyone/server/utils/http.py (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • fiftyone/server/utils/http.py
🔇 Additional comments (6)
fiftyone/server/routes/sample.py (6)

9-21: LGTM! Imports are appropriate for the new functionality.

The added imports (base64, datetime, Union) and updated import structure support the If-Match/ETag implementation.


77-120: LGTM! Early validation correctly implements fail-fast semantics.

The function appropriately validates the If-Match condition early, raising 412 if the sample's last_modified_at doesn't match, which helps catch conflicts before processing begins.


123-135: LGTM! ETag generation is correct and addresses previous concerns.

The function now returns a properly formatted string ETag by base64-encoding the last_modified_at timestamp, which addresses previous review feedback about type correctness.


214-256: LGTM! Sample.patch correctly implements the If-Match/ETag flow.

The updated flow properly integrates the new helper functions:

  • Parses If-Match header
  • Conditionally fetches the sample with early validation
  • Performs atomic conditional save
  • Returns response with ETag header

The implementation is clean and follows the expected pattern for optimistic concurrency control.


278-343: LGTM! SampleField.patch mirrors the Sample.patch implementation correctly.

The field-specific PATCH endpoint follows the same If-Match/ETag pattern as the sample endpoint, ensuring consistency across the API. The logic correctly:

  • Validates the sample's state before applying field changes
  • Saves atomically with conditional update
  • Returns the updated field with an ETag header

151-163: Verify manual last_modified_at assignment aligns with FiftyOne's conventions.

Line 153 manually sets last_modified_at to datetime.datetime.now(datetime.timezone.utc). This raises a few questions:

  1. Does FiftyOne typically auto-update last_modified_at on save operations?
  2. Should this use fou.utcnow() instead for consistency with FiftyOne's timestamp utilities?
  3. Could manual assignment bypass any hooks, validators, or timezone handling that FiftyOne provides?

Please confirm this approach is consistent with FiftyOne's timestamp management and won't cause issues with auto-timestamps or other framework features.

Run the following script to check how FiftyOne handles last_modified_at:

Warning

Review ran into problems

🔥 Problems

Git: Failed to clone repository. Please run the @coderabbitai full review command to re-trigger a full review. If the issue persists, set path_filters to include or exclude specific files.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (8)
fiftyone/server/decorators.py (3)

34-46: Return 400 for invalid JSON payloads

Currently, JSON decode failures will fall into the broad exception handler and yield 500. Map JSON parsing errors to 400.

Apply:

   async def wrapper(
       endpoint: HTTPEndpoint, request: Request, *args
   ) -> t.Union[dict, Response]:
       try:
           body = await request.body()
           payload = body.decode("utf-8")
-          data = utils.json.loads(payload)
+          try:
+              data = utils.json.loads(payload)
+          except Exception as je:
+              raise HTTPException(status_code=400, detail=f"Invalid JSON: {je}") from je
           response = await func(endpoint, request, data, *args)
           if isinstance(response, Response):
               return response

22-27: Optional: use utils.json.JSONResponse for consistency

Using the custom JSONResponse keeps encoding consistent with dumps/Encoder.

-async def create_response(response: dict):
-    """Creates a JSON response from the given dictionary."""
-    return Response(
-        await run_sync_task(lambda: utils.json.dumps(response)),
-        headers={"Content-Type": "application/json"},
-    )
+async def create_response(response: dict):
+    """Creates a JSON response from the given dictionary."""
+    return utils.json.JSONResponse(response)

47-60: Optional: return error JSON via utils.json.JSONResponse

The current payload is string fields, but using the same JSON layer avoids future surprises with non-primitive values.

-            return JSONResponse(
+            return utils.json.JSONResponse(
                 {
                     "kind": "Server Error",
                     "stack": traceback.format_exc(),
                 },
                 status_code=500,
             )
fiftyone/server/routes/sample.py (3)

148-153: Reload before computing ETag to ensure it reflects persisted state

Depending on how save() updates last_modified_at, reloading guarantees the ETag corresponds to the stored document.

-        result.save()
-
-        return utils.json.JSONResponse(
-            utils.json.serialize(result),
-            headers={"ETag": f'"{generate_sample_etag(result)}"'},
-        )
+        result.save()
+        # Ensure last_modified_at is the persisted value used for ETag
+        try:
+            result.reload(hard=True)
+        except Exception:
+            pass
+
+        return utils.json.JSONResponse(
+            utils.json.serialize(result),
+            headers={"ETag": f'"{generate_sample_etag(result)}"'},
+        )

225-231: Also reload sample before ETag on nested field patches

Same rationale as for the sample PATCH: ensure ETag represents the persisted state after save.

-        result = handle_json_patch(field, data)
-        sample.save()
-
-        return utils.json.JSONResponse(
-            utils.json.serialize(result),
-            headers={"ETag": f'"{generate_sample_etag(sample)}"'},
-        )
+        result = handle_json_patch(field, data)
+        sample.save()
+        try:
+            sample.reload(hard=True)
+        except Exception:
+            pass
+
+        return utils.json.JSONResponse(
+            utils.json.serialize(result),
+            headers={"ETag": f'"{generate_sample_etag(sample)}"'},
+        )

134-141: Note: TOCTOU window remains despite If-Match precondition

The If-Match check happens before mutation; a concurrent write between check and save can still race. Consider optimistic concurrency at the persistence layer (filter on previous last_modified_at or version) to atomically enforce the precondition.

Also applies to: 148-153

tests/unittests/sample_route_tests.py (2)

60-62: Avoid default mutable dict for headers

Use None default to prevent accidental shared state across tests.

-    def _create_mock_request(
-        self, payload, content_type="application/json", headers={}
-    ):
+    def _create_mock_request(
+        self, payload, content_type="application/json", headers=None
+    ):
@@
-        mock_request.headers = headers | {"Content-Type": content_type}
+        headers = headers or {}
+        mock_request.headers = headers | {"Content-Type": content_type}

Also applies to: 69-69


498-508: Avoid default mutable dict for headers (SampleField helper)

Same fix for the SampleField helper.

-    def _create_mock_request(self, payload, field_path, field_id, headers={}):
+    def _create_mock_request(self, payload, field_path, field_id, headers=None):
@@
-        mock_request.headers = headers | {"Content-Type": "application/json"}
+        headers = headers or {}
+        mock_request.headers = headers | {"Content-Type": "application/json"}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c58a1d8 and dc3c96a.

📒 Files selected for processing (16)
  • fiftyone/server/decorators.py (2 hunks)
  • fiftyone/server/routes/sample.py (8 hunks)
  • fiftyone/server/utils/__init__.py (1 hunks)
  • fiftyone/server/utils/http.py (1 hunks)
  • fiftyone/server/utils/json/__init__.py (1 hunks)
  • fiftyone/server/utils/json/encoder.py (1 hunks)
  • fiftyone/server/utils/json/jsonpatch/__init__.py (2 hunks)
  • fiftyone/server/utils/json/jsonpatch/patch.py (6 hunks)
  • fiftyone/server/utils/json/serialization.py (1 hunks)
  • fiftyone/server/utils/json_transform/__init__.py (0 hunks)
  • fiftyone/server/utils/json_transform/transform.py (0 hunks)
  • fiftyone/server/utils/json_transform/types.py (0 hunks)
  • tests/unittests/sample_route_tests.py (7 hunks)
  • tests/unittests/server/utils/jsonpatch/test_json_patch_patch.py (1 hunks)
  • tests/unittests/server/utils/jsonpatch/test_jsonpatch.py (1 hunks)
  • tests/unittests/server/utils/jsonpatch/test_jsonpatch_methods.py (1 hunks)
💤 Files with no reviewable changes (3)
  • fiftyone/server/utils/json_transform/init.py
  • fiftyone/server/utils/json_transform/transform.py
  • fiftyone/server/utils/json_transform/types.py
🔇 Additional comments (6)
tests/unittests/server/utils/jsonpatch/test_json_patch_patch.py (1)

12-12: Import path refactor looks good

Path now aligns with centralized utils.json.jsonpatch. No behavior change.

tests/unittests/server/utils/jsonpatch/test_jsonpatch_methods.py (1)

14-23: Consistent import path update

Imports now point to the centralized json.jsonpatch module. Looks correct.

fiftyone/server/utils/json/__init__.py (1)

18-33: Centralized JSON layer LGTM

dumps/loads + JSONResponse unify encoding with bson.json_util and custom Encoder. Re-export of parse_jsonpatch/serialize/deserialize is convenient.

fiftyone/server/utils/json/jsonpatch/patch.py (1)

40-47: Typing and subclassing improvements LGTM

  • init_subclass correctly calls super and validates op
  • Generic[V] for value-carrying patches improves type safety
  • Import path aligned with utils.json.jsonpatch

Also applies to: 72-79, 93-101, 129-136, 138-145

fiftyone/server/utils/__init__.py (1)

18-20: Re-exporting utils.json and utils.http is fine; confirm no import cycles

This enables a clean from fiftyone.server.utils import json, http. Please verify:

  • No circular imports are introduced.
  • Any intended registration side effects occur only once.

Run this to locate ETag and check for cycles/usage:

fiftyone/server/routes/sample.py (1)

59-72: Confirm If-Match parsing covers multiple tags and weak tags

If-Match can contain multiple entity-tags or “*”. You currently parse a single value and ignore weakness.

Please verify utils.http.ETag.parse:

  • Handles lists (e.g., If-Match: "etag1", "etag2") and wildcard “*”
  • Rejects weak tags (RFC specifies If-Match must not use weak validators), or at least treats them safely

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (12)
fiftyone/server/utils/http.py (1)

20-33: Consider adding validation for malformed ETags.

The parse method assumes the ETag has surrounding quotes (lines 30-31) but doesn't validate this assumption. If an ETag lacks quotes, the slicing operation will silently remove the first and last characters, potentially corrupting the value.

While this may be acceptable if callers guarantee well-formed ETags, consider adding validation or documenting the expected input format.

Example validation:

 @staticmethod
 def parse(etag: str) -> tuple[str, bool]:
     """Parses an ETag string into its value and whether it is weak."""
 
     is_weak = False
     if etag.startswith("W/"):
         is_weak = True
         etag = etag[2:]  # Remove "W/" prefix
 
     # Remove surrounding quotes (ETags are typically quoted)
     if etag.startswith('"') and etag.endswith('"'):
         etag = etag[1:-1]
+    elif etag:
+        # ETag without quotes - may be malformed
+        pass  # or raise ValueError("Malformed ETag: missing quotes")
 
     return etag, is_weak
fiftyone/server/utils/json/__init__.py (1)

23-26: Confirm silent fallback in loads()

Returning {} for falsy input can mask empty/invalid JSON bodies. If this is intentional, OK; otherwise consider raising 400 upstream when the body is empty for content types that require JSON.

fiftyone/server/utils/json/encoder.py (1)

22-29: Enhance numpy coverage in Encoder

Also handle numpy booleans to avoid serialization surprises.

-        if isinstance(o, np.floating):
+        if isinstance(o, np.floating):
             return float(o)
 
-        if isinstance(o, np.integer):
+        if isinstance(o, np.integer):
             return int(o)
 
+        if isinstance(o, np.bool_):
+            return bool(o)
tests/unittests/sample_route_tests.py (3)

60-70: Avoid mutable default for headers

Use None default to prevent shared mutable state across calls (even though you don’t mutate it here).

-    def _create_mock_request(
-        self, payload, content_type="application/json", headers={}
-    ):
+    def _create_mock_request(
+        self, payload, content_type="application/json", headers=None
+    ):
@@
-        mock_request.headers = headers | {"Content-Type": content_type}
+        headers = headers or {}
+        mock_request.headers = headers | {"Content-Type": content_type}

246-271: Optionally assert no mutation on 412

You can add a quick sample.reload() and assert the field did not change to strengthen the failure-path guarantee.


498-508: Avoid mutable default for headers (SampleField tests)

Same recommendation as above.

-    def _create_mock_request(self, payload, field_path, field_id, headers={}):
+    def _create_mock_request(self, payload, field_path, field_id, headers=None):
@@
-        mock_request.headers = headers | {"Content-Type": "application/json"}
+        headers = headers or {}
+        mock_request.headers = headers | {"Content-Type": "application/json"}
fiftyone/server/routes/sample.py (6)

89-96: Apply-return is ignored; assign to capture root replacements

If a patch replaces the root or returns a new container, ignoring the return can drop changes. Assign the result each iteration.

-        try:
-            p.apply(target)
+        try:
+            target = p.apply(target)

104-110: ETag construction: consider using a stable string token

Returning an int from an md5 digest is fine, but you could return the hex digest (or the iso timestamp directly) to avoid very large integer strings and align with typical opaque ETag formats.


112-126: Type hints: patch() data shape and return type

  • data can be dict (merge) or list[dict] (JSON Patch).
  • The method returns a Response, not dict.
-class Sample(HTTPEndpoint):
+class Sample(HTTPEndpoint):
@@
-    async def patch(self, request: Request, data: dict) -> dict:
+    async def patch(self, request: Request, data: Any) -> utils.json.JSONResponse:

You might also update the docstring to reflect both payload shapes.


155-173: Type annotation: _handle_patch return

The method returns the mutated Sample, not a dict.

-    def _handle_patch(self, sample: fo.Sample, data: dict) -> dict:
+    def _handle_patch(self, sample: fo.Sample, data: dict) -> fo.Sample:

189-201: Type hints and return type (SampleField.patch)

Same as Sample.patch: data is a JSON Patch list, and return is a Response.

-    async def patch(self, request: Request, data: dict) -> dict:
+    async def patch(self, request: Request, data: List[dict]) -> utils.json.JSONResponse:

9-13: Remove unused import

datetime is unused.

-import datetime
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c58a1d8 and 71af200.

📒 Files selected for processing (16)
  • fiftyone/server/decorators.py (2 hunks)
  • fiftyone/server/routes/sample.py (8 hunks)
  • fiftyone/server/utils/__init__.py (1 hunks)
  • fiftyone/server/utils/http.py (1 hunks)
  • fiftyone/server/utils/json/__init__.py (1 hunks)
  • fiftyone/server/utils/json/encoder.py (1 hunks)
  • fiftyone/server/utils/json/jsonpatch/__init__.py (2 hunks)
  • fiftyone/server/utils/json/jsonpatch/patch.py (6 hunks)
  • fiftyone/server/utils/json/serialization.py (1 hunks)
  • fiftyone/server/utils/json_transform/__init__.py (0 hunks)
  • fiftyone/server/utils/json_transform/transform.py (0 hunks)
  • fiftyone/server/utils/json_transform/types.py (0 hunks)
  • tests/unittests/sample_route_tests.py (7 hunks)
  • tests/unittests/server/utils/jsonpatch/test_json_patch_patch.py (1 hunks)
  • tests/unittests/server/utils/jsonpatch/test_jsonpatch.py (1 hunks)
  • tests/unittests/server/utils/jsonpatch/test_jsonpatch_methods.py (1 hunks)
💤 Files with no reviewable changes (3)
  • fiftyone/server/utils/json_transform/transform.py
  • fiftyone/server/utils/json_transform/init.py
  • fiftyone/server/utils/json_transform/types.py
🔇 Additional comments (15)
fiftyone/server/utils/json/serialization.py (2)

53-70: Document the include_private=True rationale.

The special handling of fos.Sample with include_private=True is noteworthy. This exposes internal fields that are typically hidden from the public API.

Ensure this is intentional and documented, especially in the context of ETag generation where consistency is critical.


14-50: Verify extensibility and error handling.

The deserialize function has a hardcoded list of 6 label classes. Consider:

  1. If additional label types need deserialization support in the future, this list must be manually updated
  2. If cls.from_dict(value) raises an exception, it will propagate unhandled

While the current implementation is functional, verify that this approach aligns with the project's deserialization strategy.

Run the following script to verify if there are other label types that might need deserialization support:

tests/unittests/server/utils/jsonpatch/test_json_patch_patch.py (1)

12-12: LGTM! Import path updated correctly.

The import path has been updated to align with the new module structure (fiftyone.server.utils.json.jsonpatch). Test behavior remains unchanged.

tests/unittests/server/utils/jsonpatch/test_jsonpatch.py (1)

12-12: LGTM! Import path updated correctly.

The import path has been updated to align with the new module structure. Test behavior remains unchanged.

tests/unittests/server/utils/jsonpatch/test_jsonpatch_methods.py (1)

14-23: LGTM! Import paths updated correctly.

The import paths have been updated to align with the new module structure (fiftyone.server.utils.json.jsonpatch). Test behavior remains unchanged.

fiftyone/server/utils/__init__.py (1)

18-19: LGTM! Submodule imports added correctly.

The imports of json and http submodules enable the new utilities introduced in this PR (ETag support and centralized JSON handling). This aligns with the broader refactoring effort.

fiftyone/server/decorators.py (1)

19-40: LGTM! Integration with new JSON utils is clean.

The decorator functions now use utils.json.dumps() and utils.json.loads() from the refactored utilities. The integration is straightforward and maintains the existing behavior.

fiftyone/server/utils/json/jsonpatch/__init__.py (1)

11-32: LGTM! Import paths updated correctly.

The import paths have been updated to reflect the new module structure (fiftyone.server.utils.json.jsonpatch.*), and a formatting improvement was added. The public API remains unchanged.

fiftyone/server/utils/json/__init__.py (1)

28-33: LGTM: JSONResponse uses centralized dumps

Consistent encoding via the custom Encoder looks good.

tests/unittests/sample_route_tests.py (2)

224-245: ETag/If-Match success tests look solid

Covers ETag, isoformat, and timestamp variants and asserts ETag presence in responses.


579-608: Field If-Match tests: good coverage

Mirrors sample-level tests; ETag assertions included.

fiftyone/server/utils/json/jsonpatch/patch.py (3)

15-19: Import path fix looks right

Updating to the nested utils.json.jsonpatch path matches the new structure.


40-47: Subclass guard + super call: good

Validation of op presence and forwarding kwargs to super is appropriate.


72-79: Generic typing cleanup is fine

Switch to Generic[V] on PatchWithValue and updated value typing improves clarity.

fiftyone/server/utils/json/encoder.py (1)

31-46: Duplicate dumps/loads/JSONResponse definitions — centralize in utils.json

This file redefines dumps(), loads(), and JSONResponse, which are also defined/exported in fiftyone.server.utils.json.init.py. Keeping two sources risks drift and accidental imports from the wrong module.

Apply this diff to keep Encoder here and centralize the public API in utils.json:

-def dumps(obj: Any) -> str:
-    """Serializes an object to a JSON-formatted string."""
-    return json_util.dumps(obj, cls=Encoder)
-
-
-def loads(s: Union[str, bytes, bytearray, None]) -> Any:
-    """Deserializes a JSON-formatted string to a Python object."""
-    return json_util.loads(s) if s else {}
-
-
-class JSONResponse(StarletteJSONResponse):
-    """Custom JSON response that uses the custom Encoder."""
-
-    def render(self, content: Any) -> bytes:
-        return dumps(content).encode("utf-8")
+# NOTE: dumps/loads/JSONResponse are defined in
+# fiftyone.server.utils.json.__init__.py. This module should only expose Encoder.

To help verify unintended imports, you can scan the repo:

@j053y j053y requested a review from a team as a code owner October 14, 2025 20:28
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (2)
fiftyone/server/routes/sample.py (2)

50-67: Add debug logging for ETag format parsing failures.

The multiple try-except blocks with bare ellipsis (...) silently swallow all exceptions when parsing If-Match formats. This makes debugging difficult if clients send malformed values. Consider adding debug-level logging to trace which formats were attempted.

Apply this diff to add debug logging:

         # As ETag - Currently this is just a based64 encode string of
         # last_modified_at
         try:
             if_not_modified_since = datetime.datetime.fromisoformat(
                 base64.b64decode(if_match.encode("utf-8")).decode("utf-8")
             )
         except Exception:
-            ...
+            logger.debug("If-Match value not in base64-encoded ISO format")

         # As ISO date
         try:
             if_not_modified_since = datetime.datetime.fromisoformat(if_match)
         except Exception:
-            ...
+            logger.debug("If-Match value not in ISO format")

         # As Unix timestamp
         try:
             if_not_modified_since = datetime.datetime.fromtimestamp(float(if_match))
         except ValueError:
-            ...
+            logger.debug("If-Match value not in Unix timestamp format")

139-143: Log warnings when sample reload fails in ETag generation.

The try-except block catches all exceptions during reload and continues silently. If the reload fails (e.g., sample was deleted between save and ETag generation), the returned ETag may be based on stale in-memory state rather than the persisted state. Consider logging a warning to aid debugging.

Apply this diff to add warning logging:

         # Ensure last_modified_at reflects persisted state before computing
         # ETag
         try:
             sample.reload(hard=True)
         except Exception:
             # best-effort; still return response
-            ...
+            logger.warning(
+                "Failed to reload sample %s after save; ETag may reflect "
+                "pre-save state",
+                sample.id,
+                exc_info=True
+            )
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 71af200 and 5254a94.

📒 Files selected for processing (2)
  • fiftyone/core/sample.py (2 hunks)
  • fiftyone/server/routes/sample.py (6 hunks)
🔇 Additional comments (3)
fiftyone/server/routes/sample.py (3)

28-101: LGTM! Atomic conditional update addresses TOCTOU concerns.

The context manager implementation successfully addresses the previous TOCTOU (Time-of-Check to Time-of-Use) concerns by performing the optimistic lock check atomically at the database level via sample.save(if_not_modified_since=...). This ensures that:

  1. The If-Match validation is tied directly to the database operation
  2. The conditional update query last_modified_at <= if_not_modified_since is atomic
  3. Race conditions between validation and save are eliminated

The multi-format If-Match parsing (base64 ETag, ISO date, Unix timestamp) provides good backward compatibility and flexibility for clients.


177-193: LGTM! Context manager ensures consistent ETag semantics.

The refactored patch flow using the sample_manager context manager properly integrates If-Match validation, conditional saving, and ETag generation. The ETag header in the response reflects the saved state (after the reload in generate_sample_etag), ensuring clients receive accurate version information for subsequent requests.


248-278: LGTM! SampleField endpoint follows the same ETag pattern.

The SampleField.patch implementation correctly reuses the sample_manager context manager, ensuring consistent If-Match/ETag handling across both sample-level and field-level updates. The ETag header represents the entire sample's state, which is appropriate since field modifications update the sample's last_modified_at.

Copy link
Contributor

@CamronStaley CamronStaley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems reasonable just a couple nits

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/unittests/sample_route_tests.py (1)

336-340: Fix payload to be JSON-serializable while still invalid for the schema

Using a Python set will fail during JSON serialization. Use a dict (still invalid for a list field) so the request reaches the route and triggers the intended 400.

-            "ground_truth": {
-                "_cls": "Detections",
-                "detections": {"some messed up map"},
-            }
+            "ground_truth": {
+                "_cls": "Detections",
+                "detections": {"not": "a list"},
+            }
🧹 Nitpick comments (5)
fiftyone/server/utils/__init__.py (1)

18-18: Prefer relative import for package submodules

Use explicit relative import to avoid ambiguity and potential circular-resolve surprises.

-from fiftyone.server.utils import http, json
+from . import http, json
tests/unittests/sample_route_tests.py (1)

557-559: Remove duplicate MagicMock initialization

mock_request = MagicMock() is assigned twice; drop the first.

-        """Helper to create a mock request object."""
-        mock_request = MagicMock()
-        mock_request = MagicMock()
+        """Helper to create a mock request object."""
+        mock_request = MagicMock()
fiftyone/server/routes/sample.py (3)

44-47: Optionally reject weak ETags for If-Match

Per RFC 7232, weak validators must not be used with If-Match. ETag.parse() returns a weak flag; consider rejecting weak ETags with 412.

Would you like a patch that enforces this and updates tests accordingly?


132-136: Centralize ETag formatting via utils.http

Since you already use utils.http.ETag.parse, consider using the companion factory (if available) to create/quote ETags to keep formatting consistent.

If utils.http.ETag.create(value: str) (or similar) exists, prefer it over manual quoting here.


324-324: Remove stray print

Leftover debug print will pollute logs.

-        print(field_list)
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5254a94 and c3e76b2.

📒 Files selected for processing (3)
  • fiftyone/server/routes/sample.py (5 hunks)
  • fiftyone/server/utils/__init__.py (1 hunks)
  • tests/unittests/sample_route_tests.py (9 hunks)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (2)
fiftyone/server/routes/sample.py (2)

47-69: Make If-Match parsing UTC‑aware and harden error handling

  • fromtimestamp() uses local time; produces false 412s.
  • Normalize all parsed datetimes to UTC, and broaden exceptions.

Apply:

-        # As ETag - Currently this is just a based64 encode string of
+        # As ETag - Currently this is just a base64-encoded string of
         # last_modified_at
         try:
-            if_last_modified_at = datetime.datetime.fromisoformat(
-                base64.b64decode(if_match.encode("utf-8")).decode("utf-8")
-            )
+            _decoded = base64.b64decode(if_match.encode("utf-8")).decode("utf-8")
+            dt = datetime.datetime.fromisoformat(_decoded)
+            if_last_modified_at = (
+                dt.astimezone(datetime.timezone.utc)
+                if dt.tzinfo is not None
+                else dt.replace(tzinfo=datetime.timezone.utc)
+            )
         except Exception:
             ...
 
         # As ISO date
         try:
-            if_last_modified_at = datetime.datetime.fromisoformat(if_match)
+            dt = datetime.datetime.fromisoformat(if_match)
+            if_last_modified_at = (
+                dt.astimezone(datetime.timezone.utc)
+                if dt.tzinfo is not None
+                else dt.replace(tzinfo=datetime.timezone.utc)
+            )
         except Exception:
             ...
 
         # As Unix timestamp
         try:
-            if_last_modified_at = datetime.datetime.fromtimestamp(
-                float(if_match)
-            )
-        except ValueError:
+            if_last_modified_at = datetime.datetime.fromtimestamp(
+                float(if_match), tz=datetime.timezone.utc
+            )
+        except (ValueError, TypeError, OSError, OverflowError):
             ...

150-167: Atomic path doesn’t advance last_modified_at; ETag may not change

replace_one bypasses ODM save hooks and won’t bump last_modified_at, so validators can remain stale.

Apply:

     if if_last_modified_at is not None:
         d = sample.to_mongo_dict(include_id=True)
+        # ensure persisted doc gets a fresh last_modified_at
+        d["last_modified_at"] = fou.utcnow()
 
         # pylint:disable-next=protected-access
         update_result = sample.dataset._sample_collection.replace_one(
             {
                 # pylint:disable-next=protected-access
                 "_id": sample._id,
                 "last_modified_at": {"$eq": if_last_modified_at},
             },
             d,
         )

Optional improvement: use find_one_and_replace(..., return_document=ReturnDocument.AFTER) to get the updated doc directly and skip a separate reload.

🧹 Nitpick comments (2)
fiftyone/server/routes/sample.py (1)

323-324: Remove stray print

Debug printing in server routes is noisy; use logger if needed.

-        print(field_list)
tests/unittests/sample_route_tests.py (1)

592-593: Remove duplicate assignment

Redundant reinitialization of mock_request.

-        mock_request = MagicMock()
-        mock_request = MagicMock()
+        mock_request = MagicMock()
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between c3e76b2 and 0d1d1af.

📒 Files selected for processing (2)
  • fiftyone/server/routes/sample.py (5 hunks)
  • tests/unittests/sample_route_tests.py (9 hunks)

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
fiftyone/server/routes/sample.py (1)

62-68: Timestamp parsing still uses local time zone.

Past review comments flagged this issue, but it remains unaddressed. The code uses datetime.datetime.fromtimestamp(float(if_match)) which interprets the timestamp in the server's local timezone rather than UTC. This can cause false 412 responses when comparing with sample.last_modified_at (which is stored in UTC).

Apply this diff to parse as UTC:

-        # As Unix timestamp
+        # As Unix timestamp (UTC)
         try:
             if_last_modified_at = datetime.datetime.fromtimestamp(
-                float(if_match)
+                float(if_match), tz=datetime.timezone.utc
             )
         except Exception:
🧹 Nitpick comments (1)
fiftyone/server/routes/sample.py (1)

150-152: Prefer fou.utcnow() for consistency.

While datetime.datetime.now(datetime.timezone.utc) is correct, the codebase already imports fiftyone.core.odm.utils as fou at line 19. Using fou.utcnow() would be more consistent with FiftyOne's patterns.

Apply this diff:

     if if_last_modified_at is not None:
         d = sample.to_mongo_dict(include_id=True)
-        d["last_modified_at"] = datetime.datetime.now(datetime.timezone.utc)
+        d["last_modified_at"] = fou.utcnow()
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 0d1d1af and 13890b1.

📒 Files selected for processing (1)
  • fiftyone/server/routes/sample.py (5 hunks)
🔇 Additional comments (5)
fiftyone/server/routes/sample.py (5)

114-118: Early If-Match check is now complemented by atomic conditional update.

The early validation here serves as a fast-fail mechanism. The actual protection against race conditions is provided by the atomic conditional update in save_sample (lines 150-167), which re-validates at write time.


123-134: LGTM!

The ETag generation using base64-encoded last_modified_at with proper quoting is correct and follows HTTP ETag standards.


182-207: LGTM!

The JSON patch handling correctly uses the refactored utilities, accumulates errors from multiple patches, and provides clear error messages.


213-255: LGTM!

The PATCH endpoint correctly implements the If-Match/ETag flow:

  1. Parses If-Match header
  2. Loads sample with conditional check
  3. Applies patches based on content type
  4. Saves with atomic conditional update
  5. Returns response with fresh ETag header

This addresses the TOCTOU and ETag staleness concerns from previous reviews.


277-344: LGTM with one minor issue!

The SampleField PATCH endpoint correctly implements the If-Match/ETag flow similar to the Sample endpoint. The logic for finding and patching a specific field in a list is correct, and the atomic conditional update with ETag response is properly implemented.

Just ensure the debug print statement at line 324 is removed (see separate comment).

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (1)
fiftyone/server/routes/sample.py (1)

64-66: Fix Unix timestamp parsing to use UTC.

datetime.fromtimestamp() interprets the timestamp in local time, which can cause false 412 responses when the stored last_modified_at is in UTC. This critical issue was flagged in a previous review but remains unaddressed.

Apply this diff to parse the timestamp as UTC:

         # As Unix timestamp
         try:
-            if_last_modified_at = datetime.datetime.fromtimestamp(
-                float(if_match)
+            if_last_modified_at = datetime.datetime.fromtimestamp(
+                float(if_match), tz=datetime.timezone.utc
             )
🧹 Nitpick comments (1)
fiftyone/server/routes/sample.py (1)

114-118: Consider removing the early conditional check.

While the early fail-fast check provides quick feedback for stale requests, it can produce misleading errors if the sample changes between this check and the atomic update in save_sample(). The authoritative conditional check happens atomically at line 155-167.

Removing this early check would simplify the code and ensure all 412 responses come from the atomic operation, making debugging easier.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 13890b1 and 705ae54.

📒 Files selected for processing (1)
  • fiftyone/server/routes/sample.py (5 hunks)
🔇 Additional comments (5)
fiftyone/server/routes/sample.py (5)

123-134: LGTM!

The ETag generation using base64-encoded ISO timestamp is appropriate. The format is opaque per HTTP spec and properly quoted, with matching parse logic in get_if_last_modified_at().


185-192: LGTM!

The refactored JSON patch handling using utils.json.parse_jsonpatch and utils.json.deserialize is appropriate and maintains proper error handling.


214-255: LGTM!

The PATCH endpoint implementation correctly integrates If-Match validation and ETag generation. The flow properly handles conditional updates and returns appropriate headers.


278-342: LGTM!

The SampleField PATCH endpoint correctly implements If-Match/ETag semantics for nested field updates. The flow properly validates the field path, applies patches, and returns the updated field state with an appropriate ETag header.


150-170: Verify that sample.save() automatically updates last_modified_at.

The atomic conditional update at lines 155-162 manually sets last_modified_at (line 152), while the non-conditional path at line 169 relies on save() to update it. Ensure both paths consistently update the timestamp, otherwise the ETag computed at line 179 may reflect stale state in the non-conditional case.

Run the following script to verify the save() behavior:

"""Utility class for creating and parsing ETag strings."""

@staticmethod
def create(value: Any) -> str:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this used anywhere? I see in the sample method "generate_sample_etage" you're doing this same thing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope. It should be, I'll add it in.

@j053y j053y merged commit 4269038 into develop Oct 15, 2025
20 checks passed
@j053y j053y deleted the josey/sample-endpoint-race-conditions branch October 15, 2025 16:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants