-
Notifications
You must be signed in to change notification settings - Fork 680
Add If-Match header to sample patch endpoints
#6416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughCentralizes JSON (de)serialization into a new utils.json package, adds utils.http.ETag utilities, implements If-Match/ETag conditional PATCH logic in sample and sample-field routes, relocates jsonpatch imports under utils.json, removes the json_transform registry/auto-registration, and migrates tests to pytest with updated import/ETag semantics. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant C as Client
participant R as SampleRoute (PATCH)
participant J as utils.json / jsonpatch
participant D as DatasetStore
participant H as utils.http.ETag
C->>R: PATCH /datasets/{id}/samples/{sid}\nHeaders: If-Match: "<etag>"
R->>R: get_if_last_modified_at(request)
R->>D: load dataset & sample
R->>H: parse(if-match)
alt If-Match present and mismatch
R-->>C: 412 Precondition Failed
else OK
R->>J: parse_jsonpatch + deserialize values
J-->>R: patch ops
R->>D: apply patch and save (conditional on If-Match)
D-->>R: updated sample (reloaded)
R->>H: create(new_last_modified_at)
H-->>R: "new-etag"
R-->>C: 200 OK\nHeaders: ETag: "new-etag"\nBody: JSONResponse (utils.json.dumps)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (1)
🔇 Additional comments (6)
Warning Review ran into problems🔥 ProblemsGit: Failed to clone repository. Please run the Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (8)
fiftyone/server/decorators.py (3)
34-46: Return 400 for invalid JSON payloadsCurrently, JSON decode failures will fall into the broad exception handler and yield 500. Map JSON parsing errors to 400.
Apply:
async def wrapper( endpoint: HTTPEndpoint, request: Request, *args ) -> t.Union[dict, Response]: try: body = await request.body() payload = body.decode("utf-8") - data = utils.json.loads(payload) + try: + data = utils.json.loads(payload) + except Exception as je: + raise HTTPException(status_code=400, detail=f"Invalid JSON: {je}") from je response = await func(endpoint, request, data, *args) if isinstance(response, Response): return response
22-27: Optional: use utils.json.JSONResponse for consistencyUsing the custom JSONResponse keeps encoding consistent with dumps/Encoder.
-async def create_response(response: dict): - """Creates a JSON response from the given dictionary.""" - return Response( - await run_sync_task(lambda: utils.json.dumps(response)), - headers={"Content-Type": "application/json"}, - ) +async def create_response(response: dict): + """Creates a JSON response from the given dictionary.""" + return utils.json.JSONResponse(response)
47-60: Optional: return error JSON via utils.json.JSONResponseThe current payload is string fields, but using the same JSON layer avoids future surprises with non-primitive values.
- return JSONResponse( + return utils.json.JSONResponse( { "kind": "Server Error", "stack": traceback.format_exc(), }, status_code=500, )fiftyone/server/routes/sample.py (3)
148-153: Reload before computing ETag to ensure it reflects persisted stateDepending on how
save()updateslast_modified_at, reloading guarantees the ETag corresponds to the stored document.- result.save() - - return utils.json.JSONResponse( - utils.json.serialize(result), - headers={"ETag": f'"{generate_sample_etag(result)}"'}, - ) + result.save() + # Ensure last_modified_at is the persisted value used for ETag + try: + result.reload(hard=True) + except Exception: + pass + + return utils.json.JSONResponse( + utils.json.serialize(result), + headers={"ETag": f'"{generate_sample_etag(result)}"'}, + )
225-231: Also reload sample before ETag on nested field patchesSame rationale as for the sample PATCH: ensure ETag represents the persisted state after save.
- result = handle_json_patch(field, data) - sample.save() - - return utils.json.JSONResponse( - utils.json.serialize(result), - headers={"ETag": f'"{generate_sample_etag(sample)}"'}, - ) + result = handle_json_patch(field, data) + sample.save() + try: + sample.reload(hard=True) + except Exception: + pass + + return utils.json.JSONResponse( + utils.json.serialize(result), + headers={"ETag": f'"{generate_sample_etag(sample)}"'}, + )
134-141: Note: TOCTOU window remains despite If-Match preconditionThe If-Match check happens before mutation; a concurrent write between check and save can still race. Consider optimistic concurrency at the persistence layer (filter on previous last_modified_at or version) to atomically enforce the precondition.
Also applies to: 148-153
tests/unittests/sample_route_tests.py (2)
60-62: Avoid default mutable dict for headersUse None default to prevent accidental shared state across tests.
- def _create_mock_request( - self, payload, content_type="application/json", headers={} - ): + def _create_mock_request( + self, payload, content_type="application/json", headers=None + ): @@ - mock_request.headers = headers | {"Content-Type": content_type} + headers = headers or {} + mock_request.headers = headers | {"Content-Type": content_type}Also applies to: 69-69
498-508: Avoid default mutable dict for headers (SampleField helper)Same fix for the SampleField helper.
- def _create_mock_request(self, payload, field_path, field_id, headers={}): + def _create_mock_request(self, payload, field_path, field_id, headers=None): @@ - mock_request.headers = headers | {"Content-Type": "application/json"} + headers = headers or {} + mock_request.headers = headers | {"Content-Type": "application/json"}
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
fiftyone/server/decorators.py(2 hunks)fiftyone/server/routes/sample.py(8 hunks)fiftyone/server/utils/__init__.py(1 hunks)fiftyone/server/utils/http.py(1 hunks)fiftyone/server/utils/json/__init__.py(1 hunks)fiftyone/server/utils/json/encoder.py(1 hunks)fiftyone/server/utils/json/jsonpatch/__init__.py(2 hunks)fiftyone/server/utils/json/jsonpatch/patch.py(6 hunks)fiftyone/server/utils/json/serialization.py(1 hunks)fiftyone/server/utils/json_transform/__init__.py(0 hunks)fiftyone/server/utils/json_transform/transform.py(0 hunks)fiftyone/server/utils/json_transform/types.py(0 hunks)tests/unittests/sample_route_tests.py(7 hunks)tests/unittests/server/utils/jsonpatch/test_json_patch_patch.py(1 hunks)tests/unittests/server/utils/jsonpatch/test_jsonpatch.py(1 hunks)tests/unittests/server/utils/jsonpatch/test_jsonpatch_methods.py(1 hunks)
💤 Files with no reviewable changes (3)
- fiftyone/server/utils/json_transform/init.py
- fiftyone/server/utils/json_transform/transform.py
- fiftyone/server/utils/json_transform/types.py
🔇 Additional comments (6)
tests/unittests/server/utils/jsonpatch/test_json_patch_patch.py (1)
12-12: Import path refactor looks goodPath now aligns with centralized utils.json.jsonpatch. No behavior change.
tests/unittests/server/utils/jsonpatch/test_jsonpatch_methods.py (1)
14-23: Consistent import path updateImports now point to the centralized json.jsonpatch module. Looks correct.
fiftyone/server/utils/json/__init__.py (1)
18-33: Centralized JSON layer LGTMdumps/loads + JSONResponse unify encoding with bson.json_util and custom Encoder. Re-export of parse_jsonpatch/serialize/deserialize is convenient.
fiftyone/server/utils/json/jsonpatch/patch.py (1)
40-47: Typing and subclassing improvements LGTM
- init_subclass correctly calls super and validates op
- Generic[V] for value-carrying patches improves type safety
- Import path aligned with utils.json.jsonpatch
Also applies to: 72-79, 93-101, 129-136, 138-145
fiftyone/server/utils/__init__.py (1)
18-20: Re-exporting utils.json and utils.http is fine; confirm no import cyclesThis enables a clean
from fiftyone.server.utils import json, http. Please verify:
- No circular imports are introduced.
- Any intended registration side effects occur only once.
Run this to locate ETag and check for cycles/usage:
fiftyone/server/routes/sample.py (1)
59-72: Confirm If-Match parsing covers multiple tags and weak tagsIf-Match can contain multiple entity-tags or “*”. You currently parse a single value and ignore weakness.
Please verify
utils.http.ETag.parse:
- Handles lists (e.g., If-Match: "etag1", "etag2") and wildcard “*”
- Rejects weak tags (RFC specifies If-Match must not use weak validators), or at least treats them safely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (12)
fiftyone/server/utils/http.py (1)
20-33: Consider adding validation for malformed ETags.The
parsemethod assumes the ETag has surrounding quotes (lines 30-31) but doesn't validate this assumption. If an ETag lacks quotes, the slicing operation will silently remove the first and last characters, potentially corrupting the value.While this may be acceptable if callers guarantee well-formed ETags, consider adding validation or documenting the expected input format.
Example validation:
@staticmethod def parse(etag: str) -> tuple[str, bool]: """Parses an ETag string into its value and whether it is weak.""" is_weak = False if etag.startswith("W/"): is_weak = True etag = etag[2:] # Remove "W/" prefix # Remove surrounding quotes (ETags are typically quoted) if etag.startswith('"') and etag.endswith('"'): etag = etag[1:-1] + elif etag: + # ETag without quotes - may be malformed + pass # or raise ValueError("Malformed ETag: missing quotes") return etag, is_weakfiftyone/server/utils/json/__init__.py (1)
23-26: Confirm silent fallback in loads()Returning {} for falsy input can mask empty/invalid JSON bodies. If this is intentional, OK; otherwise consider raising 400 upstream when the body is empty for content types that require JSON.
fiftyone/server/utils/json/encoder.py (1)
22-29: Enhance numpy coverage in EncoderAlso handle numpy booleans to avoid serialization surprises.
- if isinstance(o, np.floating): + if isinstance(o, np.floating): return float(o) - if isinstance(o, np.integer): + if isinstance(o, np.integer): return int(o) + if isinstance(o, np.bool_): + return bool(o)tests/unittests/sample_route_tests.py (3)
60-70: Avoid mutable default for headersUse None default to prevent shared mutable state across calls (even though you don’t mutate it here).
- def _create_mock_request( - self, payload, content_type="application/json", headers={} - ): + def _create_mock_request( + self, payload, content_type="application/json", headers=None + ): @@ - mock_request.headers = headers | {"Content-Type": content_type} + headers = headers or {} + mock_request.headers = headers | {"Content-Type": content_type}
246-271: Optionally assert no mutation on 412You can add a quick sample.reload() and assert the field did not change to strengthen the failure-path guarantee.
498-508: Avoid mutable default for headers (SampleField tests)Same recommendation as above.
- def _create_mock_request(self, payload, field_path, field_id, headers={}): + def _create_mock_request(self, payload, field_path, field_id, headers=None): @@ - mock_request.headers = headers | {"Content-Type": "application/json"} + headers = headers or {} + mock_request.headers = headers | {"Content-Type": "application/json"}fiftyone/server/routes/sample.py (6)
89-96: Apply-return is ignored; assign to capture root replacementsIf a patch replaces the root or returns a new container, ignoring the return can drop changes. Assign the result each iteration.
- try: - p.apply(target) + try: + target = p.apply(target)
104-110: ETag construction: consider using a stable string tokenReturning an int from an md5 digest is fine, but you could return the hex digest (or the iso timestamp directly) to avoid very large integer strings and align with typical opaque ETag formats.
112-126: Type hints: patch() data shape and return type
- data can be dict (merge) or list[dict] (JSON Patch).
- The method returns a Response, not dict.
-class Sample(HTTPEndpoint): +class Sample(HTTPEndpoint): @@ - async def patch(self, request: Request, data: dict) -> dict: + async def patch(self, request: Request, data: Any) -> utils.json.JSONResponse:You might also update the docstring to reflect both payload shapes.
155-173: Type annotation: _handle_patch returnThe method returns the mutated Sample, not a dict.
- def _handle_patch(self, sample: fo.Sample, data: dict) -> dict: + def _handle_patch(self, sample: fo.Sample, data: dict) -> fo.Sample:
189-201: Type hints and return type (SampleField.patch)Same as Sample.patch: data is a JSON Patch list, and return is a Response.
- async def patch(self, request: Request, data: dict) -> dict: + async def patch(self, request: Request, data: List[dict]) -> utils.json.JSONResponse:
9-13: Remove unused importdatetime is unused.
-import datetime
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (16)
fiftyone/server/decorators.py(2 hunks)fiftyone/server/routes/sample.py(8 hunks)fiftyone/server/utils/__init__.py(1 hunks)fiftyone/server/utils/http.py(1 hunks)fiftyone/server/utils/json/__init__.py(1 hunks)fiftyone/server/utils/json/encoder.py(1 hunks)fiftyone/server/utils/json/jsonpatch/__init__.py(2 hunks)fiftyone/server/utils/json/jsonpatch/patch.py(6 hunks)fiftyone/server/utils/json/serialization.py(1 hunks)fiftyone/server/utils/json_transform/__init__.py(0 hunks)fiftyone/server/utils/json_transform/transform.py(0 hunks)fiftyone/server/utils/json_transform/types.py(0 hunks)tests/unittests/sample_route_tests.py(7 hunks)tests/unittests/server/utils/jsonpatch/test_json_patch_patch.py(1 hunks)tests/unittests/server/utils/jsonpatch/test_jsonpatch.py(1 hunks)tests/unittests/server/utils/jsonpatch/test_jsonpatch_methods.py(1 hunks)
💤 Files with no reviewable changes (3)
- fiftyone/server/utils/json_transform/transform.py
- fiftyone/server/utils/json_transform/init.py
- fiftyone/server/utils/json_transform/types.py
🔇 Additional comments (15)
fiftyone/server/utils/json/serialization.py (2)
53-70: Document the include_private=True rationale.The special handling of
fos.Samplewithinclude_private=Trueis noteworthy. This exposes internal fields that are typically hidden from the public API.Ensure this is intentional and documented, especially in the context of ETag generation where consistency is critical.
14-50: Verify extensibility and error handling.The deserialize function has a hardcoded list of 6 label classes. Consider:
- If additional label types need deserialization support in the future, this list must be manually updated
- If
cls.from_dict(value)raises an exception, it will propagate unhandledWhile the current implementation is functional, verify that this approach aligns with the project's deserialization strategy.
Run the following script to verify if there are other label types that might need deserialization support:
tests/unittests/server/utils/jsonpatch/test_json_patch_patch.py (1)
12-12: LGTM! Import path updated correctly.The import path has been updated to align with the new module structure (
fiftyone.server.utils.json.jsonpatch). Test behavior remains unchanged.tests/unittests/server/utils/jsonpatch/test_jsonpatch.py (1)
12-12: LGTM! Import path updated correctly.The import path has been updated to align with the new module structure. Test behavior remains unchanged.
tests/unittests/server/utils/jsonpatch/test_jsonpatch_methods.py (1)
14-23: LGTM! Import paths updated correctly.The import paths have been updated to align with the new module structure (
fiftyone.server.utils.json.jsonpatch). Test behavior remains unchanged.fiftyone/server/utils/__init__.py (1)
18-19: LGTM! Submodule imports added correctly.The imports of
jsonandhttpsubmodules enable the new utilities introduced in this PR (ETag support and centralized JSON handling). This aligns with the broader refactoring effort.fiftyone/server/decorators.py (1)
19-40: LGTM! Integration with new JSON utils is clean.The decorator functions now use
utils.json.dumps()andutils.json.loads()from the refactored utilities. The integration is straightforward and maintains the existing behavior.fiftyone/server/utils/json/jsonpatch/__init__.py (1)
11-32: LGTM! Import paths updated correctly.The import paths have been updated to reflect the new module structure (
fiftyone.server.utils.json.jsonpatch.*), and a formatting improvement was added. The public API remains unchanged.fiftyone/server/utils/json/__init__.py (1)
28-33: LGTM: JSONResponse uses centralized dumpsConsistent encoding via the custom Encoder looks good.
tests/unittests/sample_route_tests.py (2)
224-245: ETag/If-Match success tests look solidCovers ETag, isoformat, and timestamp variants and asserts ETag presence in responses.
579-608: Field If-Match tests: good coverageMirrors sample-level tests; ETag assertions included.
fiftyone/server/utils/json/jsonpatch/patch.py (3)
15-19: Import path fix looks rightUpdating to the nested utils.json.jsonpatch path matches the new structure.
40-47: Subclass guard + super call: goodValidation of op presence and forwarding kwargs to super is appropriate.
72-79: Generic typing cleanup is fineSwitch to Generic[V] on PatchWithValue and updated value typing improves clarity.
fiftyone/server/utils/json/encoder.py (1)
31-46: Duplicate dumps/loads/JSONResponse definitions — centralize in utils.jsonThis file redefines dumps(), loads(), and JSONResponse, which are also defined/exported in fiftyone.server.utils.json.init.py. Keeping two sources risks drift and accidental imports from the wrong module.
Apply this diff to keep Encoder here and centralize the public API in utils.json:
-def dumps(obj: Any) -> str: - """Serializes an object to a JSON-formatted string.""" - return json_util.dumps(obj, cls=Encoder) - - -def loads(s: Union[str, bytes, bytearray, None]) -> Any: - """Deserializes a JSON-formatted string to a Python object.""" - return json_util.loads(s) if s else {} - - -class JSONResponse(StarletteJSONResponse): - """Custom JSON response that uses the custom Encoder.""" - - def render(self, content: Any) -> bytes: - return dumps(content).encode("utf-8") +# NOTE: dumps/loads/JSONResponse are defined in +# fiftyone.server.utils.json.__init__.py. This module should only expose Encoder.To help verify unintended imports, you can scan the repo:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
fiftyone/server/routes/sample.py (2)
50-67: Add debug logging for ETag format parsing failures.The multiple try-except blocks with bare ellipsis (
...) silently swallow all exceptions when parsing If-Match formats. This makes debugging difficult if clients send malformed values. Consider adding debug-level logging to trace which formats were attempted.Apply this diff to add debug logging:
# As ETag - Currently this is just a based64 encode string of # last_modified_at try: if_not_modified_since = datetime.datetime.fromisoformat( base64.b64decode(if_match.encode("utf-8")).decode("utf-8") ) except Exception: - ... + logger.debug("If-Match value not in base64-encoded ISO format") # As ISO date try: if_not_modified_since = datetime.datetime.fromisoformat(if_match) except Exception: - ... + logger.debug("If-Match value not in ISO format") # As Unix timestamp try: if_not_modified_since = datetime.datetime.fromtimestamp(float(if_match)) except ValueError: - ... + logger.debug("If-Match value not in Unix timestamp format")
139-143: Log warnings when sample reload fails in ETag generation.The try-except block catches all exceptions during reload and continues silently. If the reload fails (e.g., sample was deleted between save and ETag generation), the returned ETag may be based on stale in-memory state rather than the persisted state. Consider logging a warning to aid debugging.
Apply this diff to add warning logging:
# Ensure last_modified_at reflects persisted state before computing # ETag try: sample.reload(hard=True) except Exception: # best-effort; still return response - ... + logger.warning( + "Failed to reload sample %s after save; ETag may reflect " + "pre-save state", + sample.id, + exc_info=True + )
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
fiftyone/core/sample.py(2 hunks)fiftyone/server/routes/sample.py(6 hunks)
🔇 Additional comments (3)
fiftyone/server/routes/sample.py (3)
28-101: LGTM! Atomic conditional update addresses TOCTOU concerns.The context manager implementation successfully addresses the previous TOCTOU (Time-of-Check to Time-of-Use) concerns by performing the optimistic lock check atomically at the database level via
sample.save(if_not_modified_since=...). This ensures that:
- The If-Match validation is tied directly to the database operation
- The conditional update query
last_modified_at <= if_not_modified_sinceis atomic- Race conditions between validation and save are eliminated
The multi-format If-Match parsing (base64 ETag, ISO date, Unix timestamp) provides good backward compatibility and flexibility for clients.
177-193: LGTM! Context manager ensures consistent ETag semantics.The refactored patch flow using the
sample_managercontext manager properly integrates If-Match validation, conditional saving, and ETag generation. The ETag header in the response reflects the saved state (after the reload ingenerate_sample_etag), ensuring clients receive accurate version information for subsequent requests.
248-278: LGTM! SampleField endpoint follows the same ETag pattern.The SampleField.patch implementation correctly reuses the
sample_managercontext manager, ensuring consistent If-Match/ETag handling across both sample-level and field-level updates. The ETag header represents the entire sample's state, which is appropriate since field modifications update the sample'slast_modified_at.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems reasonable just a couple nits
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
tests/unittests/sample_route_tests.py (1)
336-340: Fix payload to be JSON-serializable while still invalid for the schemaUsing a Python set will fail during JSON serialization. Use a dict (still invalid for a list field) so the request reaches the route and triggers the intended 400.
- "ground_truth": { - "_cls": "Detections", - "detections": {"some messed up map"}, - } + "ground_truth": { + "_cls": "Detections", + "detections": {"not": "a list"}, + }
🧹 Nitpick comments (5)
fiftyone/server/utils/__init__.py (1)
18-18: Prefer relative import for package submodulesUse explicit relative import to avoid ambiguity and potential circular-resolve surprises.
-from fiftyone.server.utils import http, json +from . import http, jsontests/unittests/sample_route_tests.py (1)
557-559: Remove duplicate MagicMock initialization
mock_request = MagicMock()is assigned twice; drop the first.- """Helper to create a mock request object.""" - mock_request = MagicMock() - mock_request = MagicMock() + """Helper to create a mock request object.""" + mock_request = MagicMock()fiftyone/server/routes/sample.py (3)
44-47: Optionally reject weak ETags for If-MatchPer RFC 7232, weak validators must not be used with If-Match.
ETag.parse()returns a weak flag; consider rejecting weak ETags with 412.Would you like a patch that enforces this and updates tests accordingly?
132-136: Centralize ETag formatting via utils.httpSince you already use
utils.http.ETag.parse, consider using the companion factory (if available) to create/quote ETags to keep formatting consistent.If
utils.http.ETag.create(value: str)(or similar) exists, prefer it over manual quoting here.
324-324: Remove stray printLeftover debug print will pollute logs.
- print(field_list)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
fiftyone/server/routes/sample.py(5 hunks)fiftyone/server/utils/__init__.py(1 hunks)tests/unittests/sample_route_tests.py(9 hunks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
fiftyone/server/routes/sample.py (2)
47-69: Make If-Match parsing UTC‑aware and harden error handling
- fromtimestamp() uses local time; produces false 412s.
- Normalize all parsed datetimes to UTC, and broaden exceptions.
Apply:
- # As ETag - Currently this is just a based64 encode string of + # As ETag - Currently this is just a base64-encoded string of # last_modified_at try: - if_last_modified_at = datetime.datetime.fromisoformat( - base64.b64decode(if_match.encode("utf-8")).decode("utf-8") - ) + _decoded = base64.b64decode(if_match.encode("utf-8")).decode("utf-8") + dt = datetime.datetime.fromisoformat(_decoded) + if_last_modified_at = ( + dt.astimezone(datetime.timezone.utc) + if dt.tzinfo is not None + else dt.replace(tzinfo=datetime.timezone.utc) + ) except Exception: ... # As ISO date try: - if_last_modified_at = datetime.datetime.fromisoformat(if_match) + dt = datetime.datetime.fromisoformat(if_match) + if_last_modified_at = ( + dt.astimezone(datetime.timezone.utc) + if dt.tzinfo is not None + else dt.replace(tzinfo=datetime.timezone.utc) + ) except Exception: ... # As Unix timestamp try: - if_last_modified_at = datetime.datetime.fromtimestamp( - float(if_match) - ) - except ValueError: + if_last_modified_at = datetime.datetime.fromtimestamp( + float(if_match), tz=datetime.timezone.utc + ) + except (ValueError, TypeError, OSError, OverflowError): ...
150-167: Atomic path doesn’t advance last_modified_at; ETag may not changereplace_one bypasses ODM save hooks and won’t bump last_modified_at, so validators can remain stale.
Apply:
if if_last_modified_at is not None: d = sample.to_mongo_dict(include_id=True) + # ensure persisted doc gets a fresh last_modified_at + d["last_modified_at"] = fou.utcnow() # pylint:disable-next=protected-access update_result = sample.dataset._sample_collection.replace_one( { # pylint:disable-next=protected-access "_id": sample._id, "last_modified_at": {"$eq": if_last_modified_at}, }, d, )Optional improvement: use find_one_and_replace(..., return_document=ReturnDocument.AFTER) to get the updated doc directly and skip a separate reload.
🧹 Nitpick comments (2)
fiftyone/server/routes/sample.py (1)
323-324: Remove stray printDebug printing in server routes is noisy; use logger if needed.
- print(field_list)tests/unittests/sample_route_tests.py (1)
592-593: Remove duplicate assignmentRedundant reinitialization of mock_request.
- mock_request = MagicMock() - mock_request = MagicMock() + mock_request = MagicMock()
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
fiftyone/server/routes/sample.py(5 hunks)tests/unittests/sample_route_tests.py(9 hunks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
fiftyone/server/routes/sample.py (1)
62-68: Timestamp parsing still uses local time zone.Past review comments flagged this issue, but it remains unaddressed. The code uses
datetime.datetime.fromtimestamp(float(if_match))which interprets the timestamp in the server's local timezone rather than UTC. This can cause false 412 responses when comparing withsample.last_modified_at(which is stored in UTC).Apply this diff to parse as UTC:
- # As Unix timestamp + # As Unix timestamp (UTC) try: if_last_modified_at = datetime.datetime.fromtimestamp( - float(if_match) + float(if_match), tz=datetime.timezone.utc ) except Exception:
🧹 Nitpick comments (1)
fiftyone/server/routes/sample.py (1)
150-152: Preferfou.utcnow()for consistency.While
datetime.datetime.now(datetime.timezone.utc)is correct, the codebase already importsfiftyone.core.odm.utils as fouat line 19. Usingfou.utcnow()would be more consistent with FiftyOne's patterns.Apply this diff:
if if_last_modified_at is not None: d = sample.to_mongo_dict(include_id=True) - d["last_modified_at"] = datetime.datetime.now(datetime.timezone.utc) + d["last_modified_at"] = fou.utcnow()
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fiftyone/server/routes/sample.py(5 hunks)
🔇 Additional comments (5)
fiftyone/server/routes/sample.py (5)
114-118: Early If-Match check is now complemented by atomic conditional update.The early validation here serves as a fast-fail mechanism. The actual protection against race conditions is provided by the atomic conditional update in
save_sample(lines 150-167), which re-validates at write time.
123-134: LGTM!The ETag generation using base64-encoded
last_modified_atwith proper quoting is correct and follows HTTP ETag standards.
182-207: LGTM!The JSON patch handling correctly uses the refactored utilities, accumulates errors from multiple patches, and provides clear error messages.
213-255: LGTM!The PATCH endpoint correctly implements the If-Match/ETag flow:
- Parses If-Match header
- Loads sample with conditional check
- Applies patches based on content type
- Saves with atomic conditional update
- Returns response with fresh ETag header
This addresses the TOCTOU and ETag staleness concerns from previous reviews.
277-344: LGTM with one minor issue!The SampleField PATCH endpoint correctly implements the If-Match/ETag flow similar to the Sample endpoint. The logic for finding and patching a specific field in a list is correct, and the atomic conditional update with ETag response is properly implemented.
Just ensure the debug print statement at line 324 is removed (see separate comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
fiftyone/server/routes/sample.py (1)
64-66: Fix Unix timestamp parsing to use UTC.
datetime.fromtimestamp()interprets the timestamp in local time, which can cause false 412 responses when the storedlast_modified_atis in UTC. This critical issue was flagged in a previous review but remains unaddressed.Apply this diff to parse the timestamp as UTC:
# As Unix timestamp try: - if_last_modified_at = datetime.datetime.fromtimestamp( - float(if_match) + if_last_modified_at = datetime.datetime.fromtimestamp( + float(if_match), tz=datetime.timezone.utc )
🧹 Nitpick comments (1)
fiftyone/server/routes/sample.py (1)
114-118: Consider removing the early conditional check.While the early fail-fast check provides quick feedback for stale requests, it can produce misleading errors if the sample changes between this check and the atomic update in
save_sample(). The authoritative conditional check happens atomically at line 155-167.Removing this early check would simplify the code and ensure all 412 responses come from the atomic operation, making debugging easier.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fiftyone/server/routes/sample.py(5 hunks)
🔇 Additional comments (5)
fiftyone/server/routes/sample.py (5)
123-134: LGTM!The ETag generation using base64-encoded ISO timestamp is appropriate. The format is opaque per HTTP spec and properly quoted, with matching parse logic in
get_if_last_modified_at().
185-192: LGTM!The refactored JSON patch handling using
utils.json.parse_jsonpatchandutils.json.deserializeis appropriate and maintains proper error handling.
214-255: LGTM!The PATCH endpoint implementation correctly integrates If-Match validation and ETag generation. The flow properly handles conditional updates and returns appropriate headers.
278-342: LGTM!The SampleField PATCH endpoint correctly implements If-Match/ETag semantics for nested field updates. The flow properly validates the field path, applies patches, and returns the updated field state with an appropriate ETag header.
150-170: Verify thatsample.save()automatically updateslast_modified_at.The atomic conditional update at lines 155-162 manually sets
last_modified_at(line 152), while the non-conditional path at line 169 relies onsave()to update it. Ensure both paths consistently update the timestamp, otherwise the ETag computed at line 179 may reflect stale state in the non-conditional case.Run the following script to verify the
save()behavior:
fiftyone/server/utils/http.py
Outdated
| """Utility class for creating and parsing ETag strings.""" | ||
|
|
||
| @staticmethod | ||
| def create(value: Any) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this used anywhere? I see in the sample method "generate_sample_etage" you're doing this same thing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope. It should be, I'll add it in.
What changes are proposed in this pull request?
Add
If-Matchheader to sample patch endpoints to avoid race conditions and other multi-user update problems.Successful sample PATCH endpoints now return the
ETagheader.Sample PATCH endpoints now accept the
If-Matchheader which can be one of the following:ETagheader from a previous request (this is preferred as it is more flexible than the following options)last_modified_atvalue as an ISO 8601 datelast_modified_atvalue as a UNIX timestampThe first example will compare against the current calculated ETag, while the latter two will compare directly against the sample's
last_modified_atand raise a412if the values do not match.If-Matchheader support to the sample routes (with tests!)fiftyone.server.utilsinto sub-modules to group similar areas.fiftyone.server.decoratorsto use refactoredfiftyone.server.utilsHow is this patch tested? If it is not, please explain why.
Run the automated tests
Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)
What areas of FiftyOne does this PR affect?
fiftyonePython library changesSummary by CodeRabbit