-
Notifications
You must be signed in to change notification settings - Fork 676
Remove serverside JS usages #6334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
Conversation
WalkthroughReplaces JS-based array sorting and counting with MongoDB aggregation expressions: adds a Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Client
participant App(ViewExpression / Server)
participant MongoDB
Client->>App: Request view update (sort or count)
alt sort -> $sortArray
App->>MongoDB: Build expression with $sortArray (input: expr, sortBy: {...})
MongoDB-->>App: Evaluated/represented sorted array
App-->>Client: Return ViewExpression containing $sortArray stage
else count -> aggregation expr
App->>MongoDB: Build aggregation expr ($map / $setUnion -> $filter/$size -> $arrayToObject)
MongoDB-->>App: Evaluated counts map (aggregation result or expression)
App->>App: set_field(path, expr, _allow_missing=True)
App-->>Client: Acknowledge field set / return expression
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Pre-merge checks (2 passed, 1 warning)❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
Poem
Tip 👮 Agentic pre-merge checks are now available in preview!Pro plan users can now enable pre-merge checks in their settings to enforce checklists before merging PRs.
Please see the documentation for more information. Example: reviews:
pre_merge_checks:
custom_checks:
- name: "Undocumented Breaking Changes"
mode: "warning"
instructions: |
Pass/fail criteria: All breaking changes to public APIs, CLI flags, environment variables, configuration keys, database schemas, or HTTP/GraphQL endpoints must be documented in the "Breaking Change" section of the PR description and in CHANGELOG.md. Exclude purely internal or private changes (e.g., code not exported from package entry points or explicitly marked as internal). Please share your feedback with us on this Discord post. ✨ Finishing Touches
🧪 Generate unit tests
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
fiftyone/core/expressions.py (1)
18-18
: Remove unusedpymongo
import
pymongo
isn't referenced in this module. Drop it to avoid lint failures and unnecessary dependency coupling.- import pymongo
fiftyone/server/view.py (1)
949-982
: Optional: use$setField
to simplify and avoid$mergeObjects+$arrayToObject
MongoDB 5.0+ supports
$setField
, which directly sets/updates a key on an object. This is clearer and avoids building single-element objects each step.def _count_list_items(path, view): - expr = { - "$reduce": { - "input": F(path), # The field to be processed - "initialValue": {}, - "in": { - "$mergeObjects": [ - "$$value", - { - "$arrayToObject": [ - [ - { - "k": "$$this", - "v": { - "$add": [ - { - "$ifNull": [ - { - "$getField": { - "input": "$$value", - "field": "$$this", - } - }, - 0, - ] - }, - 1, - ] - }, - } - ] - ] - }, - ] - }, - } - } + expr = { + "$reduce": { + "input": F(path), + "initialValue": {}, + "in": { + "$setField": { + "input": "$$value", + "field": "$$this", + "value": { + "$add": [ + { + "$ifNull": [ + { + "$getField": { + "input": "$$value", + "field": "$$this", + } + }, + 0, + ] + }, + 1, + ] + }, + } + }, + } + } return view.set_field(path, expr, _allow_missing=True)
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
fiftyone/core/expressions.py
(1 hunks)fiftyone/server/view.py
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
fiftyone/server/view.py (2)
fiftyone/core/collections.py (2)
view
(1059-1066)set_field
(6332-6441)fiftyone/core/expressions.py (1)
set_field
(2061-2123)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (13)
- GitHub Check: test-windows / test-python (windows-latest, 3.12)
- GitHub Check: test-windows / test-python (windows-latest, 3.11)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.9)
- GitHub Check: test-windows / test-python (windows-latest, 3.10)
- GitHub Check: test-windows / test-python (windows-latest, 3.9)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.11)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.12)
- GitHub Check: test / test-app
- GitHub Check: lint / eslint
- GitHub Check: build / build
- GitHub Check: e2e / test-e2e
- GitHub Check: build
🔇 Additional comments (1)
fiftyone/server/view.py (1)
949-982
: Confirm tag key validity for object field namesThis builds a document whose keys are tag strings. If tags can contain '.' or start with '$', MongoDB will error on invalid field names. If such tags are possible, either sanitize keys or store counts as an array of
{k, v}
pairs instead. Otherwise, please confirm upstream constraints disallow these characters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't seem like it gives the correct results.
added 8 label tags 'test' to sample then ran
view= _add_labels_tags_counts(ds)
pipeline2=view._pipeline()
pipeline2.extend([{
"$match": {
"_label_tags": {"$ne": {}}
}
}, {"$project":{"_label_tags":1}}])
# proposed
In [37]: list(ds._sample_collection.aggregate(pipeline))
Out[37]: [{'_id': ObjectId('6829068fb59076c98c88bdcc'), '_label_tags': {'test': 1}}]
# existing
list(ds._sample_collection.aggregate(pipeline2))
Out[34]: [{'_id': ObjectId('6829068fb59076c98c88bdcc'), '_label_tags': {'test': 8.0}}]
…deleting dead code" This reverts commit 9119a27.
Fixed. The only difference now is that the count is an integer instead of a float, idk if that matters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
fiftyone/core/expressions.py (1)
2667-2741
: Fix behavior/doc drift in ViewExpression.sort; deprecate/handlenumeric
and update docs
- The docstring still references JS
.sort()
and implies string-based sorting; the new impl uses MongoDB$sortArray
. Update docs accordingly.numeric
is now ignored. Either implement a cast workflow or at least emit a deprecation warning to avoid silent behavior changes.Apply:
@@ - If no ``key`` is provided, this array must contain elements whose - BSON representation can be sorted by JavaScript's ``.sort()`` method. + If no ``key`` is provided, the array elements are sorted by their + values using MongoDB's ``$sortArray`` (ascending by default). @@ - numeric (False): whether the array contains numeric values. By - default, the values will be sorted alphabetically by their - string representations + numeric (False): deprecated; ``$sortArray`` sorts numeric types + natively. If your data are numeric strings, cast to numbers + before sorting (e.g. via ``map(...).to_double()``). @@ - sort_order = pymongo.DESCENDING if reverse else pymongo.ASCENDING + if numeric: + warnings.warn( + "numeric is deprecated; $sortArray sorts numeric types natively. " + "For numeric strings, cast to numbers before sorting.", + DeprecationWarning, + ) + sort_order = -1 if reverse else 1
🧹 Nitpick comments (2)
fiftyone/core/expressions.py (2)
18-18
: Drop hard dependency on pymongo for sort constants
$sortArray
accepts 1/-1; importingpymongo
here adds an unnecessary import to core expressions.Apply:
- import pymongo + # keep this module pymongo-agnostic; $sortArray accepts 1 (asc) and -1 (desc)And (already covered above in sort()):
- sort_order = pymongo.DESCENDING if reverse else pymongo.ASCENDING + sort_order = -1 if reverse else 1Also applies to: 2727-2727
2729-2733
: Type expectations forkey
If callers pass a
ViewField
instead of a string, the resulting$sortArray.sortBy
would be incorrect (keys must be field names, not$
-prefixed paths). Guard or coerce to string, or raise with a clear error.Apply:
if key is not None: - sort_by = {key: sort_order} + if not isinstance(key, str): + raise TypeError("sort(key=...) must be a field path string") + sort_by = {key: sort_order}
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
fiftyone/core/expressions.py
(2 hunks)fiftyone/server/view.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- fiftyone/server/view.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
- GitHub Check: test-windows / test-python (windows-latest, 3.9)
- GitHub Check: test-windows / test-python (windows-latest, 3.12)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.12)
- GitHub Check: test-windows / test-python (windows-latest, 3.11)
- GitHub Check: test-windows / test-python (windows-latest, 3.10)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.11)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.9)
- GitHub Check: test / test-python (ubuntu-latest-m, 3.10)
- GitHub Check: e2e / test-e2e
- GitHub Check: build
idk why tests would pass in ubuntu but fail in windows. those tests shouldn't have anything to do with the OS? Unless their MongoDB installations differ in behavior which would be wild? |
Ok so confusingly this is because windows downloads v6 while ubuntu downloads v8 of mongod. So it's a version thing not a platform thing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW- the example code for sort()
still runs successfully for me on develop
. I'm running MongoDB 5.0.2 on an x86 mac.
I'm onboard with swapping out $function
for $sortArray
though. Although looks like I'll have to upgrade to at least Mongo 5.2 if I want to use this function going forward 😝
comp = "(a, b) => a - b" | ||
else: | ||
comp = "" | ||
sort_order = pymongo.DESCENDING if reverse else pymongo.ASCENDING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A couple comments on the implementation:
- The
numeric
parameter is no longer used. Assuming that the new implementation sorts strings alphabetically and numbers numerically by default, can you delete thenumeric
parameter? - The docstring currently mentions "JavaScript's
.sort()
method". Can you update to say something like this?
If no ``key`` is provided, this array must contain sortable non-document values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright will clean up if we actually want to go forward with this. Sort of a prototype at the moment
return view.set_field( | ||
path, F(path)._function(function), _allow_missing=True | ||
) | ||
expr = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Defer to @benjaminpkane to evaluate the suitability of this new implementation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah it's not working in mongo6. Solution is getting really ugly ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think changing this to existence instead of counts could work. These counts are for the tag:count
grid bubble values but showing the distinct tags is sufficient imo if we are technically blocked (also a performance improvement for the grid, less computation)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I got a working version now I think. Check it out and lmk.
Of course if you want to change the grid behavior that would work too but that's up to you + your squad
well I have no idea what's going on then because I definitely tried Mongo 5. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
fiftyone/server/view.py (1)
950-970
: Verify tag values won’t produce illegal document keys
$arrayToObject
will fail if any tag contains.
or starts with$
. If such tags are possible, encode keys (e.g., replace.
with a safe char and track a reverse map) or return an array of{tag, count}
objects instead.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fiftyone/server/view.py
(1 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
fiftyone/server/view.py (2)
fiftyone/core/collections.py (2)
view
(1059-1066)set_field
(6332-6441)fiftyone/core/expressions.py (1)
set_field
(2061-2123)
expr = { | ||
"$arrayToObject": { | ||
"$map": { | ||
"input": {"$setUnion": "$_label_tags"}, | ||
"as": "tag", | ||
"in": { | ||
"k": "$$tag", | ||
"v": { | ||
"$size": { | ||
"$filter": { | ||
"input": "$_label_tags", | ||
"as": "t", | ||
"cond": {"$eq": ["$$t", "$$tag"]}, | ||
} | ||
} | ||
}, | ||
}, | ||
} | ||
} | ||
} | ||
return view.set_field(path, expr, _allow_missing=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Critical: $setUnion invocation is invalid (and hard-codes the source path)
- $setUnion expects an array of arrays; passing a single array expression will error on MongoDB 6.x.
- The source is hard-coded to
$_label_tags
, ignoring thepath
param.
Fix both by referencing path
and unioning with []
:
def _count_list_items(path, view):
- expr = {
- "$arrayToObject": {
- "$map": {
- "input": {"$setUnion": "$_label_tags"},
- "as": "tag",
- "in": {
- "k": "$$tag",
- "v": {
- "$size": {
- "$filter": {
- "input": "$_label_tags",
- "as": "t",
- "cond": {"$eq": ["$$t", "$$tag"]},
- }
- }
- },
- },
- }
- }
- }
+ expr = {
+ "$arrayToObject": {
+ "$map": {
+ "input": {"$setUnion": [f"${path}", []]},
+ "as": "tag",
+ "in": {
+ "k": "$$tag",
+ "v": {
+ "$size": {
+ "$filter": {
+ "input": f"${path}",
+ "as": "t",
+ "cond": {"$eq": ["$$t", "$$tag"]},
+ }
+ }
+ },
+ },
+ }
+ }
+ }
return view.set_field(path, expr, _allow_missing=True)
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
expr = { | |
"$arrayToObject": { | |
"$map": { | |
"input": {"$setUnion": "$_label_tags"}, | |
"as": "tag", | |
"in": { | |
"k": "$$tag", | |
"v": { | |
"$size": { | |
"$filter": { | |
"input": "$_label_tags", | |
"as": "t", | |
"cond": {"$eq": ["$$t", "$$tag"]}, | |
} | |
} | |
}, | |
}, | |
} | |
} | |
} | |
return view.set_field(path, expr, _allow_missing=True) | |
expr = { | |
"$arrayToObject": { | |
"$map": { | |
"input": {"$setUnion": [f"${path}", []]}, | |
"as": "tag", | |
"in": { | |
"k": "$$tag", | |
"v": { | |
"$size": { | |
"$filter": { | |
"input": f"${path}", | |
"as": "t", | |
"cond": {"$eq": ["$$t", "$$tag"]}, | |
} | |
} | |
}, | |
}, | |
} | |
} | |
} | |
return view.set_field(path, expr, _allow_missing=True) |
🤖 Prompt for AI Agents
In fiftyone/server/view.py around lines 950 to 970, the aggregation expr
incorrectly calls $setUnion with a single array and hard-codes $_label_tags;
change it to reference the provided path and pass an array-of-arrays by unioning
the field with an empty array (e.g. use {"$setUnion": [f"${path}", []]}), and
ensure subsequent references to the tags use the same dynamic field expression;
keep returning view.set_field(path, expr, _allow_missing=True).
$function is deprecated. It's also slow and bad security (which is why it's deprecated).
What changes are proposed in this pull request?
ViewExpression.sort
to mongo native$sortArray
. This was added in 5.2 which wouldn't have been OK originally but is fine now since we require >=6.0.NOTE:
ViewExpression.sort
docstring examples do not work in any configuration I've tried, with code going back 4+ years. I've tried mongodb versions back to 4.2. Both local (Mac) and some deployed (Linux). Either the docstrings have been totally wrong forever, the function has been broken forever, or I'm doing something silly. Would love to know in any case.How is this patch tested? If it is not, please explain why.
App runs if you disable serverside JS 🤷🏼
Gemini says they're logically equivalent 🤷🏼 🤖
Letting unit tests run to see results of that.
I'm open to more tests to run? I don't really know what the server function populates (2)
Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
Remove usage of deprecated $function MongoDB stage
What areas of FiftyOne does this PR affect?
fiftyone
Python library changesSummary by CodeRabbit
Refactor
Chores
Documentation