-
Notifications
You must be signed in to change notification settings - Fork 587
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding label IDs in Segmentation Evaluation results #5323
Conversation
WalkthroughThe pull request modifies the segmentation evaluation functionality in the Changes
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Tip CodeRabbit's docstrings feature is now available as part of our Early Access Program! Simply use the command Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
Documentation and Community
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
fiftyone/utils/eval/segmentation.py (2)
352-352
: Potential memory overhead from accumulating matches
Storing all matches in a single global list (matches = []
) for the entire dataset could become memory-intensive for large collections. Consider storing matches on a per-sample or streaming basis if memory usage becomes an issue.
462-470
: Conditional parsing of matches
These lines correctly parse the custommatches
(with IDs) if provided, or fall back toself._parse_confusion_matrix(...)
otherwise. This design offers flexibility for capturing more granular information when needed. However, if your dataset is large, consider using a more memory-friendly approach (e.g., storing aggregated metrics or streaming).
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fiftyone/utils/eval/segmentation.py
(5 hunks)
🔇 Additional comments (3)
fiftyone/utils/eval/segmentation.py (3)
431-431
: Passing matches to SegmentationResults
Keeping track of matches
via the constructor parameter is good for enabling more interactive confusion matrix features, especially if you plan to link confusion matrix cells to their respective segmentation IDs.
457-457
: New parameter for matches
Introducing matches=None
as a default parameter is a clean way to preserve backward compatibility and only store segmentation matches when available.
479-480
: Injecting IDs into the parent constructor
By passing ytrue_ids
and ypred_ids
, you ensure that the segmentation results can reference the original segmentation objects for more interactive features, such as clickable confusion matrices.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
fiftyone/utils/eval/segmentation.py (1)
459-459
: Add type hints and validation for matches parameterThe
matches
parameter should be documented and validated to ensure correct tuple structure.def __init__( self, samples, config, eval_key, pixel_confusion_matrix, classes, - matches=None, + matches: Optional[List[Tuple[str, str, int, str, str]]] = None, missing=None, backend=None, ): """ Args: ... matches: Optional list of tuples containing (true_class, pred_class, count, ground_truth_id, prediction_id) for each non-zero entry in the confusion matrix ... """ pixel_confusion_matrix = np.asarray(pixel_confusion_matrix) ytrue_ids = None ypred_ids = None if matches is not None: + # Validate matches structure + if not all(len(m) == 5 for m in matches): + raise ValueError("Each match must be a 5-tuple") ytrue, ypred, weights, ytrue_ids, ypred_ids = zip(*matches)Also applies to: 464-469
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fiftyone/utils/eval/segmentation.py
(5 hunks)
🔇 Additional comments (2)
fiftyone/utils/eval/segmentation.py (2)
481-482
: LGTM: IDs properly passed to parent class
The implementation correctly passes the segmentation IDs to the parent class, enabling the click-through functionality in confusion matrices.
393-396
: 🛠️ Refactor suggestion
Consider optimizing matches collection for multi-frame samples
The current implementation appends matches for each image within the sample loop, which could lead to redundant entries for multi-frame samples. Consider aggregating matches at the sample level instead.
- non_zero_indexes = np.nonzero(image_conf_mat)
- for index in zip(*non_zero_indexes):
- matches.append((classes[index[0]], classes[index[1]], image_conf_mat[index[0], index[1]], gt_seg.id, pred_seg.id))
+ sample_conf_mat += image_conf_mat
+
+ # Collect matches after processing all frames in the sample
+ non_zero_indexes = np.nonzero(sample_conf_mat)
+ for index in zip(*non_zero_indexes):
+ matches.append((classes[index[0]], classes[index[1]], sample_conf_mat[index[0], index[1]], gt_seg.id, pred_seg.id))
Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (1)
fiftyone/utils/eval/segmentation.py (1)
393-396
:⚠️ Potential issueMove matches collection outside the image loop to prevent duplicates
The matches are currently being collected for each image within a sample, which could lead to duplicate entries for multi-frame samples. This issue was previously identified and marked as fixed, but appears to still be present.
Apply this diff to fix the issue:
- non_zero_indexes = np.nonzero(image_conf_mat) - for index in zip(*non_zero_indexes): - matches.append((classes[index[0]], classes[index[1]], image_conf_mat[index[0], index[1]], gt_seg.id, pred_seg.id)) + # After processing all images in the sample + non_zero_indexes = np.nonzero(sample_conf_mat) + for index in zip(*non_zero_indexes): + matches.append((classes[index[0]], classes[index[1]], sample_conf_mat[index[0], index[1]], gt_seg.id, pred_seg.id))
🧹 Nitpick comments (2)
fiftyone/utils/eval/segmentation.py (2)
465-473
: Add type checking for matches unpackingThe matches unpacking logic could be more robust with type checking to prevent runtime errors if the matches data is malformed.
Apply this diff to add type checking:
if matches is not None: + if not isinstance(matches, list) or not all(isinstance(m, tuple) and len(m) == 5 for m in matches): + raise ValueError("Expected matches to be a list of 5-element tuples") ytrue, ypred, weights, ytrue_ids, ypred_ids = zip(*matches) else: ytrue, ypred, weights = self._parse_confusion_matrix( pixel_confusion_matrix, classes )
460-460
: Update docstring to document the new matches parameterThe
matches
parameter should be documented in the class docstring to maintain API documentation completeness.Add this to the docstring:
classes: a list of class labels corresponding to the confusion matrix + matches (None): optional list of tuples containing (true_label, pred_label, count, true_id, pred_id) missing (None): a missing (background) class backend (None): a :class:`SegmentationEvaluation` backend
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fiftyone/utils/eval/segmentation.py
(5 hunks)
🔇 Additional comments (1)
fiftyone/utils/eval/segmentation.py (1)
426-427
: LGTM: Clean integration with existing functionality
The handling of empty matches and the integration with the parent class constructor is well implemented.
Also applies to: 482-483
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
fiftyone/utils/eval/segmentation.py (4)
353-353
: Consider memory optimization for large datasetsThe unbounded growth of the
matches
list could lead to memory issues when processing large datasets. Consider implementing batch processing or streaming the results.Also applies to: 394-397
465-473
: Add input validation for matches parameterConsider adding validation to ensure that when
matches
is provided, all tuples have the expected length and types.Add this validation before the unpacking:
if matches is not None: + if not all(len(m) == 5 for m in matches): + raise ValueError("All matches must be 5-tuples") ytrue, ypred, weights, ytrue_ids, ypred_ids = zip(*matches)
501-514
: Improve code readability with helper functionConsider extracting the ID handling logic into a helper function to improve readability and maintainability.
+ @staticmethod + def _get_ids_or_none(ids): + return ids if ids is not None else itertools.repeat(None) @classmethod def _from_dict(cls, d, samples, config, eval_key, **kwargs): ytrue = d["ytrue"] ypred = d["ypred"] weights = d["weights"] - ytrue_ids = d.get("ytrue_ids", None) - if ytrue_ids is None: - ytrue_ids = itertools.repeat(None) - - ypred_ids = d.get("ypred_ids", None) - if ypred_ids is None: - ypred_ids = itertools.repeat(None) + ytrue_ids = cls._get_ids_or_none(d.get("ytrue_ids")) + ypred_ids = cls._get_ids_or_none(d.get("ypred_ids")) matches = list(zip(ytrue, ypred, weights, ytrue_ids, ypred_ids))
434-434
: Add docstring for the matches parameterSince this is a user-facing change that enables click-through behavior, consider adding documentation for the
matches
parameter in the class docstring."""Class that stores the results of a segmentation evaluation. Args: samples: the :class:`fiftyone.core.collections.SampleCollection` used config: the :class:`SegmentationEvaluationConfig` used eval_key: the evaluation key pixel_confusion_matrix: a pixel value confusion matrix classes: a list of class labels corresponding to the confusion matrix + matches: a list of tuples containing (true_class, pred_class, count, gt_id, pred_id) + that enables click-through behavior in confusion matrices missing (None): a missing (background) class backend (None): a :class:`SegmentationEvaluation` backend """
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fiftyone/utils/eval/segmentation.py
(7 hunks)
🔇 Additional comments (1)
fiftyone/utils/eval/segmentation.py (1)
394-400
:
Risk of duplicate entries in multi-frame samples
The matches are being appended inside the image loop, which could lead to duplicate entries when processing multi-frame samples. Consider aggregating matches at the sample level instead.
Apply this diff to fix the issue:
sample_conf_mat = np.zeros((nc, nc), dtype=int)
for image in images:
image_conf_mat = _compute_pixel_confusion_matrix(...)
sample_conf_mat += image_conf_mat
- non_zero_indexes = np.nonzero(image_conf_mat)
- for index in zip(*non_zero_indexes):
- matches.append((classes[index[0]], classes[index[1]], image_conf_mat[index[0], index[1]], gt_seg.id, pred_seg.id))
# After processing all frames
+non_zero_indexes = np.nonzero(sample_conf_mat)
+for index in zip(*non_zero_indexes):
+ matches.append((classes[index[0]], classes[index[1]], sample_conf_mat[index[0], index[1]], gt_seg.id, pred_seg.id))
Likely invalid or redundant comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
fiftyone/utils/eval/segmentation.py (3)
394-397
: Consider batch processing for large datasetsWhile the logic is correct, appending to the matches list for each non-zero element in every image's confusion matrix could lead to memory growth with large datasets. Consider batch processing or streaming the results.
- non_zero_indexes = np.nonzero(image_conf_mat) - for index in zip(*non_zero_indexes): - matches.append((classes[index[0]], classes[index[1]], image_conf_mat[index[0], index[1]], gt_seg.id, pred_seg.id)) + # Process matches in batches + non_zero_indexes = np.nonzero(image_conf_mat) + batch_matches = [(classes[i], classes[j], image_conf_mat[i, j], gt_seg.id, pred_seg.id) + for i, j in zip(*non_zero_indexes)] + matches.extend(batch_matches)
466-474
: Handle empty matches list edge caseThe code handles both cases (with and without matches) but could be more explicit about empty list handling. Consider adding a specific condition for empty matches.
- if matches is not None: + if matches and matches is not None: # Check for non-empty list ytrue, ypred, weights, ytrue_ids, ypred_ids = zip(*matches) else: ytrue, ypred, weights = self._parse_confusion_matrix(Also applies to: 483-484
502-514
: Consider memory-efficient deserialization for large datasetsThe deserialization logic is correct but could be memory-intensive for large datasets as it creates a list of all matches in memory. Consider implementing a generator-based approach for large datasets.
- matches = list(zip(ytrue, ypred, weights, ytrue_ids, ypred_ids)) + # For large datasets, consider using a generator + matches = zip(ytrue, ypred, weights, ytrue_ids, ypred_ids) + if not config.stream_matches: # Add a config option for streaming + matches = list(matches)
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fiftyone/utils/eval/segmentation.py
(8 hunks)
🔇 Additional comments (2)
fiftyone/utils/eval/segmentation.py (2)
353-353
: LGTM: Matches list initialization
The initialization of the matches
list is clean and appropriate for storing the segmentation match details.
449-449
: LGTM: Constructor parameter addition
The new matches
parameter is well-documented with clear type hints in the docstring. The optional nature of the parameter maintains backward compatibility.
Also applies to: 461-463
7c0799b
to
56eb28c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
fiftyone/utils/eval/segmentation.py (1)
502-514
: Consider adding type validation for IDs.While the implementation handles missing IDs well, consider adding type validation for the cases where IDs are provided to ensure they match the expected format.
ytrue_ids = d.get("ytrue_ids", None) if ytrue_ids is None: ytrue_ids = itertools.repeat(None) +else: + if not isinstance(ytrue_ids, (list, tuple)): + raise ValueError("ytrue_ids must be a list or tuple") ypred_ids = d.get("ypred_ids", None) if ypred_ids is None: ypred_ids = itertools.repeat(None) +else: + if not isinstance(ypred_ids, (list, tuple)): + raise ValueError("ypred_ids must be a list or tuple")
📜 Review details
Configuration used: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
fiftyone/utils/eval/segmentation.py
(8 hunks)
🔇 Additional comments (2)
fiftyone/utils/eval/segmentation.py (2)
Line range hint 449-463
: LGTM! Constructor changes look good.
The changes to the constructor properly handle both cases (with and without matches) and maintain backward compatibility.
395-397
: Verify the match collection logic to prevent double-counting.
The match collection happens inside the image loop. Let's verify that this doesn't lead to duplicate entries when processing multiple frames.
✅ Verification successful
No double-counting issue in match collection logic
Based on the code analysis:
matches
list is initialized once at the beginning of the evaluation process- Matches are collected within the confusion matrix computation loop, where each match represents a unique intersection between ground truth and predicted segmentation masks
- The collection happens after computing the pixel confusion matrix for each image, ensuring each match is recorded exactly once
- The matches are appended with unique IDs from both ground truth and predicted segmentations
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for potential duplicate entries in match collection
# Look for patterns where matches are collected within frame loops
ast-grep --pattern 'for $frame in $frames:
$$$
matches.append($$$)
$$$'
Length of output: 79
Script:
#!/bin/bash
# Let's check the context around where matches are collected
rg -B 10 -A 5 "matches\.append.*segmentation" fiftyone/utils/eval/segmentation.py
Length of output: 83
Script:
#!/bin/bash
# Let's look at the broader context of matches collection
rg -B 10 -A 5 "matches\.append.*id" fiftyone/utils/eval/segmentation.py
Length of output: 764
Script:
#!/bin/bash
# Let's check how matches are initialized and where they're used
rg -B 5 -A 5 "matches\s*=\s*\[\]" fiftyone/utils/eval/segmentation.py
Length of output: 480
ytrue_ids = None | ||
ypred_ids = None | ||
|
||
if matches is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@prernadh mmm this an appealing way to store the data because it means _parse_confusion_matrix
doesn't need to be called when loading results 🧠
Prior to this refactor and before you created #5330, I had forked your initial implementation with dicts and implemented Model Evaluation panel callbacks in #5331.
As I mentioned on that PR, there's a few implementation details we probably aught to benchmark to make sure the panel is as performant as possible, but I could definitely see us using the data model here 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, what were you thinking in terms of benchmarking @brimoor ?
Subsumed by #5332 . Closing this PR |
What changes are proposed in this pull request?
Adding Segmentation Ids in Segementation Evaluation results fixes the click through behavior when you attach plots to sessions in FiftyOne Notebooks.
How is this patch tested? If it is not, please explain why.
Manually tested
Release Notes
Is this a user-facing change that should be mentioned in the release notes?
notes for FiftyOne users.
Adds in click-through capabilities for segmentation evaluation confusion matrices.
(Details in 1-2 sentences. You can just refer to another PR with a description
if this PR is part of a larger change.)
What areas of FiftyOne does this PR affect?
fiftyone
Python library changesCode to test-