Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
22 changes: 16 additions & 6 deletions fiftyone/core/models.py
Original file line number Diff line number Diff line change
Expand Up @@ -485,8 +485,17 @@ def _apply_image_model_data_loader(
warning="Async failure labeling batches",
)
)
output_processor = getattr(
model, "_output_processor", lambda output, *_, **__: output
)
context.enter_context(fou.SetAttributes(model, _output_processor=None))

Comment on lines +488 to 492
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Avoid mutating models that don’t define _output_processor; guard and fall back cleanly

Setting a private attribute on all models can change behavior inadvertently. Only disable _output_processor when it exists; otherwise use a no-op postprocessor.

-        output_processor = getattr(
-            model, "_output_processor", lambda output, *_, **__: output
-        )
-        context.enter_context(fou.SetAttributes(model, _output_processor=None))
+        output_processor = getattr(model, "_output_processor", None)
+        if output_processor is None:
+            def output_processor(output, *_, **__):
+                return output
+        else:
+            context.enter_context(
+                fou.SetAttributes(model, _output_processor=None)
+            )
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
output_processor = getattr(
model, "_output_processor", lambda output, *_, **__: output
)
context.enter_context(fou.SetAttributes(model, _output_processor=None))
output_processor = getattr(model, "_output_processor", None)
if output_processor is None:
def output_processor(output, *_, **__):
return output
else:
context.enter_context(
fou.SetAttributes(model, _output_processor=None)
)
🤖 Prompt for AI Agents
In fiftyone/core/models.py around lines 488 to 492, the code unconditionally
sets a private attribute _output_processor on every model which can mutate
models that don’t define it; change this so you only disable the processor when
it actually exists: check hasattr(model, "_output_processor") first, set
output_processor to model._output_processor if present else to a no-op lambda,
and only call context.enter_context(fou.SetAttributes(model,
_output_processor=None)) when the attribute exists so models without the
attribute are left untouched.

def save_batch(sample_batch, labels_batch):
def save_batch(sample_batch, output, image_sizes):
labels_batch = output_processor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it actually more efficient to do this in a thread? How do the size of the output vectors compare to the final predicted objects?

Copy link
Contributor Author

@exupero exupero Sep 3, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not more efficient, it's more concurrent. Moving this to a background thread allows the output to be processed while the next batch's inputs are being processed and run. I.e.

(not to scale)

gantt
	title Async postprocessing
	dateFormat X
	axisFormat %s
	
	section Main thread
	Preprocess :active, preprocess, 0, 1
	Predict :active, predict, 1, 2
	Preprocess :active, preprocess, 2, 3
	Predict :active, predict, 3, 4
	
	section Background thread
	Postprocess :active, postprocess, 2, 3
	Write :active, write, 3, 4
	Postprocess :active, postprocess, 4, 5
	Write :active, write, 5, 6
Loading

With the test script, output is 4-5x larger (in bytes) than labels_batch, but since this is a thread, memory is shared, so it it shouldn't incur any greater memory usage than the previous version.

output,
image_sizes,
confidence_thresh=model.config.confidence_thresh,
)
with _handle_batch_error(skip_failures, sample_batch):
for sample, labels in zip(sample_batch, labels_batch):
if filename_maker is not None:
Expand All @@ -507,14 +516,15 @@ def save_batch(sample_batch, labels_batch):
if isinstance(imgs, Exception):
raise imgs

image_sizes = imgs.pop("fo_image_size", [(None, None)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fo_image_size is unique to FiftyOneTransformer so it can't be referenced in fiftyone.core.models.

if needs_samples:
labels_batch = model.predict_all(
imgs, samples=sample_batch
)
output = model.predict_all(imgs, samples=sample_batch)
else:
labels_batch = model.predict_all(imgs)
output = model.predict_all(imgs)

submit(save_batch, sample_batch, labels_batch)
submit(
save_batch, sample_batch, output, image_sizes=image_sizes
)

pb.update(len(sample_batch))

Expand Down
Loading