Skip to content

Reconsider the dispatching of trace_object: filling table cells #1325

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wks opened this issue May 21, 2025 · 2 comments
Open

Reconsider the dispatching of trace_object: filling table cells #1325

wks opened this issue May 21, 2025 · 2 comments

Comments

@wks
Copy link
Collaborator

wks commented May 21, 2025

On today's meeting, we discussed the need to refactor the work packets, ProcessEdgesWork, trace_object, and other derived issues. We also mentioned that all previous discussions about this topic usually ends with the proposal of implementing work packet specialization, i.e. creating one work packet for multiple objects in the same space.

Combined with the fact that ProcessEdgesWork comes with multiple flavors (which should not happen), including GenNurseryProcessEdges, PlanProcessEdges, SFTProcessEdges, or any custom plan-specific XxxxProcessEdges, I think there is another way to view this problem.

We make one table for each plan. Each row is a space, and each column is a kind of trace.

Let's take GenImmix as example.

Space Nursery GC Full GC (non-defrag) Full GC (defrag)
nursery (CopySpace) nursery.trace_object nursery.trace_object nursery.trace_object
immix space do nothing immix_space.trace_object::<FAST> immix_space.trace_object::<DEFRAG>
immortal space do nothing immortal.trace_object immortal.trace_object
LOS los.trace_object_nursery los.trace_object los.trace_object

For GenCopy, it will be

Space Nursery GC Full GC
nursery (CopySpace) nursery.trace_object nursery.trace_object
copyspace0 do nothing copyspace0.trace_object
copyspace1 do nothing copyspace1.trace_object
immortal space do nothing immortal.trace_object
LOS los.trace_object_nursery los.trace_object

Tracing an object inevitably involves dispatching the trace_object call into one of the table cells. That is fundamentally what GenNurseryProcessEdges, PlanProcessEdges, SFTProcessEdges actually implemented, i.e. methods of dispatching.

  • SFTProcessEdges work for plans (such as GenCopy) where no spaces have specialized trace_object methods. It simply does a SFT::sft_trace_object virtual call.
  • PlanProcessEdges is just an automated way to write chained if-else-if-else... to call the right trace_object::<KIND>(obj) of the right space. It is still there because some spaces have generic trace_object<KIND> methods.
  • GenNurseryProcessEdges fits in when neither of the above works. It selectively skips some spaces.

And the proposed work packet specialization is one alternative to all of the above. It will do the dispatch by selecting a queue before enqueuing, rather than selecting the trace_object method when tracing an object (or more precisely, an object graph edge pointing to that object).

What's missing?

There is currently no object (or type) in mmtk-core that represents a single cell above.

What does ProcessEdgesWork do?

Each ProcessEdgesWork instance represents one column, and that's why it needs a dispatch of some form.

  • SFTProcessEdges selects the row (space) by virtual dispatch in the SFTMap.
  • PlanProcessEdges selects the row (space) by if-else-if-else-if-else-if-else-if-else-... and space.in_space.
  • GenNurseryProcessEdges selects some rows (spaces) and ignores others, using if-else-if-else, but not that many (at most two spaces, the nursery and the LOS).

p.s. Actually SFTProcessEdges selects a cell because it dispatches to a concrete method. Consequently, SFTProcessEdges can be used for plans that either

  • has only one column in the table (e.g. SemiSpace, MarkSweep), or
  • the plan only uses SFTProcessEdges to cover one column, and uses something else (e.g. GenNurseryProcessEdges) to cover other columns. (e.g mature collection of GenCopy where nursery collection is handled by GenNurseryProcessEdges)

What does trace_object<KIND> do?

Each SomeSpace::trace_object<KIND> attempts to handle one row. For example, ImmixSpace::trace_object<KIND> where KIND can be FAST or DEFRAG corresponds to the full-heap collections without or with defrag. The current implementation of KIND doesn't cover nursery collection, however.

What will the thing that represents a cell look like?

It may be a trait, like:

trait TraceObject {
    /// Trace `object`, and return the new address that the (object graph) edge should be updated to point to.
    /// During execution, it maybe enqueue one object into the `queue`,
    /// which is either the old address (mark compact) or the new address (evacuating collector.
    fn trace_object<Q: FnMut(ObjectReference)>(&self, object: ObjectReference, queue: &mut Q) -> ObjectReference;
}

Given that <Q> makes the method generic, we may get rid of it by writting it slightly:

struct TraceObjectResult {
    forwarded: ObjectReference,
    enqueued: Option<ObjectReference>,
};

trait TraceObject {
    fn trace_object(&self, object: ObjectReference) -> TraceObjectResult;
}

This does not use generic parameters, making it eligible for &dyn. But according to a previous study (#559), this approach has a small but noticeable performance impact compared to the generic <Q: FnMut(ObjecctReference)>. Maybe that was due to the absence of PGO (that was 2022, and we were still strugging with #[inline] at that time).

Then each space will implement some TraceObject. For example,

  • CopySpaceTraceObject
  • ImmixSpaceFastTraceObject
  • ImmixSpaceDefragTraceObject
  • ImmixSpaceNurseryTraceObject (for StickyImmix)
  • LOSTraceObject<IS_NURSERY> (Concrete structs with generic args can implement TraceObject without type args, like impl<IS_NURSERY: bool> TraceObject for LOSTraceObject<IS_NURSERY> { ... })

Then each plan, as a "declarative description of a (spatial and temporal) composition of policies", just fills in the table. I don't know exactly how, but we may introduce some dispatchers (using if-else, or SFT, or manually written function, or some declarative way, or work-packet specialization) to select the TraceObject implementation according to:

  • what space the object is in, and
  • what kind of tracing is currently running.

What about node/slot enqueuing?

Note that the TraceObject trait has no knowledge of slot enqueuing or object enqueuing. We still need to compose that trait into object-enqueuing or slot-enqueuing work packets.

In the following example, to make things simple we assume there is a dispatching_tracer that implements the same interface as TraceObject but automagically dispatches it to the right space (using SFT, if-else, etc.). That's just what we are currently doing in SFTProcessEdges, PlanProcessEdges, etc. If we do work packet specialization, we just make it a non-dispatching tracer.

struct NodeEnqueuingTrace<T: TraceObject> {
    dispatching_tracer: T,
    objects: Vec<ObjectReference>,
}

impl<T: TraceObject> ObjectEnqueuingTrace<T> {
    fn process_nodes(&mut self) {
        while !self.objects.is_empty() {
            let src = self.objects.pop();
            Scanning::scan_object_and_trace_edges(src, |target| {
                let TraceObjectResult{ forwarded, enqueued } = self.dispatching_tracer.trace_object(object);
                if let Some(enq) = enqueued { self.objects.push(enq); }
                forwarded // the binding will assign this to the field.
            });
        }
    }
}

struct SlotEnqueuingTrace<T: TraceObject> {
    dispatching_tracer: T,
    slots: Vec<Slot>,
}

impl<T: TraceObject> SlotEnqueuingTrace<T> {
    fn process_slots(&mut self) {
        while !self.slots.is_empty() {
            let slot = self.slots.pop();
            if let Some(target) = slot.load() {
                let TraceObjectResult{ forwarded, enqueued } = self.dispatching_tracer.trace_object(object);
                slot.store(forwarded);
                if let Some(enq) = enqueued {
                    Scanning::scan_object(enq, |slot| {
                        self.slots.push(slot);
                    }
                }
            }
        }
    }
}

Then if we do work packet specialization, the plan can instantiate SlotEnqueuingTrace<ImmixSpaceNurseryTraceObject>, SlotEnqueuingTrace<ImmixSpaceNonDefragTraceObject>, SlotEnqueuingTrace<ImmixSpaceDefragTraceObject>, SlotEnqueuingTrace<CopySpaceNurseryTraceObject>, SlotEnqueuingTrace<DontTraceObject>, etc., and NodeEnqueuingTrace, NodeEnqueuingTrace`, etc., but needs to rewrite the code a little to select the right queue to enqueue into.

And if we don't want to do work packet specialization, we instantiate SlotEnqueuingTrace<GenCopySFTMatureTraceObject> which automatically dispatches to the right space in a mature collection in GenCopy.

The point is, the enqueuing mechanism is decoupled from the tracing and the dispatching. The plan simply fills in the <T> in NodeEnqueuingTrace<T> and SlotEnqueuingTrace<T>, and the VM binding may choose whether to use NodeEnqueuingTrace or SlotEnqueuingTrace (e.g. CRuby must use node enqueuing).

Related issus

There is an old PR: #1278 which introduced what I then called TracingDelegate, which is in principle like the TraceObject trait, but was intended to abstract over the dispatching mechanism in SFT and if-else-if-else...

Yi made #1314. It tried to solve the problem by standardizing TraceKind over all spaces and plans, which effectively makes every Space::trace_object<KIND> dispatch over a whole row regardless of plan. My proposal is a bit different. Each implementation of trait TraceObject fills one individual cells.

@k-sareen
Copy link
Collaborator

We make one table for each plan. Each row is a space, and each column is a kind of trace.

We need to be careful when we reason about this "table". Consider the SemiSpace GC that supports Android, specifically the ZygoteSpace. Since the ZygoteSpace is a wrapper around an ImmixSpace, we now need to be able to pick the correct ImmixSpace trace we need when a GC occurs before the Zygote is frozen. However, after the Zygote has been frozen, we turn into a standard SemiSpace collector.

I would like if we kept things like the above in mind when we discuss a refactor since it's important to be extensible enough to be able to implement support for various other VMs as well.

@wks
Copy link
Collaborator Author

wks commented May 23, 2025

... Since the ZygoteSpace is a wrapper around an ImmixSpace, we now need to be able to pick the correct ImmixSpace trace we need when a GC occurs before the Zygote is frozen. However, after the Zygote has been frozen, we turn into a standard SemiSpace collector.

I think this means the table can be mutable, at least when certain key events happen, such as freezing the zygote.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants