-
Notifications
You must be signed in to change notification settings - Fork 80
Description
- This issue is partially addressed in Custom tracing unit/packet #1137
- This issue is important for merging the
lxrbranch intomaster, but can be addressed separately before merginglxr.
An object graph consists of GC objects as nodes and edges between nodes or from roots to nodes. But in some VMs, there may be nodes that are not objects. In #1137, we mentioned malloc buffers in CRuby. Malloc-allocated buffers are attached (owned) by GC objects. They may contain references and therefore must be scanned, too. In the current mmtk-ruby binding, such malloc buffers are reclaimed using finalizers when their owner (GC) objects die.
But there are also native things that are not uniquely owned by GC objects, which complicates the matter.
"Claiming" in OpenJDK
In OpenJDK, there are off-heap native objects, such as ClassLoaderData. Those objects are not managed by GC, and are not uniquely owned by any GC objects. For ClassLoaderData, the pointer path is object -> klass -> class_loader_data. Each ClassLoaderData instance has a _claimed field so that during GC (or other activities), when multiple threads reach a ClassLoaderData instance via different paths, the thread that atomically "claims" the instance will scan it. The _claimed field is similar to the mark bit of GC objects, and is used to prevent re-scanning the same instance again and again.
In addition to ClassLoaderData, there are other types that uses similar "claiming" patterns. For example, the nmethod::_oops_do_mark_link is linked when "claimed".
The problem
From mmtk-core's point of view, those instances are off-heap. Currently, the lxr branch will claim and scan the ClassLoaderData instance associated to the object->klass->class_loader_data when scanning the object. From mmtk-core's point of view, it will appear that the object has many outgoing edges, including those from the ClassLoaderData. This behavior is controlled by a property of the SlotVisitor in Scanning::scan_object(object, slot_visitor). It is confusing from the API design's point of view because
- The
ClassLoaderDatais Java-specific, and cannot be extended to other bindings. - The outgoing edges from the
ClassLoaderDataare not really part of theobjectto be scanned, and those edges can be reachable from other objects, too. - "Claiming" a
ClassLoaderDatainstance has side effect. It is only valid in a certain context (such as transitive closure, before which we do initializations such as clearing the_claimedfields of all instances), and it is inappropriate for a generalScanning::scan_objectAPI function which is supposed to have no more side effects than returning theSlotinstances for updating, and also inappropriate forScanning::scan_object_and_trace_edgeswhich is only supposed to have the side effect of update outgoing slots.
Related work
In #1137, we mentioned creating special work packets for off-heap non-object data structures. This thought may be applicable in this case.
The pull request #1437 adds a parameter RefScanPolicy to Scanning::scan_object{,_and_trace_edges} to control the behavior of object scanning, and also explicitly allowing the side effect of "discovering references" in VM-specific ways. In the lxr branch, claiming and scanning of ClassLoaderData are also controlled by similar flags. This makes us wonder should the parameter be more general. For example, should we expose the intention that "this object scanning operation is done for transitive closure" to the VM binding? More precisely, with such a parameter, it is no longer mere "scanning object", but actually "processing node" which the old TransitiveClosure type was supposed to implement (but did not) (See this blog). Then should we have something like "process node" in our API? I hesitate to do so because that exposes too much of the MMTk internals to the VM binding, and blurs the boundary of the MMTk-Binding API.
The bottom line
The bottom line is, the API should be VM-neutral, but at least powerful enough to allow OpenJDK to implement class unloading.