Decuple reference processor from `ProcessEdgesWork`
TL;DR: The reference processor is tightly coupled with ProcessEdgesWork work packets, making it impossible to support runtimes that can only do object-enqueuing. We can make it general.
Problem
Currently, reference-processing work packets {Soft,Weak,Phantom}RefProcessing take ProcessEdgesWork as a parameter. For example:
pub struct SoftRefProcessing<E: ProcessEdgesWork>(PhantomData<E>);
impl<E: ProcessEdgesWork> GCWork<E::VM> for SoftRefProcessing<E> {
fn do_work(&mut self, worker: &mut GCWorker<E::VM>, mmtk: &'static MMTK<E::VM>) {
let mut w = E::new(vec![], false, mmtk);
w.set_worker(worker);
mmtk.reference_processors.scan_soft_refs(&mut w, mmtk);
w.flush();
}
}
Seeing from the use pattern, it instantiates E, passes it to scan_soft_refs, and calls .flush() on it. It never calls w.do_work, which indicates ProcessEdgesWork is not a sub-task of SoftRefProcessing. Instead, SoftRefProcessing is just stealing some functionalities from ProcessEdgesWork!
Analysis
It uses the E: ProcessEdgesWork type for three purposes:
-
Use
E::trace_objectto trace object. Traced soft/weak/phantom references will become softly/weakly/phantomly reachable. -
Use
Eas an object queue.E::trace_objecttakes a queue parameter (currently in the form of aTransitiveClosure, but will be refactored in https://github.com/mmtk/mmtk-core/issues/559). -
Use the object queue to create a object-scanning work packet.
w.flush()does this. It will create aScanObjectswork packet to scan objects.
From the analysis, the SoftRefProcessing work packet only has two dependencies:
- A delegate for calling
trace_objectin the appropriate space, and - The type of the object-scanning work packet to create. (More concretely, the post-scan hook which Immix needs.)
The queue is just a Vec<ObjectReference> (or whatever that wraps it) and can be created locally.
Solution
To move away from ProcessEdgesWork, we just need to parameterise {Soft,Weak,Phantom}RefProcessing with a trait that provides the above two operations, namely trace_object and post_scan_object.
I have a draft for this trait. I call it TracingDelegate.
pub trait TracingDelegate<VM: VMBinding>: 'static + Copy + Send {
fn trace_object<T: TransitiveClosure>(
&self,
trace: &mut T,
object: ObjectReference,
worker: &mut GCWorker<VM>,
) -> ObjectReference;
fn may_move_objects() -> bool;
fn post_scan_object(&self, object: ObjectReference);
}
They are just the three methods provided by PlanTraceObject, except without the KIND which can be parameterised on concrete implementations.
There should be two implementations, one for SFT, and the other for PlanTraceObject, just like there are SFTProcessEdges and PlanProcessEdges.
Then SoftRefProcessing can just call trace_object from that trait. The ScanObjects trait can also be refactored to use that trait.
I don't think we need a special type for reference processor. Reference processor could use any type that we use for tracing (currently ProcessEdgesWork, or any type that will supersede ProcessEdgesWork).
TracingDelegate looks quite similar to PlanTraceObject. We could just rename PlanTraceObject to TracingDelegate. Reusing code is also an important part of software engineering.
TracingDelegate looks quite similar to PlanTraceObject. We could just rename PlanTraceObject to TracingDelegate. Reusing code is also an important part of software engineering.
TracingDelegate can also be implemented for SFT. For example, SFTTracingDelegate::trace_object could call SFT::sft_trace_object, like SFTProcessEdges does.
TracingDelegate intends to extract "good parts" from SFTProcessEdges and PlanProcessEdges into SFTTracingDelegate and PlanTracingDelegate, and make them reusable. My plan is to reuse TracingDelegate in other work packets as well, such as:
-
struct TracingProcessEdges<D: TracingDelegate>: Replacingtrait ProcessEdgesWork. -
struct TracingProcessEdges<D: TracingDelegate>: Replacingstruct ScanObjectsandstruct PlanScanObjects. It scans object likeScanObjectsdoes, but can optionally process its fields (edges), too, if the VM (Ruby) does not support edge enqueuing for some objects.
Neither of them can have subclasses, but plans can customise them by provide different TracingDelegate instances.
However, if we plan to commit to PlanTraceObject and dismiss SFT, we won't need TracingDelegate. Instead, we just embed a plan: &'static p where P: Plan<VM = VM> + PlanTraceObject<VM>, and those work packets can call into the plan directly.
I don't think we need a special type for reference processor. Reference processor could use any type that we use for tracing (currently ProcessEdgesWork, or any type that will supersede ProcessEdgesWork).
If the "special type" means SoftRefProcessing (and its weak/phantom counterparts), then we do need it. The logic of reference processing is still a bit different from ordinary edge processing. Edge processing does the following:
- Make an object queue
q - Load objref from the edge
- Call
trace_object(&mut q, objref) - Store the new objref back to the edge (if moved).
- Repeat 2-4 until all edges are processed
- If
qis not empty, create an object-scanning work packet with all elements inq - Execute the object-scanning work packet, or submit the work packet to the scheduler.
SoftRefProcessing is different in the following aspects:
- Step 2 and 4 are different, because it accesses SoftReference objects via special VM API.
- If it is weak reference processing, or it decides not to retain soft references, it will only do step 2-4 for reachable Soft/WeakReference.
TracingDelegate supports step 3, so the reference processor can still call it.
TracingDelegate indirectly supports step 7. It provides post_scan_object which is currently the only difference between different XxxxxScanObjects work packets. And the code for step 7 (creating a work packet) is trivial (just a few lines of code). If code repetition is a problem, we can still abstract it out in a function.
Here is an in-progress work for TracingProcessEdges::gc_work which does what ProcessEdgesWork::gc_work does with TracingDelegate: https://github.com/wks/mmtk-core/blob/8217c09480451e4ed7b43a8cea3b2aead8e1913e/src/scheduler/gc_work.rs#L872
We implemented object enqueuing by wrapping ProcessEdgesWork. https://github.com/mmtk/mmtk-core/pull/628
Ruby now uses the new VM-specific weak reference processing API: https://github.com/mmtk/mmtk-core/pull/700 This makes the changes of the built-in reference processors unnecessary. Actually we shall replace the built-in reference and finalization processors with OpenJDK-specific and JikesRVM-specific implementations.
I am closing this issue. More discussions about migrating to the new weak reference processing API happen in: https://github.com/mmtk/mmtk-core/issues/694