Allow bindings to implement GC triggering policy
MMTk currently triggers GCs based on a fixed heap size. ~~We have plans to support dynamic heap size (e.g. https://github.com/mmtk/mmtk-core/issues/561), and we wlll have different GC triggering implementations in that case. Though we will implement different GC triggering policies in MMTk, it is possible that we expose this to the binding. For example, one of the GC triggering policies is to call into the binding and let the binding to decide whether to trigger a GC with some information from MMTk (e.g. reserved pages, free pages, etc).~~ We should also allow VMs to decide whether to trigger GCs.
There are different motivations for this.
- this could be a replacement for https://github.com/mmtk/mmtk-core/pull/639, which helps account off-heap bytes for the VM. Having an API for the binding to tell us the inc/dec of the bytes is very much not desired. Instead, the binding could implement their own GC triggering and count their allocated bytes/pages.
- some runtimes have its own GC triggering. Like Julia, they trigger GC based on the allocation volume. To have a fair performance comparison with their stock GC, we would need either implement fix heap size for Julia, or allow Julia binding to implement their own GC triggering.
Ruby status quo
Note: This section is about the vanilla Ruby, without MMTk binding.
Ruby has its own mmtk counter. Ruby wraps malloc and free into ruby_xmalloc and ruby_xfree, respectively. Inside those functions, they increase or decrease the counters and possibly trigger GC.
The counter has two fields:
typedef struct rb_objspace {
// ...
struct {
size_t limit;
size_t increase;
} malloc_params;
// ...
} rb_objspace_t;
where:
-
increaseis the number of bytes allocated byruby_xmallocminus the number of bytes freed byruby_xfree(saturated to 0 when underflow) since the last GC. -
limitis a threshold.
If ruby_xmalloc caused increase to exceed limit, it will trigger GC. After GC, it will reset increase to 0, and adjust the limit.
| variable | description | default (macro name) | adjustable? (env var name) |
|---|---|---|---|
limit |
threshold to trigger GC | 16MiB (GC_MALLOC_LIMIT_MIN) |
Yes (RUBY_GC_MALLOC_LIMIT) |
gc_params.malloc_limit_min |
minimum of limit |
16MiB (GC_MALLOC_LIMIT_MIN) |
No |
gc_params.malloc_limit_max |
maximum of limit |
32MiB (GC_MALLOC_LIMIT_MAX) |
Yes (RUBY_GC_MALLOC_LIMIT_MAX) |
gc_params.malloc_limit_growth_factor |
factor to grow limit |
1.4 (GC_MALLOC_LIMIT_GROWTH_FACTOR) |
Yes (RUBY_GC_MALLOC_LIMIT_GROWTH_FACTOR) |
| 0.98 (a magic number) | factor to shrink limit |
0.98 | No |
limit starts at 16MiB. If increase > limit between two GCs, limit will be multiplied by a factor of 1.4; if increase <= limit, limit will shrink to 0.96 of its current value. And limit never grows beyond 32MiB, or shrinks below 16MiB.
As we can see, in the default setting, Creating a 33554432-characters-long string will guarantee to trigger a GC.
Implication to MMTk API
It looks like Ruby doesn't need a special API to support its current behaviour. It can simply call mmtk_handle_user_collection_request every time it allocates 32MiB of memory.