Mimalloc Allocator
This pull request introduces an allocator based on Microsoft's mimalloc (https://www.microsoft.com/en-us/research/uploads/prod/2019/06/mimalloc-tr-v1.pdf) used in a MarkSweep GC. It is largely completed, but it needs some cleaning up and logical rearrangement.
Yes. It needs to be merged/rebased with the master version because the architecture and the API have changed a lot.
I also see #[cfg(feature="malloc")] macros everywhere. The code is significantly different if we use our own free-list implementation instead of external malloc libraries. I think it is wise to use trait to abstract out the difference and make them swappable components. This is an application of the "strategy" pattern.
I think we may choose one of two ways to structure the code.
- Keep one
MarkSweepplan and oneMarkSweepSpace, but define multiple allocator implementations which theMarkSweepSpacecan choose to use underneath. We may define something like atrait MallocAllocator, and have implementation for Paige's mimalloc allocator, and another for calling libc'smalloc. - Keep one
MarkSweepplan, but give each implementation a space, such asMiMallocMarkSweepSpaceandSystemMallocmarkSweepSpace. We can also make them different plans, too, such asMarkSweepandMallocMarkSweepwhere the latter uses the system'smalloc.
binding-refs OPENJDK_BINDING_REF=mimalloc-ms-support JIKESRVM_BINDING_REF=mimalloc-ms-support V8_BINDING_REF=update-pr-643
@wks @wenyuzhao I have requested reviews from you two. Before you do a comprehensive review, could you let me know if I should get this PR split into a few smaller PRs?
@wks I have addressed your comments. I will work on Wenyu's comments.
@wenyuzhao I have addressed most of your comments, except for those related to the object reference. I plan to fix them in a separate PR (which should get merged before this PR). We can come back to those comments later.
I believe I have addressed the issues you pointed out. Can you review the PR again and let me know if there are any further changes required? @wks @wenyuzhao
It looks good to me now.
I managed to get the mmtk-ruby binding working. However, if I switch from malloc MarkSweep to native MatkSweep, it will crash in post_alloc when allocating new objects after GC because the VO-bit is already set. (FYI, Ruby uses the "is_mmtk_object" feature which turns on the "global_alloc_bit" feature.)
I am still investigating.
I think the implementation of bzero_metadata is unsound when we use it to clear a bit range, not whole pages. Log shows that sometimes bzero_metadata will clear 0 bytes when clearing the VO-bits for a 56-byte cell.
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff1, end: 0xc0804015ff2, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fca8
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fcb0, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff2, end: 0xc0804015ff3, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fce0
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fce8, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff3, end: 0xc0804015ff4, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fd18
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fd20, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff4, end: 0xc0804015ff5, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fd50
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fd58, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff5, end: 0xc0804015ff6, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fd88
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fd90, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff6, end: 0xc0804015ff7, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fdc0
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fdc8, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff7, end: 0xc0804015ff8, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fdf8
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fe00, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff8, end: 0xc0804015ff8, len: 0
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fe30
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fe38, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff8, end: 0xc0804015ff9, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fe68
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fe70, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ff9, end: 0xc0804015ffa, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fea0
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fea8, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ffa, end: 0xc0804015ffb, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057fed8
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057fee0, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ffb, end: 0xc0804015ffc, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057ff10
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057ff18, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ffc, end: 0xc0804015ffd, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057ff48
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057ff50, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ffd, end: 0xc0804015ffe, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057ff80
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057ff88, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015ffe, end: 0xc0804015fff, len: 1
[2022-11-30T09:25:55Z INFO mmtk::policy::marksweepspace::native_ms::block] clearing alloc bits. cell: 0x2010057ffb8
[2022-11-30T09:25:55Z INFO mmtk::util::alloc_bit] bzero_alloc_bit start: 0x2010057ffc0, size: 56
[2022-11-30T09:25:55Z INFO mmtk::util::metadata::side_metadata::global] zeroing: start: 0xc0804015fff, end: 0xc0804015fff, len: 0
bzero_metadata uses memset to set byte ranges. The granularity of VO-bit metadata is one bit per 8 bytes. Then a 56-byte cell corresponds to 7 bits in the VO-bit metadata. As a result, if we use bzero_metadata to clear VO-bits when doing naive_brute_force_sweep, for every 8 consecutive cells, 7 of them will have the start of the VO-bits in one byte, and the end of the VO-bits in another, while 1 of them have both the start and the end of the VO-bits in the same byte. That explains the log.
(FYI, for Ruby, obj.to_raw_address() == ObjectModel::ref_to_address(obj) == ObjectModel::ref_to_object_start(obj) + 8)
I think it is unsound to clear 0 or 1 bytes. It should only clear the bit range the object occupies, which may span multiple bytes.
binding-refs OPENJDK_BINDING_REF=mimalloc-ms-support JIKESRVM_BINDING_REF=mimalloc-ms-support V8_BINDING_REF=update-pr-643