Add Zig solution for binarytrees that makes use of MemoryPool
About
- Wrap each tree creation into MemoryPool: concept borrowed from C++ solution in here.
- Remove tree-walking Tree.deinit, as pool.deinit() already deallocates all memory
- Zig tests show that there are no memory leaks
- Compared to 1.zig, removed 'allocator' attribute from
Nodestruct: it doesn't really make sense to repeat this value 10000s of times in memory. - Compared to this suggestion, this solution slightly different in sense that we deallocate memory on every tree release, the suggestion made by @zigster64 only uses single arena instance and releases all memory at once thus I assume it uses more memory?
Comparing performance
built with zig build-exe -O ReleaseFast --library c 1.zig.
Difference in time from 1.42s down to 0.44s, memory use down from 50.520kb to 23.792kb
Original 1.zig output:
/usr/bin/time -v ./1 18
stretch tree of depth 19 check: 1048575
262144 trees of depth 4 check: 8126464
65536 trees of depth 6 check: 8323072
16384 trees of depth 8 check: 8372224
4096 trees of depth 10 check: 8384512
1024 trees of depth 12 check: 8387584
256 trees of depth 14 check: 8388352
64 trees of depth 16 check: 8388544
16 trees of depth 18 check: 8388592
long lived tree of depth 18 check: 524287
Command being timed: "./1 18"
User time (seconds): 1.42
System time (seconds): 0.01
Percent of CPU this job got: 99%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.43
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 50520
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 2
Minor (reclaiming a frame) page faults: 12357
Voluntary context switches: 45
Involuntary context switches: 4
Swaps: 0
File system inputs: 280
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Updated 2.zig output:
/usr/bin/time -v ./2 18
stretch tree of depth 19 check: 1048575
262144 trees of depth 4 check: 8126464
65536 trees of depth 6 check: 8323072
16384 trees of depth 8 check: 8372224
4096 trees of depth 10 check: 8384512
1024 trees of depth 12 check: 8387584
256 trees of depth 14 check: 8388352
64 trees of depth 16 check: 8388544
16 trees of depth 18 check: 8388592
long lived tree of depth 18 check: 524287
Command being timed: "./2 18"
User time (seconds): 0.44
System time (seconds): 0.00
Percent of CPU this job got: 98%
Elapsed (wall clock) time (h:mm:ss or m:ss): 0:00.46
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 23792
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 2
Minor (reclaiming a frame) page faults: 7475
Voluntary context switches: 45
Involuntary context switches: 0
Swaps: 0
File system inputs: 280
File system outputs: 0
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
Thanks for the PR but memory pool / arena solutions are not accepted for this problem
Thanks for the PR but memory pool / arena solutions are not accepted for this problem
why is pool/arena not accepted solution? all jvm based langauages are doing something similar.In languuages where we can control how to use the memory why not use it in the best way, benchmarking does not say how to use memory. we can create 2 zig tree solution 1 optimised for memory and other for cpu. I believe this is the power of zig that we can make use allocator in the way we want
@hanabi1224 can you please provide the reason for not accepting this as valid solution.
@peeyush18
we can create 2 zig tree solution 1 optimised for memory and other for cpu
This solution isn't really optimized for memory or cpu: it's better in both ways. MemoryPool can use less memmory and less cpu than regular c malloc:
- malloc needs to track metadata about every allocation, memory pool guarantees that all nodes in pool are of same size thus no extra metadata needs to be tracked for every node, this saves memory
- MemoryPool will preallocate memory in chunks that will fit multiple nodes this would reduce number of c malloc calls, this saves cpu
C malloc is really a wrong tool for tiny object allocation - it's a tool to implement further abstractions on top and typically these abstractions will allocate memory in larger chunks. I guess if @hanabi1224 insist on not using MemoryPool I probably could also rewrite this to use ArrayList achieving very similar result.