Incorporate Araq's suggestions
From the forum reply
Your implementation should inline data elements that fit in the about 3 bytes in the payload. For example: Numbers like 0, 1, 2, 3 are common and do not have to take up space in the BiTable. Likewise for short key names like "abc". This is reasonably easy to implement and should improve benchmark results quite a bit. It also means that the typical JSON data uses even less space.
The sweet spot might even be to use a 64 bit for a packed node so that more data fits the inline case.
sso branch benchmark results:
extract:
Collected 10000 samples in 3.5586 s
Average time: 0.3482 ms
Stddev time: 0.0409 ms
Min time: 0.3431 ms
Max time: 1.0929 ms
parse:
Collected 10000 samples in 12.2699 s
Average time: 1.2194 ms
Stddev time: 0.0040 ms
Min time: 1.2127 ms
Max time: 1.3753 ms
toString:
Collected 10000 samples in 9.3768 s
Average time: 0.5825 ms
Stddev time: 0.0028 ms
Min time: 0.5728 ms
Max time: 0.5958 ms
fromJson:
Collected 10000 samples in 3.5812 s
Average time: 0.0041 ms
Stddev time: 0.0001 ms
Min time: 0.0029 ms
Max time: 0.0085 ms
toJson:
Collected 10000 samples in 0.0129 s
Average time: 0.0010 ms
Stddev time: 0.0000 ms
Min time: 0.0005 ms
Max time: 0.0035 ms
test:
Collected 10000 samples in 3.5766 s
Average time: 0.0035 ms
Stddev time: 0.0001 ms
Min time: 0.0034 ms
Max time: 0.0058 ms
replace:
Collected 10000 samples in 3.5799 s
Average time: 0.0037 ms
Stddev time: 0.0001 ms
Min time: 0.0034 ms
Max time: 0.0065 ms
remove:
Collected 10000 samples in 3.6927 s
Average time: 0.0143 ms
Stddev time: 0.0003 ms
Min time: 0.0128 ms
Max time: 0.0199 ms
add:
Collected 10000 samples in 3.5934 s
Average time: 0.0036 ms
Stddev time: 0.0002 ms
Min time: 0.0029 ms
Max time: 0.0094 ms
copy:
Collected 10000 samples in 3.6806 s
Average time: 0.0122 ms
Stddev time: 0.0002 ms
Min time: 0.0113 ms
Max time: 0.0218 ms
move:
Collected 10000 samples in 3.7868 s
Average time: 0.0231 ms
Stddev time: 0.0004 ms
Min time: 0.0224 ms
Max time: 0.0276 ms
stdlib - extract:
Collected 10000 samples in 8.2239 s
Average time: 0.7242 ms
Stddev time: 0.0059 ms
Min time: 0.7103 ms
Max time: 1.1079 ms
stdlib - parse:
Collected 10000 samples in 15.8467 s
Average time: 1.4828 ms
Stddev time: 0.0093 ms
Min time: 1.4456 ms
Max time: 1.5896 ms
stdlib - toString:
Collected 10000 samples in 14.9889 s
Average time: 0.6627 ms
Stddev time: 0.0114 ms
Min time: 0.6409 ms
Max time: 1.1972 ms
stdlib - fromJson:
Collected 10000 samples in 8.4067 s
Average time: 0.0011 ms
Stddev time: 0.0004 ms
Min time: 0.0000 ms
Max time: 0.0103 ms
stdlib - toJson:
Collected 10000 samples in 0.0083 s
Average time: 0.0005 ms
Stddev time: 0.0001 ms
Min time: 0.0002 ms
Max time: 0.0048 ms
stdlib - test:
Collected 10000 samples in 8.3573 s
Average time: 0.0007 ms
Stddev time: 0.0001 ms
Min time: 0.0000 ms
Max time: 0.0043 ms
stdlib - replace:
Collected 10000 samples in 8.3806 s
Average time: 0.0007 ms
Stddev time: 0.0002 ms
Min time: 0.0000 ms
Max time: 0.0059 ms
stdlib - remove:
Collected 10000 samples in 8.3820 s
Average time: 0.0010 ms
Stddev time: 0.0002 ms
Min time: 0.0000 ms
Max time: 0.0071 ms
stdlib - add:
Collected 10000 samples in 8.3859 s
Average time: 0.0007 ms
Stddev time: 0.0002 ms
Min time: 0.0000 ms
Max time: 0.0055 ms
stdlib - copy:
Collected 10000 samples in 8.4408 s
Average time: 0.0008 ms
Stddev time: 0.0003 ms
Min time: 0.0000 ms
Max time: 0.0083 ms
stdlib - move:
Collected 10000 samples in 8.4005 s
Average time: 0.0011 ms
Stddev time: 0.0002 ms
Min time: 0.0000 ms
Max time: 0.0064 ms
Used mem:
jsonpak: 256KiB std/json: 1.359MiB
Using separate strings, numbers hash tables, branch literals:
extract:
Collected 10000 samples in 3.1340 s
Average time: 0.3077 ms
Stddev time: 0.0186 ms
Min time: 0.2981 ms
Max time: 0.9689 ms
parse:
Collected 10000 samples in 10.3927 s
Average time: 1.0334 ms
Stddev time: 0.0142 ms
Min time: 1.0135 ms
Max time: 1.1465 ms
toString:
Collected 10000 samples in 4.5868 s
Average time: 0.1466 ms
Stddev time: 0.0038 ms
Min time: 0.1441 ms
Max time: 0.1696 ms
fromJson:
Collected 10000 samples in 3.1475 s
Average time: 0.0039 ms
Stddev time: 0.0005 ms
Min time: 0.0038 ms
Max time: 0.0229 ms
toJson:
Collected 10000 samples in 0.0095 s
Average time: 0.0006 ms
Stddev time: 0.0002 ms
Min time: 0.0006 ms
Max time: 0.0202 ms
test:
Collected 10000 samples in 3.1534 s
Average time: 0.0040 ms
Stddev time: 0.0008 ms
Min time: 0.0038 ms
Max time: 0.0255 ms
replace:
Collected 10000 samples in 3.1441 s
Average time: 0.0040 ms
Stddev time: 0.0006 ms
Min time: 0.0039 ms
Max time: 0.0227 ms
remove:
Collected 10000 samples in 3.2490 s
Average time: 0.0147 ms
Stddev time: 0.0013 ms
Min time: 0.0145 ms
Max time: 0.0338 ms
add:
Collected 10000 samples in 3.1630 s
Average time: 0.0040 ms
Stddev time: 0.0006 ms
Min time: 0.0019 ms
Max time: 0.0228 ms
copy:
Collected 10000 samples in 3.2011 s
Average time: 0.0085 ms
Stddev time: 0.0009 ms
Min time: 0.0082 ms
Max time: 0.0286 ms
move:
Collected 10000 samples in 3.3002 s
Average time: 0.0194 ms
Stddev time: 0.0014 ms
Min time: 0.0191 ms
Max time: 0.0406 ms
Used mem: 231.109KiB
For reference, these are the current timings:
Collected 10000 samples in 2.6649 s
Average time: 0.2564 ms
Stddev time: 0.0053 ms
Min time: 0.2513 ms
Max time: 0.3097 ms
parse:
Collected 10000 samples in 9.6022 s
Average time: 0.9501 ms
Stddev time: 0.0170 ms
Min time: 0.9104 ms
Max time: 1.0435 ms
toString:
Collected 10000 samples in 10.7450 s
Average time: 0.8031 ms
Stddev time: 0.0109 ms
Min time: 0.7795 ms
Max time: 1.1168 ms
fromJson:
Collected 10000 samples in 2.7236 s
Average time: 0.0040 ms
Stddev time: 0.0005 ms
Min time: 0.0039 ms
Max time: 0.0231 ms
toJson:
Collected 10000 samples in 0.0116 s
Average time: 0.0009 ms
Stddev time: 0.0003 ms
Min time: 0.0008 ms
Max time: 0.0195 ms
test:
Collected 10000 samples in 2.7419 s
Average time: 0.0043 ms
Stddev time: 0.0008 ms
Min time: 0.0040 ms
Max time: 0.0241 ms
replace:
Collected 10000 samples in 2.7292 s
Average time: 0.0043 ms
Stddev time: 0.0007 ms
Min time: 0.0041 ms
Max time: 0.0240 ms
remove:
Collected 10000 samples in 2.8196 s
Average time: 0.0132 ms
Stddev time: 0.0011 ms
Min time: 0.0129 ms
Max time: 0.0335 ms
add:
Collected 10000 samples in 2.7112 s
Average time: 0.0043 ms
Stddev time: 0.0006 ms
Min time: 0.0041 ms
Max time: 0.0243 ms
copy:
Collected 10000 samples in 2.7654 s
Average time: 0.0087 ms
Stddev time: 0.0010 ms
Min time: 0.0084 ms
Max time: 0.0311 ms
move:
Collected 10000 samples in 2.8696 s
Average time: 0.0179 ms
Stddev time: 0.0014 ms
Min time: 0.0175 ms
Max time: 0.0409 ms
stdlib - extract:
Collected 10000 samples in 7.5400 s
Average time: 0.6629 ms
Stddev time: 0.0088 ms
Min time: 0.6435 ms
Max time: 0.8394 ms
stdlib - parse:
Collected 10000 samples in 14.5678 s
Average time: 1.3665 ms
Stddev time: 0.0126 ms
Min time: 1.3298 ms
Max time: 1.4247 ms
stdlib - toString:
Collected 10000 samples in 14.2210 s
Average time: 0.6593 ms
Stddev time: 0.0087 ms
Min time: 0.6439 ms
Max time: 0.7004 ms
stdlib - fromJson:
Collected 10000 samples in 7.6211 s
Average time: 0.0008 ms
Stddev time: 0.0004 ms
Min time: 0.0006 ms
Max time: 0.0198 ms
stdlib - toJson:
Collected 10000 samples in 0.0081 s
Average time: 0.0005 ms
Stddev time: 0.0003 ms
Min time: 0.0005 ms
Max time: 0.0192 ms
stdlib - test:
Collected 10000 samples in 7.6917 s
Average time: 0.0006 ms
Stddev time: 0.0002 ms
Min time: 0.0004 ms
Max time: 0.0017 ms
stdlib - replace:
Collected 10000 samples in 7.6918 s
Average time: 0.0006 ms
Stddev time: 0.0002 ms
Min time: 0.0003 ms
Max time: 0.0026 ms
stdlib - remove:
Collected 10000 samples in 7.6839 s
Average time: 0.0008 ms
Stddev time: 0.0003 ms
Min time: 0.0006 ms
Max time: 0.0181 ms
stdlib - add:
Collected 10000 samples in 7.7003 s
Average time: 0.0006 ms
Stddev time: 0.0003 ms
Min time: 0.0004 ms
Max time: 0.0187 ms
stdlib - copy:
Collected 10000 samples in 7.6893 s
Average time: 0.0006 ms
Stddev time: 0.0002 ms
Min time: 0.0004 ms
Max time: 0.0181 ms
stdlib - move:
Collected 10000 samples in 7.6927 s
Average time: 0.0008 ms
Stddev time: 0.0003 ms
Min time: 0.0006 ms
Max time: 0.0186 ms
used Mem: 1.635MiB
jsonpak alone used Mem: 302.281KiB
And here's branch 'short' with the suggest changes:
extract:
Collected 10000 samples in 0.5627 s
Average time: 0.0534 ms
Stddev time: 0.0145 ms
Min time: 0.0502 ms
Max time: 0.1943 ms
parse:
Collected 10000 samples in 8.1286 s
Average time: 0.8097 ms
Stddev time: 0.0090 ms
Min time: 0.7939 ms
Max time: 0.9388 ms
toString:
Collected 10000 samples in 3.7397 s
Average time: 0.3187 ms
Stddev time: 0.0062 ms
Min time: 0.3136 ms
Max time: 0.4323 ms
fromJson:
Collected 10000 samples in 0.5844 s
Average time: 0.0042 ms
Stddev time: 0.0007 ms
Min time: 0.0041 ms
Max time: 0.0260 ms
toJson:
Collected 10000 samples in 0.0084 s
Average time: 0.0005 ms
Stddev time: 0.0002 ms
Min time: 0.0005 ms
Max time: 0.0180 ms
test:
Collected 10000 samples in 0.5803 s
Average time: 0.0039 ms
Stddev time: 0.0005 ms
Min time: 0.0038 ms
Max time: 0.0256 ms
replace:
Collected 10000 samples in 0.5810 s
Average time: 0.0040 ms
Stddev time: 0.0007 ms
Min time: 0.0038 ms
Max time: 0.0266 ms
remove:
Collected 10000 samples in 0.6900 s
Average time: 0.0148 ms
Stddev time: 0.0011 ms
Min time: 0.0146 ms
Max time: 0.0371 ms
add:
Collected 10000 samples in 0.5831 s
Average time: 0.0041 ms
Stddev time: 0.0006 ms
Min time: 0.0040 ms
Max time: 0.0235 ms
copy:
Collected 10000 samples in 0.6475 s
Average time: 0.0103 ms
Stddev time: 0.0013 ms
Min time: 0.0098 ms
Max time: 0.0824 ms
move:
Collected 10000 samples in 0.7580 s
Average time: 0.0215 ms
Stddev time: 0.0012 ms
Min time: 0.0207 ms
Max time: 0.0430 ms
used Mem: 223.312KiB
~~Clear winner the literals branch.~~ ~~Edit: The short implementation is still buggy and the results shouldn't be taken seriously~~
Bugs fixed, holy shit. Extract and parse have become stupidly cheap, with the rest remain about the same. Memory usage is the lowest.
Summary of branches:
- sso: works and all tests pass. Running time gets worse. Shrinks in size but not substantially. Doesn't work at compile time.
- literals: Size has improved dramatically, while performance slightly suffers. Doesn't work at CT. Bug-free but tests need to be adjusted. Clean patch, with good code quality.
- short. Biggest improvements in both runtime and size. Code becomes spaghetti, with many if isShort-else branches. May not work at CT, but hard to tell since there's still many bugs to be fixed. New tests need to be written, and refactorings need to be done to improve code quality.
Here's the bench report of the main branch by just changing the bitable implementation to not store hashes.
extract:
Collected 10000 samples in 3.8511 s
Average time: 0.3759 ms
Stddev time: 0.0407 ms
Min time: 0.3672 ms
Max time: 1.1593 ms
parse:
Collected 10000 samples in 10.9879 s
Average time: 1.0874 ms
Stddev time: 0.0157 ms
Min time: 1.0521 ms
Max time: 1.1939 ms
toString:
Collected 10000 samples in 5.7015 s
Average time: 0.1715 ms
Stddev time: 0.0042 ms
Min time: 0.1693 ms
Max time: 0.1973 ms
fromJson:
Collected 10000 samples in 3.9908 s
Average time: 0.0040 ms
Stddev time: 0.0007 ms
Min time: 0.0039 ms
Max time: 0.0258 ms
toJson:
Collected 10000 samples in 0.0118 s
Average time: 0.0009 ms
Stddev time: 0.0003 ms
Min time: 0.0008 ms
Max time: 0.0198 ms
test:
Collected 10000 samples in 3.9942 s
Average time: 0.0041 ms
Stddev time: 0.0006 ms
Min time: 0.0039 ms
Max time: 0.0236 ms
replace:
Collected 10000 samples in 4.0076 s
Average time: 0.0042 ms
Stddev time: 0.0007 ms
Min time: 0.0041 ms
Max time: 0.0253 ms
remove:
Collected 10000 samples in 4.1095 s
Average time: 0.0131 ms
Stddev time: 0.0011 ms
Min time: 0.0130 ms
Max time: 0.0345 ms
add:
Collected 10000 samples in 4.0237 s
Average time: 0.0042 ms
Stddev time: 0.0007 ms
Min time: 0.0040 ms
Max time: 0.0239 ms
copy:
Collected 10000 samples in 4.0739 s
Average time: 0.0086 ms
Stddev time: 0.0009 ms
Min time: 0.0084 ms
Max time: 0.0283 ms
move:
Collected 10000 samples in 4.1703 s
Average time: 0.0180 ms
Stddev time: 0.0014 ms
Min time: 0.0176 ms
Max time: 0.0391 ms
used Mem: 270.281KiB
Observation that was missed: not caching hashes might be the reason that the literals branch doesn't perform as well for extract, but the size would increase as well. Sso branch results are still unreliable and might be caused by errors.
Also worth noting VM performance of main versus literals. literals may take less time to execute but results in more instructions.
master:
prof: µs #instr location 391925 7504 /home/antonisg/Code/dumpster/dumpster/classical_ml/random_forest.nim(10, 6)
158651 967 /home/antonisg/Code/dumpster/dumpster/classical_ml/random_forest.nim(25, 7)
93285 125480 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/private/rawops.nim(19, 6)
90572 9 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/parser.nim(110, 6)
90542 2656 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/extra.nim(11, 6)
90433 28 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/parser.nim(96, 6)
69995 10 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/parser.nim(87, 6)
69763 42246 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/parser.nim(26, 6)
66527 38000 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/private/jsontree.nim(26, 6)
56415 176378 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/private/bitabs.nim(69, 6)
48564 92468 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/jsonptr.nim(104, 6)
40748 53276 /home/antonisg/Build/Nim/lib/pure/parsejson.nim(354, 6)
30165 3660 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/builder.nim(96, 6)
20157 28 /home/antonisg/Build/Nim/lib/pure/parsejson.nim(107, 6)
20119 32 /home/antonisg/Build/Nim/lib/pure/lexbase.nim(137, 6)
19979 44 /home/antonisg/Build/Nim/lib/pure/lexbase.nim(46, 6)
19931 11 /home/antonisg/Build/Nim/lib/pure/streams.nim(244, 6)
19882 46 /home/antonisg/Build/Nim/lib/pure/streams.nim(1209, 8)
18258 60 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/builder.nim(27, 6)
17519 31410 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/private/rawops.nim(3, 6)
14154 1162 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/extra.nim(19, 6)
13901 3699 /home/antonisg/Build/Nim/lib/pure/parsejson.nim(520, 6)
12861 53580 /home/antonisg/Build/Nim/lib/pure/parsejson.nim(178, 6)
12466 5886 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/private/jsontree.nim(106, 6)
10338 47184 /home/antonisg/Build/Nim/lib/system/indices.nim(88, 6)
9350 42469 /home/antonisg/Build/Nim/lib/system/indices.nim(76, 6)
9282 3738 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/parser.nim(4, 6)
9133 39512 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/jsonptr.nim(40, 6)
6872 4814 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/builder.nim(30, 6)
6544 26256 /home/antonisg/Build/Nim/lib/pure/hashes.nim(382, 6)
6035 23296 /home/antonisg/Build/Nim/lib/pure/parsejson.nim(265, 6)
5990 20571 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/private/bitabs.nim(57, 6)
0.023049161999999956
literals:
prof: µs #instr location 395387 7504 /home/antonisg/Code/dumpster/dumpster/classical_ml/random_forest.nim(10, 6)
177519 959 /home/antonisg/Code/dumpster/dumpster/classical_ml/random_forest.nim(25, 7)
117820 129508 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/private/rawops.nim(19, 6)
109033 2656 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/extra.nim(11, 6)
102605 9 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/parser.nim(116, 6)
102480 28 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/parser.nim(102, 6)
84329 157283 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/private/bitabs.nim(66, 6)
82004 10 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/parser.nim(93, 6)
81787 42246 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/parser.nim(32, 6)
55803 13860 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/private/jsontree.nim(30, 6)
48168 92468 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/jsonptr.nim(104, 6)
40856 53276 /home/antonisg/Build/Nim/lib/pure/parsejson.nim(354, 6)
36521 6340 /home/antonisg/Build/Nim/lib/pure/hashes.nim(213, 8)
34523 15850 /home/antonisg/Build/Nim/lib/pure/hashes.nim(165, 6)
34516 23360 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/private/jsontree.nim(27, 6)
29987 15850 /home/antonisg/Build/Nim/lib/pure/hashes.nim(117, 6)
25492 110146 /home/antonisg/Build/Nim/lib/pure/hashes.nim(100, 6)
24421 3660 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/builder.nim(111, 6)
24319 60 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/builder.nim(27, 6)
20720 5202 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/parser.nim(4, 6)
20171 28 /home/antonisg/Build/Nim/lib/pure/parsejson.nim(107, 6)
20131 32 /home/antonisg/Build/Nim/lib/pure/lexbase.nim(137, 6)
19999 44 /home/antonisg/Build/Nim/lib/pure/lexbase.nim(46, 6)
19937 11 /home/antonisg/Build/Nim/lib/pure/streams.nim(244, 6)
19854 46 /home/antonisg/Build/Nim/lib/pure/streams.nim(1209, 8)
16666 31410 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/private/rawops.nim(3, 6)
14121 3699 /home/antonisg/Build/Nim/lib/pure/parsejson.nim(520, 6)
14050 1162 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/extra.nim(19, 6)
12834 53580 /home/antonisg/Build/Nim/lib/pure/parsejson.nim(178, 6)
10795 2196 /home/antonisg/Code/dumpster/jsonpak/src/jsonpak/private/jsontree.nim(117, 6)
10380 47184 /home/antonisg/Build/Nim/lib/system/indices.nim(88, 6)
9279 42469 /home/antonisg/Build/Nim/lib/system/indices.nim(76, 6)
0.02215018999999996