columnar
columnar copied to clipboard
Add lz4 compression to strings
Currently strings are stored basically uncompressed (only lengths are compressed) when table compression is not applicable. It makes sense to try to add LZ4 compression to such strings to save disk space (string hases used for grouping and filters are stored as a separate attribute).
Also, float vectors could benefit from it
float vectors could benefit from it
Here's how lz4 compresses float vectors:
~ mysql -v -P9306 -h0 -e "desc test; select count(*) from test"
--------------
desc test
--------------
+--------------+--------------+----------------+
| Field | Type | Properties |
+--------------+--------------+----------------+
| id | bigint | |
| title | text | indexed stored |
| image_vector | float_vector | knn |
+--------------+--------------+----------------+
--------------
select count(*) from test
--------------
+----------+
| count(*) |
+----------+
| 100000 |
+----------+
~ ls -lah /opt/homebrew/var/manticore/test/test.6.spb*
-rw------- 1 sn admin 196M 26 Jun 15:54 /opt/homebrew/var/manticore/test/test.6.spb
-rw------- 1 sn admin 86M 26 Jun 15:54 /opt/homebrew/var/manticore/test/test.6.spb.lz4