columnar icon indicating copy to clipboard operation
columnar copied to clipboard

Add lz4 compression to strings

Open githubmanticore opened this issue 3 years ago • 2 comments

Currently strings are stored basically uncompressed (only lengths are compressed) when table compression is not applicable. It makes sense to try to add LZ4 compression to such strings to save disk space (string hases used for grouping and filters are stored as a separate attribute).

githubmanticore avatar Jul 06 '22 15:07 githubmanticore

Also, float vectors could benefit from it

glookka avatar Jun 26 '25 12:06 glookka

float vectors could benefit from it

Here's how lz4 compresses float vectors:

 ~  mysql -v -P9306 -h0 -e "desc test; select count(*) from test"
--------------
desc test
--------------

+--------------+--------------+----------------+
| Field        | Type         | Properties     |
+--------------+--------------+----------------+
| id           | bigint       |                |
| title        | text         | indexed stored |
| image_vector | float_vector | knn            |
+--------------+--------------+----------------+
--------------
select count(*) from test
--------------

+----------+
| count(*) |
+----------+
|   100000 |
+----------+
 ~  ls -lah /opt/homebrew/var/manticore/test/test.6.spb*
-rw-------  1 sn  admin   196M 26 Jun 15:54 /opt/homebrew/var/manticore/test/test.6.spb
-rw-------  1 sn  admin    86M 26 Jun 15:54 /opt/homebrew/var/manticore/test/test.6.spb.lz4

sanikolaev avatar Jun 26 '25 13:06 sanikolaev