tinyknn Use separate PQs in each cluster

Currently the same product quantizer is used for every cluster in IVF. However, the PQ doesn't use a lot of space (it's just 16 center points), so we might as well train a separate one for the data in each cluster.

The main disadvantage is that queries would have to compute a distance table for each PQ. It's unclear how much that currently is a bottle-neck compared to the actual pass 1 and pass 2 filtering.

An advantage is that we can quantize data[mask] - center instead of data[mask] as we do now. I believe this is what QuickADC actually does. By subtracting the "main component" of the points we thus gain the ability to scale up the scalars before we map to [-128, 127], allowing higher precision.

Apr 11 '23 19:04 thomasahle

Currently the distance table computation takes way too long to consider using more than one, as is evident from this profiling screenshot. Screenshot 2023-04-11 at 10 04 27 PM

Apr 12 '23 05:04 thomasahle

I did some work trying to speed up distance_table(...) in 2c16ac4471974b30e54904da3070faadf58e352d, but unfortunately I wasn't able to improve it much. Maybe the whole thing needs to be moved to cython, as disappointing as that would be.

Apr 19 '23 00:04 thomasahle