tinyknn issues

[bug] Can't compile or install from pip.

2

For `pip install .`: ``` tinyknn/_fast_pq.cpp:17539:320: error: can’t convert a value of type ‘int’ to vector type ‘__m128i’ {aka ‘__vector(2) long long int’} which has different size 17539 | __pyx_t_6...

matchyc

Better support for storing points in multiple lists

4

Since 2df6a428cf6bcc4e4a08f15e3f7caef9ce5f4f61 it is possible to store every datapoint in `n` lists by building with `ivf.build(n_probes=n)`. This increases performance recall/qps quite a lot, but only when going from `n=1` to...

thomasahle

enhancement

Use AVX-512

1

AVX-512 has some nice features, such as support for fast float16 operations. This might allow us to do rescoring very fast. The Quicker ADC paper also mentions some uses of...

thomasahle

enhancement

Add typing

Currently we are not using python's typing functionality.

thomasahle

enhancement

good first issue

Support estimating distance between two compressed datasets

2

Often we use PQ to estimate the distance from a full precision vector to a bunch of compressed points. However, we can also try to compute the distance between all...

thomasahle

enhancement

good first issue

Faster building / batch-insert using compression

Currently `IVF.fit(...)` uses brute force nearest neighbours to find which clusters to insert the points into. Instead we could use the same `PQ.top(...)` method that we use to do queries...

thomasahle

enhancement

Support multi-ivf

A classical way to make building the index faster, cheaper memory wise, and potentially better (bigger, but lower quality) is to use a top level product code. Instead of just...

thomasahle

enhancement

Use separate PQs in each cluster

2

Currently the same product quantizer is used for every cluster in IVF. However, the PQ doesn't use a lot of space (it's just 16 center points), so we might as...

thomasahle

enhancement

Add support for AVX, Neon etc.

Currently, only SSE is supported. It would be nice to also support AMD chips.

thomasahle

enhancement

good first issue

Add numba

1

PyNNDescent (https://github.com/lmcinnes/pynndescent) is able to get a lot of speed (presumably) by using [numba](https://numba.pydata.org/). This should be relatively easy to add to fast_pq as well.

thomasahle

enhancement

help wanted

tinyknn
tinyknn copied to clipboard

Metadata

[bug] Can't compile or install from pip.

Better support for storing points in multiple lists

Use AVX-512

Add typing

Support estimating distance between two compressed datasets

Faster building / batch-insert using compression

Support multi-ivf

Use separate PQs in each cluster

Add support for AVX, Neon etc.

Add numba

← Metadata

Owner

Metadata

tinyknn tinyknn copied to clipboard

Metadata

← Metadata

Owner

Metadata

tinyknn
tinyknn copied to clipboard