sqlite-vec icon indicating copy to clipboard operation
sqlite-vec copied to clipboard

kNN Query Limit Error: "k value in knn query too large, provided X and the limit is 4096"

Open LukasKriesch opened this issue 1 year ago • 5 comments

When performing k-Nearest Neighbors (kNN) queries using the sqlite-vec extension, queries with a LIMIT or k value greater than the configured maximum (vec_max_k) result in an OperationalError. The default maximum k value is 4096, which can cause issues when trying to retrieve a larger number of results. Is there a possibility to extend the limit?

LukasKriesch avatar Dec 18 '24 08:12 LukasKriesch

I also encounter this issues in latest version (Python 0.1.6), and surprised why a kNN query must have harder limit.. I think the maximum limit must the length of row, so we can get all possibilities regardless the score returned..

yusufsyaifudin avatar Dec 19 '24 10:12 yusufsyaifudin

Hello! So that 4096 limit was attacked to mitigate any possible denial of service attacks, because the results of KNN queries are stored in memory.

I wanted to avoid an attacker adding a k = 99999999 clause to a query and exhausting all the memory of an application.

That being said, making this a configurable settings makes a lot of sense, so I'll take a look at including one. Probably one that can either 1) increase the limit to a new N value, or 2) removing the limit entirely. But this would be an opt-in per-table flag, since I want the default to always be safe.

asg017 avatar Dec 19 '24 19:12 asg017

+1 for making this configurable. I am migrating vector search PoC from usearch to sqlite-vec and also caught by this limit 😭 Fortunately, it's not the size of any buffer, etc., so I can easily compile the lib with increased limits and try it out in practice

vec0 works blazingly fast!

sergey-v9 avatar Jan 11 '25 04:01 sergey-v9

Hi, @asg017, excellent work with this extension!

Are there plans for making this configurable in the near future?

Best, Paweł

dudzicp avatar Apr 03 '25 15:04 dudzicp

Just curious what k value are you trying to use? Will likely temporarily bump it to 16,384 in the next release, and add a proper configurable setting soon after

Also, for folks who find this thread who need no limit — keep in mind that internally sqlite-vec uses a O(n^2) algorithm internally on the value of k, so even if you could use a larger number ,it's likely it would be slow. Consider instead storing vectors "manually" outside of a vec0 virtual table, in which case you can use whatever LIMIT value as you wish.

asg017 avatar Apr 03 '25 22:04 asg017