cobs icon indicating copy to clipboard operation
cobs copied to clipboard

Query length limit

Open graceblackwell opened this issue 5 years ago • 6 comments

Would it be possible to increase the query length limit? I am wanting to query sequences up to 300kb and it would be good to avoid having to split the sequences up into chunks.

graceblackwell avatar Aug 03 '20 13:08 graceblackwell

Yes, this is possible by copying some of the query code. Will do.

bingmann avatar Aug 03 '20 13:08 bingmann

Oh great! Thanks

graceblackwell avatar Aug 03 '20 13:08 graceblackwell

Hi @bingmann , how about canceling length limit?

shenwei356 avatar Aug 04 '20 00:08 shenwei356

What do you mean with cancel? The score counters can be 16-bit (max 65 Ki query length), or 32-bit (max 3 million query length), 64-bit would also be possible, but expensive memory-wise.

bingmann avatar Aug 04 '20 06:08 bingmann

I see, I just figure out that 65535 is the maximum 16bit uint, where you use _mm_add_epi16 for parallelizing k-mer count for 8 documents. So replacing _mm_add_epi16 with _mm_add_epi64 can break the limit, in cost of little more memory usage.

shenwei356 avatar Aug 04 '20 06:08 shenwei356

This limitation has been removed in 05588df18fee9bfdd44f6954059600a399ac2258

Please tell me if the new master version works for you.

bingmann avatar Aug 13 '20 11:08 bingmann