manticoresearch icon indicating copy to clipboard operation
manticoresearch copied to clipboard

Make ID 64 bits

Open sanikolaev opened this issue 4 years ago • 2 comments

Historically since Manticore 3 the ID must have been positive signed 64-bit integers which limits it to 63 bits in fact. If you try to insert a document with id > 2^63 Manticore will convert it to exactly 2^63. It may be confusing and some users confirmed it's confusing and not convenient in our public Slack. For example, if you use 64-bit MurMur hash (or any other 64 bits hash) as a basis for your ids it won't work and you, of course, can remove one of the bits to comply with the limitation, but it doesn't seem to be a good option.

So it makes sense to consider making ID true 64 bits.

sanikolaev avatar Mar 03 '22 11:03 sanikolaev

I ran into the same issue - i want to store url hashes as id and have to drop one bit. (Also, it took me a while to realize what happens...)

puhoy avatar May 16 '22 18:05 puhoy

Comment from public Slack :

I was hoping I would have seen this as part of the 5.x release. This is pretty much blocking my ability to deploy Manticore solutions in my infrastructure. For a couple of small sets with redundant IDs < 64-bit were possible, sure, but the bulk of my system has a 64-bit key for everything.

sanikolaev avatar Jun 23 '22 03:06 sanikolaev

I would have though going from 63 to 64 bits would have been a pretty straightforward type change. Unless, of course, that 64th bit is being used internally for some purpose. Is that the obstacle in the way of switching from a uint64 to an int64? I'm trying to gauge the likelihood of this change being implemented before I invest any more time in Manticore. I can't justify introducing an artificial secondary key into my systems just to make it work around this issue.

wayneseguin avatar Aug 16 '22 15:08 wayneseguin

we do not have uint64 type and such queries could produce wrong result set for straightforward type change, ie

select id as i, max (int64_attr, i) as s from idx order by s asc;
select id as i, int64_attr+ i as cnd from idx where cnd<0;

To implement this uint64 type should be added to:

  • index data storage
  • all kind of data sources
  • RT indexing stage
  • expression calculation stage
  • filtering stage
  • sorting stage
  • grouping stage
  • check JSON part could handle this type well
  • fix APIs and check they could handle this new data type

tomatolog avatar Aug 16 '22 21:08 tomatolog

Making ID 64 bits is in the roadmap https://roadmap.manticoresearch.com/ which means it's planned, if not in progress.

sanikolaev avatar Aug 17 '22 03:08 sanikolaev

➤ Ilya Kuznetsov commented:

Done in docid64 branch (there might be some untested cases where signed docids are not supported yet).

githubmanticore avatar Sep 01 '22 13:09 githubmanticore

Merged in https://github.com/manticoresoftware/manticoresearch/commit/2c1629266c654151bb654c8c616349ee1158b868

sanikolaev avatar Sep 05 '22 03:09 sanikolaev

It's worth mentioning that what's implemented in https://github.com/manticoresoftware/manticoresearch/commit/2c1629266c654151bb654c8c616349ee1158b868 is kind of limited support in a sense that the id can be signed 64-bit now. In the future we'd like to make it unsigned 64-bit.

sanikolaev avatar Nov 29 '22 06:11 sanikolaev

The related task is https://github.com/manticoresoftware/manticoresearch/issues/1030 - "Make ID unsigned bigint"

sanikolaev avatar Feb 09 '23 05:02 sanikolaev