postgresql-hll icon indicating copy to clipboard operation
postgresql-hll copied to clipboard

Q: Evaluation cost of hll_empty()

Open samkillin opened this issue 2 years ago • 1 comments

I am looking to "upsert" hll's in my database, using a traditional two stage query:

# Add an empty HLL if one doesn't exist
INSERT INTO counts(event_type, reference, count) 
VALUES(:event_type, :reference, hll_empty()) 
ON CONFLICT DO NOTHING; 

# Add to the HLL
UPDATE counts 
SET count = hll_add(count, hll_hash_text(:event_id)) 
WHERE event_type = :event_type AND reference = :reference;

Given hll_empty() will be invoked on every query, even in the update case (i.e. before a conflict is identified), how cheap/expensive is that function to execute? The README mentions use of a sentinel value – is this sentinel value recalculated every call?

Should I be concerned with the evaluation of that function being a potential bottleneck compared to hll_add and hll_hash_text?

samkillin avatar Jan 29 '24 01:01 samkillin

Looking at the code, hll_empty() in fact creates an empty set (internally represented as a byte array) at every call. So it does not use a precalculated sentinel.

emelsimsek avatar Jan 30 '24 12:01 emelsimsek