Should upsert() work with compound primary key?
Given this table:
create table main.summary (
Source TEXT,
Object TEXT,
Category TEXT,
Count INTEGER,
primary key (Source, Object, Category)
)
I'd like to have a simple helper to run rollups like so:
def summarize(db, table, source, object_name, category='All', criteria=None):
db['summary'].upsert({
'source': source,
'object': object_name,
'category': category,
'count': table.count_where(criteria)
})
To use like so:
utils.summarize(db, db["accounts_a"], 'Client A', 'Accounts')
utils.summarize(db, db["accounts_b"], 'Client B', 'Accounts', 'Customer', 'Type = "Customer"')
utils.summarize(db, db["accounts_b"], 'Client B', 'Accounts', 'Prospect', 'Type = "Prospect"')
But I'm getting an error that upsert requires a pk:
File "/Users/jclark/Documents/dev/clientname/merge/scripts/utils.py", line 13, in summarize
db['summary'].upsert({
File "/opt/homebrew/lib/python3.12/site-packages/sqlite_utils/db.py", line 3346, in upsert
return self.upsert_all(
^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.12/site-packages/sqlite_utils/db.py", line 3383, in upsert_all
return self.insert_all(
^^^^^^^^^^^^^^^^
File "/opt/homebrew/lib/python3.12/site-packages/sqlite_utils/db.py", line 3237, in insert_all
raise PrimaryKeyRequired("upsert() requires a pk")
sqlite_utils.db.PrimaryKeyRequired: upsert() requires a pk
Per the upsert docs, the pk param to upsert() is only used when creating the target table, which I am not, so I'm assuming the error means it doesn't like my pk. Is that because it's a compound key? Or am I doing something else wrong?
This appears to be a documentation issue.,
Took a look in the code; looks like pk is required for upsert, despite the docs saying,
Note that the pk and column_order parameters here are optional if you are certain that the table has already been created. You should pass them if the table may not exist at the time the first upsert is performed.
So I modified my method to pass the pk as a tuple and it's working fine.
def summarize(db, table, source, object_name, category='All', criteria=None):
db['summary'].upsert({
'source': source,
'object': object_name,
'category': category,
'count': table.count_where(criteria)
}, pk=("source", "object", "category"))