SyncKit Memory usage

Even after fixing leak(s) - #59 - I am not able to sync my quite huge database. It contains ~150000 CKRecords in CloudKit. The memory usage keeps increasing slowly when SyncKit downloads changes. After download is completed, used memory increases become slower, but it still goes up (when syncKit is applying attributes/relationships). Then finally it reaches ~700 MB and on one device (iPad) it receives memory warning and OS kills it. Do you have any suggestions what can be done here? What can I try to improve SyncKit performance?

Oct 16 '18 15:10 BlixLT

This is a tricky one, let me get back to you on it.

Oct 18 '18 14:10 mentrena

So, yeah, 150000 records sounds quite big 😅

I see you've already looked at the code, but basically the library does this:

Create child context from the main managed object context.
Query CloudKit with the stored database token and record zone tokens.
Download changed records (batch) from record zone and apply attribute changes to the objects in the child context.
Repeat previous step until no more changes in record zone.
Apply relationship changes to objects in child context.
save child context so main context gets updated, then save main context so changes are persisted to disk.
persist record zone and database tokens.

The reason relationships are updated in a second step is that CloudKit will batch the download of records, so it is possible that a record downloaded in a given batch will contain a CKReference for a record that hasn't been downloaded yet (because it's coming in the next batch).

Changes made in the child context have to stay in memory until they're saved and persisted to disk, so that becomes an issue with a database this big. I think the solution would be to save the child context more regularly (after each batch is downloaded?) to avoid memory usage growing too much, but I would need to reconsider the algorithm to avoid cases where the graph becomes inconsistent. I will think about it, but this might be a bit harder to fix.

Oct 19 '18 19:10 mentrena

Any newer thoughts on how to implement this? 1.5m records isn't that much on modern devices. I'm thinking that because realm-swift already requires relationships to be optional (afaik), this lib should just persist the incoming changes without waiting on dependency objects.

Sep 15 '22 17:09 aehlke

If you do that you will lose the relationships. Imagine you have A -> B, and your app downloads record A' in the first batch, and record B' in the second batch. You need to make sure that changes in record B' are applied before changes in record A', so that object B exists in Realm before you try to set the relationship A -> B.

Some kind of logic is always needed to process relationships that cannot be applied yet. What I do is is save basic properties as soon as the record is downloaded, but create a PendingRelationship object for record references, so that the library can do a pass after all records have been downloaded and connect the objects.

Oct 16 '22 18:10 mentrena

Thanks for elaborating!

Oct 17 '22 01:10 aehlke