WIP: [Improvement] make part of operationrepo initialization async
Description
One Line Summary
Make part of the initialization of OperationRepo asynchronous so that previously saved operations can be added asynchronously, preventing long-loading operations from blocking the main thread.
Details
Motivation
We have observed numerous ANRs during the initialization phase, with OperationRepo.init being the top cause. This issue does not occur consistently, and we suspect it may be related to the device's state or having a problem accessing device's disk. To address this, we plan to make the initialization process asynchronous in OperationRepo. By moving the loading part to a background thread, we aim to prevent the main thread from being blocked when the initialization process unexpectedly takes a long time.
Scope
Saved operations from previous session will not be executed until they are loaded successfully. The order may be incorrect depends on the timing of the loading completion This change will try to insert saved operations starting from the beginning of the queue, and any later operation will be added to the end of the queue.
Testing
Unit testing
OPTIONAL - Explain unit tests added, if not clear in the code.
Manual testing
RECOMMEND - OPTIONAL - Explain what scenarios were tested and the environment. Example: Tested opening a notification while the app was foregrounded, app build with Android Studio 2020.3 with a fresh install of the OneSignal example app on a Pixel 6 with Android 12.
Affected code checklist
- [ ] Notifications
- [ ] Display
- [ ] Open
- [ ] Push Processing
- [ ] Confirm Deliveries
- [ ] Outcomes
- [ ] Sessions
- [ ] In-App Messaging
- [ ] REST API requests
- [ ] Public API changes
Checklist
Overview
- [ ] I have filled out all REQUIRED sections above
- [ ] PR does one thing
- If it is hard to explain how any codes changes are related to each other then it most likely needs to be more than one PR
- [ ] Any Public API changes are explained in the PR details and conform to existing APIs
Testing
- [ ] I have included test coverage for these changes, or explained why they are not needed
- [ ] All automated tests pass, or I explained why that is not possible
- [ ] I have personally tested this on my device, or explained why that is not possible
Final pass
- [ ] Code is as readable as possible.
- Simplify with less code, followed by splitting up code into well named functions and variables, followed by adding comments to the code.
- [ ] I have reviewed this PR myself, ensuring it meets each checklist item
- WIP (Work In Progress) is ok, but explain what is still in progress and what you would like feedback on. Start the PR title with "WIP" to indicate this.
We need to delay
OperationModelStore.load()as well, as this is what does the disk read. See this ANR stack trace:at com.onesignal.common.modeling.Model.initializeFromJson(Model.kt:98) at com.onesignal.core.internal.operations.impl.OperationModelStore.create(OperationModelStore.kt:68) at com.onesignal.core.internal.operations.impl.OperationModelStore.create(OperationModelStore.kt:30) at com.onesignal.common.modeling.ModelStore.load(ModelStore.kt:162) at com.onesignal.core.internal.operations.impl.OperationModelStore.<init>(OperationModelStore.kt:32) at java.lang.reflect.Constructor.newInstance0(Native method) at java.lang.reflect.Constructor.newInstance(Constructor.java:343) at com.onesignal.common.services.ServiceRegistrationReflection.resolve(ServiceRegistration.kt:89) at com.onesignal.common.services.ServiceProvider.getServiceOrNull(ServiceProvider.kt:79) at com.onesignal.common.services.ServiceProvider.getService(ServiceProvider.kt:67) at com.onesignal.common.services.ServiceRegistrationReflection.resolve(ServiceRegistration.kt:82) at com.onesignal.common.services.ServiceProvider.getServiceOrNull(ServiceProvider.kt:79) at com.onesignal.common.services.ServiceProvider.getService(ServiceProvider.kt:67) at com.onesignal.internal.OneSignalImp.initWithContext(OneSignalImp.kt:510) at com.onesignal.OneSignal.initWithContext(OneSignal.kt:135)So the order of operations of
ServiceProvidercreating instances of classes is it goes deep first and works its way back up. So in this case sinceOperationReporequires an instance ofConfigModelStoreas part of it's constructor, an instance ofConfigModelStoreis created beforeOperationRepo.
Since load() is a genetic function from ModelStore, should we delay all model stores or limit the change to OperationModelStore only?
Also, both load() and persist() may be locking the models for longer than needed, especially they include the access to the preference service inside the synchronized block. Do you think we can also introduce a little optimization along with this issue?
Since load() is a genetic function from ModelStore, should we delay all model stores or limit the change to OperationModelStore only?
Longer term we probably want change ModelStore, so none of the models read from disk in the constructor. Or ensure we never create these instances on the main thread. In the short term, to get a quick fix out, scoping it to only OperationModelStore is probably what we should do for now.
Also, both load() and persist() may be locking the models for longer than needed, especially they include the access to the preference service inside the synchronized block. Do you think we can also introduce a little optimization along with this issue?
Ya we could make those changes in this PR as well.
@jinliu9508 I believe this PR will break RecoverFromDroppedLoginBug.kt. As when it calls OperationRepo.containsInstanceOf() it assumes it will already have loaded all the save operations from disk. Can you address this in a follow up PR?