Corruption reported by sqlite on database without warning
App crashed and the reason was database corruption. Though rules sets with SharedPreference were not lost, everything else was. AppInfoRepository recovered by hydrating (via RefreshDatabase.kt) from PackageManager#getInstalledPackages, but the rest of the data was lost forever, with even default entries going empty; that is, after this happened, network logs, dns logs, all dns entires (dns-over-https, dns-proxy, dns-crypt) go empty, there's nothing to show in the UI... This was followed by inconsistency in what the connected-dns was (free.bravedns.com) and what was shown (DoH 6, the last connected DNS when the db was up and running). From the logs, there was no prior warning to be found of the database going kaput.
fdsan:
07-28 01:40:58.502 1332 7799 D OpQuickReply: setQuickReplyResumed focusedApp AppWindowToken{10c8546 token=Token{8557d21 ActivityRecord{ae84f88 u0 com.celzero.bravedns/.ui.FirewallActivity t78795}}} pkgName com.celzero.bravedns
07-28 01:40:58.532 1332 7799 E WindowManager: App trying to use insecure INPUT_FEATURE_NO_INPUT_CHANNEL flag. Ignoring
07-28 01:40:59.773 24028 24028 F DEBUG : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
07-28 01:40:59.773 24028 24028 F DEBUG : Build fingerprint: 'OnePlus/OnePlus6/OnePlus6:10/QKQ1.190716.003/2107132155:user/release-keys'
07-28 01:40:59.773 24028 24028 F DEBUG : Revision: '0'
07-28 01:40:59.773 24028 24028 F DEBUG : ABI: 'arm64'
07-28 01:40:59.773 24028 24028 F DEBUG : Timestamp: 2021-07-28 01:40:59+0530
07-28 01:40:59.773 24028 24028 F DEBUG : pid: 22304, tid: 22410, name: Thread-12 >>> com.celzero.bravedns <<<
07-28 01:40:59.774 24028 24028 F DEBUG : uid: 10494
07-28 01:40:59.774 24028 24028 F DEBUG : signal 35 (<debuggerd signal>), code -1 (SI_QUEUE), fault addr --------
07-28 01:40:59.774 24028 24028 F DEBUG : Abort message: 'attempted to close file descriptor 105, expected to be unowned, actually owned by sqlite 0x69'
07-28 01:40:59.774 24028 24028 F DEBUG : x0 0000000000000000 x1 000000000000578a x2 0000000000000023 x3 0000007089ac0640
07-28 01:40:59.774 24028 24028 F DEBUG : x4 0000000000000036 x5 0000000000000036 x6 0000000000000036 x7 0000000000000039
07-28 01:40:59.774 24028 24028 F DEBUG : x8 00000000000000f0 x9 31358c92a76b60d3 x10 0000000000000000 x11 00000071bd42bcbd
07-28 01:40:59.774 24028 24028 F DEBUG : x12 0000000000000039 x13 0000007089ac021f x14 0000007089ac0202 x15 0000000000000000
07-28 01:40:59.774 24028 24028 F DEBUG : x16 00000071bd5048c0 x17 00000071bd4e0900 x18 0000000000000000 x19 0000000000005720
07-28 01:40:59.774 24028 24028 F DEBUG : x20 000000000000578a x21 0000007089ac1020 x22 00000071c205f8f4 x23 0000000000000001
07-28 01:40:59.774 24028 24028 F DEBUG : x24 0000007089ac0430 x25 0000007089ac03b0 x26 0000007089ac0370 x27 00000071c0cb5d80
07-28 01:40:59.774 24028 24028 F DEBUG : x28 0000007089eecf30 x29 0000007089ac0720
07-28 01:40:59.774 24028 24028 F DEBUG : sp 0000007089ac02f0 lr 00000071bd4956d8 pc 00000071bd4956f8
07-28 01:40:59.780 24028 24028 F DEBUG :
07-28 01:40:59.780 24028 24028 F DEBUG : backtrace:
07-28 01:40:59.780 24028 24028 F DEBUG : NOTE: Function names and BuildId information is missing for some frames due
07-28 01:40:59.780 24028 24028 F DEBUG : NOTE: to unreadable libraries. For unwinds of apps, only shared libraries
07-28 01:40:59.780 24028 24028 F DEBUG : NOTE: found under the lib/ directory are readable.
07-28 01:40:59.780 24028 24028 F DEBUG : #00 pc 00000000000866f8 /apex/com.android.runtime/lib64/bionic/libc.so (fdsan_error(char const*, ...)+388) (BuildId: a6a4a6a4e20240bbe3173fe560b161af)
07-28 01:40:59.780 24028 24028 F DEBUG : #01 pc 0000000000086510 /apex/com.android.runtime/lib64/bionic/libc.so (android_fdsan_close_with_tag+728) (BuildId: a6a4a6a4e20240bbe3173fe560b161af)
07-28 01:40:59.780 24028 24028 F DEBUG : #02 pc 0000000000086c7c /apex/com.android.runtime/lib64/bionic/libc.so (close+16) (BuildId: a6a4a6a4e20240bbe3173fe560b161af)
07-28 01:40:59.780 24028 24028 F DEBUG : #03 pc 00000000002257b4 /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG : #04 pc 000000000020d1f8 /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG : #05 pc 00000000001bc740 /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG : #06 pc 00000000001c4040 /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG : #07 pc 0000000000204e48 /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG : #08 pc 00000000002072a4 /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG : #09 pc 0000000000217a80 /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG : #10 pc 00000000001b0e48 /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.781 24028 24028 F DEBUG : #11 pc 00000000000c9398 /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.781 24028 24028 F DEBUG : #12 pc 00000000000ce9e8 /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.781 24028 24028 F DEBUG : #13 pc 0000000000305034 /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.781 24028 24028 F DEBUG : #14 pc 00000000000e6890 /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+36) (BuildId: a6a4a6a4e20240bbe3173fe560b161af)
07-28 01:40:59.781 24028 24028 F DEBUG : #15 pc 0000000000084b6c /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: a6a4a6a4e20240bbe3173fe560b161af)
followed by:
07-28 01:41:24.367 22304 22365 E SQLiteLog: (26) statement aborts at 12: [select count(*) from BlockedConnections where uid = -1000] file is not a database
07-28 01:41:24.368 22304 22365 E SupportSQLite: Corruption reported by sqlite on database: /data/user/0/com.celzero.bravedns/databases/bravedns.db
07-28 01:41:24.368 22304 22365 W SupportSQLite: deleting the database file: /data/user/0/com.celzero.bravedns/databases/bravedns.db
07-28 01:41:24.372 22304 22365 E AndroidRuntime: FATAL EXCEPTION: arch_disk_io_2
07-28 01:41:24.372 22304 22365 E AndroidRuntime: Process: com.celzero.bravedns, PID: 22304
07-28 01:41:24.372 22304 22365 E AndroidRuntime: java.lang.RuntimeException: Exception while computing database live data.
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at androidx.room.RoomTrackingLiveData$1.run(RoomTrackingLiveData.java:92)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at java.lang.Thread.run(Thread.java:919)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: Caused by: android.database.sqlite.SQLiteDatabaseCorruptException: file is not a database (code 26 SQLITE_NOTADB)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.sqlite.SQLiteConnection.nativeExecuteForCursorWindow(Native Method)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.sqlite.SQLiteConnection.executeForCursorWindow(SQLiteConnection.java:942)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.sqlite.SQLiteSession.executeForCursorWindow(SQLiteSession.java:838)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.sqlite.SQLiteQuery.fillWindow(SQLiteQuery.java:62)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.sqlite.SQLiteCursor.fillWindow(SQLiteCursor.java:153)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.sqlite.SQLiteCursor.getCount(SQLiteCursor.java:140)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.AbstractCursor.moveToPosition(AbstractCursor.java:232)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.AbstractCursor.moveToFirst(AbstractCursor.java:271)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at com.celzero.bravedns.database.BlockedConnectionsDAO_Impl$10.call(BlockedConnectionsDAO_Impl.java:663)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at com.celzero.bravedns.database.BlockedConnectionsDAO_Impl$10.call(BlockedConnectionsDAO_Impl.java:657)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at androidx.room.RoomTrackingLiveData$1.run(RoomTrackingLiveData.java:90)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--... 3 more
Handle such database corruption by:
- Detecting it and informing the user via a dialog / notification that the database is being rebuilt because all state has been lost.
- Hydrating the tables with seed data whenever they go empty.
- Prompt for bug reports. #336
Likely, we must move from using GlobalScope co-routines for table updates and use viewModelScope wherever appropriate. #243
Possibly, buffer writes (batch-write) into network-log and dns-log tables. #343
One of the possible case where the database corruption can happen is explained in the below link,
https://www.fatalerrors.org/a/crash-sqlitedatabase-corruption-exception-file-is-encrypted-or-is-not-a-database.html
Adding the setJournalMode(JournalMode.TRUNCATE) to the App database builder as suggested.
Android 9 introduces a special mode of SQLiteDatabase called "write ahead logging", which allows the database to use journal_mode=WAL, while retaining the behavior of creating at most one connection per database.
WAL changes the way the database works and throws this exception in some states. You need to manually turn off the WAL mode according to the above solution to fix this problem.
Journal mode has been changed back to WAL (since the suggestion linked to from above isn't really based on any concrete facts or anything) since readers on the underlying sqlite3 tables has increased.
Besides, there's been a bunch of other improvements in batching writes (#343) and splitting away the write-heavy logs-related tables (ie, network log & dns log tables) in to its own database (#574), which should improve and isolate corruptions a bit.