rethink-app icon indicating copy to clipboard operation
rethink-app copied to clipboard

Corruption reported by sqlite on database without warning

Open ignoramous opened this issue 4 years ago • 1 comments

App crashed and the reason was database corruption. Though rules sets with SharedPreference were not lost, everything else was. AppInfoRepository recovered by hydrating (via RefreshDatabase.kt) from PackageManager#getInstalledPackages, but the rest of the data was lost forever, with even default entries going empty; that is, after this happened, network logs, dns logs, all dns entires (dns-over-https, dns-proxy, dns-crypt) go empty, there's nothing to show in the UI... This was followed by inconsistency in what the connected-dns was (free.bravedns.com) and what was shown (DoH 6, the last connected DNS when the db was up and running). From the logs, there was no prior warning to be found of the database going kaput.

fdsan:

07-28 01:40:58.502  1332  7799 D OpQuickReply: setQuickReplyResumed focusedApp AppWindowToken{10c8546 token=Token{8557d21 ActivityRecord{ae84f88 u0 com.celzero.bravedns/.ui.FirewallActivity t78795}}} pkgName com.celzero.bravedns
07-28 01:40:58.532  1332  7799 E WindowManager: App trying to use insecure INPUT_FEATURE_NO_INPUT_CHANNEL flag. Ignoring
07-28 01:40:59.773 24028 24028 F DEBUG   : *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
07-28 01:40:59.773 24028 24028 F DEBUG   : Build fingerprint: 'OnePlus/OnePlus6/OnePlus6:10/QKQ1.190716.003/2107132155:user/release-keys'
07-28 01:40:59.773 24028 24028 F DEBUG   : Revision: '0'                        
07-28 01:40:59.773 24028 24028 F DEBUG   : ABI: 'arm64'                         
07-28 01:40:59.773 24028 24028 F DEBUG   : Timestamp: 2021-07-28 01:40:59+0530  
07-28 01:40:59.773 24028 24028 F DEBUG   : pid: 22304, tid: 22410, name: Thread-12  >>> com.celzero.bravedns <<<
07-28 01:40:59.774 24028 24028 F DEBUG   : uid: 10494                           
07-28 01:40:59.774 24028 24028 F DEBUG   : signal 35 (<debuggerd signal>), code -1 (SI_QUEUE), fault addr --------
07-28 01:40:59.774 24028 24028 F DEBUG   : Abort message: 'attempted to close file descriptor 105, expected to be unowned, actually owned by sqlite 0x69'
07-28 01:40:59.774 24028 24028 F DEBUG   :     x0  0000000000000000  x1  000000000000578a  x2  0000000000000023  x3  0000007089ac0640
07-28 01:40:59.774 24028 24028 F DEBUG   :     x4  0000000000000036  x5  0000000000000036  x6  0000000000000036  x7  0000000000000039
07-28 01:40:59.774 24028 24028 F DEBUG   :     x8  00000000000000f0  x9  31358c92a76b60d3  x10 0000000000000000  x11 00000071bd42bcbd
07-28 01:40:59.774 24028 24028 F DEBUG   :     x12 0000000000000039  x13 0000007089ac021f  x14 0000007089ac0202  x15 0000000000000000
07-28 01:40:59.774 24028 24028 F DEBUG   :     x16 00000071bd5048c0  x17 00000071bd4e0900  x18 0000000000000000  x19 0000000000005720
07-28 01:40:59.774 24028 24028 F DEBUG   :     x20 000000000000578a  x21 0000007089ac1020  x22 00000071c205f8f4  x23 0000000000000001
07-28 01:40:59.774 24028 24028 F DEBUG   :     x24 0000007089ac0430  x25 0000007089ac03b0  x26 0000007089ac0370  x27 00000071c0cb5d80
07-28 01:40:59.774 24028 24028 F DEBUG   :     x28 0000007089eecf30  x29 0000007089ac0720
07-28 01:40:59.774 24028 24028 F DEBUG   :     sp  0000007089ac02f0  lr  00000071bd4956d8  pc  00000071bd4956f8
07-28 01:40:59.780 24028 24028 F DEBUG   :                                      
07-28 01:40:59.780 24028 24028 F DEBUG   : backtrace:                           
07-28 01:40:59.780 24028 24028 F DEBUG   :     NOTE: Function names and BuildId information is missing for some frames due
07-28 01:40:59.780 24028 24028 F DEBUG   :     NOTE: to unreadable libraries. For unwinds of apps, only shared libraries
07-28 01:40:59.780 24028 24028 F DEBUG   :     NOTE: found under the lib/ directory are readable.
07-28 01:40:59.780 24028 24028 F DEBUG   :       #00 pc 00000000000866f8  /apex/com.android.runtime/lib64/bionic/libc.so (fdsan_error(char const*, ...)+388) (BuildId: a6a4a6a4e20240bbe3173fe560b161af)
07-28 01:40:59.780 24028 24028 F DEBUG   :       #01 pc 0000000000086510  /apex/com.android.runtime/lib64/bionic/libc.so (android_fdsan_close_with_tag+728) (BuildId: a6a4a6a4e20240bbe3173fe560b161af)
07-28 01:40:59.780 24028 24028 F DEBUG   :       #02 pc 0000000000086c7c  /apex/com.android.runtime/lib64/bionic/libc.so (close+16) (BuildId: a6a4a6a4e20240bbe3173fe560b161af)
07-28 01:40:59.780 24028 24028 F DEBUG   :       #03 pc 00000000002257b4  /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG   :       #04 pc 000000000020d1f8  /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG   :       #05 pc 00000000001bc740  /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG   :       #06 pc 00000000001c4040  /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG   :       #07 pc 0000000000204e48  /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG   :       #08 pc 00000000002072a4  /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG   :       #09 pc 0000000000217a80  /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.780 24028 24028 F DEBUG   :       #10 pc 00000000001b0e48  /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.781 24028 24028 F DEBUG   :       #11 pc 00000000000c9398  /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.781 24028 24028 F DEBUG   :       #12 pc 00000000000ce9e8  /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.781 24028 24028 F DEBUG   :       #13 pc 0000000000305034  /data/data/com.celzero.bravedns/code_cache/libjvmtiagent_arm64.so
07-28 01:40:59.781 24028 24028 F DEBUG   :       #14 pc 00000000000e6890  /apex/com.android.runtime/lib64/bionic/libc.so (__pthread_start(void*)+36) (BuildId: a6a4a6a4e20240bbe3173fe560b161af)
07-28 01:40:59.781 24028 24028 F DEBUG   :       #15 pc 0000000000084b6c  /apex/com.android.runtime/lib64/bionic/libc.so (__start_thread+64) (BuildId: a6a4a6a4e20240bbe3173fe560b161af)

followed by:

07-28 01:41:24.367 22304 22365 E SQLiteLog: (26) statement aborts at 12: [select count(*) from BlockedConnections where uid = -1000] file is not a database
07-28 01:41:24.368 22304 22365 E SupportSQLite: Corruption reported by sqlite on database: /data/user/0/com.celzero.bravedns/databases/bravedns.db
07-28 01:41:24.368 22304 22365 W SupportSQLite: deleting the database file: /data/user/0/com.celzero.bravedns/databases/bravedns.db
07-28 01:41:24.372 22304 22365 E AndroidRuntime: FATAL EXCEPTION: arch_disk_io_2
07-28 01:41:24.372 22304 22365 E AndroidRuntime: Process: com.celzero.bravedns, PID: 22304
07-28 01:41:24.372 22304 22365 E AndroidRuntime: java.lang.RuntimeException: Exception while computing database live data.
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at androidx.room.RoomTrackingLiveData$1.run(RoomTrackingLiveData.java:92)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at java.lang.Thread.run(Thread.java:919)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: Caused by: android.database.sqlite.SQLiteDatabaseCorruptException: file is not a database (code 26 SQLITE_NOTADB)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.sqlite.SQLiteConnection.nativeExecuteForCursorWindow(Native Method)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.sqlite.SQLiteConnection.executeForCursorWindow(SQLiteConnection.java:942)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.sqlite.SQLiteSession.executeForCursorWindow(SQLiteSession.java:838)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.sqlite.SQLiteQuery.fillWindow(SQLiteQuery.java:62)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.sqlite.SQLiteCursor.fillWindow(SQLiteCursor.java:153)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.sqlite.SQLiteCursor.getCount(SQLiteCursor.java:140)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.AbstractCursor.moveToPosition(AbstractCursor.java:232)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at android.database.AbstractCursor.moveToFirst(AbstractCursor.java:271)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at com.celzero.bravedns.database.BlockedConnectionsDAO_Impl$10.call(BlockedConnectionsDAO_Impl.java:663)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at com.celzero.bravedns.database.BlockedConnectionsDAO_Impl$10.call(BlockedConnectionsDAO_Impl.java:657)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--at androidx.room.RoomTrackingLiveData$1.run(RoomTrackingLiveData.java:90)
07-28 01:41:24.372 22304 22365 E AndroidRuntime: >--... 3 more      

Handle such database corruption by:

  1. Detecting it and informing the user via a dialog / notification that the database is being rebuilt because all state has been lost.
  2. Hydrating the tables with seed data whenever they go empty.
  3. Prompt for bug reports. #336

Likely, we must move from using GlobalScope co-routines for table updates and use viewModelScope wherever appropriate. #243

Possibly, buffer writes (batch-write) into network-log and dns-log tables. #343

ignoramous avatar Jul 27 '21 21:07 ignoramous

One of the possible case where the database corruption can happen is explained in the below link,

https://www.fatalerrors.org/a/crash-sqlitedatabase-corruption-exception-file-is-encrypted-or-is-not-a-database.html

Adding the setJournalMode(JournalMode.TRUNCATE) to the App database builder as suggested.

Android 9 introduces a special mode of SQLiteDatabase called "write ahead logging", which allows the database to use journal_mode=WAL, while retaining the behavior of creating at most one connection per database.

WAL changes the way the database works and throws this exception in some states. You need to manually turn off the WAL mode according to the above solution to fix this problem.

hussainmohd-a avatar Jul 28 '21 20:07 hussainmohd-a

Journal mode has been changed back to WAL (since the suggestion linked to from above isn't really based on any concrete facts or anything) since readers on the underlying sqlite3 tables has increased.

Besides, there's been a bunch of other improvements in batching writes (#343) and splitting away the write-heavy logs-related tables (ie, network log & dns log tables) in to its own database (#574), which should improve and isolate corruptions a bit.

ignoramous avatar Jan 12 '23 15:01 ignoramous