mail icon indicating copy to clipboard operation
mail copied to clipboard

Crash when conversation is marked as read

Open danirabbit opened this issue 3 years ago • 5 comments

What Happened?

I get a consistent crash whenever a conversation is marked as read via the timeout. First the conversation list is filled with junk and then I get a crash in ConversationItemModel trying to get MessageInfo flags

In OS 6.1, I can only reproduce the crash with commits after https://github.com/elementary/mail/pull/770 was merged

But in OS 7, I can reproduce it even with version 6.4.0

Steps to Reproduce

  1. Select an unread conversation
  2. Wait for it to change to read via the timeout
  3. Observe the list fill with junk
  4. Crash

Expected Behavior

No crash :)

OS Version

7.x (Early Access)

Software Version

Compiled from git

Log Output

Thread 1 "io.elementary.m" received signal SIGSEGV, Segmentation fault.
0x00007ffff7e436e9 in g_type_check_instance_is_fundamentally_a () from /lib/x86_64-linux-gnu/libgobject-2.0.so.0
(gdb) bt
#0  0x00007ffff7e436e9 in g_type_check_instance_is_fundamentally_a () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#1  0x00007ffff7e31a70 in g_object_get () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#2  0x00005555555a932b in mail_conversation_item_model_get_deleted (self=0x555555e72700) at ../src/ConversationList/ConversationItemModel.vala:186
#3  0x00005555555af068 in mail_conversation_list_box_filter_function (obj=0x555555e72700) at ../src/ConversationList/ConversationListBox.vala:349
#4  0x00005555555b22cf in _mail_conversation_list_box_filter_function_mail_conversation_list_store_row_visibility_func (row=0x555555e72700, self=0x0)
    at /home/dani/Projects/mail/build/ConversationListBox.c:5067
#5  0x00005555555b57a0 in mail_conversation_list_store_get_item_internal (self=0x555555955410, index=38, unfiltered=0) at ../src/ConversationList/ConversationListStore.vala:70
#6  0x00005555555b5432 in mail_conversation_list_store_real_get_item (base=0x555555955410, index=38) at ../src/ConversationList/ConversationListStore.vala:37
#7  0x00005555555d3864 in virtualizing_list_box_model_get_item (self=0x555555955410, index=38) at ../src/VirtualizingListBox/VirtualizingListBoxModel.vala:31
#8  0x00007ffff7232573 in g_list_model_get_object () at /lib/x86_64-linux-gnu/libgio-2.0.so.0
#9  0x00005555555d5d34 in virtualizing_list_box_get_widget (self=0x55555597a220, index=38) at ../src/VirtualizingListBox/VirtualizingListBox.vala:208
#10 0x00005555555d76fd in virtualizing_list_box_insert_bottom_widgets (self=0x55555597a220, bin_height=0x7fffffffcc34) at ../src/VirtualizingListBox/VirtualizingListBox.vala:427
#11 0x00005555555d7eb5 in virtualizing_list_box_ensure_visible_widgets (self=0x55555597a220, model_changed=1) at ../src/VirtualizingListBox/VirtualizingListBox.vala:478
#12 0x00005555555d612a in virtualizing_list_box_on_items_changed (self=0x55555597a220, position=0, removed=0, added=121) at ../src/VirtualizingListBox/VirtualizingListBox.vala:246
#13 0x00005555555db860 in _virtualizing_list_box_on_items_changed_g_list_model_items_changed (_sender=0x555555955410, position=0, removed=0, added=121, self=0x55555597a220)
    at ../src/VirtualizingListBox/VirtualizingListBox.vala:39
#14 0x00007ffff7e1ed2f in g_closure_invoke () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#15 0x00007ffff7e3ab76 in  () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#16 0x00007ffff7e3c554 in g_signal_emit_valist () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#17 0x00007ffff7e3c7a3 in g_signal_emit () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#18 0x00005555555ad5e4 in mail_conversation_list_box_folder_changed
    (self=0x55555597a220, change_info=0x555555fd5a90, service_uid=0x55555590e3c0 "a8424df2d19833d62ae360047bd50a58cd7e6c79", cancellable=0x555555e493a0)
    at ../src/ConversationList/ConversationListBox.vala:284
#19 0x00005555555aa6ea in _________lambda24_ (_data20_=0x555555802320, change_info=0x555555fd5a90) at ../src/ConversationList/ConversationListBox.vala:192
#20 0x00005555555aa717 in __________lambda24__camel_folder_changed (_sender=0x555555a16970, changes=0x555555fd5a90, self=0x555555802320) at ../src/ConversationList/ConversationListBox.vala:192
#21 0x00007ffff7e1ed2f in g_closure_invoke () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#22 0x00007ffff7e3ab76 in  () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#23 0x00007ffff7e3c554 in g_signal_emit_valist () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#24 0x00007ffff7e3c7a3 in g_signal_emit () at /lib/x86_64-linux-gnu/libgobject-2.0.so.0
#25 0x00007ffff6fc5b5d in  () at /lib/x86_64-linux-gnu/libcamel-1.2.so.63
#26 0x00007ffff7ebfc24 in g_main_context_dispatch () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#27 0x00007ffff7f146f8 in  () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#28 0x00007ffff7ebd3c3 in g_main_context_iteration () at /lib/x86_64-linux-gnu/libglib-2.0.so.0
#29 0x00007ffff724fc85 in g_application_run () at /lib/x86_64-linux-gnu/libgio-2.0.so.0
#30 0x0000555555574c4d in _vala_main (args=0x7fffffffdd68, args_length1=1) at ../src/Application.vala:150
#31 0x0000555555574c94 in main (argc=1, argv=0x7fffffffdd68) at ../src/Application.vala:148
(gdb)

Hardware Info

No response

danirabbit avatar May 24 '22 04:05 danirabbit

So I built EDS (3.44.2) and elementary Mail (7.0.0) with maximum debug symbols and no optimizations and ran the result through Valgrind (ASan didn't want to work, probably because I didn't instrument glib as the rebuild would've taken hours). I saw some initialization-related memory unsafety, but that didn't cause any visible issues. Might file a separate issue for that.

I saw two main issues related to this crash. First this use-after-free:

==387316== Invalid read of size 8
==387316==    at 0x439711: mail_conversation_item_model_get_flagged (ConversationItemModel.vala:131)
==387316==    by 0x43F0E7: mail_conversation_list_item_assign (ConversationListItem.vala:134)
==387316==    by 0x43AFA8: _mail_conversation_list_box___lambda32_ (ConversationListBox.vala:65)
==387316==    by 0x43AFA8: __mail_conversation_list_box___lambda32__virtualizing_list_box_row_factory_method (ConversationListBox.c:5130)
==387316==    by 0x45161A: virtualizing_list_box_get_widget (VirtualizingListBox.vala:219)
==387316==    by 0x452605: virtualizing_list_box_insert_bottom_widgets (VirtualizingListBox.vala:427)
==387316==    by 0x452605: virtualizing_list_box_ensure_visible_widgets (VirtualizingListBox.vala:478)
==387316==    by 0x453099: virtualizing_list_box_on_items_changed (VirtualizingListBox.vala:246)
==387316==    by 0x453099: _virtualizing_list_box_on_items_changed_g_list_model_items_changed (VirtualizingListBox.vala:39)
==387316==    by 0x49A361E: g_closure_invoke (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49B5636: signal_emit_unlocked_R (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49BBE70: g_signal_emit_valist (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49BC41E: g_signal_emit (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x43C628: mail_conversation_list_box_folder_changed (ConversationListBox.vala:284)
==387316==    by 0x43C628: _________lambda24_ (ConversationListBox.vala:192)
==387316==    by 0x43C628: __________lambda24__camel_folder_changed (ConversationListBox.vala:192)
==387316==    by 0x49A361E: g_closure_invoke (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==  Address 0x2457c4b8 is 1,464 bytes inside a block of size 1,536 free'd
==387316==    at 0x4843F6C: free (in /nix/store/3h5r17k60dr9236sjap7cmd7z3rjjaqg-valgrind-3.19.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==387316==    by 0x578BE31: camel_memchunk_destroy (camel-memchunk.c:347)
==387316==    by 0x5775147: camel_folder_thread_messages_apply (camel-folder-thread.c:816)
==387316==    by 0x43C4FA: mail_conversation_list_box_folder_changed (ConversationListBox.vala:251)
==387316==    by 0x43C4FA: _________lambda24_ (ConversationListBox.vala:192)
==387316==    by 0x43C4FA: __________lambda24__camel_folder_changed (ConversationListBox.vala:192)
==387316==    by 0x49A361E: g_closure_invoke (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49B5636: signal_emit_unlocked_R (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49BBE70: g_signal_emit_valist (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49BC41E: g_signal_emit (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x5779A48: folder_emit_changed_cb (camel-folder.c:311)
==387316==    by 0x48A9D03: g_main_context_dispatch (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x48AA0A7: g_main_context_iterate.constprop.0 (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x48AA15E: g_main_context_iteration (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==  Block was alloc'd at
==387316==    at 0x484179B: malloc (in /nix/store/3h5r17k60dr9236sjap7cmd7z3rjjaqg-valgrind-3.19.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==387316==    by 0x48AFFB8: g_malloc (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x578BB7A: camel_memchunk_alloc (camel-memchunk.c:133)
==387316==    by 0x578BB7A: camel_memchunk_alloc (camel-memchunk.c:116)
==387316==    by 0x578BBB8: camel_memchunk_alloc0 (camel-memchunk.c:160)
==387316==    by 0x5774981: thread_summary (camel-folder-thread.c:564)
==387316==    by 0x5775023: camel_folder_thread_messages_new (camel-folder-thread.c:752)
==387316==    by 0x43DB2D: mail_conversation_list_box_load_folder_co.isra.0 (ConversationListBox.vala:196)
==387316==    by 0x552B792: g_task_return_now (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgio-2.0.so.0.7200.1)
==387316==    by 0x552B7C8: complete_in_idle_cb (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgio-2.0.so.0.7200.1)
==387316==    by 0x48A9D03: g_main_context_dispatch (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x48AA0A7: g_main_context_iterate.constprop.0 (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x48AA15E: g_main_context_iteration (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)

This happened with slightly varying stack traces for occurrence, but exactly the same one for both allocation and free.

Then this use-after-free which immediately preceded the crash:

==387316== Invalid read of size 8
==387316==    at 0x578D217: camel_message_info_get_uid (camel-message-info.c:1712)
==387316==    by 0x43C59F: mail_conversation_list_box_folder_changed (ConversationListBox.vala:269)
==387316==    by 0x43C59F: _________lambda24_ (ConversationListBox.vala:192)
==387316==    by 0x43C59F: __________lambda24__camel_folder_changed (ConversationListBox.vala:192)
==387316==    by 0x49A361E: g_closure_invoke (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49B5636: signal_emit_unlocked_R (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49BBE70: g_signal_emit_valist (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49BC41E: g_signal_emit (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x5779A48: folder_emit_changed_cb (camel-folder.c:311)
==387316==    by 0x48A9D03: g_main_context_dispatch (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x48AA0A7: g_main_context_iterate.constprop.0 (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x48AA15E: g_main_context_iteration (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x555958C: g_application_run (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgio-2.0.so.0.7200.1)
==387316==    by 0x41D960: _vala_main (Application.vala:150)
==387316==    by 0x41D960: main (Application.vala:148)
==387316==  Address 0x20d48610 is 240 bytes inside a block of size 296 free'd
==387316==    at 0x4843F6C: free (in /nix/store/3h5r17k60dr9236sjap7cmd7z3rjjaqg-valgrind-3.19.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==387316==    by 0x49C45B5: g_type_free_instance (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x48766CF: ptr_array_free (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x577517B: camel_folder_thread_messages_apply (camel-folder-thread.c:820)
==387316==    by 0x43C4FA: mail_conversation_list_box_folder_changed (ConversationListBox.vala:251)
==387316==    by 0x43C4FA: _________lambda24_ (ConversationListBox.vala:192)
==387316==    by 0x43C4FA: __________lambda24__camel_folder_changed (ConversationListBox.vala:192)
==387316==    by 0x49A361E: g_closure_invoke (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49B5636: signal_emit_unlocked_R (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49BBE70: g_signal_emit_valist (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49BC41E: g_signal_emit (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x5779A48: folder_emit_changed_cb (camel-folder.c:311)
==387316==    by 0x48A9D03: g_main_context_dispatch (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x48AA0A7: g_main_context_iterate.constprop.0 (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==  Block was alloc'd at
==387316==    at 0x484179B: malloc (in /nix/store/3h5r17k60dr9236sjap7cmd7z3rjjaqg-valgrind-3.19.0/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==387316==    by 0x48AFFB8: g_malloc (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x48C8440: g_slice_alloc (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x48C8AB9: g_slice_alloc0 (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libglib-2.0.so.0.7200.1)
==387316==    by 0x49C4224: g_type_create_instance (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49A8DFC: g_object_new_internal (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49AA817: g_object_new_valist (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x49AAD48: g_object_new (in /nix/store/ckj0g3ycva2an405zf4mh4bvqhj44zr5-glib-2.72.1/lib/libgobject-2.0.so.0.7200.1)
==387316==    by 0x5773194: camel_read_mir_callback (camel-folder-summary.c:1984)
==387316==    by 0xAB69211: sqlite3_exec (in /nix/store/87l7isafm1izrgc398lzrl2sizp1v92v-sqlite-3.38.5/lib/libsqlite3.so.0.8.6)
==387316==    by 0x575C4C7: cdb_sql_exec (camel-db.c:567)
==387316==    by 0x575E754: camel_db_select (camel-db.c:1470)

After this one it immediately crashed because these 8 bytes of freed memory were used as a pointer and pointed into garbage.

Both of those use-after-frees seem to be caused by the same issue: they access memory freed by camel_folder_thread_messages_apply. Looking at that function it frees and recreates a lot of internal data structures. So something is holding onto the old references to these. I'm sadly not very familiar with the GLib/Vala memory model (beyond knowing that Vala is reference-counted) so I'm not sure where to go from here.

lorenz avatar Jun 10 '22 12:06 lorenz

I spent the whole evening debugging this and I was not able to pin things down.

It feels like there are multiple issues at play here: First and foremost the implementation of VirtualizingListBox does not seem to handle async data updates well. In addition, we seem to recycle ConversationListBoxRow's (by assigning new data) which are not meant to be recycled. This is probably related to a wrong row-selection in the list filter mechanism (which explains why we had to temporarily disable it in the past).

Since eOSv7 is (more or less) around the corner, I think it makes way more sense to focus on porting Mail to Gtk4 and retire VirtualizingListBox in favour of Gtk.ListView.

FWIW: I also found that crashes seem to happen way less often in folders which do not contain a lot of messages. This probably explains why I don't experience a lot of crashes on my end: I'm archiving all messages from the inbox once I'm done processing them.

marbetschar avatar Jun 13 '22 20:06 marbetschar

I can reliably reproduce it with a single message on a test account at least with optimizations turned off. It might very well behave in a probabilistic fashion if used with optimizations as it's a memory safety issue. What would be the time horizon to port to GTK4? It's quite annoying so I'd still put some value on fixing it if it takes more than a few weeks.

lorenz avatar Jun 13 '22 20:06 lorenz

Its reproduceable on my end as well. However, we already spent a lot of time debugging and fixing similar crashes - most of them were architectural problems somehow related to VirtualizingListBox.

I agree that its very annoying, but I do have limited time right now and even though porting to Gtk4 will take some time its a better use of resources in my opinion.

However, if you have some time to digg into this I'm happy to help wherever I can!

marbetschar avatar Jun 13 '22 21:06 marbetschar

So I spent some more time looking at this and I think this is not actually directly related to VirtualizingListBox. I disabled all widget recycling in there and this changed nothing. I looked into ConversationListBox and there are some rather big problems in there where a whole bunch of references are kept around, for example load_folder -> add_conversation_item keeps references to absolutely everything in the conversations property, all of which gets invalidated by the next folder_change call as that calls apply on the thread, freeing all the old references as they are passed through an unowned variable.

lorenz avatar Jul 11 '22 19:07 lorenz

This was fixed in #825

danirabbit avatar May 17 '23 16:05 danirabbit