Ticket #1123 (closed defect: fixed)

Opened 4 years ago

Last modified 3 years ago

Sugar crashing when saving a journal entry upon closing an activity

Reported by: sdz Owned by: alsroot
Priority: Unspecified by Maintainer Milestone: 0.86
Component: sugar Version: 0.85.x
Severity: Blocker Keywords: r!
Cc: sdz, sascha_silbe, alsroot, benzea Distribution/OS: Unspecified
Bug Status: New

Description

In the most recent SoaS snapshot, X gets apparently killed each time one saves a Journal entry. At least, the user is logged out and logged in again. I haven't seen this in 0.85.2, so this might have been introduced with 0.85.3. But I'm just guessing here, it could also be caused by the new display manager in SoaS... basically putting it here as a self-reminder to investigate.

Attachments

sugar-1123.patch Download (2.3 KB) - added by alsroot 4 years ago.
moving child_watch_add's callback to standalone method fixes this issue(looks like ActivityCreationHandler object is died in place of invoking child_watch_add's callback)
wm-test.patch Download (1.8 KB) - added by benzea 4 years ago.
Patch that adds error traps to wm.py
wm.patch Download (2.4 KB) - added by benzea 4 years ago.
ok, newer better

Change History

  Changed 4 years ago by sascha_silbe

  • cc sascha_silbe added

  Changed 4 years ago by sascha_silbe

  • status changed from new to assigned
  • severity changed from Unspecified to Blocker
  • cc sdz added; tomeu removed
  • owner changed from sdz to tomeu
  • distribution changed from SoaS to Unspecified
  • status_field changed from Unconfirmed to New

Happened to me today in sugar-jhbuild on DebXO:

==> shell.log <==
1249389164.020492 DEBUG root: __reply_handler_cb
The program 'sugar-session' received an X Window System error.
This probably reflects a bug in the program.
The error was 'BadWindow (invalid Window parameter)'.
  (Details: serial 12424 error_code 3 request_code 20 minor_code 0)
  (Note to programmers: normally, X errors are reported asynchronously;
   that is, you will receive the error a while after causing it.
   To debug your program, run it with the --sync command line
   option to change this behavior. You can then get a meaningful
   backtrace from your debugger if you break on the gdk_x_error() function.)

The first try (open Calculate, change title, close) worked fine, on the second I got the above crash.

The last entries from datastore.log were:

==> datastore.log <==
1249389163.991101 DEBUG root: datastore.update dbus.String(u'242fcbda-1d6c-4d98-b8f2-6b0a4bd3abe3')
1249389163.995086 DEBUG root: IndexStore.flush: False 5
1249389163.998238 DEBUG root: FileStore moving from dbus.String(u'/home/sascha.silbe/.sugar/default/org.laptop.Calculate/instance/1249389163') to u'/home/sascha.silbe/.sugar/default/datastore/24/242fcbda-1d6c-4d98-b8f2-6b0a4bd3abe3/data'
1249389164.002992 DEBUG org.laptop.sugar.DataStore: _update_completion_cb() called with <function <lambda> at 0x9a4133c> / <function <lambda> at 0x9a41374>, exc None
1249389164.007408 DEBUG root: optimize u'/home/sascha.silbe/.sugar/default/datastore/checksums/queue/242fcbda-1d6c-4d98-b8f2-6b0a4bd3abe3'
1249389164.007943 DEBUG org.laptop.sugar.DataStore: updated 242fcbda-1d6c-4d98-b8f2-6b0a4bd3abe3
1249389164.009260 DEBUG root: datastore.get_properties dbus.String(u'242fcbda-1d6c-4d98-b8f2-6b0a4bd3abe3')
1249389164.013969 DEBUG root: datastore.find dbus.Dictionary({dbus.String(u'limit'): dbus.Int32(5, variant_level=1), dbus.String(u'order_by'): dbus.Array([dbus.String(u'+timestamp')], signature=dbus.Signature('s'), variant_level=1), dbus.String(u'activity'): dbus.String(u'org.laptop.Calculate', variant_level=1)}, signature=dbus.Signature('sv'))
1249389164.014454 DEBUG root: _parse_query dbus.Dictionary({dbus.String(u'activity'): dbus.String(u'org.laptop.Calculate', variant_level=1)}, signature=dbus.Signature('sv'))
1249389164.016082 DEBUG org.laptop.sugar.DataStore: find(): 0.0016891956329345703
1249389164.017376 DEBUG root: datastore.get_properties dbus.String(u'242fcbda-1d6c-4d98-b8f2-6b0a4bd3abe3')
1249389164.021708 DEBUG root: _process_entry_cb processing '242fcbda-1d6c-4d98-b8f2-6b0a4bd3abe3'
1249389224.052541 DEBUG root: IndexStore.flush: True 6

sugar and sugar-toolkit were pristine, sugar-datastore had my WIP prefix support changes (#1090).

follow-up: ↓ 6   Changed 4 years ago by tomeu

I suspect we are trying to get a preview on a window that has already been destroyed.

  Changed 4 years ago by sascha_silbe

  • summary changed from Sugar kills X every time when saving a Journal entry to Sugar crashing when saving a journal entry upon closing an activity

I suspect we are trying to get a preview on a window that has already been destroyed.

Plausible theory, given that it also crashed for me when closing the object picker after encountering #1192.

  Changed 4 years ago by alsroot

  • cc alsroot added

in reply to: ↑ 3   Changed 4 years ago by alsroot

Replying to tomeu:

I suspect we are trying to get a preview on a window that has already been destroyed.

looks like nope,
I've commented get_preview() invoking in sugar.activity.activity.Activity.save() and got the same problem

#0  0x00007f8a76e659ca in PyErr_Fetch () from /usr/lib/libpython2.6.so.1.0
(gdb) bt
#0  0x00007f8a76e659ca in PyErr_Fetch () from /usr/lib/libpython2.6.so.1.0
#1  0x00007f8a6c00219c in dbus_py_variant_level_clear () from /usr/lib64/python2.6/site-packages/_dbus_bindings.so
#2  0x00007f8a6c0021f9 in DBusPyStrBase_tp_dealloc () from /usr/lib64/python2.6/site-packages/_dbus_bindings.so
#3  0x00007f8a76dfe6eb in dict_dealloc () from /usr/lib/libpython2.6.so.1.0
#4  0x00007f8a76e1f5b3 in subtype_dealloc () from /usr/lib/libpython2.6.so.1.0
#5  0x00007f8a76dfe6eb in dict_dealloc () from /usr/lib/libpython2.6.so.1.0
#6  0x00007f8a76e1f5b3 in subtype_dealloc () from /usr/lib/libpython2.6.so.1.0
#7  0x00007f8a76dfe6eb in dict_dealloc () from /usr/lib/libpython2.6.so.1.0
#8  0x00007f8a7326c07e in pygobject_dealloc () from /usr/lib64/python2.6/site-packages/gtk-2.0/gobject/_gobject.so
#9  0x00007f8a76e1f516 in subtype_dealloc () from /usr/lib/libpython2.6.so.1.0
#10 0x00007f8a76dd61b3 in instancemethod_dealloc () from /usr/lib/libpython2.6.so.1.0
#11 0x00007f8a73686f82 in child_watch_dnotify () from /usr/lib64/python2.6/site-packages/gtk-2.0/glib/_glib.so
#12 0x00007f8a747f843d in g_source_callback_unref () from /usr/lib/libglib-2.0.so.0
#13 0x00007f8a747f8a25 in g_source_destroy_internal () from /usr/lib/libglib-2.0.so.0
#14 0x00007f8a747f8ea8 in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#15 0x00007f8a747fc5a8 in g_main_context_iterate () from /usr/lib/libglib-2.0.so.0
#16 0x00007f8a747fca9d in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#17 0x00007f8a72924937 in gtk_main () from /usr/lib/libgtk-x11-2.0.so.0
#18 0x00007f8a72f6ce22 in _wrap_gtk_main () from /usr/lib64/python2.6/site-packages/gtk-2.0/gtk/_gtk.so
#19 0x00007f8a76e5b29d in PyEval_EvalFrameEx () from /usr/lib/libpython2.6.so.1.0
#20 0x00007f8a76e5b85d in PyEval_EvalFrameEx () from /usr/lib/libpython2.6.so.1.0
#21 0x00007f8a76e5cbc1 in PyEval_EvalCodeEx () from /usr/lib/libpython2.6.so.1.0
#22 0x00007f8a76e5ccd2 in PyEval_EvalCode () from /usr/lib/libpython2.6.so.1.0
#23 0x00007f8a76e777c1 in run_mod () from /usr/lib/libpython2.6.so.1.0
#24 0x00007f8a76e77896 in PyRun_FileExFlags () from /usr/lib/libpython2.6.so.1.0
#25 0x00007f8a76e78ddd in PyRun_SimpleFileExFlags () from /usr/lib/libpython2.6.so.1.0
#26 0x00007f8a76e8534b in Py_Main () from /usr/lib/libpython2.6.so.1.0
#27 0x00007f8a7617da26 in __libc_start_main () from /lib/libc.so.6
#28 0x00000000004006e9 in _start ()

  Changed 4 years ago by alsroot

look like registering wait-for-pid thread triggering this crash

 http://git.sugarlabs.org/projects/sugar-toolkit/repos/mainline/blobs/master/src/sugar/activity/activityfactory.py#line278

I've commented gobject.child_watch_add and it works w/o crashes

Changed 4 years ago by alsroot

moving child_watch_add's callback to standalone method fixes this issue(looks like ActivityCreationHandler object is died in place of invoking child_watch_add's callback)

  Changed 4 years ago by alsroot

  • keywords r? added

  Changed 4 years ago by tomeu

  • keywords r+ added; r? removed

  Changed 4 years ago by tomeu

  • owner changed from tomeu to alsroot

Please push it, though would be good to have a HACK comment referring to the upstream ticket, so we remove it when it gets fixed.

  Changed 4 years ago by alsroot

  • status changed from assigned to closed
  • resolution set to fixed

  Changed 4 years ago by benzea

  • status changed from closed to reopened
  • resolution fixed deleted

Reopening for now.

I have tried to reproduce a couple of time today (I removed the above patch for that, but don't think that is really relevant). So, I did get a backtrace with an X crash in the end which looks something like the following (sorry, no proper python bt):

-> X error here
/home/benjamin/Projects/sugar-jhbuild/install/lib/python2.5/site-packages/sugar/wm.py (37): get_bundle_id
/home/benjamin/Projects/sugar-jhbuild/install/lib/python2.5/site-packages/jarabe/model/shell.py (196): get_type
/home/benjamin/Projects/sugar-jhbuild/install/lib/python2.5/site-packages/jarabe/model/shell.py (200): is_journal
/home/benjamin/Projects/sugar-jhbuild/install/lib/python2.5/site-packages/jarabe/view/buddymenu.py (125): _update_invite_menu
/home/benjamin/Projects/sugar-jhbuild/install/lib/python2.5/site-packages/jarabe/view/buddymenu.py (79): _add_buddy_items
/home/benjamin/Projects/sugar-jhbuild/install/lib/python2.5/site-packages/jarabe/view/buddymenu.py (51): __init__
/home/benjamin/Projects/sugar-jhbuild/install/lib/python2.5/site-packages/jarabe/view/buddyicon.py (33): __init__
/home/benjamin/Projects/sugar-jhbuild/install/lib/python2.5/site-packages/jarabe/desktop/meshbox.py (810): _add_alone_buddy
/home/benjamin/Projects/sugar-jhbuild/install/lib/python2.5/site-packages/jarabe/desktop/meshbox.py (791): _buddy_added_cb
/home/benjamin/Projects/sugar-jhbuild/install/lib/python2.5/site-packages/jarabe/model/neighborhood.py (218): _buddy_appeared_cb
/home/benjamin/Projects/sugar-jhbuild/install/lib/python2.5/site-packages/sugar/presence/presenceservice.py (204): _emit_buddy_appeared_signal
-> idle callback handler here

So, what seems to happen is that sugar tries to query the activity boundle of a destroyed X window here. It looks like a very hard to hit race condition here, but it can happen. I'll attach a patch that should work around this (though I have never been able to see a backtrace or even get a crash after I applied it).

The race condition would be:

  1. Activity is closed
  2. Buddy appears; idle handler registered
  3. idle handler _emit_buddy_appeared_signal starts running (sugar-session main loop is blocked)
  4. Activities X window is destroyed (sugar-session does not recieve the X event as the main loop blocked)
  5. sugar-session rebuilds the invite menu (_update_invite_menu)
  6. sugar-session tries to get a property of the destroyed window
  7. sugar will/would be notified that the window disappeared and the activity is not the active one anymore

  Changed 4 years ago by benzea

  • cc benzea added

Changed 4 years ago by benzea

Patch that adds error traps to wm.py

  Changed 4 years ago by tomeu

  • keywords r! added; r+ removed

That's great! What about adding a wrapper function around property_change with the trap? That would remove lots of duplicated code.

Changed 4 years ago by benzea

ok, newer better

follow-up: ↓ 16   Changed 4 years ago by benzea

  • status changed from reopened to closed
  • resolution set to fixed

in reply to: ↑ 15   Changed 4 years ago by sascha_silbe

Replying to benzea:

 http://git.sugarlabs.org/projects/sugar-toolkit/repos/mainline/commits/d304f72872fa346dd2c3cedd1d3e508a3aeb5107

Is it intentional that you pop twice in _property_get_trapped()?

  Changed 3 years ago by alsroot

  • keywords olpc-0.84 added

  Changed 3 years ago by alsroot

  • keywords olpc-0.84 removed
Note: See TracTickets for help on using tickets.