Opened 7 years ago

Closed 7 years ago

Last modified 6 years ago

#3670 closed defect (fixed)

Browse segfaults when closing

Reported by: erikos Owned by: erikos
Priority: Immediate Milestone:
Component: Browse Version: 0.96.x
Severity: Major Keywords: 13.1.0, patch olpc-test-passed
Cc: dsd, manuq, humitos, benzea Distribution/OS: OLPC
Bug Status: Assigned

Description

The segfault is visible when looking into the log file.

Attachments (7)

0001-Remove-GObject-threads-SL-3670.patch (1.1 KB) - added by manuq 7 years ago.
pygobject.patch (1.3 KB) - added by benzea 7 years ago.
The untested patch, sorry, it doesn't want to compile here right now.
session-bug.patch (557 bytes) - added by manuq 7 years ago.
Simple script that shows the issue.
test_session-2.py (179 bytes) - added by manuq 7 years ago.
Simpler test script from benzea
pygobject.2.patch (1.6 KB) - added by benzea 7 years ago.
Tested patch, this fixes the crash.
browse-crash-fix.patch (1.0 KB) - added by benzea 7 years ago.
Nasty workaround that could be done in the activity.
0001-Do-not-do-any-python-calls-when-GObjects-are-destroy.patch (2.0 KB) - added by erikos 7 years ago.
Patch that applies on pygobject master

Download all attachments as: .zip

Change History (31)

comment:1 follow-up: Changed 7 years ago by erikos

  • Cc dsd manuq added
Program received signal SIGSEGV, Segmentation fault.
new_threadstate (interp=interp@entry=0x0, init=init@entry=1) at /usr/src/debug/Python-2.7.3/Python/pystate.c:200
200	        tstate->next = interp->tstate_head;
(gdb) where
#0  new_threadstate (interp=interp@entry=0x0, init=init@entry=1) at /usr/src/debug/Python-2.7.3/Python/pystate.c:200
#1  0x4918733e in PyThreadState_New (interp=0x0) at /usr/src/debug/Python-2.7.3/Python/pystate.c:211
#2  0x49187cc6 in PyGILState_Ensure () at /usr/src/debug/Python-2.7.3/Python/pystate.c:598
#3  0x00231706 in pyglib_gil_state_ensure () at pyglib.c:100
#4  0x00254d9b in pygobject_data_free (data=0x9cbd900) at pygobject.c:60
#5  0x003acc93 in g_datalist_clear (datalist=datalist@entry=0x98ba018) at gdataset.c:283
#6  0x001cbf09 in g_object_finalize (object=0x98ba010) at gobject.c:1018
#7  0x45b84119 in finalize (object=0x98ba010) at soup-session.c:269
#8  0x001cbba8 in g_object_unref (_object=0x98ba010) at gobject.c:3018
#9  0x012503ed in webkitExit () at Source/WebKit/gtk/webkit/webkitglobals.cpp:280
#10 0x48e0ed01 in __run_exit_handlers (status=status@entry=0, listp=0x48f873c4, run_list_atexit=run_list_atexit@entry=true) at exit.c:78
#11 0x48e0ed8d in __GI_exit (status=0) at exit.c:100
#12 0x48df663d in __libc_start_main (main=0x8048550 <main>, argc=10, ubp_av=0xbf8312d4, init=0x8048680 <__libc_csu_init>, fini=0x80486f0 <__libc_csu_fini>, rtld_fini=
    0x48dc9a90 <_dl_fini>, stack_end=0xbf8312cc) at libc-start.c:258
#13 0x080485a1 in _start ()
(gdb) list
195	
196	        if (init)
197	            _PyThreadState_Init(tstate);
198	
199	        HEAD_LOCK();
200	        tstate->next = interp->tstate_head;
201	        interp->tstate_head = tstate;
202	        HEAD_UNLOCK();
203	    }
204	

comment:2 Changed 7 years ago by manuq

Help activity, which also uses webkitgtk3, doesn't have this issue.

comment:3 Changed 7 years ago by manuq

The issue is gone if WebKit.get_default_session() call is removed, so I have to figure out what's the problem with it:

diff --git a/webactivity.py b/webactivity.py
index d33913f..5ed5272 100644
--- a/webactivity.py
+++ b/webactivity.py
@@ -154,15 +154,15 @@ class WebActivity(activity.Activity):
 
         _logger.debug('Starting the web activity')
 
-        session = WebKit.get_default_session()
-        session.set_property('accept-language-auto', True)
+#        session = WebKit.get_default_session()
+#        session.set_property('accept-language-auto', True)
 
         # By default, cookies are not stored persistently, we have to
         # add a cookie jar so that they get saved to disk.  We use one
         # with a SQlite database:
         cookie_jar = SoupGNOME.CookieJarSqlite(filename=_cookies_db_path,
                                                read_only=False)
-        session.add_feature(cookie_jar)
+#        session.add_feature(cookie_jar)
 
         _seed_xs_cookie(cookie_jar)

comment:4 Changed 7 years ago by erikos

  • Bug Status changed from Unconfirmed to Assigned

Ok, so the issue is inside the cleanup of webkit, see Source/WebKit/gtk/webkit/webkitglobals.cpp like mentioned in the trace. We fail in atexit(webkitExit), which does the unref of the session: g_object_unref(webkit_get_default_session()).

So suspecting the default session was a good one. There are threads as well involved, we do in Browse "GObject.threads_init()" which wehn you add it to your test program you will be able to reproduce the issue there as well.

We are close now :)

comment:5 Changed 7 years ago by manuq

Great finding erikos,

so as git history says, threads are there to fix this old bug: http://dev.laptop.org/ticket/5639

Which says:

Browse will crash if two things are done in order:

    1) the window is scrolled down at least once, and
    2) the zoom-in button is pressed in View

But I removed threads and I can't reproduce the bug. Tested in the XO and in my dev machine with sugar-jhbuild. Tried scrolling with the mouse, dragging the scrollbar, having the hand key pressed, and with the game pad. Nothing breaks.

So, this patch would fix it.

Changed 7 years ago by manuq

comment:6 Changed 7 years ago by humitos

  • Cc humitos added
  • Keywords patch added

comment:7 in reply to: ↑ 1 ; follow-ups: Changed 7 years ago by humitos

Replying to erikos:

Program received signal SIGSEGV, Segmentation fault.

I don't see the "Segmentation fault." on the log. This is what I get

Terminated by signal 11, pid 429 data (None, <open file '<fdopen>',
mode 'w' at 0x96a5020>, 'b3e26c69860201b3c492994bb7504a3980ea963c')

Signal 11 is the same as SIGSEGV. Where did you get that information?

After aplying the patch I get this:

Exited with status 0, pid 1480 data (None, <open file '<fdopen>',
mode 'w' at 0x96a5020>, 'a67b0adeb68f644d390346936c1be85e828b4338')

Well, reading this document (it's for gtk2, but it's the same concept) I found that if we want to update the GIU by another thread we must use GObject.threads_init() and then GObject.idle_add() but it doesn't say what's up if we use idle_add without threads_init which is our case now[1] that we removed GObject.threads_init()

[1] I'm not sure what is the function of that line

comment:8 in reply to: ↑ 7 Changed 7 years ago by manuq

Replying to humitos:

Replying to erikos:

Signal 11 is the same as SIGSEGV. Where did you get that information?

From wikipedia, "SIGSEGV is the signal sent to a process when it makes an invalid memory reference, or segmentation fault"

Well, reading this document (it's for gtk2, but it's the same concept) I found that if we want to update the GIU by another thread we must use GObject.threads_init() and then GObject.idle_add() but it doesn't say what's up if we use idle_add without threads_init which is our case now[1] that we removed GObject.threads_init()

[1] I'm not sure what is the function of that line

idle_add is used to do a call when the processor is idle, activities use it a lot without involving threads.

comment:9 in reply to: ↑ 7 Changed 7 years ago by manuq

Replying to humitos:

Replying to erikos:

Program received signal SIGSEGV, Segmentation fault.

I don't see the "Segmentation fault." on the log. This is what I get

Note that erikos is using gdb for debugging segfault.

comment:10 Changed 7 years ago by benzea

OK, just happened to see the e-mail thread, and then thinking that this was weird.

From the stack trace it looks to me like the Python Interpreter is already destroyed, but at the same time webkit is still doing some cleaning up and destroys the default session. This calls into pygobject even though the interpreter has already quit.

So, it seems to me that removing the init_threads() is just hiding the real problem. The real question is: Why are there still python objects attached to a GObject even though the interpreter has already quit?

Smells like a pygobject bug to me, it should be destroying all the objects, or guard against this.

comment:11 Changed 7 years ago by benzea

  • Cc benzea added

comment:12 Changed 7 years ago by humitos

OK. I think that GObject.threads_init() allow us to use the Python's threading module, e.i "import threading".

I checked that the threading module is imported in Browse in the pdfviewer.pdf but it's not used (and should be removed) and I think that it was added some time before.

Besides, Get Books uses Python's threading module to download things from the Internet (opds.py file) and if I comment the GObject.threads_init() line it don't work anymore.

The doc link seems to miss something in the middle of the explanation: pay attention to the "the the". Even more, if you copy and paste that line on a text entry you will see a big space between the "the the" words -> that made me think about there is something missing there...

comment:13 Changed 7 years ago by benzea

GObject.threads_init() initializes the threading support inside pygobject. It is essential to call it when using threads.

I expect that this could happen if webkit starts a thread and then calls back into python in a signal handler. I do expect that webkit is using threads internally, though it may be that it ensures that callbacks only happen from the main thread.

I have cooked up a patch against gobject introspection, haven't even compile tested it yet though. Basically the idea is to not do any python calls when GObjects are destroyed after the python interpreter has been finalized.

Changed 7 years ago by benzea

The untested patch, sorry, it doesn't want to compile here right now.

Changed 7 years ago by manuq

Simple script that shows the issue.

Changed 7 years ago by manuq

Simpler test script from benzea

Changed 7 years ago by benzea

Tested patch, this fixes the crash.

comment:14 Changed 7 years ago by manuq

Thanks a lot Benzea! Upstream ticket at https://bugzilla.gnome.org/show_bug.cgi?id=678046

Changed 7 years ago by benzea

Nasty workaround that could be done in the activity.

comment:15 Changed 7 years ago by benzea

So, I just attached a new patch against browse. This is a nasty workaround for the issue, but it will work, and can safely be removed once a final solution has been found upstream.

The idea is to add a reference to the session object so that the cleanup routine of WebKitGtk will not destroy it.

Changed 7 years ago by erikos

Patch that applies on pygobject master

comment:16 Changed 7 years ago by erikos

  • Milestone changed from 0.96 to 0.98
  • Severity changed from Critical to Major

Pushed to master and 0.96 branch: http://git.sugarlabs.org/browse/mainline/commit/480cd56ff330035310a0a24b736e4256a8cc76f4

For the upstream pygobject fix moving out to keep track of the progress.

comment:17 Changed 7 years ago by manuq

  • Keywords olpc-test-pending added
  • Resolution set to fixed
  • Status changed from new to closed

Fixed in os15 from olpc, which has the latest pygobject.

comment:18 Changed 7 years ago by manuq

  • Resolution fixed deleted
  • Status changed from closed to reopened

Not really, we still have PyGObject 3.2.2 and the fix is in 3.3.3 .

comment:19 Changed 7 years ago by manuq

  • Keywords olpc-test-pending removed

comment:20 Changed 7 years ago by dsd

  • Keywords 13.1.0 added; 12.1.0 removed
  • Resolution set to fixed
  • Status changed from reopened to closed

3.3.4 is shipped in 13.1.0 build 1, this can now be tested.

comment:21 Changed 7 years ago by dsd

  • Keywords olpc-test-pending added

comment:22 Changed 7 years ago by greenfeld

  • Keywords olpc-test-passed added; olpc-test-pending removed

I do not see Browse segfaulting on close in 13.1.0 os1.

comment:23 Changed 7 years ago by manuq

We reverted the change that removed the GObject threads in b0c0ba9a because the upstream bug is fixed, and Wikipedia was crashing without it.

comment:24 Changed 6 years ago by dnarvaez

  • Milestone 0.98 deleted

Milestone 0.98 deleted

Note: See TracTickets for help on using tickets.