#4889 closed defect (notsugar)
virtual memory leak in background control panel
Reported by: | quozl | Owned by: | |
---|---|---|---|
Priority: | Normal | Milestone: | Unspecified |
Component: | Sugar | Version: | Unspecified |
Severity: | Minor | Keywords: | |
Cc: | Distribution/OS: | OLPC | |
Bug Status: | New |
Description
Virtual memory size of the shell increases by 20 Mb for each ten launches of the background control panel.
To reproduce, capture virtual memory size at one second intervals;
while true; do echo $(date +%s.%N) \ $(awk '/VmSize/ { print $2; }' /proc/$(pgrep -f jarabe)/status) sleep 1 done | tee memory.log
Then open the background control panel, close it, and repeat ten times. Compare the starting VmSize to the ending VmSize.
A graph to be attached.
Cause might be Fedora 18, Python, libraries, or Sugar.
Attachments (2)
Change History (13)
Changed 8 years ago by quozl
comment:1 Changed 8 years ago by quozl
The VmSize leak rate for background control panel show and hide is of the order of 151 kB per second.
A leak occurs with all GTK+ drawing operations done by Sugar. It even reproduces when scrolling the Journal, at a rate of 24 kB per second.
Automated reproducers:
sudo yum install xdotool # journal scrolling # prepare by adding journal entries until scroll bar is present function test { xdotool \ mousemove 1191 801 click 1 sleep 0.2 \ mousemove 1191 200 click 1 sleep 0.2 } # background control panel # prepare by using f3 then position cursor in search box function test { xdotool \ mousemove 600 450 click 3 sleep 0.7 \ mousemove 700 660 click 1 sleep 0.7 \ mousemove 935 270 click 1 sleep 7 \ key Escape sleep 0.1 \ key Escape sleep 1 } # test runner and vmsize logger while true; do echo $(date +%s.%N) \ $(awk '/VmSize/ { print $2; }' /proc/$(pgrep -f jarabe)/status) test done | tee memory.log
Log file analysis tool:
#!/usr/bin/python import glob for name in glob.glob('memory*.log'): lines = file(name, 'r').readlines() t0 = float(lines[0].split()[0]) t1 = float(lines[-1].split()[0]) v0 = float(lines[0].split()[1]) v1 = float(lines[-1].split()[1]) print '%.0f | %s' % (((v1 - v0) / (t1 - t0)), name)
comment:2 Changed 8 years ago by quozl
Further testing on XO-1.5 with 1 GB RAM.
According to changes in /proc/$(pgrep -f jarabe)/smaps the area of virtual memory that leaks corresponds to the heap. The virtual memory is also part of the resident set VmRSS, since there is no swap.
Eventually the heap consumes system memory and Jarabe hangs. Before it hangs, it grows slower and slower. The test above had to be changed to give time for Jarabe to respond;
function test { xdotool \ mousemove 600 450 click 3 sleep 30 \ mousemove 700 660 click 1 sleep 30 \ mousemove 935 270 click 1 sleep 100 \ key Escape sleep 1 \ key Escape sleep 40 }
Next step might be to use a tool to analyse what is in heap.
comment:3 Changed 8 years ago by quozl
On an XO-1,
- after five uses of the background control panel, free memory is fully consumed, and the user interface response times begins to increase,
- after ten uses, user interface response times are more than double normal times, and swap space is being used.
A quick fix may be to force a restart after changing background.
comment:4 Changed 8 years ago by SAMdroid
Testing on sugar-build. I used a library python to view leaked objects [1]. I started the SummaryTracker in the show_section_view function like:
diff --git a/src/jarabe/controlpanel/gui.py b/src/jarabe/controlpanel/gui.py index 442857d..8a51917 100644 --- a/src/jarabe/controlpanel/gui.py +++ b/src/jarabe/controlpanel/gui.py @@ -31,6 +31,7 @@ from jarabe.controlpanel.toolbar import MainToolbar from jarabe.controlpanel.toolbar import SectionToolbar from jarabe import config from jarabe.model import shell +from pympler import tracker _logger = logging.getLogger('ControlPanel') @@ -193,7 +194,9 @@ class ControlPanel(Gtk.Window): if self._section_view is not None: self._section_view.destroy() self._section_view = None + self._tr.print_diff() + self._tr = tracker.SummaryTracker() self._set_toolbar(self._main_toolbar) self._main_toolbar.show() self._set_canvas(self._scrolledwindow)
Here are the outputs:
types | # objects | total size ============================== | =========== | ============ list | 5343 | 543.05 KB str | 5422 | 335.25 KB int | 627 | 14.70 KB code | 21 | 2.62 KB dict | 1 | 2.52 KB <class 'gi.types.GObjectMeta | 2 | 1.77 KB tuple | 12 | 1.14 KB method_descriptor | 12 | 864 B weakref | 5 | 440 B wrapper_descriptor | 5 | 400 B function (undo) | 3 | 360 B function (__init__) | 2 | 240 B member_descriptor | 3 | 216 B function (do_get_property) | 1 | 120 B function (do_set_property) | 1 | 120 B
types | # objects | total size ========================================================= | =========== | ============ set | 1 | 232 B code | 1 | 128 B function (store_info) | 1 | 120 B cell | 2 | 112 B float | 1 | 24 B list | 0 | 8 B str | -1 | -50 B tuple | -1 | -56 B <class 'sugar3.graphics.icon._IconBuffer | -1 | -64 B <class 'pympler.tracker.SummaryTracker | -1 | -64 B instance | -1 | -72 B <class 'jarabe.model.buddy.BuddyModel | -1 | -80 B <class 'jarabe.view.buddyicon.BuddyIcon | -1 | -80 B <class 'gi.repository.SugarGestures.LongPressController | -1 | -80 B <class 'sugar3.graphics.palettewindow.CursorInvoker | -1 | -80 B
types | # objects | total size ===================================== | =========== | ============ dict | 36 | 30.84 KB str | 48 | 8.90 KB instancemethod | 92 | 7.19 KB <class 'dbus.connection.SignalMatch | 37 | 6.94 KB dbus.String | 19 | 5.23 KB list | 29 | 2.59 KB tuple | 38 | 2.38 KB cell | 40 | 2.19 KB function (signal_cb) | 17 | 1.99 KB function (callback) | 17 | 1.99 KB <class 'dbus.bus.NameOwnerWatch | 17 | 1.06 KB dbus.ObjectPath | 6 | 957 B long | 26 | 728 B dbus.ByteArray | 2 | 662 B set | 0 | 512 B
types | # objects | total size ===================================== | =========== | ============ str | 46 | 8.64 KB <class 'dbus.connection.SignalMatch | 38 | 7.12 KB cell | 94 | 5.14 KB list | 44 | 3.40 KB function (signal_cb) | 19 | 2.23 KB function (callback) | 19 | 2.23 KB function (msg_reply_handler) | 15 | 1.76 KB <class 'dbus.bus.NameOwnerWatch | 19 | 1.19 KB dbus.ObjectPath | 7 | 1014 B function (error_cb) | 7 | 840 B functools.partial | 6 | 528 B dbus.lowlevel.PendingCall | 22 | 528 B dbus.UTF8String | 10 | 417 B code | 3 | 384 B <class 'threading._RLock | 5 | 320 B
At first appearence, most of these seem to come from the neighbour hood view. I need to do a better test, but yeah.
comment:5 Changed 8 years ago by quozl
- thanks, that's interesting. Looking forward to some more detailed tests. How well does the Pympler summary correlate to the VmSize change?
- some repeated mouse click testing, monitoring VmSize while simulating mouse clicks;
- Moon activity orientation button is unaffected (switches between southern and northern hemisphere view),
- Moon activity grid button is unaffected (overlays a grid or not),
- Moon activity tool bar is affected a tiny bit,
- StopWatch activity tool bar sharing icon is affected a tiny bit,
- an F1 F3 key press loop shows only 21 kbyte/sec leak on XO-1, not the 85 kbyte/sec leak on XO-4 found earlier, so either architecture or the number and complexity of activity icons matters,
- for the record, some earlier discussion of this problem is in pr 520, with an analysis of potential relevant code by Gonzalo,
comment:6 Changed 8 years ago by quozl
- the leak of virtual memory in the background control panel reproduces on Ubuntu 14.04 on x86_64, and Sugar 0.107.0, at a rate of 145 kbyte/sec, and when graphed is a linear ramp,
- a stress test of Icon shows no leak, VmSize is stable, (Hello World was used, see the replaced activity.py),
- increasing the number of icons in the test from 40 to 100 causes the VmSize to ramp up and down like a sawtooth wave as the LRU surface cache is overburdened; in icon.py it is set to 50 surfaces, and because the surfaces are different sizes in the test the result is expected,
comment:7 Changed 8 years ago by SAMdroid
It's not just the background cpanel code causing leaks, it is also the whole cpanel ui.
Look at the test results for the frame cpanel (attached). Opening and closing the frame cpanel using my mouse causes a 25kbyte/sec increase in memory usage. The frame cpanel contains just Gtk+ ranges and labels - no fancy pixbuf loading.
Changed 8 years ago by SAMdroid
comment:8 Changed 8 years ago by quozl
Agreed. However the Background panel consumes the most memory so is easier to test with. A mine canary.
I was starting to worry that it might be the unusual way that control panel section views are called, in show_section_view() in src/jarabe/controlpanel/gui.py:
mod = __import__('.'.join(('cpsection', option, 'view')), globals(), locals(), ['view']) view_class = getattr(mod, self._options[option]['view'], None)
But still a 175 kbyte/sec leak occurs if the background panel is modified to rescan and reload the images in the Gtk.ListView when the "Clear background" button is pressed. See first file changed in my test patch. That's not closing and opening the panel.
comment:9 Changed 8 years ago by quozl
Comparing an XO-1 with OLPC OS 13.2.1 with Sugar 0.98 against OLPC OS 13.2.6 with Sugar 0.107.0, using only repeated frame control panel entry and exit, automated using xdotool.
OLPC OS and Sugar | VmSize Leak KBytes/second |
13.2.1 and 0.98 | 3.711 |
13.2.6 and 0.107.0 | 1.887 |
The leak is reduced by later OLPC OS and Sugar version.
comment:10 Changed 8 years ago by quozl
- Resolution set to notsugar
- Status changed from new to closed
Testing with valgrind confirms many pygobject memory leaks exist.
Interesting tickets;
- https://bugzilla.gnome.org/show_bug.cgi?id=726999#c7
- https://bugzilla.gnome.org/show_bug.cgi?id=735018
Doesn't seem to be caused by Sugar. Closing ticket.
comment:11 Changed 5 years ago by aperezbios
For posterity, documenting that GNOME Bug 726999 was closed as RESOLVED FIXED back on 2014-08-18
The other bug does not appear to have been fixed, ever, and was closed as RESOLVED OBSOLETE on 2018-01-10
XO-1.5 OLPC OS 13.2.5 Sugar 0.106.1 My Settings Background VmSize leak test, eleven cycles, 2015-10-06, by James Cameron