Opened 14 years ago
Closed 10 years ago
#1407 closed defect (fixed)
Blanked metadata data-store entry damage (possibly caused during a shell crash)
Reported by: | garycmartin | Owned by: | martin.langhoff |
---|---|---|---|
Priority: | Unspecified by Maintainer | Milestone: | Unspecified |
Component: | Sugar | Version: | Git as of bugdate |
Severity: | Critical | Keywords: | |
Cc: | sascha_silbe, erikos | Distribution/OS: | Unspecified |
Bug Status: | Needinfo |
Description
I've twice seen a case where a single data-store entry is corrupted by having its metadata files zeroed out. This causes the Journal to display it as a MIME data default document icon, with no title, and that randomly cycles through different colour schemes as you move the mouse cursor over it. You can not access it's details view, and it has no palette so you can't erase it via the UI. With a recent (today) build it's also now showing a 0% download grey bar as well.
Finding and looking at the data-store entry in both cases showed the data was valid and intact (one case was a TA project, the other a Labyrinth mind-map). Looking at their metadata files, all were zero bytes, except for the checksum that looked like a valid hash.
I think in both cases Sugar had previously crashed (sugar-jhbuild and either a full Fedora crash or a Xephyr black screen requiring a force quit). But am not 100% sure as I spotted the Journal entries a while later. I think it likely that the specific journal entry was also resumed at the time.
Couple of screens shots attached. Will try to reproduce and get some logs to post if possible.
Attachments (3)
Change History (10)
Changed 14 years ago by garycmartin
Changed 14 years ago by garycmartin
comment:1 in reply to: ↑ description ; follow-up: ↓ 2 Changed 14 years ago by sascha_silbe
- Owner changed from tomeu to sascha_silbe
- Severity changed from Major to Critical
- Status changed from new to accepted
comment:2 in reply to: ↑ 1 Changed 14 years ago by tomeu
Replying to sascha_silbe:
We could modify metadatastore to ensure file content integrity even without data journalling (create new metadata directory, fsync() contents after writing, move new directory in place) but it would be a quite invasive patch, so
a) it won't make it into 0.86 and
b) I'm not sure it's worth the effort (with version support the impact will be much smaller and also easier to handle).
c) if sugar can tolerate entries without any metadata, may be better to at least have the data file.
Though, as Gary has mentioned that he has had several cases of this particular failure, I would suspect a bug in our DS, rather than general system flakiness.
comment:3 Changed 14 years ago by sascha_silbe
- Bug Status changed from Unconfirmed to Needinfo
Patch attached, please report back when you reproduced the issue.
comment:4 Changed 14 years ago by garycmartin
Patched and awaiting re-occurance (actually F11 KP'ed on me earlier today while I was setting up this test).
comment:5 Changed 11 years ago by martin.langhoff
- Cc sascha_silbe erikos added
- Owner changed from sascha_silbe to martin.langhoff
- Status changed from accepted to assigned
This seems to be exactly the issue addressed by http://lists.sugarlabs.org/archive/sugar-devel/2012-September/039729.html
comment:6 Changed 10 years ago by dnarvaez
- Component changed from sugar-datastore to Sugar
comment:7 Changed 10 years ago by dnarvaez
- Resolution set to fixed
- Status changed from assigned to closed
Replying to garycmartin:
There are actually two issues then:
a) datastore entry getting corrupted somehow
b) Journal behaving funkily on entries with unexpected (but syntactically valid) metadata
I suggest to open a new bug about the second isssue.
The checksum is set by optimizer.py using metadatastore..set_property(). So there are two possible "culprits":
a) "GUI"-side (shell / activity (framework))
b) datastore
I'll prepare a patch that traces update() calls. Please apply this patch, set SUGAR_LOGGER_LEVEL to trace (or all) and report back with the update() call for the broken entry when you encountered the bug again. Any recipe for reproducing this bug would be even better, of course. :)
A full-machine (=kernel) crash would be a likely candidate for these symptoms. By default, ext3/4 only ensures metadata (=directory entries) integrity, but _not_ file content (=content of our metadata entries) integrity. I'm using data=journal for virtually all of my filesystems for exactly this reason.
We could modify metadatastore to ensure file content integrity even without data journalling (create new metadata directory, fsync() contents after writing, move new directory in place) but it would be a quite invasive patch, so
a) it won't make it into 0.86 and
b) I'm not sure it's worth the effort (with version support the impact will be much smaller and also easier to handle).