Opened 8 years ago

Closed 5 years ago

#2974 closed defect (fixed)

Resuming journal entries may cause Sugar to falsely report failed activity launch

Reported by: greenfeld Owned by: erikos
Priority: Low Milestone: Unspecified
Component: Sugar Version: 0.92.x
Severity: Unspecified Keywords:
Cc: Distribution/OS: OLPC
Bug Status: Unconfirmed

Description

Resuming journal entries may occasionally cause Sugar to falsely report that the Activty to resume the entry with failed to launch. So far I've had this happen once on an XO-1.5 resuming a simple TurtleArt script, and once resuming a Wikipedia activity session. No other activities were running on the XO at the time.

Pressing the button at the "XYZ failed to launch" message causes the launch screen to go away, and the activity with the Journal entry to use loaded can be found underneath it or via tabbing over to it. The activity will appear in the Frame as well with the icon for the activity instance still flashing as if it was launching.

May be a related to #2958, or just caused by similar conditions. I don't see any obvious explanation in sugar.log or the activity's log when this happens. Seen in 11.2.0 os873.

Change History (9)

comment:1 Changed 8 years ago by greenfeld

  • Component changed from untriaged to sugar

comment:2 Changed 8 years ago by walter

Turtle Art launch times vary depending upon the complexity of the resumed script. Can you please attach the script you were using that caused the problem?

comment:3 Changed 8 years ago by greenfeld

It was the same TurtleArt script being run over an over again, a simple:

  • Start block
  • Forward 100
  • Right 90

Which was then saved to a journal entry, the journal entry restarted, and then the activity stopped, the journal entry restarted, ....

The Wikipedia entry had a single favorite, a few pages of browse history, and a different page than the favorite set to the current one. It only happened the first time I resumed the entry.

Interestingly (for TurtleArt) the trace is not saved, but the turtle location/cursor is saved, which causes a different segment of the resulting square to be drawn each time.

comment:4 Changed 7 years ago by greenfeld

  • Owner set to erikos
  • Status changed from new to assigned

Martin and I are noticing a reasonably high rate of launch failures with OLPC 13.1.0/Sugar 0.97.x os versions. With his datastore patch Martin reports that launching Write "fails" 40% of the time on XO-4.

I am not seeing failure rates that high (going through a variety of activities) but this issue in the latest builds has clearly gone from once in a blue moon to periodic nuisance.

What do we have to do to diagnose this?

comment:5 Changed 7 years ago by dsd

  • Milestone changed from Unspecified by Release Team to 0.98

comment:6 Changed 7 years ago by godiard

  • Priority changed from Unspecified by Maintainer to Low

comment:7 Changed 6 years ago by dnarvaez

  • Milestone changed from 0.98 to Unspecified

comment:8 Changed 5 years ago by quozl

In Sugar 0.104.1 this works by scheduling a 90 second timeout in notify_launch method of ShellModel. The source id of the timeout is not kept, so there's no way it can be cancelled.

At the expiry of the timeout, _check_activity_launched is called; if the activity has been closed already, nothing is done. But if there is a launcher for the same activity id (which will happen if it is a resume from journal), then a launch failure will be reported spuriously.

A fix could be to delete the timeout if another instance is being started (see pr 481), or detect the process exit status and delete the timeout.

Last edited 5 years ago by quozl (previous) (diff)

comment:9 Changed 5 years ago by James Cameron

  • Resolution set to fixed
  • Status changed from assigned to closed

prevent duplicate launch timers

"failed to start" will occur if an activity is started exactly 90
seconds before any previous start. This is easy to reproduce if an
activity is repeatedly started and stopped.

This was caused by the previous launch timers. There was no limit to
the number of launch timers that may be pending.

Delete any existing launch timer before starting a new one. Also
removed stale FIXME for status checking; this is done in the toolkit.
(in _child_watch_cb method of activityfactory.py)

Fixes #2974.

Tested-by: James Cameron <quozl@…>
Signed-off-by: James Cameron <quozl@…>

Changeset: 6528ba50dfcf2fd59f22812387975beacc1f3fdc

Note: See TracTickets for help on using tickets.