Context Navigation

← Previous Ticket
Next Ticket →

#1160 closed defect (wontfix)

no way to wait for index to finish updating/(re)building

Reported by:	sascha_silbe	Owned by:	tomeu
Priority:	Unspecified by Maintainer	Milestone:
Component:	Sugar	Version:	Git as of bugdate
Severity:	Unspecified	Keywords:	r!
Cc:		Distribution/OS:	Unspecified
Bug Status:	New

Description

The datastore (re)builds the index in the background (idle loop), but has no API to wait for this to finish. That creates a race condition when using the datastore from a test suite:
The test suite needs to run from a clean directory every time (so it's deterministic), thus causing the datastore to do an index build on startup. The datastore is started automatically by DBus and the tests fire their API requests right away, so any API call that uses the index will fail to work as intended (for regular use).

Attachments (1)

0001-add-API-to-check-wait-for-index-rebuild-to-finish.patch (2.4 KB) - added by sascha_silbe 15 years ago.: add API to check and wait for index rebuild to finish

Download all attachments as: .zip

Change History (17)

Changed 15 years ago by sascha_silbe

Attachment 0001-add-API-to-check-wait-for-index-rebuild-to-finish.patch added

add API to check and wait for index rebuild to finish

comment:1 Changed 15 years ago by sascha_silbe

Keywords r? added

Implemented the check_ready() method and Ready() signal from my datastore redesign proposal. Also added a wait_ready() method that blocks the caller until the index (re)build has finished. pep8.py/pylint clean.

comment:2 in reply to: ↑ description Changed 15 years ago by tomeu

Replying to sascha_silbe:

The test suite needs to run from a clean directory every time (so it's deterministic), thus causing the datastore to do an index build on startup.

But in order to be deterministic, you also need to have the same data and metadata, having the same index is not enough. So, why not just delete the whole DS, then adding entries with Create?

comment:3 follow-up: ↓ 4 Changed 15 years ago by sascha_silbe

But in order to be deterministic, you also need to have the same data and metadata, having the same index is not enough. So, why not just delete the whole DS, then adding entries with Create?

That's exactly what I'm doing. The test suite creates a temporary directory and starts up a new datastore instance with HOME set to the temporary directory. But since an empty directory doesn't contain an index, the datastore will schedule an index (re)build. I need to wait for the index build to finish, otherwise API calls like get_uniquevalues() or find() won't work properly.

comment:4 in reply to: ↑ 3 Changed 15 years ago by tomeu

Replying to sascha_silbe:

But in order to be deterministic, you also need to have the same data and metadata, having the same index is not enough. So, why not just delete the whole DS, then adding entries with Create?

That's exactly what I'm doing. The test suite creates a temporary directory and starts up a new datastore instance with HOME set to the temporary directory. But since an empty directory doesn't contain an index, the datastore will schedule an index (re)build. I need to wait for the index build to finish, otherwise API calls like get_uniquevalues() or find() won't work properly.

Right, but I was suggesting creating the index cia create() calls, instead of reindexing the whole directory. Is this so impractical for your use case to warrant new API?

comment:5 Changed 15 years ago by tomeu

Resolution set to obsolete
Status changed from new to closed

Turned this was actually a bug in the DS because the index was marked as rebuilding even when it didn't existed at startup and there were no entries. Fixed that bug, I understand that this new API is not needed any more.

comment:6 Changed 15 years ago by tomeu

Keywords r? removed

comment:7 follow-up: ↓ 8 Changed 15 years ago by sascha_silbe

Keywords r? added
Resolution obsolete deleted
Status changed from closed to reopened

Still relevant because if the index gets corrupted (e.g. due to unclean shutdown on a non-dat journalling filesystem) it will be rebuilt in the background and return different results than during normal operation.

The old data store apparently had a similar method called complete_indexing() (though "indexing" meant full-text extraction there). My data store rewrite proposal contains the same API as proposed here with slightly different semantics as well ("regular" API calls are disallowed until the data store has finished recovering, instead of returning bogus results).

Even though there are slight semantic differences their usage by a caller is the same (wait for data store to be ready to server reliable results). As it's currently not used by any caller (complete_indexing() is still in sugar.datastore.datastore, but not in carquinyol.datastore) it's a low-risk change and would ease transition to the version-enabled data store.

comment:8 in reply to: ↑ 7 Changed 15 years ago by tomeu

Replying to sascha_silbe:

Still relevant because if the index gets corrupted (e.g. due to unclean shutdown on a non-dat journalling filesystem) it will be rebuilt in the background and return different results than during normal operation.

That's fine, but why add API if nobody will use it?

comment:9 follow-up: ↓ 10 Changed 15 years ago by sascha_silbe

Because I intend for it to get used. Introducing the API now allows for independant, small changes in other code.

comment:10 in reply to: ↑ 9 Changed 15 years ago by tomeu

Keywords r! added; r? removed

Replying to sascha_silbe:

Because I intend for it to get used. Introducing the API now allows for independant, small changes in other code.

I really believe we should be putting our energies in issues that are real right now. I also don't think it's a good idea to design APIs when we don't have real needs.