EPrints manages its index as a table in the same mysql database that
holds the repository metadata. Every time a metadata field or
digital object is created or altered, it is added to an 'index
queue'. The indexer process indexes the items in the queue and then
sleeps. In previous versions of eprints the indexer repeatedly
indexed all items in the repository; it was a relatively resource-
intensive task. Digital objects are converted into text format for
indexing using external conversion programs to handle office formats
and PDF. Details of the indexing algorithm (stop words, stemming etc)
are controlled by the EPrints configuration.
Is this the level of detail that you wanted? Do you have a more
specific question?
--
Les
On 22 Jun 2007, at 00:40, Julie Allinson wrote:
> I was asked earlier today about the capabilities for full-text
> indexing within repository software. Although it looks like
> EPrints, DSpace and Fedora all support full-text indexing, I wonder
> whether members of this list might be able to expand further on the
> indexing capabilities of repository software?
>
> Cheers,
>
> Julie
>
> --
> Julie Allinson [log in to unmask]
> Repositories Research Officer
> UKOLN, University of Bath, Bath BA2 7AY, United Kingdom
> tel: ++44 (0) 114 2486457, ++44 (0) 1225 386580
> skype: j.allinson
> http://www.ukoln.ac.uk/repositories/
>
> --
|