I'm surprised no-one has mentioned the entirely-free, open-source,
and very mature Lucene (java) and Ferret (ruby) engines?
http://lucene.apache.org/
http://ferret.davebalmain.com/trac/
Lucene is the full-text search engine used in DSpace. Ferret is used
by the majority of Ruby On Rails sites that implement full-text
searching. Both can be configured to work with any database or
application, and there are various utilities, helpers and plugins
available.
-S
On 25 Jun 2007, at 13:06, Joanne Yeomans wrote:
> Forwarding on behalf of the CDS team which runs CDS Invenio, the
> repository software used at CERN:
>
> <http://cdsware.cern.ch/>
>
> Invenio supports fulltext indexing via conversion to ASCII text format
> from file formats such as PDF, PS, HTML, MS Word, MS Excel, MS
> PowerPoint, etc. For fulltext indexing conversion, we make use of
> free
> tools like pdftotext, xlhtml, etc.
> For higher quality conversions, we also have a separate conversion
> service that runs native MS Windows applications.
> <http://cdsconv.cern.ch/>
>
> Best regards
> --
> Tibor Simko ** CERN Document Server ** <http://cds.cern.ch/>
>
> ********************
> Joanne Yeomans
> Section Leader CERN Library
> http://library.cern.ch/
> [log in to unmask]
> Mail address: Mailbox C27810,
> CERN CH 1211, Geneva 23, Switzerland
> +41 22 76 70548
>
>> -----Original Message-----
>> From: Repositories discussion list
>> [mailto:[log in to unmask]] On Behalf Of
>> Antoinette Le Maire
>> Sent: 24 June 2007 19:34
>> To: [log in to unmask]
>> Subject: Re: Full-text indexing in repository software
>>
>> Hi,
>>
>> VITAL from VTLS which is based on FEDORA sopport also full
>> text indexing for common texts format...
>>
>> More info at: http://www.vtls.com/Products/vital.shtml
>>
>> Antoinette Lemaire
>> Universite catholique de Louvain
>>
>>> Hi
>>>
>>> DigiTool, Ex Libris' Digital Asset Management solution,
>> supports full
>>> text indexing of common text file formats (i.e. DOC, RTF, PDF, TXT,
>>> HTML, and XML).
>>>
>>> Please visit http://www.exlibrisgroup.com/digitool.htm for further
>>> details about DigiTool.
>>>
>>> Best wishes,
>>> Alan
>>>
>>> Alan Oliver
>>> Business Development Manager
>>> Ex Libris (UK) Ltd.
>>>
>>> -----Original Message-----
>>> From: Julie Allinson [mailto:[log in to unmask]]
>>> Sent: 22 June 2007 00:40
>>> To: [log in to unmask]
>>> Subject: Full-text indexing in repository software
>>>
>>> I was asked earlier today about the capabilities for full-text
>>> indexing within repository software. Although it looks
>> like EPrints,
>>> DSpace and Fedora all support full-text indexing, I wonder whether
>>> members of this list might be able to expand further on the
>> indexing
>>> capabilities of repository software?
>>>
>>> Cheers,
>>>
>>> Julie
>>>
>>> --
>>> Julie Allinson [log in to unmask]
>>> Repositories Research Officer
>>> UKOLN, University of Bath, Bath BA2 7AY, United Kingdom
>>> tel: ++44 (0) 114 2486457, ++44 (0) 1225 386580
>>> skype: j.allinson
>>> http://www.ukoln.ac.uk/repositories/
>>>
>>> --
>>>
>>> --
>>> This message has been scanned for viruses and dangerous
>> content by Ex
>>> Libris (UK) Ltd., and is believed to be clean.
>>>
>>>
>>> --
>>> This message has been scanned for viruses and dangerous
>> content by Ex
>>> Libris (UK) Ltd., and is believed to be clean.
>>>
>>
|