JISCMail - DC-GENERAL Archives

Alan Poulter makes the point (see below) that full text indexing or near
full-text indexing as practised by Web robot search engines is not 
adequate and that this is a major motivation for metadata. I agree with
this as at least a major reason for moving toward metadata. I also
recognise that there are always concepts shared by an author and reader
that are not expressed directly in a document, and that techniques that
work only on the words in a document  have limitations. 

However, manual keywording or summarisation is a skilled labour intensive
process that people do not like doing and are not good at. Therefore, I
believe that a large degree of automation is essential if the metadata
initiative is to succeed. A process may provide a first cut to suggest
terms to assist authors or indexers with important  documents and to
automate others, it may involve a thesaurus to provide additional domain
knowledge but whatever it is, something is needed and the availability in
electronic form now makes it practical. 

Richard Jones
---------- Forwarded Message ----------

From:   Alan Poulter, INTERNET:[log in to unmask]
TO:     (unknown), INTERNET:[log in to unmask]
DATE:   5/6/97 5:56 PM

RE:     Re: Automatic content analysis and DCM


Richard Jones wrote:-
> If metadata is going to take off universally on the Web, then it is clear
> that the information has to be automatically generated. A key part of
this
> is how to generate sensible keywords and summaries to act as a surrogate
> for the original document.

Sorry but I disagree. 'Automatic generation' of subject content is what the

robot search engines do and it is the dissatisfaction with the results they

produce that is the impetus for metadata. If the robot search engines were 
infallible then there would be no need for metadata.