Andrew Waugh wrote:
>
> Frank Roos wrote...
> >Actually, the *searcher* for a resource will give meaning to the DATE
> >field. He decides, that he wants to discover a resource with a specific
> >date. And as long as this date, whatever kind of date this is, is in the
> >DC record, he will succeed in discovering it, and he will be satisfied.
>
> True... at least in theory. But false in practice, unfortunately.
>
> Including many dates in a resource simply increases the number of false
> hits. If you ask for a map of Melbourne in the 1990s, for example, you
> might get maps drawn in the 1880s but digitized in 1995.
>
In reality it wouldn't be that bad, I think.
In your example "in the 1990s" actually is content for COVERAGE as far
as I understand this element.
So your example isn't actually a DATE search.
If you do the following search:
TYPE=map AND COVERAGE="1990-1997" AND DESCRIPTION=Melbourne
you would have a high precision result if any.
If you are interested in the date of issue of the source item or the
date of issue of the digitized version you would have to refine the
search using the DATE element accordingly.
BTW, I would be interested in a list of possible DATE instances. Has the
DATE gruop addressed this? Is this a very long list?
> In the worst case the search engine may return the first N hits and,
> if the resource doesn't occur in this list, the user won't find it.
> More likely, however, the search engine will return results a page at a
> time. The user often has to wade through pages of irrelevant resources
> to find the answer to their question. They frequently give up before
> reaching it. The problem of low precision can be seen (in spades!) on
> the net at the moment.
>
> There is also an important psychological problem with returning many
> apparently irrelevant resources. If the user can't understand *why* the
> search engine chose that resource, a naive user usually judges the
> search engine as poor and may cease to use it.
>
> A major challange of resource discovery is to build models of 'typical
> users'. These models should be able to answer the question 'the user
> has asked this, what are they likely to mean?' The search engine could
> then return some answers that are likely to be relevant together with
> suggestions as to ways of refining the query if the answers aren't
> relevant. This models the way I've observed reference librarians working
> with customers.
>
> This is very relevant to the discussion of 'DATE' and 'COVERAGE'.
>
Fully agreed.
Recall/precision in combination with notions of practical use and user
groups should always be the measuring rod for DC and DC refinement.
> In my view, users are most unlikely to use (or understand) 'date' as
> 'date resource was made available electronically'.
>
Yes, so you would add this in your search strategy in the DATE field,
either free text or (for better recall/precision) formatted according to
some standard(s).
> A typical query I want answered (wearing my hat as a techical historian)
> is 'Find me books about railway signalling published between 1900 and
> 1920' ...because I know that these books would describe current
> practice c1910. I don't care in the slightest that the book was scanned
> in the 1990s. The same argument applies to maps and data sets.
>
If you want a 100% precision as to DATE, then I agree you have to
qualify the DATE field accordingly to allow such a result. For Reuters
this is probably a necessity, but not for all other users in general.
I suspect, that in future date qualified DC records will not be in the
majority anyhow.
> andrew waugh
--
Frank A. Roos ([log in to unmask])
|