Rebecca,
Its probably a bad idea of me to jump into this "number of elements" battle
early on a Sunday morning, but here goes.
A number of us had some very useful and "enthusiastic" discussions at the
recent (and very interesting) ERCIM Metadata workshop that Tom Baker ran in
Bonn. A lasting impression that I've been left with (with help from John
Kunze) is that we have lost sight in the Dublin Core process of an
important engineering principle - be certain and firm on your requirements
before you set out on a design.
I hear in your message and those of others a confusion about what is the
requirements definition for the Dublin Core. We seem to shift back and
forth between a number of requirements:
1. aid to initial resource discovery (i.e. - very few fields)
2. method for general short cataloging (i.e. - more fields)
3.- method for short cataloging in "my" specific domain (where "my" is
geospatial, imaging, etc) (i.e. more and more and more fields)
A number of us have strong feelings about which oneof these requirements
the DC should fulfill. Rather than get up on a soap-box here and campaign
for my favorite (hint, hint #1), I propose that we delay the discussions
such as:
1. I want to add this field
2. I want to remove this field
3. I want this syntax
4. etc.
and come to consensus of what we are trying to do here. Then, when we
start making specific suggestions we can review them in the light of our
stated requirement.
Carl
----------
From: Rebecca S. Guenther[SMTP:[log in to unmask]]
Sent: Friday, October 11, 1996 11:32 AM
To: [log in to unmask]
Cc: [log in to unmask]; [log in to unmask]
Subject: Dublin Core and search engines
At the Library of Congress we would like to begin instructing staff to put
meta information in the Web documents they put up. Of course we would like
to support the Dublin Core, but current search engines aren't programmed
to use it. Certainly the META tag can be used, as has been discussed on
this list. However, Alta Vista and Infoseek both are able now to use only
2 meta tags: "description" and "keywords". Those map in the DC to Subject
role=abstract and Subject without a qualifier. If we use them that way
the search engines won't be able to use them now. We all need something
NOW to help us find what we want on the Web. All this discussion
brings up the following questions.
1. Why are we lumping an abstract into the Subject field? Aren't keywords
and abstracts different enough that they warrant their own fields?
Abstracts are very useable in search and retrieval, and one could imagine
wanting to limit a search to an abstract. They are also suffiently
different from keywords in that stop words shouldn't be indexed. Also,
don't we want to consider consistency with what the search engines are
already doing so that when we have sufficiently developed guidelines so
that everyone starts using metadata that we can grandfather in what had
already been done? To have to use Subject and role= for the abstract makes
it harder to create metadata; don't we want to keep it simple for anyone
off the street to use? Can we consider having two different elements for
what is now "Subject" and make them consistent with AltaVista (Descriptor
and Keywords)? For those that want to go further, they could still
qualify Keywords by scheme=LCSH or whatever.
2. I can't remember when the "DC" part of the META NAME was added (e.g.
DC.subject). To use that implies there is some other scheme out there. Is
there really any other attempt to standardize meta information that we
have to include DC? Isn't the LINK REL enough to identify that Dublin Core
is being used? Again, can't we use it as it has already been used, without
specifying "DC"? If we want our scheme to be the standard, then we
wouldn't want to be forever having to put in "DC", since it adds
complexity to adding metadata for the average person.
3. Does anyone know of any progress with getting the Web search engines to
use DC meta elements? Why haven't they jumped at the chance to make some
order out of chaos?
We have a bit of a dilemma here in deciding what meta information to put
in our documents, because we want to support the Dublin Core but need to
have something that can be used by search engines right now. We considered
putting it in both ways (the way that AltaVista can now use and also
repeating it in the Dublin Core form), but that seems too much for people
to key it in twice.
Rebecca
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^ Rebecca S. Guenther ^^
^^ Senior MARC Standards Specialist ^^
^^ Network Development and MARC Standards Office ^^
^^ Library of Congress ^^
^^ Washington, DC 20540-4020 ^^
^^ (202) 707-5092 (voice) (202) 707-0115 (FAX) ^^
^^ [log in to unmask] ^^
^^ ^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|