Thus spoke Jon Knight (at least at 09:39 PM 10/11/96 +0100)
>On Fri, 11 Oct 1996, Rebecca S. Guenther wrote:
>> 1. Why are we lumping an abstract into the Subject field? Aren't keywords
>> and abstracts different enough that they warrant their own fields?
I'm afraid that I may be the guilty party on this one. The reason
for it was that we were limiting the number of elements, and under
such a constraint, "Subject (scheme=abstract)" seemed the best place
to put an abstract. Yes, an abstract is rather different than
keywords, which are different than controlled subject descriptors,
but creating "Abstract", "Keyword", "LCSH", ... as part of the
core seemed impossible.
>> They are also suffiently
>> different from keywords in that stop words shouldn't be indexed.
[...]
>With Dublin Core the search engine can determine from the sub-elements
>whether the Subject element is an unconstrained keywords list, a
>description, a set of terms from a constrained theasurus or whatever. It
>can thus apply its stop word mechanisms differently to the different types
>of Subject element as it sees fit.
Yeah, what he said.
>> Also,
>> don't we want to consider consistency with what the search engines are
>> already doing so that when we have sufficiently developed guidelines so
>> that everyone starts using metadata that we can grandfather in what had
>> already been done?
Not really. Alta Vista didn't exist when the Dublin Core was first put
together. Not all search engines support what Alta Vista does.
We should do what we think is best. If it becomes popular,
search engines will start to use it as a way to improve their
performance in an incresingly competitive market.
>> To have to use Subject and role= for the abstract makes
>> it harder to create metadata; don't we want to keep it simple for anyone
>> off the street to use?
Having a large list of elements is also hard to use.
>However if the general consensus is to split Subject in to Keywords and
>Description (or is it Descriptor - a bit inconsistant on that) then I'd
>have no problem with it.
I could live with a seperate element for Description, but this is
in contradiction to the recent trend to reduce the number of elements
in the core even further.
> However these decisions mustn't be taken lightly
>and we've got to make them _now_ so that we've got a stable Dublin Core
>that we can tell people about.
Yeah, what he said.
>> To use that implies there is some other scheme out there. Is
>> there really any other attempt to standardize meta information that we
>> have to include DC?
Yes. We have some work going on with other sites in the weapons
complex that is different enough from the DC that we can't just
use it, but may want to use some of it. We will have to add other
things too. Then there are things like the medical, GIS, ... work going on.
>> Again, can't we use it as it has already been used, without
>> specifying "DC"? If we want our scheme to be the standard, then we
>> wouldn't want to be forever having to put in "DC", since it adds
>> complexity to adding metadata for the average person.
>
>I seriously doubt that Dublin Core is going to be _the_ standard for all
>time. _A_ standard maybe, but not _the_ standard.
Yeah, what he said.
Average people are not going to put even the simplest metadata into
their documents, no matter how simple we make it. Every time this
subject of "metadata for the masses" comes up, I run a little
poll to see how many people in the room use MS Word (most do) and
of those, how many fill out file summary info (few do - the highest
percentage I ever got was 2/7, I was one of the 2, and the rest were
at a company that was asking me about URC, the DC, and metadata).
Since this topic only comes up among people who care about metadata,
the results are not promising for the general public.
If they do put metadata into their documents, it will typically be
through the use of something like the file summary dialog box,
which can easily supply the "DC.".
>> 3. Does anyone know of any progress with getting the Web search engines to
>> use DC meta elements? Why haven't they jumped at the chance to make some
>> order out of chaos?
Because there is no appreciable population of documents that use DC
meta elements.
>When the search engine
>vendors see a growing community of users, data and code, they'll probably
>want to follow suit.
Yeah, what he said.
Ron Daniel Jr. email: [log in to unmask]
Advanced Computing Lab voice: +1 505 665 0597
MS B287 fax: +1 505 665 4939
Los Alamos National Laboratory http://www.acl.lanl.gov/~rdaniel/
Los Alamos, NM, USA 87545 obscure_term: "hyponym"
|