Print

Print


On Mon, 11 Aug 2003, Andy Powell wrote:

> Largely prompted by the need to write up a discussion paper on 'structured
> values' for the DC Usage Board, I've been working on writing up an
> 'abstract model' for DC metadata records.
>
>   http://www.ukoln.ac.uk/metadata/dcmi/abstract-model/
>
> As the introduction says, "the primary purpose of [the abstract model] is
> to provide a reference model against which particular DC encoding
> guidelines can be compared, in order to facilitate better mappings and
> translations between different syntaxes".  However, I strongly suspect
> that documenting the DC abstract model in this way will also be very
> helpful to people who want to use DC properties and constructs within
> their own non-DCMI metadata applications.


I think this will be a very useful document.

Some comments:

1.      Section 3 Qualified DC model : Third bullet and note

I am wondering about the implications of the note

<quote>
It (is) recognised that many real-world metadata applications will use
additional properties beyond those indicated in the third bullet point
above. While such usage does not fall strictly within the definition of
'qualified DC' provided here, such applications are strongly encouraged to
conform to the DCMI abstract model in order to achieve maximum
interoperability with other DC metadata records.
</quote>

The third bullet states :Each property must be one of the elements or
element refinements recommended by the DCMI

I am wondering whether it is worthwhile defining the model of qualified DC
so strictly that 'many real-world' applications of DC will be excluded?
Particularly, as you mention above,as this model may be quite helpful to
such applications.

I would prefer the third bullet to say something like:

Wherever possible existing DCMI elements or element refinements should be
used to describe properties.

And possibly add a bullet along the lines of:

Local (novel? non-DCMI?) terms should follow DCMI conventions as regards
relationships between terms i.e. properties should be expressed as
elements or element refinements with no nesting or grouping.


In any case, to me there seems some ambiguity in the note as it reads now.
The note encourages conformance with the abstract model, but the wording
seems a little ambiguous as to whether you are encouraging applications to
strictly adhere to only DCMI terms, or whether the intention is to
encourage conformance with the model as far as possible despite the use of
local terms. I am assuming it's the latter as whilst agreeing that
interoperability will be best served by applications using DC terms where
possible in a consistent way I am not sure that excluding appropriate
local  terms serves interoperability in any beneficial way? It
might well make processing the metadata easier but that is not quite the
same thing.


In which case I would prefer the note to say: .....such applications are
strongly encouraged to conform to the DCMI abstract model in all other
ways in order to achieve maximum interoperability with other DC metadata
records.

... and this would to my mind make this note closer in intent to section 6
within the "Guidelines for implementing Dublin Core in XML" recommendation
  http://dublincore.org/documents/dc-xml-guidelines/


2. What is a 'record'?

<quote>A record is some structured metadata about a resource, comprising
one or more properties and their associated values. </quote>

Perhaps my concerns above arise from too 'concrete' an understanding of
'record'. If a 'record' is as defined here 'some structured metadata' then
in effect an XML document might consist of a 'Qualified DC record' plus
some additional properties?

You touch on the ambiguity of 'record' in the appendix on RDF expression.
Presumably a number of RDF DC Qualified records could make up a more
complex RDF document describing various aspects of a resource.

I notice the PRISM spec [1] which I've been looking at recently seems to
refer to 'metadata' or 'document' rather than 'records'. I guess I am
getting somewhat confused as to whether the DCMI should be trying to model
DC 'records' or the DC 'vocabulary', and what difference that would make
to the model.

3. Enabling evolution of DC

I rather like the approach Thomas Habing outlined in his mail defining
Simple DC at the element level. This gets over the exclusion of audience
from Simple DC which has always seemed a bit odd to me. This approach no
doubt does vary from exisiting DCMI recommendations, but it seems somewhat
unfortunate not to encourage use of 'audience' where appropriate. I am
inclined to think there are benefits to enabling 'audience' to be
considered in the same way as other so-called DCMI 'elements' as far as
possible given the constraints of historical recommendations.

In particular I am not conviced that the dumb-down  process as specified
in this model should necessarily remove 'audience'. Can we not view
'dumb-down' as resulting in 'parent properties' i.e. elements?  This could
be achieved by removing the last bullet from the process. Such an approach
would also give flexibility for any future addition of elements.

In addition I'm not sure it will always be appropriate to throw away
non-DC metadata. I don't think we want to constrain systems from
'respecting' unknown metadata especially if there is a chance that other
systems are going to process that metadata. Maybe this is worth a note
somewhere in the 'dumbing down' section, even if it is more applicable to
RDF based systems?

Within the PRISM spec [1] there is a constraint on 'PRISM compliant
software' not to 'throw an error when a novel element is encountered'.
This is argued to support new functionality, and for cost reasons.

<quote>
Because PRISM obeys the RDF constraints on XML structure, implementations
are guaranteed to correctly parse even unknown elements and attributes.
PRISM-compliant applications MUST NOT throw an error if they encounter
unknown elements or attributes. They are free to delete or preserve such
information, although recommended practice is to retain them and pass them
along. Retaining the information is an architectural principle which helps
new functionality be established in the presence of older versions of
software. [1 p16]
</quote>

And later

<quote>
Discarding metadata is discouraged but not forbidden. A major cost occurs
when metadata has to be recreated after it was discarded earlier in the
production process. Therefore implementations MAY retain
and retransmit any information that they do not know is actually wrong.
[1 p37]
</quote>

Enough!

Rachel

[1]PRISM: Publishing Requirements for Industry Standard Metadata Version
1.2 (g)
http://www.prismstandard.org/spec1.2g.pdf


---------------------------------------------------------------------------
Rachel Heery
UKOLN
University of Bath                              tel: +44 (0)1225 826724
Bath, BA2 7AY, UK                               fax: +44 (0)1225 826838
http://www.ukoln.ac.uk/