Dear all,
at the Max Planck Digital Library we started some efforts to implement
DCAP metadata profiles, and therefore, natively, got a bit into
discussion on DSP usage, implementation issues etc.
We develop SOA-based repository infrastructure (eSciDoc, see
http://escidoc.org) , thus having to deal with different known
(standardized) and unknown (proprietary) metadata formats.
DSP-usage is related to known, DC-based application profiles.
Other metadata formats we deal with are in addition MODS or completely
proprietary unknown (i.e. pure XML) based resource descriptions (e.g.
scientific data).
The overall tendency is to actually develop DCAP wherever possible, and
therefore, enable some tools for generic handing of metadata.
Therefore, i am sending a bit longer email, with some more concrete
examples and looking forward to hear on your opinion.
What after several intensive team discussions we have questioned is:
*known vs. unknown metadata profiles - how we deal with them within
technical implementation (which goes beyond metadata entry masks)
*DCAP vs. non-DCAP - DCAP sufficient or not?
*DCAP and DSPs - what are DSPs benefits (interoperability is clear, but
which tools do we have based on DSP)?
...
...
1) We clearly understand that DSP itself is not and can not be a
configuration for the metadata editing, viewing, searching user
interfaces i.e. forms. For this purpose, we realized that we need
something which we currently named "screen configuration" (e.g. XML
based definition of the user interface part that works with metadata
elements).
2) if we already use "screen configuration" what is the starting point?
*Initial idea was to have a DSP definition of the metadata profile(at
present we settled on use of a DSP-XML expression, future would probably
be DSP-RDF), transform it to a "screen configuration" that can be
further customized
**screen configuration may include grouping of fields, their ordering on
the screen etc.
3) DSP and "screen configuration" example
DSP (excerpt) is the following (apologies for the XML):
<DescriptionSetTemplate
xmlns="http://dublincore.org/xml/dc-dsp/2008/01/14"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://dublincore.org/xml/dc-dsp/2008/01/14
.\dcmi-dsp.xsd">
<DescriptionTemplate ID="diamond-item" minOccurs="1" maxOccurs="1"
standalone="yes">
<ResourceClass>http://purl.org/escidoc/metadata/profiles/diamond-item</ResourceClass>
<StatementTemplate ID="diamond-elements" minOccurs="1" maxOccurs="1"
type="literal">
<Property>http://purl.org/escidoc/metadata/terms/diamond-elements</Property>
</StatementTemplate>
<StatementTemplate ID="shape" minOccurs="1" maxOccurs="1" type="literal">
<Property>http://purl.org/escidoc/metadata/terms/shape</Property>
</StatementTemplate>
<StatementTemplate ID="color" minOccurs="1" maxOccurs="1" type="literal">
<Property>http://purl.org/escidoc/metadata/terms/color</Property>
</StatementTemplate>
</DescriptionTemplate>
</DescriptionSetTemplate>
3.1.) A "screen configuration" for a metadata view page would then have
the following configuration:
<generic-metadata:description screen-id="edit" configuration="formatted"
resource-class="diamond-item"
xmlns:generic-metadata="http://purl.org/escidoc/schemas/generic-metadata/metadata/0.1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://purl.org/escidoc/schemas/generic-metadata/metadata/0.1
.\metadata-scr.xsd">
<generic-metadata:statement id="diamond-elements"
label="diamond-elements" display="true" optional="false"
repeatable="false" gui-component="text-field"
namespace="http://purl.org/escidoc/metadata/terms/">
<generic-metadata:value xml:lang="en-US"></generic-metadata:value>
</generic-metadata:statement>
...
...
</generic-metadata:description>
within this screen configuration there may be also groups or field which
are not related to any metadata from the profile
3.2) after user input is done, the fields are populated with values, and
a XML serialization would would look smth like:
<generic-metadata:description screen-id="edit" configuration="formatted"
resource-class="diamond-item"
xmlns:generic-metadata="http://purl.org/escidoc/schemas/generic-metadata/metadata/0.1"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://purl.org/escidoc/schemas/generic-metadata/metadata/0.1
.\metadata-scr.xsd">
<generic-metadata:statement id="diamond-elements"
label="diamond-elements" display="true" optional="false"
repeatable="false" gui-component="text-field"
namespace="http://purl.org/escidoc/metadata/terms/">
<generic-metadata:value xml:lang="en-US">diamond elements
description</generic-metadata:value>
</generic-metadata:statement>
...
...
</generic-metadata:description>
3.3) we need to do an additional transformation to "write" the diamond
elements to a simple metadata record in XML such as (namespaces omitted
for clarity here):
<diamond-item >
<diamond-elements xml:lang="en-US"> diamond elements description
</diamond-elements>
...
</diamond-item>
Note: we can not store the generic format (as given at 3.2 or generic
but without GUI elements information) above internally due to many
services will have troubles (Note: we ingest metadata as they come, and
not trying to modify their format, as long as there are valid namespaces
defined in the metadata record)
3.4) The Problem begins :)
*the XML record representation in 3.3 is fine, and can work without
problems with all our infrastructure services
*but: we need to validate the XML record representation against any of
XSD, RelaxNG Schema, Schematron rules or DSP definition
*if we have any of them, there is no need to duplicate the definition of
a profile for the purpose of validation e.g. if we have a DSP definition
of a metadata profile, and a service that validates the metadata record
based on DSP definition, we would not need to have in addition manually
created and maintained xsd schema. What we need are automatically
generated validation/structural rules (in any of above format) if the
resource (metadata) is described via DSP. For this purpose, as we
already have a metadata validation service that uses Schematron rules,
we would need some additional component which is able to "translate" DSP
definition into Schematron rules (Explanation: this service was
developed before and allows for validation of semantical correctness of
the metadata values in a metadata record for particular event e.g. if
author is from Max Planck Society, then author identifier has to be
provided during the event of creation of metadata record).
We are not aware that such component already exist, therefore we are
considering developing it.
*If we have a DSP, there can be 2 levels of validation:
a) one would be structural, i.e. makes sure that the metadata presented
at 3.3 is well-formed in accordance with the DSP (this may also include
LiteralOpttion constraints, as long as they are discrete values)
b) another would be Controlled-vocabulary related - it makes sure that
values if related to a controlled vocabulary (CV), that an entry is
validated from the CV entry.
We are not aware that such service or component exists at the moment.
@Diane Hillmann (NSDL):
You mentioned that you already work on a DSP management tool. We are
very interested in collaboration, as this would benefit not only MPDL,
NSDL and eSciDoc community, but in addition the overall DC community.
Our focus is multi-folded:
*have DSP profile management (including metadata profile versioning)
*allow for metadata profiles not expressed as DSP (e.g. MODS)
*have proper validation of metadata records (when profile is based on
DSP, some automation needed)
*have possibility to quickly develop user interfaces for viewing and
editing of various resources: publications, images (all with various
metadata e.g. facial expressions, diamond research, etc.), digitized
manuscripts etc.
*have possibility to automatically transform from XML to RDF/XML (based
on DSP)
*At present, native format of our metadata assumes XML-based metadata
records (based on Fedora repository)
*RDF/XML, OAI-ORE would be available in future (as dissemination) for
all resources
See also:
http://colab.mpdl.mpg.de/mediawiki/Generic_handling_of_metadata
and linked pages for more details on our work.
On our development server (be careful, sometimes will not work :) there
is an example application that handles different types of images (based
on screen configurations created manually from DSP)
*
http://dev-faces.mpdl.mpg.de/search/result/escidoc.diamond-item.shape.crystal
this very same tool will also deal with faces-images see for example:
*
http://faces.mpib-berlin.mpg.de/home/1/12/emotion/asc/personid/asc/pictureset/asc
even though the application works with images in both cases, they do
have different metadata that describe them.
Best regards,
Natasa Bulatovic
Stuart Sutton wrote:
> I recently sent Tom Baker a personal email asking a question about the DSP specification documentation.[1] My specific question was whether there was a schema available to augment the current online documentation. Apparently, Tom has had several other private inquiries of late regarding the DSP specification from others launching initiatives that rely on it. He suggested that all of us inquirers post a description of what we are doing (planning to do) here on the architecture list in order to raise awareness of DSP deployment and, perhaps, do some cross-pollination of thinking. If this is not the most appropriate forum for such revelations (and consequent inquiries and discussion), please let me know. Obviously, any suggestions regarding a more suitable forum would be appreciated.
>
> The Achievement Standards Network (ASN)[2] is in the preliminary stages of developing a configurable RDF authoring tool for representing resources like national curricula in the education, further education and vocation sectors by those unfamiliar (and uninterested) in RDF but needing to represent such curricula in a Semantic Web-amendable manner. While based in RDF schemas developed by ASN for representation of such resources, we want the tool we develop to be configurable by means of DSPs to enable jurisdiction-specific extensions to the ASN schema (i.e., jurisdictional application profiles) in order to meet specific "local" needs. I would certainly be interested in knowing whether, and how, others are engaging in similar (and even dissimilar) initiatives to deploy the DSP specification in functional applications.
>
> [1] http://dublincore.org/documents/2008/03/31/dc-dsp/
> [2] http://www.achievementstandards.org/
>
> Stuart Sutton
>
> Associate Professor & Chair,
> MLIS Degree Program
> The Information School
> Mary Gates Hall, Suite 370
> Box: 352840
> University of Washington
> Seattle, WA 98195-2840
> Tel. 206-685-6618
>
--
Natasa Bulatovic
Max Planck Digital Library (MPDL)
Amalienstrasse 33
80799 Munich, Germany
http://www.mpdl.mpg.de
e-Mail: [log in to unmask]
phone: +49-89-38602-223
fax: +49-89-38602-280
|