Would you suggest converting an existing HTML document into XHTML using
I presume this can be an automated process saving a lot of time.
Since we can teach Tidy to address specific conversions. It can help to
preprocess existing HTML document to be conformant as per the new profile.
Digital Resources & Services
National Library Board, Singapore
DID +65 6846 6740
Cell +65 9187 4275
Pete Johnston <[log in to unmask]>@[log in to unmask]> on 29/11/2007
Please respond to DCMI Architecture Forum <[log in to unmask]>
Sent by: DCMI Architecture Forum <[log in to unmask]>
To: [log in to unmask]
Subject: Re: [DC-HTML Public Comment]
Apologies for the slow response...
> This new recommendation for DC in (X)HTML looks good. And it
> has lots of examples which is good.
> It does feel a bit tedious duplicating every example in HTML
> and XHTML. Shouldn't we be encouraging people to use XHTML?
It felt a bit tedious writing them too ;-) But I'd been working on the
basis that we needed to support both HTML and XHTML, since the current
recommendation does cover both cases. But it is a question worth asking.
Certainly if the choice is that this spec should focus only on XHTML, it
simplifies (and shortens!) it, and allows us to sidestep the thorny
question of how to specify an extraction algorithm for the HTML case,
which isn't handled by GRDDL.
> So just places where it differs could be indicated (just
> mention once the '/>' needed in XHTML but not HTML).
I think there are a few other issues too (e.g. lang v xml:lang). I'd
have a slight preference for keeping the parallel sets of examples (but,
yes, I need to adjust the spacing to clarify what the captions apply
to), but I don't feel strongly about it if there's a strong preference
for dropping/reducing them.
> However I wonder if it should say something about changes
> from the current recommendation, and backwards compatibility
> (or not). There is a lot of DC out there on Web page headers.
> Do they suddenly become invalid? I suspect there will be a
> hard education task to encourage people to change current
> practices and internal / local recommendations.
The first point to emphasise is the critical role of the profile URI in
determining whether an XHTML instance should be subject to the
interpretation of the meta/link elements as expressing a DC description
set, as described in the proposal.
i.e. it is the presence of the URI in the profile attribute in
which acts as the licence - the "hook" if you like - which signals that
this interpretation should be applied to an X/HTML instance.
Without that profile URI - or some other external indicator that the
document creator intends the X/HTML doc to be interpreted in accordance
with this profile - an instance should not be subject to the
interpretation described here.
So, for any existing documents, they don't use that profile URI, and the
interpretation described in the proposal doesn't apply.
Of course this begs the question of what _is_ the DCAM interpretation of
an X/HTML instance which uses the profile URI specified by the current
DC-in-X/HTML DCMI Recommendation? i.e.
and strictly speaking, the answer is that at this point in time there
isn't one, or at least not one formally described/endorsed by DCMI. The
existing DC-in-X/HTML spec pre-dated the DCAM and is based on a
different "abstract model".
Having said that, it is possible to "retro-fit" an interpretation as a
description set for the conventions used in the existing DC-in-X/HTML
spec, particularly given that there are some existing interpretations of
the existing conventions as RDF.
So I agree that there needs to be some accompanying documentation which
seeks to address questions like
What are the relationships between the new profile and the 2003 profile?
What is the DCAM interpretation for the 2003 profile?
What are the differences between the 2003 profile and the 2007 profile?
What are the reasons for using one rather than the other?
(and probably some other related questions?)
I think this belongs in some sort of contextual note rather than in the
spec itself though, and we've made a start at developing this in a page
on the wiki
though, TBH, at the moment the content (of the latter parts at least)
isn't much more than some outline notes, so might not make a lot of
sense as it stands.
> For example, I have local recommendations such as:
> <meta name="DC.subject" scheme="DCTERMS.LCSH" content="economics" />
> <meta name="DC.subject" content="science; arts; social science" />
> <meta name="DC.subject" scheme="DCTERMS.LCSH"
> content="economics; social science; engineering" /> <meta
> name="DC.subject" scheme="DCTERMS.DDC" content="050; 100; 150" />
> <meta name="DCTERMS.created" scheme="DCTERMS.W3CDTF"
> content="2002-05-01" />
> Actually I'm not sure if this last one is still valid. The
> document uses XSD.date but doesn't say anything about why it
> uses that rather than DCTERMS.W3CDTF.
So as I say, nothing in the new spec does anything to change the
interpretation of any existing data which uses the 2003 profile.
And yes, using the new profile, you can still use DCTERMS.W3CDTF as a
prefixed name for the DCMI W3CDTF Syntax Encoding Scheme. I used the XML
Schema datatype in the example to emphasise that non-DCMI-owned
datatypes were supported, but I didn't mean too imply that W3CDTF wasn't
> Also a minor point: Is there a reason why section 3 and 4
> headings have 'Meta Data' rather than 'Metadata'?
The use 'Meta Data' because that is the form used in the HTML
and we're saying explicitly that this document is one of those things.
So actually, now that I look at it again, I think the reference in the
first sentence of 4.1 should also be to "meta data profile", not
Technical Researcher, Eduserv Foundation
Email: [log in to unmask]
Tel: +44 (0)1225 474323