The Bath Profile specifies that, for levels 1 & 2, XML record syntax
is used with a DTD for Basic Dublin Core (ie. version 1.1). The DTD
is from the CIMI project, and allows a <record-list> of <dc-record>s.
This DTD does not include reference to any standard SGML
character entity sets. So I assume the only character entities
available are the in-built XML ones: &, <, >, ', ". I am wondering
how to include non-keyboard characters. Should they be in
Unicode, or should they be in plain text, eg. é becomes 'a',
α becomes 'alpha'? I assume that if I were to include any
character entity sets within a DTD it would no longer be
interoperable.
Following on from this, I'm wondering how to encode superscripts
and subscripts. I don't think I can do this in Unicode for the general
case, though Unicode may include simple ones like
<SUP>2</SUP>. Again, if I include extra tags in text like <SUP>
and <SUB> the DTD will no longer be interoperable. The same
applies to formatting tags like italic and bold, but I expect one can
live without these - the content being more important than the
format. Super/subscipts are quite likely to occur in scientific article
titles. The only solution I have so far thought of is to include them
as plain text, eg. ^2^ for a superscript and ~2~ for a subscript.
Does anyone know if there is a 'standard' convention for this.
Thanks for any help.
Ann
--------------------------------------------------------------------------
Mrs. Ann Apps. Electronic Publishing @ MIMAS. Manchester Computing,
University of Manchester, Oxford Road, Manchester, M13 9PL, UK
Tel: +44 (0) 161 275 6039 Fax: +44 (0) 0161 275 6040
Email: [log in to unmask] WWW: http://epub.mimas.ac.uk/ann.html
--------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|