[I'm copying the [log in to unmask] list]
Ann,
> The Bath Profile specifies that, for levels 1 & 2, XML record syntax
> is used with a DTD for Basic Dublin Core (ie. version 1.1). The DTD
> is from the CIMI project, and allows a <record-list> of <dc-record>s.
>
> This DTD does not include reference to any standard SGML
> character entity sets. So I assume the only character entities
> available are the in-built XML ones: &, <, >, ', ". I am wondering
> how to include non-keyboard characters. Should they be in
> Unicode, or should they be in plain text, eg. é becomes 'a',
> α becomes 'alpha'? I assume that if I were to include any
> character entity sets within a DTD it would no longer be
> interoperable.
If you have difficulties using Unicode directly, I suggest you use
Numeric Character References (NCRs). In both HTML and XML, NCRs always
refer to the Unicode Standard. Both decimal and hexadecimal versions
are supported. For example, all of these mean the same thing:
A
A
A
> Following on from this, I'm wondering how to encode superscripts
> and subscripts. I don't think I can do this in Unicode for the general
> case, though Unicode may include simple ones like
> <SUP>2</SUP>. Again, if I include extra tags in text like <SUP>
> and <SUB> the DTD will no longer be interoperable. The same
> applies to formatting tags like italic and bold, but I expect one can
> live without these - the content being more important than the
> format. Super/subscipts are quite likely to occur in scientific article
> titles. The only solution I have so far thought of is to include them
> as plain text, eg. ^2^ for a superscript and ~2~ for a subscript.
> Does anyone know if there is a 'standard' convention for this.
This is more of a problem. When I was involved in the W3C RDF WG and the
DC Datamodel WG, my hope was that arbitrary markup could be included, and
be "passed through" transparently. This is obviously needed for titles
of mathematical papers etc. I've rather lost touch, and don't know what
the current situation is.
> Thanks for any help.
> Ann
>
> --------------------------------------------------------------------------
> Mrs. Ann Apps. Electronic Publishing @ MIMAS. Manchester Computing,
> University of Manchester, Oxford Road, Manchester, M13 9PL, UK
> Tel: +44 (0) 161 275 6039 Fax: +44 (0) 0161 275 6040
> Email: [log in to unmask] WWW: http://epub.mimas.ac.uk/ann.html
> --------------------------------------------------------------------------
Misha
[This mail was written using voice recognition software]
-----------------------------------------------------------------
Visit our Internet site at http://www.reuters.com
Any views expressed in this message are those of the individual
sender, except where the sender specifically states them to be
the views of Reuters Ltd.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|