Andrew Cunningham wrote:
> Ok .. i understand your point ... i know in html you are only allowed to
> have one encoding in the document .. does the same hold for xml?
Hmmm... this I must admit that I have not considered.
Thanks for raising it.
Re-reading the W3C Recommendation 10-February-1998 of the
Extensible Markup Language (XML) 1.0 says, I find that
sec. 4.3.3 says: "Each external parsed entity in an XML
document may use a different encoding for its characters";
and "it is an error [...] for an encoding declaration to
occur other than at the beginning of an external entity".
The way I read this is that there may be one, and only
one, encoding in use for each external parsed entity.
Which means that doing as I have done so far (i.e.: putting
encoding declarations into plain parsed entities) is in fact
_not_ valid XML. Bummer.
I would welcome further clearification from anyone
who can provide an authoritative answer on this issue.
I can't honestly see why there should be such a constraint.
But I guess it is not very productive to disagree with W3C
recommendations when creating Internet applications :-( .
> personally .. if i was mixing scripts like that .. i'd use a
> single encoding like the default encodings utf-8 or utf-16
Of course. But my questions here was to address learn whether
the other way of doing it was legal XML, not whether it was
possible to do it differently.
> does anyone know of a practical example of a parser or xml agent that can
> handle multiple encodings within the same document?
Well, currently mine can :-) (but that doesn't make it XML,
so I'll take that it out again).
> or are you thinking of a
> case where only the appropraite language information is extracted (ie
> language negotiation) and the appropriate encoding used?
>
> i'm just trying to clarify for myself whether your example was meant for
> alng based extraction or whether all languages would be displayed ? since
> each scenario would place evry different constraints on the parser .. or xml
> agent
My application is a web robot that cruises the web looking for XML
metadata to enter into the database that lives in the heart of our
Internet search engine. I was just thinking about how a multilingual
web site may want to present itself to such a robot, and constructed the
example from that scenario.
--
- gisle hannemyr ( [log in to unmask] - http://home.sol.no/home/gisle/ )
------------------------------------------------------------------------
"Use the Source, Luke. Use the Source." -- apologies to Obi-Wan Kenobi
------------------------------------------------------------------------
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|