On Fri, 21 Feb 1997, Misha Wolf wrote:
> Jon wrote:
>
> >That sounds groovy to me. I was talking to Dave Beckett the other day and
> >he suggested that I18N of DC metadata would be nudged in the right
> >direction if we assume that the default charset is ISO-8859-1 (or better
>
> Jon, wash out your mouth with soap. No default charsets.
What?? No default charset? So you might have octets from any old charset
in your metadata? Yucky. I think I'll pass the soap over to someone else
on this one. I thought ISO-8859-1 or ISO10646 as the default at least.
So if we get some DC metadata turn up with no Charset qualifier, how
exactly should software process this? If you're going to index this data
and then make it searchable (possibily via a multilingual front end), how
do you interpret what is effectively an octet stream in order to do
matching? Is it from ISO-8859-1? One of the other ISO-8859-x char sets?
ISO-10646? Big5? Etc, etc, etc.
> >yet the full Unicode 2.0, though not much implements that at the moment)
>
> Try Alis' Tango or Netscape Communicator 4.0.
Exactly. Not exactly widespread yet is it? Hopefully time will change
that though.
> >with an encoding of UTF-8 (and say a default language of International
> >English).
>
> Where's that bar of soap? No default languages.
Again my mouth is more than clean enough thank you. I think I've argued
that point before and I'm sticking to it. I want to know what the default
interpretation of the DC elements are so that any software I write can do
sensible things with them.
Tatty bye,
Jim'll
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Jon "Jim'll" Knight, Researcher, Sysop and General Dogsbody, Dept. Computer
Studies, Loughborough University of Technology, Leics., ENGLAND. LE11 3TU.
* I've found I now dream in Perl. More worryingly, I enjoy those dreams. *
|