On Sat, 12 Apr 2003, Gorissen,Pierre P.J.B. wrote:
> It looks like the strict check by XMLSpy is causing more problems related to the <language></language> element.
> I had a look at the banana example (http://www.rdn.ac.uk/resourcefinder/?query=banana) and the first four entries there fail to validate in XMLSpy:
>
> http://www.rdn.ac.uk/record/lom/oai:rdn:agrifor:2011603
> http://www.rdn.ac.uk/record/lom/oai:rdn:agrifor:2024084
> http://www.rdn.ac.uk/record/lom/oai:rdn:agrifor:2011641
> http://www.rdn.ac.uk/record/lom/oai:rdn:agrifor:2014720
>
> This is because <language>eng</language> is used in the records instead
> of <language>en</language> or <language>en-UK</language>
Hmmm... that is a slight pain. The current RFC for language tags is
RFC-3066
http://www.ietf.org/rfc/rfc3066.txt
which allows both 2 and 3 letter language codes. This is what Dublin Core
now suggests using and is what I recommend in the RDN/LTSN application
profile. Unfortunately XML doesn't seem to have caught up with this :-(
Looks like I'll have to go back to refering to RFC-1766 OR explicitly
state that only 2-letter language codes from RFC-3066 are allowed.
For now, I've also added an explicit fix for 'eng'->'en' in my DC->LOM
output filter for RDN records. So all the records abaove should be OK.
Andy.
> The W3C validator you mention doesn't report errors because of this.
>
> Pierre
>
>
> -----Oorspronkelijk bericht-----
> Van: Gorissen,Pierre P.J.B.
> Aan: [log in to unmask]
> Verzonden: 12-4-03 10:54
> Onderwerp: Re: LOM Test Data
>
> Andy,
>
> The file validates OK now in XMLSpy. Thanks for the modifications.
>
> I did some more digging trying to find the reason for the behavior:
>
> The language element is defined in elementNames.xsd:
> <xs:group name="languageIdOrNone">
> <xs:sequence>
> <xs:element name="language" type="LanguageIdOrNone"/>
> </xs:sequence>
> </xs:group>
>
> The LanguageIdOrNone type can be found in dataTypes.xsd:
> <!-- LanguageId -->
> <xs:simpleType name="LanguageIdOrNone">
> <xs:union memberTypes="LanguageId LanguageIdNone"/>
> </xs:simpleType>
> <xs:simpleType name="LanguageId">
> <xs:restriction base="xs:language"/>
> </xs:simpleType>
> <xs:simpleType name="LanguageIdNone">
> <xs:restriction base="xs:string">
> <xs:enumeration value="none"/>
> </xs:restriction>
> </xs:simpleType>
>
> Where it limits the allowed valuespace to either "none" of a value that
> is restricted to the language type as defined in the
> http://www.w3.org/2001/XMLSchema namespace.
> The W3C page on the XML Schema recommendation describes the language
> datatype as:
> [Definition:]language represents natural language identifiers as defined
> by [RFC 1766]. The ·value space· of language is the set of all strings
> that are valid language identifiers as defined in the language
> identification section of [XML 1.0 (Second Edition)]. The ·lexical
> space· of language is the set of all strings that are valid language
> identifiers as defined in the language identification section of [XML
> 1.0 (Second Edition)]. The ·base type· of language is token.
> (http://www.w3.org/TR/xmlschema-2/#language)
>
> The RFC 1766 Tags for the Identification of Languages
> (http://www.ietf.org/rfc/rfc1766.txt) says in section 2 "Whitespace is
> not allowed within the tag".
>
> So that would suggest that this behaviour by XMLSpy isn't a bug, but
> just a scrict conformance to the rules?
>
> Pierre
>
> -----Oorspronkelijk bericht-----
> Van: Andy Powell
> Aan: Gorissen,Pierre P.J.B.
> Verzonden: 12-4-03 10:21
> Onderwerp: Re: LOM Test Data
>
> On Sat, 12 Apr 2003, Gorissen,Pierre P.J.B. wrote:
>
> >
> > No it didn't help. So I had another look.
>
> Can you do me a favour and try again now? Thanks.
>
> Andy.
>
> > It looks like XMPSpy doesn't allow ANY spaces or linebreaks in the
> language element.
> > So despite what I said and thought, the <title> plays no role in the
> problem, you can leave that one unchanged, but if the
> <language></language> element has any spaces it fails:
> >
> > <language> en</language> fails
> > <language>
> > en</language> fails
> > <language>
> > en</language> fails
> > <language>en </language> fails
> > <language>e n</language> fails
> > <language>en</language> passes (and makes the file valid)
> >
> > Are there any rules in the schema used that disallow spaces in that
> element or that might explain why XMLSpy interprets it like this ?
> >
> > Pierre
> >
> > -----Oorspronkelijk bericht-----
> > Van: Andy Powell
> > Aan: [log in to unmask]
> > Verzonden: 12-4-03 9:30
> > Onderwerp: Re: LOM Test Data
> >
> > On Sat, 12 Apr 2003, Gorissen,Pierre P.J.B. wrote:
> >
> > > When I loaded
> > > http://www.rdn.ac.uk/record/lom/oai:rdn:sosig:916739272-28853 into
> > > XMLSPy Professional Edition Version 5 rel. 4 I received the same
> error
> > > as Howard reported (Mandatory element 'identifier' expected in place
> > of
> > > 'language') I looked at the XML and saw that there were a couple of
> > > empty lines in both the string for the title and the language tag:
> >
> > Thanks. Is it any better now? I've removed the empty lines...
> >
> > Andy
> > --
> > Distributed Systems, UKOLN, University of Bath, Bath, BA2 7AY, UK
> > http://www.ukoln.ac.uk/ukoln/staff/a.powell +44 1225 383933
> > Resource Discovery Network http://www.rdn.ac.uk/
> >
>
> Andy
> --
> Distributed Systems, UKOLN, University of Bath, Bath, BA2 7AY, UK
> http://www.ukoln.ac.uk/ukoln/staff/a.powell +44 1225 383933
> Resource Discovery Network http://www.rdn.ac.uk/
>
Andy
--
Distributed Systems, UKOLN, University of Bath, Bath, BA2 7AY, UK
http://www.ukoln.ac.uk/ukoln/staff/a.powell +44 1225 383933
Resource Discovery Network http://www.rdn.ac.uk/
|