David and others, hello. [this is a long un', but it's worth it -- promise] On 2006 Aug 7 , at 10.29, David Berry wrote: > On Wed, 2 Aug 2006, Norman Gray wrote: > >> Mark, >> >> On 2006 Aug 2 , at 12.32, Norman Gray wrote: >> >>>>>> think about whether you >>>>>> need to perform unit conversions for the quantity that you've >>>>>> identified to mean what you think it means... > > Sounds to me like some standard library for handling all this system > conversion, units conversion, searching, etc, stuff is needed :-) Ah, but _that_ we've already got. What we don't have is something _generic_ for what I shall suddenly decide to call semantic conversion. Until now! Herewith the demo premiere (I plan to talk about this at the Strasbourg VOTech meeting, and I hope at the IVOA, but I'll run it past youse first). >> I meant to add that unit conversions wouldn't be addressed by any >> sort of solution I'm talking about, but they're rather separate >> anyway, since unit specifications address how the value is >> represented -- and thus are to some extent syntactic -- rather than >> what it is. No? > > In that sense a velocity (say) is a velocity is a velocity, and *all* > metadata describing it is syntactic, not just the units. > > To say "velocity A and B are the same, but just measured in different > units" seems to me to be no different to saying "velocity A and B are > the same but just measured in different rest frames". In both cases, A > and B are representations of the same physical phenomenon. So I can't > immediately see any reason for treating units differently to any other > item of metadata. They are all needed if you want to be able to > compare > two values. You're really pining for the good old Quantity discussion, aren't you? I think that fundamentally, in the abstract, you're right, and that units are as much a part of the meaning of a velocity (say) as anything else. However I think they are practically distinct, and I have just now come across what I believe to be a good illustration of why. But first the demo (I'm on the edge of my seat -- I don't know about you). I'm working on the utype-to-utype-to-ucd mappings I was talking about a week or so ago, and I'm using the USNO-B catalogue at ROE as a test case, simply because it was handy. That resource has an IVO-ID of <ivo://roe.ac.uk/DSA_USNOB/TDB>, and has a set of column descriptions which includes <column> <name>ra</name> <description>J2000 Celestial Right Ascension</description> <datatype>datatype='float'</datatype> <ucd>POS_EQ_RA_MAIN</ucd> <unit>deg</unit> </column> (this is a type defined by <http://www.ivoa.net/xml/VODataService/ v0.5>, and yes, that <datatype> does look a bit odd...). So, there's implicitly a UTYPE <ivo://roe.ac.uk/DSA_USNOB/TDB#ra>, which is a subclass of <http://cdsweb.u-strasbg.fr/UCD/ old#POS_EQ_RA_MAIN>. That is, I can convert the VODataService information to RDF. From the UCD1 to UCD1+ mappings, I can get that POS_EQ_RA_MAIN is a subclass of (well, was mapped to) pos.eq.ra;meta.main. I can generate RDF from that, too. We might also decide that there is a set of types which is of interest to us, or a community we're part of, and that: @prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>. @prefix x: <http://example.edu/utypes#>. x:ra a rdfs:Class. <ivo://roe.ac.uk/DSA_USNOB/TDB#ra> rdfs:subClassOf x:ra. (that's RDF, in the form of `Notation3', and says that <...#ra> is a subclass of the concept <http://example.edu/utypes#ra>, so that the USNO-B RA is a more specific type of RA than the one we've defined and documented at that URL). So we load those different bits of information into the reasoner, and then query it: % cat query.rq prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> select ?t where { <ivo://roe.ac.uk/DSA_USNOB/TDB#ra> rdfs:subClassOf ?t } % (that's SPARQL, and is a broadly SQL-like query language for RDF). So we POST the query to the reasoning service: % curl --data-binary @query.rq \ --header 'Accept: text/csv' \ --header 'Content-Type: application/sparql-query' \ http://localhost:8080/quaestor/kb/ucd t http://example.edu/utypes#ra http://cdsweb.u-strasbg.fr/UCD/old#POS_EQ_RA_MAIN ivo://roe.ac.uk/DSA_USNOB/TDB#ra http://www.w3.org/2000/01/rdf-schema#Resource http://cdsweb.u-strasbg.fr/UCD/words#pos.eq.ra;meta.main % (obviously, you could dereference that URL from any code, and if you're prepared to URL-encode the query, you can GET it as well). So, that gives you a list of all the things that the USNO-B 'ra' column is a subclass of. Our software has presumably been written so that it already knows what a <http://example.edu/utypes#ra> is (that's why we added the extra mapping information); but if not, it'll know what the pos.eq.ra;meta.main UCD is. Thus, we've gathered together information from a variety of loosely cooperating sources: * the ROE folk declared that the USNO 'ra' column was a particular old-style UCD, but they haven't updated it; * there's a fixed mapping of old-style to new-style UCDs; * you added the mapping to <http://example.edu/utype#ra> yourself, for your own purposes. Perhaps you had to work it out from hard-to-find documentation, or perhaps the example.edu namespace is a discipline-specific standard, or an IVOA one. Then we queried it with a very simple expression, getting output from which it's easy to extract the information we want. It means all the various actors here can remain fairly loosely coupled, and the software reading this can operate at whatever level of generality it needs to. The link to units (getting back to that, David) is that when assembling and using this information, I really couldn't see a place for the units information which is in the VODataService element above. The statement "USNO-B's 'ra' column is a type of pos.eq.ra" is true independently of units. Once I've established just what this USNO-B column is supposed to be (aha, an RA!), then I'm going to have to discover what units the data there has, in order to actually read it. So yes, a complete description of the numbers in that column requires unit information, but that description can be usefully decomposed/factored into orthogonal components, namely the semantic information (which I'm taking to mean "column 'ra' is a pos.eq.ra") and the unit information. That's not a principled factorisation, but a practical one. So, what does all that sound like? See you, Norman -- ------------------------------------------------------------------------ ---- Norman Gray / http://nxg.me.uk eurovotech.org / University of Leicester, UK