Print

Print


On Tue, 1 Aug 2006, Norman Gray wrote:

> >   You've got to decide whether you're looking
> > at UCD1 or UCD1+, attempt to make sense of what a load of words
> > separated by semicolons mean, decide whether, say, phot.mag.reddFree
> > is an acceptable stand-in for phot.mag, think about whether you
> > need to perform unit conversions for the quantity that you've
> > identified to mean what you think it means...

and I forgot to add: what do you do if there are multiple columns which 
have the UCD you're looking for?

> > Worst of all, you can't rely on the UCDs being there, so if you
> > really care about where to find RA and Dec, say, you're still
> > realistically likely to be checking column names etc.
> > IMHO UCDs sound like a good idea, but do not in fact provide
> > machine-readable semantics in any very useful sense.
> 
> I will pin this user-story above my desk!

Glad to have been of service.  You may at your option add a footnote
to the effect that it comes from a curmudgeonly old sceptic who
would have voted firmly against the adoption of agriculture on
the grounds that it sounded far too complicated ever to work.

> > Utypes are easier to work with, but I don't think(?) there are the
> > public data models to derive them from.  If you're communicating
> > with yourself by reading utypes which you've just written,
> > referencing your own data model, it may make sense to use them,
> > but until/unless that data model is taken up more widely
> > it doesn't really gain you much over just knowing the J mag is
> > in column 4 because that's where you always write it.
> 
> You don't really need a big public data model (argh, that phrase  
> again) to get value from UTYPEs.
> 
> If you add UTYPEs to your published data, then you have at least  
> documented what you intend that data to mean, via a dereferencable  
> URL, whether or not any of the stuff I talk about below ever actually  
> happens.

I think that's true, and makes utypes worth using, but we're still 
talking about semantic value which is only accessible to humans.

> If you or someone else declares that jach:jmag is a slightly more  
> specific type of iau:j-magnitude, and that both are associated with  
> UCD phot.mag.j:
> 
>      jach:jmag a iau:j-magnitude.
>      iau:j-magnitude a [ hasUcd "phot.mag.j" ].
> 
> (perhaps at some well-known or easily-guessable URL in the archive  
> you're taking the data from), then it becomes possible for an  
> application to ask `here is a list of the UTYPEs I've got: are any  
> like 'phot.mag?', and `what UCD is UTYPE <X> most like?'.
> 
> Possible, that is, once I've written the service that does it.  This  
> would not be hard to do (a couple of weeks work, I'd think).  But  
> would that be useful to you?  Are those the sort of questions you'd  
> want to ask?

Well in principle it might be some use, but it's not going to be that
much help until one can have a good chance of getting a reliable
answer; that might be the case in a restricted domain but for, e.g., 
a generic table analysis package it's unlikely to provide reliable 
results any time soon for more than a small proportion of the
data that it comes across.  And it still falls foul of several
of the problems I noted about UCDs.  My feeling is that most of
the questions to which UCDs/utypes appear to provide an answer
are ones which actually require a human in the loop.  For example,
there may well be no correct answer to "are any of these utypes
like phot.mag?", even given a well-defined state of a particular
data processing system, because it depends on the kind of analysis
that the scientist using the software has got in mind at the time.

Mark

-- 
Mark Taylor   Astronomical Programmer   Physics, Bristol University, UK
[log in to unmask] +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/