On Tue, 1 Aug 2006, Norman Gray wrote:
> > You've got to decide whether you're looking
> > at UCD1 or UCD1+, attempt to make sense of what a load of words
> > separated by semicolons mean, decide whether, say, phot.mag.reddFree
> > is an acceptable stand-in for phot.mag, think about whether you
> > need to perform unit conversions for the quantity that you've
> > identified to mean what you think it means...
and I forgot to add: what do you do if there are multiple columns which
have the UCD you're looking for?
> > Worst of all, you can't rely on the UCDs being there, so if you
> > really care about where to find RA and Dec, say, you're still
> > realistically likely to be checking column names etc.
> > IMHO UCDs sound like a good idea, but do not in fact provide
> > machine-readable semantics in any very useful sense.
>
> I will pin this user-story above my desk!
Glad to have been of service. You may at your option add a footnote
to the effect that it comes from a curmudgeonly old sceptic who
would have voted firmly against the adoption of agriculture on
the grounds that it sounded far too complicated ever to work.
> > Utypes are easier to work with, but I don't think(?) there are the
> > public data models to derive them from. If you're communicating
> > with yourself by reading utypes which you've just written,
> > referencing your own data model, it may make sense to use them,
> > but until/unless that data model is taken up more widely
> > it doesn't really gain you much over just knowing the J mag is
> > in column 4 because that's where you always write it.
>
> You don't really need a big public data model (argh, that phrase
> again) to get value from UTYPEs.
>
> If you add UTYPEs to your published data, then you have at least
> documented what you intend that data to mean, via a dereferencable
> URL, whether or not any of the stuff I talk about below ever actually
> happens.
I think that's true, and makes utypes worth using, but we're still
talking about semantic value which is only accessible to humans.
> If you or someone else declares that jach:jmag is a slightly more
> specific type of iau:j-magnitude, and that both are associated with
> UCD phot.mag.j:
>
> jach:jmag a iau:j-magnitude.
> iau:j-magnitude a [ hasUcd "phot.mag.j" ].
>
> (perhaps at some well-known or easily-guessable URL in the archive
> you're taking the data from), then it becomes possible for an
> application to ask `here is a list of the UTYPEs I've got: are any
> like 'phot.mag?', and `what UCD is UTYPE <X> most like?'.
>
> Possible, that is, once I've written the service that does it. This
> would not be hard to do (a couple of weeks work, I'd think). But
> would that be useful to you? Are those the sort of questions you'd
> want to ask?
Well in principle it might be some use, but it's not going to be that
much help until one can have a good chance of getting a reliable
answer; that might be the case in a restricted domain but for, e.g.,
a generic table analysis package it's unlikely to provide reliable
results any time soon for more than a small proportion of the
data that it comes across. And it still falls foul of several
of the problems I noted about UCDs. My feeling is that most of
the questions to which UCDs/utypes appear to provide an answer
are ones which actually require a human in the loop. For example,
there may well be no correct answer to "are any of these utypes
like phot.mag?", even given a well-defined state of a particular
data processing system, because it depends on the kind of analysis
that the scientist using the software has got in mind at the time.
Mark
--
Mark Taylor Astronomical Programmer Physics, Bristol University, UK
[log in to unmask] +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/
|