On Fri, 23 Jan 2004, David Berry wrote:
> I am ashamed to say I bottled out of the recent VOTable deluge at a fairly
> early stage. Is anyone able to give an "executive summary" of it all? I
good move!
Briefly, Tony Linde is unhappy about blanket use of the VOTable format
because it's not XMLish enough. As far as I can see, the main
thing he doesn't like is that if you've got a document like:
<TABLE>
<FIELD name="Object name" ID="NAME" datatype="char" arraysize="*"/>
<FIELD name="V Magnitude" ID="VMAG" datatype="double"/>
<DATA>
<TABLEDATA>
<TR><TD>M31</TD><TD>3.4</TD></TR>
<TR><TD>Fomalhaut</TD><TD>1.23</TD></TR>
</TABLEDATA>
</DATA>
</TABLE>
then you can't write a very good schema for it, since there's nothing
much you can say about what's in a TD element - some TDs will contain
numbers, some will contain strings. This means you can't use XML binding
to generate automatic code for parsing such documents, instead someone
would have to (horror!) fire up an editor and write some actual
source code. I get the feeling there are other issues about storing
such documents in XML databases, and something to do with the registry
that this impacts on as well, though I'm not familiar enough with
these things to know what they are.
Cue massive outpouring by anyone and everyone of their
not-uniformly-well-informed pet gripes about VOTable.
A fair bit of this seemed to revolve around different understandings
of the term "metadata" by computer scientists and astronomers
(this fact was pointed out early on by Clive Page and rediscovered
by various other people at subsequent points in the debate).
To the Tony Lindes of this world, metadata characterises what you
can store in a column (basically data type, array size/shape,
limits on legal values - the sort of thing that goes to make up
a schema). To astronomers, it includes a lot of semantic information
such as UCDs, co-ordinate systems, instrument characteristics etc etc.
This settled down after a day or two into a concrete suggestion by
Roy Williams of an alternative table representation (codenamed "V2")
which would encode the data from above like this:
<TABLE>
<FIELD ID="NAME" name="Object name" datatype="char" arraysize="*"/>
<FIELD ID="VMAG" name="V Magnitude" datatype="double"/>
<DATA>
<XMLDATA>
<VORecord> <NAME>M31</NAME> <VMAG>3.4</VMAG> </VORecord>
<VORecord> <NAME>Fomalhaut</NAME> <VMAG>1.23</VMAG> </VORecord>
</XMLDATA>
</DATA>
</TABLE>
.. actually there's some namespace stuff in there too and a reference
to a (potentially auto-generated) schema which can say useful
things about what constitutes a VMAG and a NAME element etc.
The intention is that (at least initially) this would co-exist with
VOTable; lossless conversions would probably be possible in both
directions. Seems harmless enough to me, and it seems to satisfy
many of the contributors to the list. I can't quite tell if Tony
Linde is happy with it or not. I get the impression that Roy is
planning to do some additional work on this and present a concrete
proposal, though I don't know on what timescale.
There were various other paths followed in the discussion which generated
a number of messages but my guess is probably will not lead any further
in the forseeable future.
My personal view is that TL's worries about the insufficient XMLishness
of the current VOTable format are misplaced - I believe that to do
useful things with a VOTable you're better off converting it
into something that looks like a table rather than use generic XML
tools on it. As I say though, I don't understand the registry-related
issues, so there may be strong arguments against this position.
Implications for our software:
Me:
If V2 does surface, I should be able to write I/O handlers for it
and slot it into my infrastructure quite easily. In fact in
response to a blanket request from Roy Williams I have volunteered
to come up with an implementation if a concrete proposal
is put forward (I don't think anyone else has).
In a way, it would be rather nice if this does go ahead, since
it would vindicate the format-independent nature of the tables
library I've written.
David:
Nothing much got said about COOSYS, so I don't think the position
on coordinate systems has changed any from what is already in
VOTable.
Alasdair:
Alasdair tried at various points to kickstart a debate about how to
represent time series and spectra in a tabular format, but I don't
think it got taken up very much. I must admit, I don't really
understand why VOTable as it stands is not adequate for this -
though most VOTables seem to be used for object catalogues there
is little (only the COOSYS element) in the VOTable specification
which is specific to that domain of data. As far as time series
are tabular data I'd have thought with appropriate use of UCDs
they would fit happily in VOTables. Al, feel free to shove your
oar in here and enlighten me.
Anyone else??
I don't claim to have captured every last nuance of the debate here -
by all means go back and pore over the the original text to form
your own conclusions.
Mark
--
Mark Taylor Starlink Programmer Physics, Bristol University, UK
[log in to unmask] +44-117-928-8776 http://www.star.bris.ac.uk/~mbt/
|