Chris
Although I can't comment on the formats you list I commend what you are doing and hope others will help in this initiative.
From my experience in the MRD world (which is pretty limited at the moment) it seems a 'no brainer' that we need an authoritative catalogue of formats in order to stand any chance of being able to manage scientific data in a meaningful way. In the research information world which I'm more familiar with, one of the persistent blockers to information use and reuse, particularly for aggregation and interoperability is a lack of authoritative lists of such things as research organisations, funders, even and agreed list of output types .. and so on.
Seems we have the same issue in the MRD world too!
Is there no organisation out there that can act as the authority for scientific data formats?
All the best
Anna
Anna Clements
Enterprise Architect
University of St Andrews
St Andrews, Fife,KY16 9AL
On 5 Nov 2012, at 11:16, "Chris Rusbridge" <[log in to unmask]> wrote:
> Last week I wrote to try to get some of you involved in Jason Scott's "Let's solve the file format problem" effort this November. I don't think I had much success, so I'm trying again. Having started this, from my experiences so far pretty much anyone who aims to support research data management could benefit from some involvement. Let me try to explain...
>
> Since last week I have identified and listed around a hundred or so scientific data formats. I'm sure the list is nowhere near complete; I could do with heads-up on further formats, or further sources (I've used DataOne, Wikipedia and the Library of Congress so far). The list is at http://justsolve.archiveteam.org/index.php/Scientific_Data_formats.
>
> I've also researched a small number of formats and written them up based on a simple template. Here's an example of a format I didn't know anything about but found interesting: http://justsolve.archiveteam.org/index.php/EAS3. Last night I was researching sdf, and found at least 4 scientific data formats of that acronym, of which two were called Simple Data Format but are quite different. There's an older one that appears to be in a similar arena to EAS3, and a newer one from the Data Protocols Team involving CSV and JSON that looks particularly interesting. I'm not equipped to work out if the older one was used much; it may need someone much more connected with that particular world for that.
>
> What I've learned is that trying to find out about a data format teaches you something interesting, and in your case (if you are supporting data management) probably relevant to your work. I've also learned that no single source has a comprehensive set of information on scientific data formats. Maybe Wikipedia would be a better choice for them, but there are notability and other requirements on Wikipedia that the "Just Solve" effort doesn't have. Anyway, it's what we've got right now.
>
> I'd really like to persuade you to join in. It would be great if Simon Hodson asked everyone involved in JISCMRD to research at least one format, or if Kevin Ashley asked the same of each member of the DCC. Ditto for UKDA, BADC, etc etc. It would be even better if I managed to inspire a few of you to get involved off your own bat!
>
> You can register to make changes to the wiki, by sending a username and email address to [log in to unmask] Attached is the template I'm currently using, which basically is just asking for general and background information on the data format, software that processes it, sample files, identification information, and references. Please do join in and help.
>
> <Scientific data.doc>
>
>
> --
> Chris Rusbridge
> Mobile: +44 791 7423828
> Email: [log in to unmask]
> Adopt the email charter! http://emailcharter.org/
>
>
>
>
|