The following is a presentation based on a survey relating to the quality of
chemical data on the Web, but it should have wider implications in relation to
the electronic library.
==========================================================================
Ian Winship
Information Services Dept. | e-mail: [log in to unmask]
University of Northumbria at Newcastle | phone: 0191 227 4132
City Campus Library | fax: 0191 227 4563
Newcastle upon Tyne |
NE1 8ST |
UK |
===========================================================================
Data Needs of Academic Research on the Internet
Gary Wiggins
Indiana University Chemistry Library
[log in to unmask]
Data on the Web
"All in all, the chemical data now available on
the web is in a different class from the data
found in refereed journals, critical reviews and
books from reputable publishers.
- David Lide (CHMINF-L, 30 October 1996)
The above response was one of several received in response
to questions sent to three chemically-oriented discussion
lists in the fall of 1996. This was in preparation for a
lecture and demonstration delivered at the National
Institute of Standards and Technology on December 4, 1996.
Most of the information in this paper was included in that
presentation.
Questions were sent to CHMINF-L, CHEMWEB, & CHEMIND-L in
late October 1996. They were designed to:
- Gauge the extent of inaccurate data in Web databases
- Define the characteristics of data on the Web
>> Sources of data
>> Need for standardization of data formats
- Determine the best guides to data.
Respondents to the survey noted these problems with the
accuracy of data on the Web:
- Units are frequently omitted
- Transcription errors are often encountered
- This leads to a need to find redundant data
- Very few sources have quality assurance statements
- Few of the Web data sites give the source of the data
- If they do, data are likely to be copied from outdated
sources.
Other Survey Results
Several people commented on efforts or practices that will
likely improve the quality of data on the Internet,
including:
- Standardization efforts:
>> CLIC, Chemical MIME, CML
>> Roles for IUPAC, CODATA: certification?
(One person, however, questioned whether standardization
efforts were worthwhile.)
- Efforts to share data or to cooperatively compile data
sources
>> Open Molecule Foundation
>> Molecule of the Month
>> Reciprocal Net
>> Structure and Reactivity Across the Periodic Table
- Provision of a minimal level of auxiliary information
(metadata)
>> authorship
>> units
>> conditions of measurement
>> references to primary and secondary sources of data
- Use of standard symbols and terminology
- Guidelines on how to handle special characters.
General Comments on Data on the Web
"While some might argue that the Internet is designed to make
information in a single location accessible to users around the
world, the large number of mirrored sites already in existence
points out the Net's inadequacy."
- Byte, December 1996
There are a number of steps needed to improve the quality of
data found on the Web. Among them are:
- Mechanisms to synchronize changes made at multiple
sites
- Faster access to resources
- More secure transactions
- Progress on chemical metadata standards
- Interoperability of chemical plug-in programs.
Some Goals for Improving Data on the Web
- Assemble the most reliable data available
- Arrange data for easy retrieval
- Provide a "SuperIndex" of available data sources
- Establish criteria for evaluation of data sources:
>> descriptions of physical theories on which data are
based
>> full references to literature
>> format of the database
>> search capabilities
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
|