On Sat, 23 Nov 2002, [iso-8859-1] Subbiah Arunachalam wrote:
> Why is it that Open Archives/ E-prints works well in
> some fields (physics, astronomy, computer science) and
> not in other fields (say, agriculture)? I would like
> to hear from members of the list.
Others are invited to reply too. Here is my own candidate explanation:
(1) It is not that physics or astronomy or computer science are
different from other fields with regard to the benefits or feasibility
self-archiving and open access in their fields. All fields can benefit
from it and it is feasible in all fields. There are reasons, however,
why self-arching BEGAN in physics/astronomy, and why it came early in
computer science too.
(2) Self-archiving began in physics (and soon generalized to astronomy)
because physics already had, in paper days, a "preprint culture."
Physicists had already learned, well before the online era, that they
could accelerate the pace and interactivity of research if they did not
wait till published versions of papers appeared in print. Especially in
high-energy physics, they adopted the practise of mailing preprints of
their work to one another, to routing lists, and to a number of central
depositories.
(3) This practise simply generalized, in the beginning of the '90s,
quite naturally, as the technology became available, first to email
routing lists, and then to a web depository. Given the existing preprint
culture, this subsequent development requires no special explanation.
The physicists were smarter than the rest of us in having already
discovered the benefits to research progress of sharing preprints as
early as possible. They would have had to be rather thick to keep doing
that in paper once email and the web were available!
(4) The practise of self-archiving immediately began to spread to other
areas of physics and allied fields (astronomy, mathematics), but the
important fact has to be noted that from the very beginning in August
1991 to the present day, over a decade later, that growth has been
merely linear, which means, currently, 3500 deposits per month.
http://arxiv.org/show_monthly_submissions
(5) At that linear growth rate, it would take 10 years before everything
being published in physics (that year, 2012) was being self-archived.
Physics/astronomy/maths are still ahead of all disciplines, but their
lead is not dramatic enough, and another decade would be far, far too
long a wait. What is needed is something that will not only (i) accelerate
self-archiving in those fields to a curvilinear upward growth-rate
that will capture their total current research output much sooner, but
also something that will (ii) universalize the practise of self-archiving
to all the other disciplines, and capture their full research output
too (currently about 2,000,000 articles per year, appearing in the
approximately 20,000 peer-reviewed journals in all disciplines and
languages worldwide).
(6) My own hypothesis is that distributed, institutional self-archiving
will be the critical factor that will induce this acceleration and
universalization of self-archiving, as centralized, discipline-based
self-archiving alone has so far failed to do.
(7) The reason is that the rationale for institutional self-archiving
makes the benefits of open access explicit for all researchers.
Researchers and their own institutions (not their disciplines) are the
co-beneficiaries of the maximized research visibility, accessibility,
usage, citation and impact that are provided by maximizing research
access (i.e., universal, open access) through self-archiving. It is
researchers and their institutions whose research output and research
impact, and the indirect rewards that they bring -- in the form of
research funding, income and standing, prizes and prestige -- benefit
from open access.
(8) In addition, it is research institutions that have the motivation to
try to relieve their serials subscription/license crises by doing whatever
they can to promote open access through self-archiving: Distributed
self-archiving is reciprocal.
(9) And the motivation for institutional reciprocity in self-archiving
is not just based on (a) the potential to maximize the impact of
institutional research output, nor on the possibility of eventually
(b) relieving their serials budget burdens. Access itself -- (c) access
to the peer-reviewed research output of all other universities -- can
only enhance their own researchers' productivity, for in the current
toll-access system no institution, not even the biggest or wealthiest,
can afford to provide access to anywhere near the total peer-reviewed
research literature for its researchers.
(10) The fourth reason that distributed institutional self-archiving may
well prove to be the way to accelerate and universalize open access is
that (d) internal and external research assessment (to reward researchers
for their past contributions and to fund their future contributions
http://www.hero.ac.uk/rae/ ) also promises to be greatly strengthened
through the creation of a global, open-access digital database of total
institutional research output, accessible for the many new scientometric
assessment tools that are being and will be created to analyze and monitor
research productivity and impact (e.g., http://citebase.eprints.org)
on this rich new resource. This cause/effect loop, and the means to
monitor and measures, will not remain for long lost on either university
administrations or research funders.
(11) I have still to reply about computer science: This is another sort
of special case. The content of computer science, as a discipline,
is by its nature closest to the medium of self-archiving, namely,
computers, digital data, and networks themselves. It was only natural that
computer-scientists should create and store their digital research output
on the Net, and they did so, in huge numbers -- greater even than those
of physics and the other head-start disciplines. But they stored them
on their home websites or departmental tech-report pages, rather
than in a centralized computer science archive like ArXiv. (There is a
computer-science sector in ArXiv too, but it is still one of the smaller
sectors and growing no faster than the others.)
(12) The brilliant (but also quite natural) strategy of NEC's Steve
Lawrence, Lee Giles and Kurt Bollacker had then been to try to *harvest*
all of the anarchically self-archived computer science papers distributed
all over the web (and this was before the days of OAI-interoperability --
http://www.openarchives.org -- and OAI-compliant institutional Eprints
Archives -- http://www.eprints.org -- which have made harvesting so much
easier). The result, ResearchIndex -- http://citeseer.nj.nec.com/cs --
was (and still is!) the biggest open-access archive of them all, with
over twice as many computer science papers (currently 500,000) as those of
all the disciplines in the Physics ArXiv (currently 200,000) put together;
But ResearchIndex is a "virtual" archive, not a centralized one at all;
it is a google-style selective harvest from distributed websites. Lawrence
et al. also demonstrated the power of such a virtual database to provide
rich new citation-based scientometric measures.
(13) All these currents are currently converging. The Physics ArXiv is
OAI-compliant, as are all the distributed institutional Eprint Archives,
so they can all be harvested and navigated seamlessly as if they were
all one global archive. The computer science archive has announced that
it will shortly become OAI-compliant too. So there is no longer any
difference bewteen central and distributed archiving. Universities
worldwide are becoming increasingly aware of the causal connections
between research access and research impact, and their implications for
research productivity and funding, and are moving towards self-archiving
their institutional research output and the reciprocal benefits it
confers.
(14) But it is all still happening far too slowly! We need not, and
should not, wait another decade to reap the immense benefits of open
access to the planet's research output.
(15) For ideas about what researchers, their institutions, and their
research funders can do to hasten us all along the road to the optimal
and inevitable, see:
http://www.eprints.org/self-faq/#researcher/authors-do
http://www.eprints.org/self-faq/#institution-facilitate-filling
http://www.eprints.org/self-faq/#research-funders-do
Replies to Arun's question are invited from others too!
Stevan Harnad
|