Hi Simeon
Thanks for sharing these very interesting results from arXiv. It's
really useful to see some data about how many versions of papers a
large-scale mature repository is likely to contain.
The fact that 31% of the submissions in arXiv have more than one version
(albeit only 2 or 3 in most cases of multiple versions) suggests that
multiple versions and revisions are a fact of life for repositories,
especially when deposit begins with the preprint, and that revisions
will need managing in institutional repositories as well as in subject
repositories.
In arXiv, where multiple versions are very well managed, the pattern of
accesses does suggest that the vast majority of readers are being
directed quickly and easily to the latest version without being troubled
by undifferentiated earlier versions. Time taken to look at different
versions (to identify the latest one) is a matter of concern for
researchers judging by the VERSIONS Project survey responses.
Equally, the 0.3% of accesses made to specific (earlier) versions in
arXiv in one week in June (1380 accesses) suggest that the record
structure is nevertheless able to accommodate the needs of the small
minority of readers who have a specific reason to seek out earlier
versions.
Best wishes
Frances
-----Original Message-----
From: Repositories discussion list
[mailto:[log in to unmask]] On Behalf Of Simeon Warner
Sent: 16 June 2006 16:24
To: [log in to unmask]
Subject: Re: Subject based repositories
Hi Frances and friends,
Discussion of versions of articles prompted me to look at arXiv to see
how
many submissions have multiple versions *within* arXiv. This may be a
little tangential to the discussion but I thought I'd share the results
in case it is interesting:
We have stored all versions submitted since October 1997. Since then
there
have been about 307k submissions, of which 96k (31%) have more than one
version [* breakdown of versions is appended below].
Of the 460k accesses to abstracts and full-text on the main arXiv site
in
the first week of June 2006, just 0.3% specified an explicit version [**
details below]. Thus the overwhelming majority were accesses to the
"latest version".
Cheers,
Simeon
* Breakdown of counts of articles on arXiv with different number of
versions. Articles from 1997-10-01 to 2006-06-12 as of 2006-06-13.
highest
version count
1 210812
2 69654
3 19172
4 4921
5 1339
6 441
7 169
8 76
9 35
10 23
11 15
12 11
13 10
14 6
15 2
16 4
17 3
18 3
19 0
20+ 7
** Access to arXiv articles via an identifier without an explicit
version
number takes the user to the latest version (e.g.
http://arxiv.org/abs/hep-th/9901001). The previous versions are linked
from this page. Accesses to a specific version are indicated by a
version
number appended to the internal identifier (e.g. hep-th/9901001v2
instead
of hep-th/9901001, to make a URL like:
http://arxiv.org/abs/hep-th/9901001v2). Robots, admin and repeat
downloads
have been removed to a good degree from these numbers.
On Tue, 13 Jun 2006, Frances Shipsey wrote:
> Hi Joanne, Patrick
>
> It's excellent to hear how CERN is going about getting all those
papers
> in - very inspiring!
>
> On the question of identifying and differentiating different versions,
> the VERSIONS Project is currently asking researchers (in the field of
> economics) about their experience of finding multiple versions and/or
> copies of academic papers online.
>
> The survey (see http://www.lse.ac.uk/library/versions/surveys.html),
> which covers other aspects of version identification and use, is open
> for another couple of weeks, so figures given below are provisional.
> Responses so far indicate that:
>
> 94.3% of researchers (to date) find multiple versions/copies of
articles
> online either Very Frequently (16.9%), Frequently (38.7%) or Sometimes
> (38.7%).
>
> 55.6% of survey respondents (to date) find it generally quick and easy
> to establish which version(s) they want to read, while a significant
> minority (39.8%) do not.
>
> We will be analysing the results in full when the survey closes and
> making recommendations about how repositories could identify different
> versions more clearly, based on what's important to authors and
readers.
>
> Best wishes
>
> Frances Shipsey
> VERSIONS Project Manager
> Library
> London School of Economics and Political Science
> 10 Portugal Street
> London WC2A 2HD
>
> t: +44(0)20 7955 6915
> f: +44(0)20 7955 7454
> e: [log in to unmask]
> w: www.lse.ac.uk/versions
|