JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for LIS-ELIB Archives


LIS-ELIB Archives

LIS-ELIB Archives


LIS-ELIB@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

LIS-ELIB Home

LIS-ELIB Home

LIS-ELIB  June 2003

LIS-ELIB June 2003

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: Consultation Draft - Study on Preservation of eprints

From:

Stevan Harnad <[log in to unmask]>

Reply-To:

Stevan Harnad <[log in to unmask]>

Date:

Tue, 10 Jun 2003 20:15:18 +0100

Content-Type:

TEXT/PLAIN

Parts/Attachments:

Parts/Attachments

TEXT/PLAIN (353 lines)

> On Tue, 10 Jun 2003, Neal BEAGRIE wrote:
>
> Dear Stevan
>
> Many thanks for this posting in response to the consultation draft. Some
> thoughts on issues raised:
>
> 1. No-one doubts the overall importance of expanding the content held
> in institutional respositories. The draft report itself points out
> there are only around 5,000 eprints estimated currently to be held
> in UK institutional respositories. This is clearly a major issue and
> significant cultural change is still needed to change this position.

Dear Neal,

On this we agree completely. The question is, what will help facilitate
that cultural change, and bring it about as soon as possible (as it is
already overdue!).

> 2. The consultation draft is one of a series of reports commissioned
> under the JISC Continuing Access and Digital Preservation Strategy. The
> archiving of subscription e-journal journals and the issues surrounding
> the preservation of and ongoing access to, the primary corpus of published
> literature are considered in parallel.

There are in fact not two but three parallel issues there:

(a) the archiving of subscription-based e-journals (part of electronic
collection management, classified as II "MAN", below)

(b) preservation of all journals (mostly now hybrid, with both a paper
and an online edition, classified as III "PRES", below)

(c) ongoing access to all (research) journals (for those whose
institutions do not have subscription access to them, classified as I
"RES", below).

Yes, these are parallel, but they must be faithfully kept distinct,
because the solution for one is definitely not the solution for another.

> 3. Although the majority of preservation effort is clearly needed on the
> published corpus, the consultation draft for eprints is surely right to
> point to the preservation issues which are likely to arise in time for
> institutional repositories.

I am afraid this mixes up the three again. Institutional repositories
have the problem of managing their own online collections of whatever
they might have (MAN). To a great extent, managing these collections
also entails preserving them (PRES). This preservation burden, in the
case of the primary, subscription-based journal literature, is one that
is perhaps shared by university libraries and the publishers of the
journal literature.

(I say "perhaps" because I know that often the publishers are not sharing
the preservation burden, feeling that that is traditionally a library
responsibility, not a publisher responsibility. On these matters I have
no views, except to point out that they have *nothing* to do with the
ongoing-access problem, and should be handled separately, however it is
handled.)

Self-archived eprints (of which there are so few) do *not* face any
preservation problem for the following reasons:

(i) The long-term preservation problem is currently 100% on the primary
corpus (and whoever is handling that, and how).

(ii) For the short-term (decades at least, as the still-with-us and
still-100%-useable Physics ArXiv and others demonstrate since at least
1991) there is no preservation problem for the secondary, self-archived,
open-access corpus, which merely *duplicates* the primary corpus,
for ongoing-access purposes, for those whose institutions cannot afford
the subscription-based access to the primary versions. The *only*
short-term problem for this back-up corpus is its still minuscule
size! This is the *content* problem (or, better, the *access* problem)
which has nothing to do with preservation issues, and should not
be weighted down with any. It needs facilitation, not further (and
irrelevant) loads.

> 4. Preservation in this context is a means to an end - ensuring continuing
> access to cited research output (published or not).

Preservation of what? With the primary corpus, the answer is clear. But
with the secondary, duplicate corpus, meant only to remedy access
problems (and doing so quite brilliantly for well over a decade now, for
the little that has been self-archived so far) the "continuing access"
problem is rather different, isn't it, from the "continuing access"
problem for those institutions that have and can afford the primary
corpus?

To put it another way, suppose there were no self-archived versions at
all, and the only access was subscription-access. We would still be
facing all there is of the PRES problem: How to ensure that access to
the subscription-based corpus, for those who have it now, continues to
be had next year, and the next. Fair enough. But nothing whatsoever to
do with the problem of the nonexistent access of those who cannot afford
the subscription access! It is for *them* that the duplicate self-archived
versions are created. And their 1st, 2nd, and 3rd worry is still access to
that corpus *today*, because for most of the 2,000,000 annual articles
appearing in the 20,000 subscription journals today, there *is* no
self-archived version. So "continuing access" is not only moot for this
nonexistent secondary corpus, but it is also beside the point for the
little of it that exists so far. There is no need to fret about
continuing-access to the self-archived corpus: Sort out the preservation
problem for its primary incarnation, and meanwhile let those who are
concerned with secondary access worry about increasing that access. When
all of the contents of the 20K have been self-archived and made openly
accessible, *then* we can see whether there is some way it can help
solve the preservation problem for the primary corpus (if it has not
already been solved by then). Not before. For access, the only problem
is access today.

> 5. The report considers issues which may lead to the retention(or
> withdrawal) of eprints over time. For related material such as e-theses
> mentioned in your (RES) category the need for continuing access and
> preservation will always be present.

First the theses: Here too, there is a primary corpus, with the true
preservation burden: How are/were theses preserved even when no one
self-archived them? And then there is the secondary corpus, for access.
Exactly the same story as above.

Now about self-archived versions *other* than the self-archived final,
refereed, revised, accepted, published journal version. I would say it
is premature to fuss too much about those. The culture of self-archiving
has not yet established itself, whether it be the self-archiving of
unrefereed preprints or refereed postprints. In the scheme of things,
the urgency of getting all 2M annual postprints self-archived and openly
accessible *vastly* outweighs the problem of self-archiving or preserving
the unrefereed prior drafts. Not that there is no point having and
saving preprints, and archiving them for ever. That would be desirable; but
it is completely eclipsed at the moment by the access problem for the
postprints. Moreover, again the preprint corpus is minuscule. So whether
it comes and goes is not the issue.

See the American Scientist Forum thread on
"Eprint versions and removals"
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2847.html
(and its predecessor threads).

The reality there is quite illuminating: We would "love" to be able to
implement and enforce a strict policy that no self-archived draft may
subsequently be removed from an Eprint Archive. It would certainly be
the best for the scholarly record if publicly accessible documents that
users had read, used and cited, didn't vanish thereafter. But the fact is
that one reads, uses and cites unrefereed literature at one's own risk
anyway. It is not only the text that might go up in smoke subsequently,
but its content, if it does not manage to meet the standards of peer
review.

Never mind; the real constraint is this: The reason Eprint Archives
cannot at this time impose a draconian "no removals permitted" policy on
their self-archivers is that the self-archivers are still so few, and
skittish. This is not time to slap their wrists or implant the fear of
god in their heads. It's a time to encourage them to do what none of
them are yet used to doing: self-archive their refereed postprints. If
they also self-archived pre-refereeing preprints, that's well and good,
but don't make it into a handicap, or a deterrent, by staying their
already trembling hands with warnings that "if you hit the entry key on
this draft, it's forever!".

(In reality, it *is* forever, for even if the file is later removed from
the mother-archive, having been openly accessible even for a while, it
may well have been downloaded, harvested, and cached all over the
planet, and it might even have had a probity-based, time-stamped
"snapshot," bit-perfect, stored of it somewhere. But there's no point
even talking about *that* now, when it too could only be seen as
a deterrent before the requisite cultural change has taken place.
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0807.html
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/0997.html
http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/1159.html

> 6. You note all five aims listed in your email are worthwhile and
> important - but most are not urgent ie needing action now. Although
> preservation is a long-term challenge there are actions which are best
> taken at the earliest possible stage.

For the primary corpus. But not for the frail secondary corpus, where
these measures can only serve as further deterrents and retardants at a
time when facilitators are the only thing needed!

> Institutional approaches to IPR is one and this is being addressed
> elsewhere in the FAIR programme because of its impact across the board
> on repositories.

And I think casting the access problem and self-archiving in anything
faintly resembling "Intellectual Property Rights" terms is just asking
for trouble (and confusion, and still more delay): Self-archiving is
not an IPR issue! The sole IP question -- to which we know the answer --
is whether it is ok for the author of a peer-reviewed journal article
to self-archive his own final, refereed ("vanilla") draft. The short
answer is "Yes" (and that's really all there is to it: it was certainly
enough for the authors of the 250,000 articles self-archived by the
physicists lo these dozen-odd years, and for many times more authors in
other disciplines who have been doing it on their own websites in for
at least as long).

But if would-be self-archivers want more details, there's the other
JISC project, Romeo:
http://www.lboro.ac.uk/departments/ls/disresearch/romeo/Romeo%20Publisher%20Policies.htm
and the self-archiving FAQ:
http://www.eprints.org/self-faq/#self-archiving-legal
and
http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#5

But the less that authors get involved in digital IPR matters, the
better. Their time is far better spent self-archiving!

> Another is the
> issue of capturing the technical metadata which can support long-term
> management.

Again: Management of what? For the secondary, self-archived, open-access
versions of the refereed corpus, the minimal OAI-protocol is already
more than satisfactory. (For the primary corpus, nolo contendere.)
http://www.openarchives.org/

To mix the two is just to blur the picture, a picture that urgently
wants focusing, not blurring!

> As the report points out it is possible most if not all of
> this can be automated at deposit and its recommends exploring interfaces
> between institutional repository software (eprints, DSpace etc) and file
> format recognition software.

All fine, as long as the secondary ongoing-access problem, RES, is not
conflated with any of the other other four, MAN, PRES, TEACH and EPUB. It
is special, different, and very urgent.

> Overall I think the consultation draft is carefully balanced. It is
> not attempting to say there are preservation problems to solve before
> repositories can be filled. It is pointing out the role that preservation
> will play as these repositories grow and the steps that can be taken
> to address issues which become far more problematic over time if not
> addressed at an early stage. It clearly recognises the overall importance
> of growing content in repositories.

The more I think of it, though, it is not at all clear that it is
beneficial to see the kinds of repositories we need for RES as having
anything at all to do with the kind we need for MAN, PRES, TEACH and
EPUB. Is it even such a good idea to treat them all as one repository? One
of the lessons we have learned, and the powers we have gained, from
the OAI-protocol is that we need not think of big central repositories
(analogous to libraries) at all any more. OAI-interoperability makes it
much more sensible to think in terms of small, distributed archives,
unified only by the glue of interoperability. Some kinds of archives
may require much finer-grained metadata. Fine. The eprint archives
consisting of secondary self-archived versions of the primary refereed
journal corpus do not need such fine-grained metadata at this time.
Nothing much more than author, title, journal, year (plus a few more, as
in the OAI-protocol) are good enough to serve the immediate and pressing
access needs we have in this area (in face of the content that we lack).

So let the Eprint archives be coarsely interoperable, via the
OAI-protocol, with all the other archives, including many more at the same
university. No need to try to force them into the same Procrustean
meta-bed! Let the fine-grained metadata be worked out for PRES, MAN,
EPUB (and perhaps TEACH). They have the time. But (back-door, vanilla)
*ongoing-access* (to the peer-reviewed corpus) is urgently needed, now.

> I hope it is a report which will be widely read by emerging institutional
> repositories and look forward to comments from colleagues in the FAIR
> programme in due course.

I hope it will advance the finer-grained and less-pressing needs of II-V
without retarding the courser-grained and much more pressing needs of I!

Stevan

> *********************************************************************
> Neil Beagrie                   JISC Digital Preservation Focus
> Programme Director             Secretary, Digital Preservation Coalition
> JISC London Office,            Tel/Fax/Voicemail :+44 (0)709 2048179
> King's College London          email: [log in to unmask]
> Strand Bridge House            url: www.jisc.ac.uk/index.cfm?=pres_home
> 138 - 142, The Strand,              www.dpconline.org
> London WC2R 1HH        www.jiscmail.ac.uk/lists/digital-preservation.html
> ************************************************************************
>
>  -----Original Message-----
>  From: Stevan Harnad [mailto:[log in to unmask]]
>  Sent: Fri 06/06/2003 20:23
>  To: [log in to unmask]
>  Cc:
>  Subject: Re: Consultation Draft - Study on Preservation of eprints
>
>  The institutional eprint repository movement would benefit greatly
>  from clearly separating the 5 quasi-independent aims that currently
>  constitute its very mixed agenda. All 5 aims are worthwhile and important,
>  but only the first is urgent, and it is the heart of the challenge for
>  filling institutional repositiories with university research output for
>  the sake of maximizing its impact by maximizing access to it:
>
>  The 5 distinct aims for institutional repositories
>
>      I. (RES) self-archiving institutional research output (preprints,
>      postprints and theses)
>
>      II. (MAN) digital collection management (all kinds of digital content)
>
>      III. (PRES) digital preservation (all kinds of digital content)
>
>      IV. (TEACH) online teaching materials
>
>      V. (EPUB) electronic publication (journals and books)
>
>  As long as we keep blurring or mixing these 5 distinct aims, the first
>  and by far the most pressing of them, RES -- the filling of university eprint
>  archives with all university research output, pre- and post-peer-review,
>  in order to maximize its impact through open access -- will be needlessly
>  delayed (and so will any eventual relief from the university serials
>  budget crisis).
>
>  Perhaps the two most counterproductive of the conflations among these
>  five distinct aims has been that between I and III (research
>  self-archiving, RES, and digital preservation, PRES) and that between
>  I and V (research self-archiving, RES, and electronic publication,
>  EPUB).
>
>  The RES/PRES mix-up, much discussed in the American Scientist Forum,
>  can easily be seen to be a needless and misleading conflation once we
>  recall that insofar as the peer-reviewed research literature is
>  concerned, the current preservation burden is on its primary corpus,
>  which is the published literature (online and on paper). The much-needed
>  filling of university research-output archives is a *supplement* to this
>  primary corpus, for the purpose of maximizing its impact by maximizing
>  access to it; it is not a *substitute* for it. It is simply a mistake
>  and a needless retardant on the filling of the university research output
>  archived to imply that there are preservation problems to solve before
>  they can be filled.
>
>  The RES/EPUB mix-up is really two mixups. The first is the conflation of
>  self-archiving with self-publishing: The urgent archive-filling challenge,
>  RES, concerns the self-archiving of peer-reviewed, *published* research
>  output. Again, this is a *supplement* to publication, for the purpose of
>  maximizing its impact by maximizing access to it; it is not a *substitute*
>  for it.  http://www.ecs.soton.ac.uk/~harnad/Tp/resolution.htm#1.4
>
>  The second RES/EPUB mix-up has to do with university e-publishing
>  ambitions (perhaps along the lines of High-Wire Press-Hopes!). It is
>  fine to have these ambitions, but they should not be conflated in any
>  way with the completely independent and urgent aim of self-archiving
>  the university's peer-reviewed, *published* research output.
>
>  Most of this is discussed in the thread:
>
>      "EPrints, DSpace or ESpace?"
>      http://www.ecs.soton.ac.uk/~harnad/Hypermail/Amsci/2670.html
>
>  See: "Enhance UK research impact and assessment by making the RAE webmetric"
>        http://www.ecs.soton.ac.uk/~harnad/Temp/thes.html
>
>  Stevan Harnad

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

May 2024
April 2024
January 2024
December 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
February 2023
January 2023
December 2022
February 2022
December 2021
October 2021
September 2021
August 2021
May 2021
September 2020
October 2019
March 2019
February 2019
August 2018
February 2018
December 2017
October 2017
September 2017
August 2017
June 2017
April 2017
March 2017
February 2017
January 2017
November 2016
August 2016
July 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
September 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
December 2005
November 2005
October 2005
September 2005
August 2005
July 2005
June 2005
May 2005
April 2005
March 2005
February 2005
January 2005
December 2004
November 2004
October 2004
September 2004
August 2004
July 2004
June 2004
May 2004
April 2004
March 2004
February 2004
January 2004
December 2003
November 2003
October 2003
September 2003
August 2003
July 2003
June 2003
May 2003
April 2003
March 2003
February 2003
January 2003
December 2002
November 2002
October 2002
September 2002
August 2002
July 2002
June 2002
May 2002
April 2002
March 2002
February 2002
January 2002
December 2001
November 2001
October 2001
September 2001
August 2001
July 2001
June 2001
May 2001
April 2001
March 2001
February 2001
January 2001
December 2000
November 2000
October 2000
September 2000
August 2000
July 2000
June 2000
May 2000
April 2000
March 2000
February 2000
January 2000
December 1999
November 1999
October 1999
September 1999
August 1999
July 1999
June 1999
May 1999
April 1999
March 1999
February 1999
January 1999
December 1998
November 1998
October 1998
September 1998
August 1998
July 1998
June 1998
May 1998
April 1998
March 1998
February 1998
January 1998
December 1997
November 1997
October 1997
September 1997
August 1997
July 1997
June 1997
May 1997
April 1997
March 1997
February 1997
January 1997
December 1996
November 1996
October 1996
September 1996
August 1996
July 1996
June 1996
May 1996
April 1996
March 1996


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager