Stevan, et al,
Let's go back to the beginning.
Local archives/repositories were intended to store items but no-one outside the organisation concerned knew what was in them.
OAI-PMH was invented (by the Open Archives Initiative) to enable the builders of archives to make the metadata that described their contents available to the outside world.
The Open Access Movement (OAM) came along and adopted this idea of open (i.e. visible) archives via OAI-PMH to build Institutional Repositories (IRs) as an alternative to subject archives.
This is where the line crossing started. In the original OAI proposal the metadata exposed described the actual content of the repository but the OAM view (at least some members of it) was that the IR was a record of research outputs and a provider of access to some variant of those outputs. Under this view metadata is not a description of the content of the repository it is a description of a research output which may have a version of that output attached to it. This has led to a situation where many (most?) IRs today contain many more bibliographic records (metadata with no associated full text) than they do records with associated full text.
There is another area of line-crossing. When IRs were true repositories they were meant to be stable environments where a link to a full text item was maintained indefinitely and it was the same full text item. This was the approach adopted by the original subject archives; a new version did not overwrite an earlier version. This was done (I believe in part) to discourage the depositing of early inaccurate versions because the authors knew it was difficult or impossible to retract an item after deposit. Now it seems to be becoming common to replace earlier versions as later versions become available. Once you start to do this you no longer have a repository, you have a publications list with associated full text (where available). I have no problem with this but let's stop calling it a 'repository' and call it a 'publication list with associated full text (where available)'. [While we are clarifying naming systems lets get rid of the totally misleading 'pre-prints and 'post-prints', I propose 'pre-refereed' and 'post-refereed'.]
The EPrints package has in part been a contributor to this confusion. This is because some of its questions are ambivalent under the two interpretations or models. When entering an item with associated full text you are asked to tag it as 'refereed/non-refereed' and you are also asked to tag it as 'draft/submitted/etc'. If you are following the repository model both questions refer to the full text item being entered but if you are following the publications list model the first question is about the item itself and the second is about the full text associated with it *at the time of entry*. This means two users of EPrints can build two different IRs whilst using the same package because the questions allow two different (but internally consistent) interpretations.
It may be that I am the only person feeling confused about IRs and everyone else is clear - but are you sure you are clear about the same thing :-) .
Just to complete my reply to Stefan's note:
Nothing in my previous note was about certification and I'm not sure how that crept in.
Finally my comment about metrics and citation counts was an attempt to suggest a way of providing some form of quality assessment to searchers that did not require any reference to peer review or refereeing.
Regards,
John Smith.
> -----Original Message-----
> From: Repositories discussion list [mailto:JISC-
> [log in to unmask]] On Behalf Of Stevan Harnad
> Sent: 04 March 2008 21:16
> To: [log in to unmask]
> Subject: OA IRs Are Research Access Providers, Not Publishers or
> Library Collections
>
> On Tue, 4 Mar 2008, John Smith, University of Kent, wrote:
>
> > The problem is that the item referred to may have been published
> in a
> > peer reviewed journal but the full text available from the
> repository
> > may not be the final peer reviewed version.
>
> John, I may be wrong, but I think that some lines may be crossing
> here,
> between the immediate needs and concerns of researchers, trying to
> access
> and use the current research literature, and the needs and
> concerns of
> library cataloguing, concerned to collect and tag the canonical
> text.
>
> Green OA is author self-archiving. If the author deposits a text
> in his
> OA IR and tags it as "peer-reviewed" (and specifies the journal
> [date,
> volume] that accepted it for publication), that is good enough for
> Green
> OA, and for the immediate needs and concerns of researchers,
> trying to
> access and use the current research literature (and particularly
> the major
> portions of it to which they do not have subscribed institutional
> access).
>
> Researchers have no pressing current need for a new certification
> system, certifying that (1) the self-archiving author is telling
> the
> truth that his self-archived draft was indeed peer-reviewed, and
> that
> (2) the self-archived postprint is word-for-word identical with
> that
> published article.
>
> (Both this certification and this version authentification are
> technically
> feasible, of course, but they are simply not worth the bother at
> this
> critical time, when what is missing and urgently needed for 85% of
> annual peer-reviewed research articles is not certification that
> they
> have indeed been peer reviewed or that they are word-identical to
> the
> published version; what is needed is the articles themselves!)
>
> > Is this peer reviewed or
> > not? Strictly it is not but it is the best copy we can provide
> of a peer
> > reviewed item. Ideally we need to distinguish between the item
> status
> > (peer reviewed) and the full text status (non-peer reviewed). As
> far as
> > I am aware EPrints does not support this out of the box. Also it
> is not
> > clear how one would represent this to a harvester program.
>
> EPrints tags the self-archived item as peer-reviewed if the self-
> archiving
> author tags it as peer-reviewed. The journal is the one that
> certifies
> peer review. If a user's urgent need is for the certification, and
> not for
> access to the item itself, then there is always the journal to
> turn to.
>
> (Yes, some sloppy authors, sometimes, will not update their self-
> archived
> unrefereed preprints, and instead simply re-label them as "peer-
> reviewed"
> when the final draft is accepted. Scholarly practice will take
> care of
> that sloppiness in due course. But what's incomparably more
> important
> right now is to stay focussed on solving the real problem -- not
> yet
> solved, and not even being conscientiously attended to -- which is
> getting
> the authors to self-archive those preprints and postprints in the
> first
> place! We are not working here to make a rich corpus spic-and-span
> for
> a library collection catalogue: We are trying to enrich the
> impoverished
> research corpus, so researchers can get on with their work!)
>
> > This is related to a note I sent previously asking if the
> repository
> > should be seen as a store (and delivery mechanism) or as a
> publication
> > list. If it is a store (as indicated by the name 'repository')
> the
> > associated metadata should describe the item contained which in
> many
> > cases will be a non-refereed version of a refereed article.
>
> The repository should be seen as a way for researchers to make
> their
> findings accessible to and usable by all their would-be users,
> worldwide,
> and not just those that can afford access to the publisher's
> proprietary
> version, as subscribed to by their institution's library.
>
> It is ever so important to remind ourselves that an institution's
> OA IR is not a library collection, and its tags are not library
> card
> catalogues.
>
> IRs can certainly generate CVs and publication lists, but, as
> always,
> the published items that are cited in those CVs and publication
> lists
> are the *articles published in their respective journals*! A
> postprint
> in the author's IR is not the publication itself. It is a
> supplementary
> draft provided for access purposes, for those researchers whose
> institutions cannot afford access to the publisher's proprietary
> version.
>
> > Finally, there is a certain amount of academic snobbishness
> about
> > peer-review, in many non-STM subjects the peer-reviewed article
> in not
> > the main form of accepted publication and in others the first
> publication
> > of new knowledge is in conference or working papers which may be
> later
> > written up for journal publication. Any system that
> automatically says
> > 'refereed=good', 'non-refereed=bad' is going to miss a lot of
> good
> > quality material.
>
> Who is saying 'refereed=good', 'non-refereed=bad'? The IR's tag is
> just
> "refereed" and "unrefereed". For the rest, it's caveat emptor. If
> you
> want to restrict your search and usage to research tagged
> 'refereed',
> fine. If you want to broaden your search and usage, that's fine
> too.
> Again entirely a matter for scholarly practise to decide.
>
> The primary target literature for the OA movement, and OA IRs, is
> peer-reviewed journal and conference articles (about 2.5 million
> per
> year), because those are all, without exception, author give-aways,
> written purely for the sake of researcher usage and impact, not
> for the
> sake of royalty income and/or a prestigious hard-copy imprimatur.
> For the
> disciplines that also rely on books, those books are also welcome
> in OA
> IRs, but not many of them are likely to be deposited for the time
> being,
> because they are decidedly *not* just author give-aways, written
> for
> the sake of researcher usage and impact, not for royalty income
> and/or
> a prestigious hard-copy imprimatur.
>
> So whether or not to restrict search and usage to refereed content
> is a matter for the user-scholars alone to decide, and whether or
> not
> to self-archive their books alongside their articles is a matter
> for the
> self-archiving author-scholars alone to decide. (It certainly
> cannot and
> should not be mandated at this time!)
>
> > It occurs to me that what we need is a post-publication quality
> > indicator - otherwise known as a weighted citation-count :-) .
> Could
> > we automatically include this in our repositories (taken from a
> central
> > service?) or should we leave this to the search services?
>
> I cannot follow this at all: The post-publication metrics concern
> the
> *publication* itself and not just, or primarily, the version that
> happens
> to be accessible in the author's IR! By all means add IR download
> counts
> and their growth metrics to the growing spectrum of research usage
> and impact
> metrics, both before and after publication (EPrints IRs are
> already
> doing this); and by all means couple IR-native metrics with global
> harvested metrics, harvesting them back to the IR from citebase,
> google
> scholar, citeseerx (and ISI and Scopus where permitted/licensed!).
>
> But remember that the IR's primary function is to provide open
> access
> to the institution's refereed research output, along with open
> access
> metrics to provide feedback and incentives for self-archiving
> authors.
> The search and usage of external users, however, will almost never
> be
> at the local IR level; it will be at the global harvester/indexer
> level.
>
> Stevan Harnad
>
> >> -----Original Message-----
> >> From: Repositories discussion list [mailto:JISC-
> >> [log in to unmask]] On Behalf Of Stevan Harnad
> >> Sent: 29 February 2008 18:07
> >> To: [log in to unmask]
> >> Subject: Re: Required and Desirable metadata in a repository
> >>
> >> Bill Hubbard is spot-on on the utility of am explicitly
> >> searchable
> >> field indicating whether or not an item has been peer reviewed.
> >> The
> >> EPrints software has such a tag.
> >>
> >> (It is only likely to be useful at a harvester level, as
> >> individual
> >> repositories (IR) are only likely to be searched for
> >> institution-internal purposes. So this is a metadatum worth
> >> displaying
> >> for harvesters, and harvesters should set up in such a way as
> to
> >> make
> >> it possible to search on only the peer-reviewed items, if the
> user
> >> wishes.)
> >
> > TEXT DELETED
> >
> >> Stevan Harnad
> >>
> >> On 08-02-29, at 12:10, Hubbard Bill wrote:
> >>
> >>> Dear Colleagues,
> >>>
> >>> Just picking up on Ian Stuart's question as to opinion on
> >> "Required"
> >>> and
> >>> "Desired" metadata fields for eprints records.
> >>>
> >>> Could I ask colleagues how they view a "peer-reviewed" field?
> >>>
> >>> In terms of what users want, my own experience from talking to
> >>> academics
> >>> is that when faced with a mass of Open Access eprints the
> great
> >>> majority
> >>> have asked unprompted about how to search only within peer-
> >> reviewed
> >>> material.
> >>>
> >>> And for this facility we need to give services a peer-review
> >> field,
> >>> unless they start interpolating from other metadata features
> >> like
> >>> journal-title or somesuch.
> >>>
> >>> Copyright and peer-review (p-r) are the two topics that can be
> >>> guaranteed to come up in academic discussions in relation to
> >>> repositories: the first from their perspective as an author,
> the
> >> second
> >>> from their perspective as researcher/user.
> >>>
> >>> My strong suspicion is that most of those academics that
> haven't
> >> asked
> >>> about a p-r filter would want the feature before they used OA
> >> material
> >>> as a habitual source for research. Again, it may be that they
> >> didn't
> >>> ask
> >>> because they assumed that it was all p-r, or, that it was all
> >> non-p-r.
> >>> (I have found repositories have a slighted reputation in some
> >> quarters
> >>> (often BioMedical) as being all referred to as "pre-print
> >> servers").
> >>>
> >>> In terms of ingest, I think that the author is the best person
> >> to know
> >>> if their eprint has been p-r'd and that a peer-review tick-box
> >> would be
> >>> an acceptable additional task. Authors are generally pleased
> >> that their
> >>> article has passed p-r and would probably be happy about
> noting
> >> that.
> >>> As
> >>> to how that information is recorded, that is another matter.
> >>>
> >>> Does this agree with other colleagues' experience? Is a p-r
> >> field
> >>> required to facilitate future use of the material?
> >>>
> >>> Regards,
> >>>
> >>> Bill
> >>>
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Repositories discussion list
> >>>> [mailto:[log in to unmask]] On Behalf Of Ian
> >> Stuart
> >>>> Sent: 21 February 2008 14:41
> >>>> To: [log in to unmask]
> >>>> Subject: Required and Desirable metadata in a repository
> >>>>
> >>>> [This is primarily a question for those involved in
> >> repositories for
> >>>> e-prints, but others may have interesting views]
> >>>>
> >>>> Within your own Repository, what [primarily metadata] fields
> >> are
> >>>> *Required* and what are *Desired*?
> >>>>
> >>>> If you were advising a fellow Institution about setting up a
> >>>> repository,
> >>>> what fields would you advise as *Required* and what are
> >> *Recommended*?
> >>>>
> >>>> If you were to harvest[1] from a repository, what fields
> would
> >> you
> >>>> consider essential, and what would you consider helpful?
> >>>>
> >>>> Following on from that: if you were to harvest the Depot (or
> >> even the
> >>>> Intute Repository Search), how would you hope to identify[2]
> >> deposits
> >>>> that could be imported into your own Institutional Repository
> >>>>
> >>>> [1] This is where I come in: The depot will have a transfer
> >>>> service, but
> >>>> what to transfer?
> >>>> [2] I've had loads of thoughts on this one, and they all seem
> >>>> to spiral
> >>>> and knit and knot and hide their threads, and not actually
> >>>> conclude in
> >>>> any meaningful way.... for me.
> >>>>
> >>>>
> >>>> --
> >>>>
> >>>> Ian Stuart.
> >>>> Developer for The Depot,
> >>>> EDINA,
> >>>> The University of Edinburgh.
> >>>>
> >>>> http://edina.ac.uk/
> >>>>
> >>>
> >>> --
> >>>
> >>> Bill Hubbard
> >>> SHERPA Manager
> >>>
> >>> SHERPA - www.sherpa.ac.uk
> >>> RSP - www.rsp.ac.uk
> >>> RoMEO - www.sherpa.ac.uk/romeo
> >>> JULIET - www.sherpa.ac.uk/juliet
> >>> OpenDOAR - www.opendoar.org
> >>>
> >>> SHERPA
> >>> Greenfield Medical Library
> >>> University of Nottingham
> >>> Queens Medical Centre
> >>> Nottingham
> >>> NG7 2UH
> >>> UK
> >>>
> >>> Tel +44(0) 115 846 7657
> >>> Fax +44(0) 115 846 8244
> >>>
> >>> * * * * * * * *
> >>>
> >>>
> >>> This message has been checked for viruses but the contents of
> an
> >>> attachment
> >>> may still contain software viruses, which could damage your
> >> computer
> >>> system:
> >>> you are advised to perform your own checks. Email
> communications
> >> with
> >>> the
> >>> University of Nottingham may be monitored as permitted by UK
> >>> legislation.
> >
|