Print

Print


Steve, a very useful series of postings - thanks.
 
UK Research Councils have a variety of OA mandates - including two which mandate deposition in CRs (MRC- UK PubMed and ESRC - Society Today).  WIth the exception of EPSRC (and this may well change) the others do mandate deposition, but are unspecific about where.  NERC, for example, says:
 
"From 1 October 2006 NERC requires that, for new funding awards, an electronic copy of any published peer-reviewed paper, supported in whole or in part by NERC-funding, is deposited at the earliest opportunity in an e-print repository.  NERC also encourages award-holders to deposit published peer-reviewed papers arising from awards made before October 2006.  "
 
BUT its very difficult to check compliance to these mandates!  Councils have reduced their final reporting requirements on the expectation that it will be possible to collect outputs information (not just publications) electronically from grantholders.  RCUK is assessing options for doing this - either pushing/pulling from Institutional Repostories or from HEI CRIS systems, or both.  Whatever is decided its certain that that we'd be assisted by inclusion in IRs of metadata fields for a) "Funder" (perhaps using a dropdown list of funders URIs); and b) "GrantReference". 
 
The disadvantage of using IRs rather than Central Repositories is the absence of minimum standards and formats in the former.  Both the above fields exist in CRs (e.g. UK PubMed and Society Today)
 
So, three questions re IRs (reply offline if you prefer)..........
 
1. Funder and GrantRef fields exist in EPrints (as free text) from version 3.0  - do they exist in DSpace and Fedora - and in what form?
 
2. Can a standard be introduced where they allow multiple funders - like multiple authors? (its unlikely we'd want to be as sophisticated as adding a %DueToGrant field!)
 
3. If Councils were to add to their mandates a sentence like:  'By [date] such records should be tagged with Funder and Grant Reference information, and made available for harvesting', what would be an appropriate [date].  I guess this is depends on the harvisting tool.  I'm told that standard OAI-PMH doesnt handle these fields and that SWAP is not widely used?  What is the best approach?
 
Additionally, some Councils mandate deposition only 'where a suitable repository exists'.  Should we change this to something like 'where a suitable Institutional Repository does not exist it is expected that the JISC-supported repository of last resort, 'The Depot' , will be used.'?
 
Many thanks, Gerry Lawson
 
NERC Research Information Systems
RCUK fEC Review
01793-444417
 

From: Repositories discussion list on behalf of Stevan Harnad
Sent: Thu 05/02/2009 22:33
To: [log in to unmask]
Subject: Re: Repositories: Institutional or Central ? [in French, from Rector's blog, U. Liège]

On Thu, Feb 5, 2009 at 12:34 PM, Chanier Thierry <[log in to unmask]> wrote:

TC: 
I agree. The question of tools for central repository (CR) is central.

- it is preferable to avoid opposing CR and (Institutional repository) IR.

They are not opposed. Both are welcome and useful. What is under discussion is locus of deposit. (The deposited document itself, once deposited, may be exported, imported, harvested to/from as many repositories as desired. The crucial question is where it is actually deposited, and especially where deposit mandates from funders stipulate that it should be deposited.)

The issues for locus-of-deposit are:

(1) Single or multiple deposit? 

I think everyone would agree that at a time when most authors (85% ) are not yet depositing at all, this is not the time to talk about depositing the same paper more than once.

(2) If single deposit: where, institution-internally or institution-externally? 

The author's institutional repository (IR) might be his university's IR, or his research institute's IR, or the IR of some subset of his institution, such as his department's IR or his laboratory's IR. The point is that the locus of production of all research output -- funded and unfunded, in all disciplines and worldwide -- is the author's institution. The author's institution also has a shared stake and interest with its authors in hosting and showcasing their joint research output.

All other links to the author's research are fragmented: Some of it will be funded by some funders, some by others, and some will be unfunded. Some will be in some discipline or subdiscipline, some in another, some in several. There is much scope for collecting it together in various combinations into such institution-external collections, but it makes no sense at all to deposit directly in some or all of them: One deposit is enough, and the rest can be harvested automatically. The natural and optimal locus for that one deposit is at the universal source: the author's own institution.

(3) Import/Export/Harvest from where to where?

The natural and optimal procedure is: deposit institution-internally and then, where desired, import/export/harvest institution-externally. This one-to-many procedure makes sense from every standpoint: Single convergent deposit, convergent mandates, maximal flexibility and efficiency, minimal effort and complication (hence maximal willingness and compliance from authors). The alternative, of many-to-one importation, or many-to-many import/export means multiple, divergent deposit, divergent mandates, reduced flexibility and efficiency, increased effort and complications (and hence reduced willingness and compliance from authors).

TC: 
In some countries, CRs may be prominent (particularly because local
institutions have a low status, so IRs may not mean much to researchers ...
when they exist), because centralized procedures for evaluating research
may offer opportunity to researchers to start depositing - see hereafter
about France -).

Institutional status-level is irrelevant, because research is not searched at the individual IR level but at the harvester (CR) level. We are discussing here what is the optimal locus of deposit, so as to capture (and mandate the capture) all of OA's target content, worldwide, and as quickly and efficiently as possible. What matters for this is to find a procedure for systematically capturing all research output, and the natural and exhaustive locus for that is at the source: the institution (university, research institute, department, laboratory) that hosts the researcher, pays his salary, and provides his institutional affiliation.

There is of course research evaluation at the institution-internal as well as the institution-external (funder and national) level. But even for national research assessment exercises, such as the RAE in the UK, the institution and department are the "unit of assessment"; they are local, and distributed. And the natural locus for their research output is their own IRs. And that is exactly how many UK universities provided their submissions to RAE 2008. See the IRRA .


TC: 
- Researchers should be free to choose where they deposit but with
requirements to deposit. They may do it in different repositories (I mean
one document is only in one place, but depending on the nature of the
document / data, one may choose various repositories)

I am afraid that it is here that we reach the gist of the matter (and the height of the misunderstanding and equivocation):

First, the only kind of deposit under discussion here is OA's primary target content: refereed journal articles. That is also the only deposit requirement (mandate) under discussion here, because although there are many other things an author might choose to deposit too -- books, software, multimedia, courseware, research data -- those are optional contents insofar as OA deposit mandates are concerned. And it is specifically the locus of deposit of the required contents (refereed journal articles) that matters so much, particularly in funder mandate policies.

So whereas it may seem optimal for a funder to simply require deposit in some OA repository or other, but to leave it up to the author to choose which (and such a funder mandate is certainly preferable to a mandate that specifies deposit in a CR, or to no mandate at all), this is in fact far from being the optimal mandate, for the reasons discussed by Prof. Rentier: 

Most researchers (85%) do not deposit unless they are required to. Funders can only mandate the deposit of the research that they fund. If they require that it must be deposited in a specific CR, they are in direct competition with institutional mandates (necessitating double or divergent deposit). If funder mandates simply leave it open where authors deposit, then they are not in competition with IR mandates, but they are not helping them either. As noted, institutions are the producers of all research output -- funded and unfunded, in all disciplines, worldwide. Only 30 institutions mandate deposit so far, worldwide (out of tens of thousands). If a funder mandates deposit, but is open-ended about locus of deposit, it leaves institutions in their current state of inertia. But if they specifically stipulate IR deposit, they thereby immediately increase the probability and the motivation for creating an IR as well as adopting an institutional deposit mandate for the rest of the research output of every one of the institutions that have a researcher funded by that funder.

TC: 
- It is a tactical decision for OA supporters, knowing the local habits,
to advertise ways of deposit to colleagues

But we already know that advertisement, encouragement, exhortation, evidence of benefits, assistance -- none of these is sufficient to get most researchers to deposit. Only requirements (mandates) work (and you seem to agree).

Now institutions are the "sleeping giant" of OA, because they are the universal providers of all of OA's target content. So to induce the "sleeping giant" to wake up and mandate OA for all of his research output, there has to be something in it for him (or rather them, because the "sleeping giant" is in fact a global network of universities and research institutions). What is in it for each of them? A collection of its own institutional research output that it can host, manage, audit, assess and showcase. What use is it to each of them if their research output is scattered globally willy-nilly, in diverse CRs? It increases the research impact of the institution's research output, to be sure, but how to measure, credit, showcase and benefit from that, institutionally, when it is scattered willy-nilly? 

Now, as noted, importation/exportation/harvesting can in principle work both ways. But if a university that might wish to host its own research assets has to go out and find and harvest them back from all over the web, because they were deposited institution-externally, instead of being deposited institutionally in the first place, the time and effort involved is considerably greater than simply mandating direct institutional deposit would have been -- and that back-harvest does not even yield all of the university's output: only whatever institutional research output happened to be funded by funders that also mandate OA! Yet if those funders had mandated IR deposit, all that work would already be done, and the university would have a strong incentive to adopt a mandate requiring the rest of its research output to be deposited too.

Meanwhile, for a mandating funder, harvesting the distributed IR content of all of its fundees into a CR is far easier, as the fulfillment conditions for the grant need only specify that the author should send the funder the URL for the IR deposit of all articles resulting from the grant. The rest can be done automatically by software.

TC: 
- we have to make sure that people in charge of funding research (EU,
National) do not oblige researchers to deposit in one specific place
(their CR or any other)

On the contrary, there is every reason that funders should specify the fundee's IR as the preferred locus of deposit, for the reasons just adduced. Open-ended mandates are better than competing CR mandates, but they are not nearly as good as convergent, synergistic IR mandates (to help awaken the sleeping giant).

(As I was writing this posting, two new funder mandates have been announced -- FRSQ in Canada and NRC in Norway: Both are welcome, but both are open-ended about deposit locus, and consequently both miss the opportunity to have a far greater positive effect on global OA growth, by stipulating IR deposit.)
 
- But I understand them, because when they ask researchers to give access
to their work and advertise the fact that they have been paid by them,
there is currently no practical way of doing it (labels put on deposit
with the name of the program which gave the money, and harvesters able to
compute this information ?)

Yes, precisely. Funding metadata can easily be added as a field in the IR deposit software -- and institutions will be only too happy to help in monitoring grant fulfillment conditions in this way, in exchange for the jump-start it provides for the filling of their own IRs.

- I also understand them because I feel that they want to add interesting
tools (search, computation, meta-engine), tools which could be developped
by central harvesters (CH). We are late on this issue and harvesters have
not made much progress (see hereafter).

To repeat: Locus of direct deposit has nothing whatever to do with harvester-level search. Search is not done at the IR level but at the harvester (e.g., CR)  level.

TC: 
1) HAL and research evaluation
---------------------
3 years ago I tried to convince my former lab to open a sub-archive within
HAL (same repository, but URL specific to the lab, with proper interface).
I also tried to convince my university to have a general meeting with
directors of local labs in order to invite them to do the same and, at
another level, to manage the sub-archive in HAL for the university (a
solution somewhere in between CR and IR). My colleague of the lab agreed,
started the work but gave up because of lack of time. My university never
answered to my proposal.

HAL is a nationwide resource that can in principle be used (much the way the Web itself is used) to allow an institution to create and manage its own "virtual IR". As such, HAL is partly a platform for creating virtual IRs, rather than a CR.

So, essentially, what you and your colleague tried to do (and only partly succeeded) was to create and manage an IR. That's splendid, and welcome, but we already know that IRs alone are not enough. Without a mandate, they idle at the usual 15% baseline.

(Please note that a lab repository is an IR.)

TC: 
Now, thanks to procedures for evaluating research in France, labs will have
to choose the way they want to be evaluated (I mean the technical
procedure to achieve it). Some software used by the national board will
do the computation out of HAL. Consequently, my lab decided this week to
urgently re-open and manage its sub-archive in HAL. Of course, the first
thing they have to do is deposit of metadata. Actual deposit of
corresponding papers is not mandatory. But they will take the opportunity
to suggest to researchers to deposit as well their full papers.

It won't work; it's been tried many times before. So this is a great opportunity lost. As you see, the IR clearly languishes neglected without a mandate. With a mandate -- particularly one in which evaluation is based on what is deposited, as in Prof. Rentier's mandate at Liège -- researchers perk up and deposit. But if all they have to deposit is metadata, that's all they will deposit (even though adding the full-text is just one more keystroke).

The reason is that the effect of mandates is mostly not coercive. Researchers don't jump to deposit just because they are required to deposit. They actually want to deposit, but they are held back by two main constraints, one small, the other big: 

(1) The small constraint is ergonomic. Researchers are overloaded, and they will not do something extra unless it really has a high priority. A deposit mandate, especially one tied to funding and/or evaluation, gives the few minutes-worth of keystrokes per paper (which is all that a deposit amounts to) the requisite priority that they otherwise lack.

(2) The big constraint is psychological: Researchers are (groundlessly) afraid to deposit their papers (even the 63% for which the journal already gives them its explicit blessing to do so) -- afraid until and unless their institutions and/or their funders tell them they must, because then they know it is officially okay to do so! The mandate unburdens their souls, and unlocks their fingers.

TC: 
Last thing : I do not mean that in France, only HAL should be used. We
should make sure we have the choice to deposit where we please.

What France needs, like every other country, is funder and institutional mandates converging on single-locus IR deposit (irrespective of whether the IR is hosted by HAL). But if mandating funders leave locus-of-deposit open, or insist on generic deposit in some CR or other, the giant will keep hibernating, institutional (departmental, laboratory) mandates will not be adopted, and what IRs there are will continue to lie fallow.

2) Harversters : advantages and current limits
----------
Just a personal experience. Till recently I used to advertise my list of
publications by giving the URL of an open archive Edutice (a thematic one,
VERY USEFUL in our domain, sub-part of HAL but with its local procedure,
interface, etc.).
Now I give to colleagues the OAISTER URL (with the path to follow) to get
all my publications (because some of them are in other archives).
The problem is : deposits in Edutice appear twice in the OAISTER list (as
deposits of Edutice and of HAL - but there is one only deposit).
It is a concrete exemple of progress which should be made to avoid
repetitions in harvesters (among many other new features).

If they had all been deposited in your own IR you would have had an automatic listing of all your works (without duplications) through a simple google IR site-search "chanier site:http-IRetc." -- and your institutions would have it all too. And so would OAIster. And you could have exported to Edutice with SWORD if you wished.

De-duplication and version-comparator software is already being developed (though it's hardly worth it, when the problem is not the presence of duplicates but the absence of even a singleton for 85% global refereed research output) -- and that's what mandates in general -- and convergent IR mandates in particular, to awaken the slumbering giant -- are needed for.

Stevan Harnad


 

****************************** end of Thierry's message ************

Le Mer 4 février 2009 22:12, Bernard Rentier a écrit :
> I agree. It is exactly what I was trying to say in my last paragraph :
> it is my belief that lauching a centralised and/or thematic repository
> (C-TR) can make sense, but only if it does not discourage authors from
> posting their publications in an institutional repository (IR),
> otherwise many publications will be lost in the process (I mean lost
> for easy and open access).
>
> In addition, direct posting in C-TRs will shortcut IRs and it will be
> a loss for universities in their attempt to  host their entire
> scholarly production (this is just a collateral effect, I know, but
> being a University President, it is a worry for me).
>
> C-TRs are of much more interest if they collect data at a secondary
> level by harvesting from primary IRs.
>
> Bernard Rentier

**********************************************************************
Internet communications are not secure and therefore RCUK does not accept legal responsibility for the contents of this message. Any views or opinions presented are solely those of the author and do not necessarily represent those of the RCUK unless specifically stated.
All RCUK staff can be contacted using Email addresses with the following format: [log in to unmask]
**********************************************************************