The natural and optimal procedure is: deposit institution-internally and then, where desired, import/export/harvest institution-externally. This one-to-many procedure makes sense from every standpoint: Single convergent deposit, convergent mandates, maximal flexibility and efficiency, minimal effort and complication (hence maximal willingness and compliance from authors). The alternative, of many-to-one importation, or many-to-many import/export means multiple, divergent deposit, divergent mandates, reduced flexibility and efficiency, increased effort and complications (and hence reduced willingness and compliance from authors).
TC:
In some countries, CRs may be prominent (particularly because local
institutions have a low status, so IRs may not mean much to researchers ...
when they exist), because centralized procedures for evaluating research
may offer opportunity to researchers to start depositing - see hereafter
about France -).
Institutional status-level is irrelevant, because research is not searched at the individual IR level but at the harvester (CR) level. We are discussing here what is the optimal locus of deposit, so as to capture (and mandate the capture) all of OA's target content, worldwide, and as quickly and efficiently as possible. What matters for this is to find a procedure for systematically capturing all research output, and the natural and exhaustive locus for that is at the source: the institution (university, research institute, department, laboratory) that hosts the researcher, pays his salary, and provides his institutional affiliation.
There is of course research evaluation at the institution-internal as well as the institution-external (funder and national) level. But even for national research assessment exercises, such as the RAE in the UK, the institution and department are the "unit of assessment"; they are local, and distributed. And the natural locus for their research output is their own IRs. And that is exactly how many UK universities provided their submissions to RAE 2008. See the
IRRA .
TC:
- Researchers should be free to choose where they deposit but with
requirements to deposit. They may do it in different repositories (I mean
one document is only in one place, but depending on the nature of the
document / data, one may choose various repositories)
I am afraid that it is here that we reach the gist of the matter (and the height of the misunderstanding and equivocation):
First, the only kind of deposit under discussion here is OA's primary target content: refereed journal articles. That is also the only deposit requirement (mandate) under discussion here, because although there are many other things an author might choose to deposit too -- books, software, multimedia, courseware, research data -- those are optional contents insofar as OA deposit mandates are concerned. And it is specifically the locus of deposit of the required contents (refereed journal articles) that matters so much, particularly in funder mandate policies.
So whereas it may seem optimal for a funder to simply require deposit in some OA repository or other, but to leave it up to the author to choose which (and such a funder mandate is certainly preferable to a mandate that specifies deposit in a CR, or to no mandate at all), this is in fact far from being the
optimal mandate, for the reasons discussed by Prof. Rentier:
Most researchers (85%) do not deposit unless they are required to. Funders can only mandate the deposit of the research that they fund. If they require that it must be deposited in a specific CR, they are in direct competition with institutional mandates (necessitating double or divergent deposit). If funder mandates simply leave it open where authors deposit, then they are not in competition with IR mandates, but they are not helping them either. As noted, institutions are the producers of all research output -- funded and unfunded, in all disciplines, worldwide. Only 30 institutions mandate deposit so far, worldwide (out of tens of thousands). If a funder mandates deposit, but is open-ended about locus of deposit, it leaves institutions in their current state of inertia. But if they specifically stipulate IR deposit, they thereby immediately increase the probability and the motivation for creating an IR as well as adopting an institutional deposit mandate for the rest of the research output of every one of the institutions that have a researcher funded by that funder.
TC:
- It is a tactical decision for OA supporters, knowing the local habits,
to advertise ways of deposit to colleagues
But we
already know that advertisement, encouragement, exhortation, evidence of benefits, assistance -- none of these is sufficient to get most researchers to deposit. Only requirements (mandates) work (and you seem to agree).
Now institutions are the "sleeping giant" of OA, because they are the universal providers of all of OA's target content. So to induce the "sleeping giant" to wake up and mandate OA for all of his research output, there has to be something in it for him (or rather them, because the "sleeping giant" is in fact a global network of universities and research institutions). What is in it for each of them? A collection of its own institutional research output that it can host, manage, audit, assess and showcase. What use is it to each of them if their research output is scattered globally willy-nilly, in diverse CRs? It increases the research impact of the institution's research output, to be sure, but how to measure, credit, showcase and benefit from that, institutionally, when it is scattered willy-nilly?
Now, as noted, importation/exportation/harvesting can in principle work both ways. But if a university that might wish to host its own research assets has to go out and find and harvest them back from all over the web, because they were deposited institution-externally, instead of being deposited institutionally in the first place, the time and effort involved is considerably greater than simply mandating direct institutional deposit would have been -- and that back-harvest does not even yield all of the university's output: only whatever institutional research output happened to be funded by funders that also mandate OA! Yet if those funders had mandated IR deposit, all that work would already be done, and the university would have a strong incentive to adopt a mandate requiring the rest of its research output to be deposited too.
Meanwhile, for a mandating funder, harvesting the distributed IR content of all of its fundees into a CR is far easier, as the fulfillment conditions for the grant need only specify that the author should send the funder the URL for the IR deposit of all articles resulting from the grant. The rest can be done automatically by software.
TC:
- we have to make sure that people in charge of funding research (EU,
National) do not oblige researchers to deposit in one specific place
(their CR or any other)
On the contrary, there is every reason that funders should specify the fundee's IR as the preferred locus of deposit, for the reasons just adduced. Open-ended mandates are better than competing CR mandates, but they are not nearly as good as convergent, synergistic IR mandates (to help awaken the sleeping giant).
(As I was writing this posting, two new funder mandates have been announced -- FRSQ in Canada and NRC in Norway: Both are welcome, but both are open-ended about deposit locus, and consequently both miss the opportunity to have a far greater positive effect on global OA growth, by stipulating IR deposit.)
- But I understand them, because when they ask researchers to give access
to their work and advertise the fact that they have been paid by them,
there is currently no practical way of doing it (labels put on deposit
with the name of the program which gave the money, and harvesters able to
compute this information ?)
Yes, precisely. Funding metadata can easily be added as a field in the IR deposit software -- and institutions will be only too happy to help in monitoring grant fulfillment conditions in this way, in exchange for the jump-start it provides for the filling of their own IRs.
- I also understand them because I feel that they want to add interesting
tools (search, computation, meta-engine), tools which could be developped
by central harvesters (CH). We are late on this issue and harvesters have
not made much progress (see hereafter).
To repeat: Locus of direct deposit has nothing whatever to do with harvester-level search. Search is not done at the IR level but at the harvester (e.g., CR) level.
TC:
1) HAL and research evaluation
---------------------
3 years ago I tried to convince my former lab to open a sub-archive within
HAL (same repository, but URL specific to the lab, with proper interface).
I also tried to convince my university to have a general meeting with
directors of local labs in order to invite them to do the same and, at
another level, to manage the sub-archive in HAL for the university (a
solution somewhere in between CR and IR). My colleague of the lab agreed,
started the work but gave up because of lack of time. My university never
answered to my proposal.
HAL is a nationwide resource that can in principle be used (much the way the Web itself is used) to allow an institution to create and manage its own "virtual IR". As such, HAL is partly a platform for creating virtual IRs, rather than a CR.
So, essentially, what you and your colleague tried to do (and only partly succeeded) was to create and manage an IR. That's splendid, and welcome, but we already know that IRs alone are not enough. Without a mandate, they idle at the usual 15% baseline.
(Please note that a lab repository is an IR.)
TC:
Now, thanks to procedures for evaluating research in France, labs will have
to choose the way they want to be evaluated (I mean the technical
procedure to achieve it). Some software used by the national board will
do the computation out of HAL. Consequently, my lab decided this week to
urgently re-open and manage its sub-archive in HAL. Of course, the first
thing they have to do is deposit of metadata. Actual deposit of
corresponding papers is not mandatory. But they will take the opportunity
to suggest to researchers to deposit as well their full papers.
It won't work; it's been tried many times before. So this is a great opportunity lost. As you see, the IR clearly languishes neglected without a mandate. With a mandate -- particularly one in which evaluation is based on what is deposited, as in Prof. Rentier's mandate at Liège -- researchers perk up and deposit. But if all they have to deposit is metadata, that's all they will deposit (even though adding the full-text is just one more keystroke).
The reason is that the effect of mandates is mostly not coercive. Researchers don't jump to deposit just because they are required to deposit. They actually want to deposit, but they are held back by two main constraints, one small, the other big:
(1) The small constraint is ergonomic. Researchers are overloaded, and they will not do something extra unless it really has a high priority. A deposit mandate, especially one tied to funding and/or evaluation, gives the
few minutes-worth of keystrokes per paper (which is all that a deposit amounts to) the requisite priority that they otherwise lack.
(2) The big constraint is psychological: Researchers are (groundlessly) afraid to deposit their papers (even the
63% for which the journal already gives them its explicit blessing to do so) -- afraid until and unless their institutions and/or their funders tell them they must,
because then they know it is officially okay to do so! The mandate unburdens their souls, and unlocks their fingers.
TC:
Last thing : I do not mean that in France, only HAL should be used. We
should make sure we have the choice to deposit where we please.
What France needs, like every other country, is funder and institutional mandates converging on single-locus IR deposit (irrespective of whether the IR is hosted by HAL). But if mandating funders leave locus-of-deposit open, or insist on generic deposit in some CR or other, the giant will keep hibernating, institutional (departmental, laboratory) mandates will not be adopted, and what IRs there are will continue to lie fallow.
2) Harversters : advantages and current limits
----------
Just a personal experience. Till recently I used to advertise my list of
publications by giving the URL of an open archive Edutice (a thematic one,
VERY USEFUL in our domain, sub-part of HAL but with its local procedure,
interface, etc.).
Now I give to colleagues the OAISTER URL (with the path to follow) to get
all my publications (because some of them are in other archives).
The problem is : deposits in Edutice appear twice in the OAISTER list (as
deposits of Edutice and of HAL - but there is one only deposit).
It is a concrete exemple of progress which should be made to avoid
repetitions in harvesters (among many other new features).
If they had all been deposited in your own IR you would have had an automatic listing of all your works (without duplications) through a simple google IR site-search "chanier site:http-IRetc." -- and your institutions would have it all too. And so would OAIster. And you could have exported to Edutice with
SWORD if you wished.
De-duplication and version-comparator software is already being developed (though it's hardly worth it, when the problem is not the presence of duplicates but the absence of even a singleton for 85% global refereed research output) -- and that's what mandates in general -- and convergent IR mandates in particular, to awaken the slumbering giant -- are needed for.
Stevan Harnad
****************************** end of Thierry's message ************
Le Mer 4 février 2009 22:12, Bernard Rentier a écrit :
> I agree. It is exactly what I was trying to say in my last paragraph :
> it is my belief that lauching a centralised and/or thematic repository
> (C-TR) can make sense, but only if it does not discourage authors from
> posting their publications in an institutional repository (IR),
> otherwise many publications will be lost in the process (I mean lost
> for easy and open access).
>
> In addition, direct posting in C-TRs will shortcut IRs and it will be
> a loss for universities in their attempt to host their entire
> scholarly production (this is just a collateral effect, I know, but
> being a University President, it is a worry for me).
>
> C-TRs are of much more interest if they collect data at a secondary
> level by harvesting from primary IRs.
>
> Bernard Rentier