JiscMail Logo
Email discussion lists for the UK Education and Research communities

Help for JISC-REPOSITORIES Archives


JISC-REPOSITORIES Archives

JISC-REPOSITORIES Archives


JISC-REPOSITORIES@JISCMAIL.AC.UK


View:

Message:

[

First

|

Previous

|

Next

|

Last

]

By Topic:

[

First

|

Previous

|

Next

|

Last

]

By Author:

[

First

|

Previous

|

Next

|

Last

]

Font:

Proportional Font

LISTSERV Archives

LISTSERV Archives

JISC-REPOSITORIES Home

JISC-REPOSITORIES Home

JISC-REPOSITORIES  April 2012

JISC-REPOSITORIES April 2012

Options

Subscribe or Unsubscribe

Subscribe or Unsubscribe

Log In

Log In

Get Password

Get Password

Subject:

Re: SUPPLEMENT to PMC & UKPMC Should Harvest From Institutional Repositories

From:

Stevan Harnad <[log in to unmask]>

Reply-To:

Stevan Harnad <[log in to unmask]>

Date:

Sat, 14 Apr 2012 07:51:05 -0400

Content-Type:

text/plain

Parts/Attachments:

Parts/Attachments

text/plain (213 lines)

** Cross-Posted **

1. The harvesting of Institutional repository (IR) content by central
repositories (CRs) such as PMC and UKPMC is being recommended as a
SUPPLEMENT to CRs' current content.

2. The purpose is to help open the door to funder mandates designating
direct IR deposit as the means of fulfilling the funder mandate, and
thereby to facilitate the adoption of deposit mandates by institutions
for all of their research output, not just the output covered by
funder mandates.

3. The immediate benefit of supplementing PMC and UKPMC content with
harvested IR content is (3a) that it increases PMC/UKPMC content (with
existing IR biomedical content) and (3b) it makes the content that is
embargoed in UK/PMC (for 6 to 12 months or more) immediately
accessible via the IR link (or, for IRs' embargoed content, via the
IR's automated email-eprint-request Button).

4. The metadata and rights specification of this supplementary IR
content will not be as rich, but that is incomparably less important
than the additional open access to research that will be immediately
provided.

(I would only add that this principle of harvesting instead of direct
deposit also applies to UKPMC itself: Should UKPMC not just be
harvesting from PMC? Is there really a need for a repository for
directly depositing UK biomedical research output, in addition to a
repository for depositing worldwide biomedical research output? -- But
don't be distracted by this minor matter. What's far, far more
important is to supplement the current direct deposits in PMC and
UKPMC with harvesting from IRs, thereby (a)  increasing OA and at the
same time (b) encouraging funders to mandate IR deposit, thereby (c)
increasing OA orders of magnitude more, by (d) facilitating the
adoption and implementation of universal institutional mandates.)

Stevan Harnad

On Fri, Apr 13, 2012 at 6:33 PM, Stevan Harnad <[log in to unmask]> wrote:
> As Johanna McEntyre of EBI has raised an important series of questions
> about institutional deposit and institution-external harvesting (by
> PMC and UKPMC) versus direct institution-external deposit (in PMC and
> UKPMC) so I have replied in quote/commentary format:
>
> On Fri, Apr 13, 2012 at 11:27 AM, Johanna McEntyre <[log in to unmask]> wrote:
>
>> Stevan,
>>
>> Thanks for these comments on how PMC & UKPMC could be improved. While I can't respond to the mandate changes suggested, I can comment on the suggestion that UKPMC should harvest/link to IR versions of papers.
>>
>> We have considered doing this in some depth.  However, for a number of reasons this is not as straightforward to actually do as it is to say:
>>
>> (1) Firstly, UKPMC is a full text article database. Harvesting protocols such as OAI-PMH deal in metadata only. UKPMC is already supplemented by PubMed, Agricola, and EPO patent abstracts (about 26 million of them), so it is unclear how much content routine harvesting would add.
>
>
> It will (i) add to UKPMC all UK biomedical research output that is
> currently being self-archived -- spontaneously or mandatorily -- in
> its respective authors' respective institutional repositories but not
> mandated for UKPMC deposit.
>
> Much more important, it will (ii) greatly facilitate and strengthen
> the adoption of self-archiving mandates by the rest of the UK's
> institutions, thereby (iii) generating much more UK OA content (in all
> disciplines) -- including  much more UK biomedical output for
> harvesting into UKPMC.
>
>> (2) Secondly, there is no clean way to identify life science & related content in IRs (this is a matter of research not production-level functionality), apart from perhaps resolving metadata to PMIDs, which then of course would not add new content to UKPMC.
>
>
> If UKPMC harvested from IRs (and, even more important, if the funders
> that now mandate direct deposit in UKPMC instead mandated deposit in
> IRs, for harvesting by UKPMC), the software for identifying UK
> biomedical output would rapidly (and happily) be developed.
>
> The lack of identifying software is not the problem: the lack of
> institutional self-archiving mandates is; and funders insisting on
> UKPMC instead of IR deposit and UKPMC harvest compounds the problem
> instead of contributing to its solution.
>
>> (3) Thirdly, because UKPMC is primarily interested in full text articles, we would want to identify those records in IRs that have full text. Again, there is no clean programmatic way of doing this that we know of. If anyone knows how to do this programmatically then we would be interested in learning how.
>
>
> This too is a problem that IR software can easily solve -- if given
> the incentive of (a) IR deposit mandates and (b) UKPMC harvesting
> capability.
>
>> (4) Finally, PMC & UKPMC (and PMC Canada) archive full text articles in XML. This structured content facilitates:
>>
>> (a) linking to related public life science databases such as UniProt;
>> (b) operations such as text mining and smart indexing (e.g. restricting searches to figure legends);
>> (c) insures the integrity of the archive since viewed articles are rendered from the XML database to HTML on the fly, and
>> (d) reuse by third parties, in the case of OA articles.
>
>
> That's all fine, for the OA content already being deposited in UKPMC ( + PMC).
>
> But that is only a small fraction of total biomedical (or UK
> biomedical) output, all of which is provided by institutions.
>
> Surely additional OA content, even if less optimally tagged, is
> preferable to less OA content, optimally tagged. That will also
> provide the incentive to upgrade the tagging of the extra IR content
> to XML -- and eventually IRs will graduate to XML too: but first
> things first. And the overwhelming priority is not XML but OA itself!
>
>> Therefore, in the event that we could identify life science full text articles in IRs, we would want to add the ones we don't already have to UKPMC, not just link to them. For those articles, there is a lack of clarity regarding licensing information. Establishing the license of a given article currently requires a manual process and therefore is not at all scalable or sustainable. The only way around this that I can envision is for licensing information to be represented formally in structured data, with the best enabling licenses for content exchange being CC-BY or CC0.
>
>
> Same reply about licensing as about XML tagging, above:  Surely
> additional OA content, even if less optimally licensed, is preferable
> to less OA content, optimally licensed.
>
>> If we harvest full-text content into UKPMC - which we do not have to right to harvest - we know from experience that this would be subject to a take-down request.  Harvesting content, converting it to XML, and then being asked to remove it from the repository is not a strategy we wish to follow.
>
>
> That provides yet another good reason for just harvesting the metadata
> and URL for the time being. It will facilitate the generation of much
> more OA, for the reasons mentioned, and eventually will lead to
> optimal tagging and licensing too.
>
>> Content exchange to maximize usage in different contexts need not be a one-way process. Another option to consider is to encourage authors to deposit centrally (so we can do the things listed above) and then push content from UKPMC to populate IRs, for the purpose of institutional reporting, for example. We have an FTP site of OA articles: http://ukpmc.ac.uk/ftp/oa (there are over 400,000 OA articles there now) and will soon be releasing a web service that will retrieve metadata and full text (in the case of OA articles).
>
>
> There are perhaps major 3-4 discipline-based central repositories of
> any nontrivial size (mainly Arxiv in physics, PMC/UKPMC in biomedicine
> and SSRN in social sciences). In contrast, there are at least 10,000
> research active institutions generating all of the planet's research
> output in at least 40 STM and humanities disciplines.
>
> Do you really think that a realistic and natural way to make the
> research output of all those institutions and disciplines OA is to
> wait for it to be spontaneously deposited in an institution-external
> repository, and then back-harvest it to the institution from which
> originated?
>
> What is needed is institutional self-archiving mandates, for all
> research, funded and unfunded. Funder mandates that require
> institution-external deposits, and institution-external repositories
> that require direct deposit instead of harvesting are needlessly
> creating impediments to the adoption and implementation of OA mandates
> by the universal providers of all research, funded and funded: the
> planet's universities and research institutes.
>
>> I'd also like to add that we are actively exploring how UKPMC can integrate with IRs, in particular with respect to related data resources via the EBI's partnership in the OpenAIRE Plus project. We will be continuing to collaborate to explore how IRs and UKPMC can interoperate better.
>
> The returns from integrating with the sparse contents of IRs (most of
> them unmandated, hence near empty) are a far cry from what they could
> be if PMC and UKPMC (and funder mandates!) took the simple step of
> harvesting from IRs instead of requiring direct institution-external
> deposit.
>
> Stevan Harnad
>>
>> Jo McEntyre
>>
>>
>> On Apr 12, 2012, at 12:05 PM, Stevan Harnad wrote:
>>
>> > On 2012-04-12, at 5:44 AM, Steve Hitchcock wrote:
>> >
>> >> Do we know why Pubmed does not apparently link to papers in IRs?
>> >> Is this Pubmed policy, or is there a technical reason?
>> >>
>> >> Stephen Curry: PubMed, the first port of call for anyone searching
>> >> the biomedical literature, frequently links to publisher’s site but
>> >> never to institutional repositories
>> >> http://occamstypewriter.org/scurry/2012/03/18/elsevier-the-research-works-act-and-open-access-where-to-now/
>> >
>> > PubMed & PubMed Central are wonderful resources, but not nearly
>> > as resourceful or wonderful as they easily could be.
>> >
>> > (1) PMC & UKPMC should of course be harvesting or linking
>> > institutional repository (IR) versions of papers, not just
>> > PMC/UKPMC-deposited and publisher-hosted papers.
>> >
>> > (2) Funders should be mandating IR deposit and PMC harvesting
>> > rather than direct PMC deposit. By thus making funder mandates
>> > and institutional mandates convergent and collaborative instead
>> > of divergent and competitive, this will motivate and facilitate adoption
>> > and compliance with institutional mandates: institutions are the universal
>> > providers of all research output, funded and unfunded.
>> >
>> > (3) IRs should mandate immediate deposit irrespective of publisher
>> > OA policy: If authors wish to honor publisher OA embargoes, they
>> > can set access to the deposit as Closed Access during the embargo
>> > and rely on providing almost-OA via the IR's email eprint request button
>> >
>> > (4) Funder mandates should require deposit by the fundee -- the one
>> > bound by the mandate -- rather than by the publisher, who is not
>> > bound by the mandate, and indeed in conflict of interest with it.
>> > http://openaccess.eprints.org/index.php?/archives/876-.html
>> >
>> > (5) Publishers (partly to protect from rival publisher free-loading,
>> > partly to discourage funder mandates, and partly out of simple
>> > misunderstanding of network capability) are much more likely
>> > to endorse immediate institutional self-archiving than institution-external
>> > deposit. This yet another reason funders should mandate institutional
>> > deposit and metadata harvesting instead of direct institution-external deposit.
>> >
>> > Stevan Harnad
>> >
>> >
>> > _______________________________________________
>> > GOAL mailing list
>> > [log in to unmask]
>> > http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal
>>
>>
>> _______________________________________________
>> GOAL mailing list
>> [log in to unmask]
>> http://mailman.ecs.soton.ac.uk/mailman/listinfo/goal

Top of Message | Previous Page | Permalink

JiscMail Tools


RSS Feeds and Sharing


Advanced Options


Archives

April 2024
March 2024
February 2024
January 2024
December 2023
November 2023
October 2023
September 2023
August 2023
July 2023
June 2023
May 2023
April 2023
March 2023
February 2023
January 2023
December 2022
November 2022
October 2022
September 2022
August 2022
July 2022
June 2022
May 2022
April 2022
March 2022
February 2022
January 2022
December 2021
November 2021
October 2021
September 2021
August 2021
July 2021
June 2021
May 2021
April 2021
March 2021
February 2021
January 2021
December 2020
November 2020
October 2020
September 2020
August 2020
July 2020
June 2020
May 2020
April 2020
March 2020
February 2020
January 2020
December 2019
November 2019
October 2019
September 2019
August 2019
July 2019
June 2019
May 2019
April 2019
March 2019
February 2019
January 2019
December 2018
November 2018
October 2018
September 2018
August 2018
July 2018
June 2018
May 2018
April 2018
March 2018
February 2018
January 2018
December 2017
November 2017
October 2017
September 2017
August 2017
July 2017
June 2017
May 2017
April 2017
March 2017
February 2017
January 2017
December 2016
November 2016
October 2016
September 2016
August 2016
July 2016
June 2016
May 2016
April 2016
March 2016
February 2016
January 2016
December 2015
November 2015
October 2015
September 2015
August 2015
July 2015
June 2015
May 2015
April 2015
March 2015
February 2015
January 2015
December 2014
November 2014
October 2014
September 2014
August 2014
July 2014
June 2014
May 2014
April 2014
March 2014
February 2014
January 2014
December 2013
November 2013
October 2013
September 2013
August 2013
July 2013
June 2013
May 2013
April 2013
March 2013
February 2013
January 2013
December 2012
November 2012
October 2012
September 2012
August 2012
July 2012
June 2012
May 2012
April 2012
March 2012
February 2012
January 2012
December 2011
November 2011
October 2011
September 2011
August 2011
July 2011
June 2011
May 2011
April 2011
March 2011
February 2011
January 2011
December 2010
November 2010
October 2010
September 2010
August 2010
July 2010
June 2010
May 2010
April 2010
March 2010
February 2010
January 2010
December 2009
November 2009
October 2009
September 2009
August 2009
July 2009
June 2009
May 2009
April 2009
March 2009
February 2009
January 2009
December 2008
November 2008
October 2008
September 2008
August 2008
July 2008
June 2008
May 2008
April 2008
March 2008
February 2008
January 2008
December 2007
November 2007
October 2007
September 2007
August 2007
July 2007
June 2007
May 2007
April 2007
March 2007
February 2007
January 2007
December 2006
November 2006
October 2006
September 2006
August 2006
July 2006
June 2006
May 2006
April 2006
March 2006
February 2006
January 2006
November 2005
October 2005


JiscMail is a Jisc service.

View our service policies at https://www.jiscmail.ac.uk/policyandsecurity/ and Jisc's privacy policy at https://www.jisc.ac.uk/website/privacy-notice

For help and support help@jisc.ac.uk

Secured by F-Secure Anti-Virus CataList Email List Search Powered by the LISTSERV Email List Manager