At USQ, one of our systems analysis, Tim McCallum, has used some PHP code which loops through all publicly available items within an IR and queries Google Scholar for visibility. The end result is a report in the form of a spread sheet which lists the items that are visible and items that are not visible. It also gives a total count (percentage of visible items).

 

If anyone would like a copy of the code, you can contact Tim directly at [log in to unmask].

 

 

Pasquale (Pat) Loria

Research Librarian, Library Services

Global Learning Division

University of Southern Queensland | Toowoomba 4350 QLD Australia

T: +61 7 4631 1778 | Fax: +61 7 4631 1493 | Email: [log in to unmask]

 

From: Repositories discussion list [mailto:[log in to unmask]] On Behalf Of Gould, Sara
Sent: Thursday, 16 February 2012 1:29 AM
To: [log in to unmask]
Subject: Re: Google Scholar discoverability of repository content

 

Dear all

This discussion is very timely for EThOS because we're now talking to Google about having Google Scholar index the EThOS metadata, something that has been on the EThOS plans for quite some time.

 

It would be really helpful to hear of any specific challenges to look out for, or suggestions for a smooth path to successful indexing. Feel free to reply to the list or to me or Andrew direct.

 

We're already consulting with EThOS participating institutions that fund digitisation of their own theses through EThOS because we would naturally expect demand for digitisation to increase once the records are searchable through Google Scholar, but we'd also be keen to hear more generally from EThOS member institutions if you have questions or comments about our plans for Google Scholar indexing.

 

With best wishes

Sara

 

Sara Gould

EThOS Service Manager

The British Library

Boston Spa

LS23 7BQ

 

T: 01937 546123

M: 07768467929

[log in to unmask]

 

 


From: Repositories discussion list [mailto:[log in to unmask]] On Behalf Of Azhar Hussain
Sent: 15 February 2012 13:12
To: [log in to unmask]
Subject: Re: Google Scholar discoverability of repository content

Dear All,

 

Regarding repository content discoverability by Google Scholar.

 

Very recently, we at OpenDOAR (http://www.opendoar.org/index.html) have been working closely with Google Scholar to disseminate advice from them direct to repository administrators on improve indexing and discoverability. We have sent advice specific to the repository software to the contact details of administrators we hold as part of OpenDOAR (http://www.opendoar.org/tools/emailservice.html).

 

A quick summary of the advice from Google Scholar is as follows:

 

****

 

Google Scholar uses automated processes for indexing. To index a repository well, our search engine robots need to able to quickly reach all articles by following links from the home page and they need to be able to recognize bibliographic data for the article. If our robots are unable to find the URL for an article or are unable to fetch it, the article cannot be included in the Scholar index. Furthermore, if they are unable to determine correct metadata for the article, we may not be able to identify citations to the article - which in turn will impact its ranking and visibility.

 

We encourage you to go over the Google Scholar inclusion guidelines at http://scholar.google.com/intl/en/scholar/inclusion.html to see if your repository meets the guidelines for indexing. It would probably also be good to discuss the guidelines with the provider of your repository software. We would be happy to discuss technical details as needed with your repository software provider. If your repository system doesn't meet the guidelines, you may also wish to evaluate alternatives are well-structured for indexing (see the inclusion guidelines page above).

 

****

 

Azhar

 

--

 

Azhar Hussain

Open Access Services Co-ordinator

 

Centre for Research Communications

Greenfield Medical Library

University of Nottingham Medical School

Queen's Medical Centre

Nottingham

NG7 2UH

UKT: +44(0)1158-467235

F: +44(0)1158-468244

[log in to unmask]

CRC - http://crc.nottingham.ac.uk

RoMEO - http://www.sherpa.ac.uk/romeo

JULIET - http://www.sherpa.ac.uk/juliet/index.php

OpenDOAR - http://www.sherpa.ac.uk/juliet/index.php

 

-----Original Message-----
From: Repositories discussion list [mailto:[log in to unmask]] On Behalf Of Stevan Harnad
Sent: 15 February 2012 11:32
To: [log in to unmask]
Subject: Google Scholar discoverability of repository content

 

Can we enhance the google-scholar discoverability of EPrints (and

DSpace) repositories?

 

http://linksource.ebsco.com/linking.aspx?sid=google&auinit=K&aulast=Arlitsch&atitle=Invisible+Institutional+Repositories:+Addressing+the+Low+Indexing+Ratios+of+IRs+in+Google+Scholar&title=Library+Hi+Tech&volume=30&issue=1&date=2012&spage=4&issn=0737-8831

 

Kenning Arlitsch, Patrick Shawn OBrien, (2012) "Invisible Institutional

Repositories: Addressing the Low Indexing Ratios of IRs in Google

Scholar", Library Hi Tech, Vol. 30 Iss: 1

 

Purpose - Google Scholar has difficulty indexing the contents of

institutional repositories, and the authors hypothesize the reason is

that most repositories use Dublin Core, which cannot express

bibliographic citation information adequately for academic papers.

Google Scholar makes specific recommendations for repositories,

including the use of publishing industry metadata schemas over Dublin

Core. This paper tests a theory that transforming metadata schemas in

institutional repositories will lead to increased indexing by Google

Scholar.

 

Design/methodology/approach - The authors conducted two surveys of

institutional and disciplinary repositories across the United States,

using different methodologies. They also conducted three pilot projects

that transformed the metadata of a subset of papers from USpace, the

University of Utah's institutional repository, and examined the results

of Google Scholar's explicit harvests.

 

Findings - Repositories that use GS recommended metadata schemas and

express them in HTML meta tags experienced significantly higher indexing

ratios. The ease with which search engine crawlers can navigate a

repository also seems to affect indexing ratio. The second and third

metadata transformation pilot projects at Utah were successful,

ultimately achieving an indexing ratio of greater than 90%.

Research limitations/implications - The second survey was limited to

forty titles from each of seven repositories, for a total of 280 titles.

A larger survey that covers more repositories may be useful.

 

Practical implications - Institutional repositories are achieving

significant mass, and the rate of author citations from those

repositories may affect university rankings. Lack of visibility in

Google Scholar, however, will limit the ability of IRs to play a more

significant role in those citation rates.

Originality/value - Little or no research has been published about

improving the indexing ratio of institutional repositories in Google

Scholar. The authors believe that they are the first to address the

possibility of transforming IR metadata to improve indexing ratios in

Google Scholar.

 

 

This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.

This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.


This email (including any attached files) is confidential and is for the intended recipient(s) only. If you received this email by mistake, please, as a courtesy, tell the sender, then delete this email.

The views and opinions are the originator's and do not necessarily reflect those of the University of Southern Queensland. Although all reasonable precautions were taken to ensure that this email contained no viruses at the time it was sent we accept no liability for any losses arising from its receipt.

The University of Southern Queensland is a registered provider of education with the Australian Government (CRICOS Institution Code No's. QLD 00244B / NSW 02225M)