At USQ, one of our systems analysis, Tim McCallum, has used some PHP code which loops through all publicly available items within an IR and queries Google Scholar for visibility. The end result is a report in the form of a spread sheet which lists the items that are visible and items that are not visible. It also gives a total count (percentage of visible items).
If anyone would like a copy of the code, you can contact Tim directly at [log in to unmask].
Pasquale (Pat) Loria
Research Librarian, Library Services
Global Learning Division
University of Southern Queensland | Toowoomba 4350 QLD Australia
T: +61 7 4631 1778 | Fax: +61 7 4631 1493 | Email: [log in to unmask]
From: Repositories discussion list [mailto:[log in to unmask]] On Behalf Of Gould, Sara
Sent: Thursday, 16 February 2012 1:29 AM
To: [log in to unmask]
Subject: Re: Google Scholar discoverability of repository content
Dear all
This discussion is very timely for EThOS because we're now talking to Google about having Google Scholar index the EThOS metadata, something that has been on the EThOS plans for quite some time.
It would be really helpful to hear of any specific challenges to look out for, or suggestions for a smooth path to successful indexing. Feel free to reply to the list or to me or Andrew direct.
We're already consulting with EThOS participating institutions that fund digitisation of their own theses through EThOS because we would naturally expect demand for digitisation to increase once the records are searchable through Google Scholar, but we'd also be keen to hear more generally from EThOS member institutions if you have questions or comments about our plans for Google Scholar indexing.
With best wishes
Sara
Sara Gould
EThOS Service Manager
The British Library
Boston Spa
LS23 7BQ
T: 01937 546123
M: 07768467929
From: Repositories discussion list [mailto:[log in to unmask]] On Behalf Of Azhar Hussain
Sent: 15 February 2012 13:12
To: [log in to unmask]
Subject: Re: Google Scholar discoverability of repository content
Dear All,
Regarding repository content discoverability by Google Scholar.
Very recently, we at OpenDOAR (http://www.opendoar.org/index.html) have been working closely with Google Scholar to disseminate advice from them direct to repository administrators on improve indexing and discoverability. We have sent advice specific to the repository software to the contact details of administrators we hold as part of OpenDOAR (http://www.opendoar.org/tools/emailservice.html).
A quick summary of the advice from Google Scholar is as follows:
****
Google Scholar uses automated processes for indexing. To index a repository well, our search engine robots need to able to quickly reach all articles by following links from the home page and they need to be able to recognize bibliographic data for the article. If our robots are unable to find the URL for an article or are unable to fetch it, the article cannot be included in the Scholar index. Furthermore, if they are unable to determine correct metadata for the article, we may not be able to identify citations to the article - which in turn will impact its ranking and visibility.
We encourage you to go over the Google Scholar inclusion guidelines at http://scholar.google.com/intl/en/scholar/inclusion.html to see if your repository meets the guidelines for indexing. It would probably also be good to discuss the guidelines with the provider of your repository software. We would be happy to discuss technical details as needed with your repository software provider. If your repository system doesn't meet the guidelines, you may also wish to evaluate alternatives are well-structured for indexing (see the inclusion guidelines page above).
****
Azhar
--
Azhar Hussain
Open Access Services Co-ordinator
Centre for Research Communications
Greenfield Medical Library
University of Nottingham Medical School
Queen's Medical Centre
Nottingham
NG7 2UH
UKT: +44(0)1158-467235
F: +44(0)1158-468244
RoMEO - http://www.sherpa.ac.uk/romeo
OpenDOAR - http://www.sherpa.ac.uk/juliet/index.php
-----Original Message-----
From: Repositories discussion list [mailto:[log in to unmask]] On Behalf Of Stevan Harnad
Sent: 15 February 2012 11:32
To: [log in to unmask]
Subject: Google Scholar discoverability of repository content
Can we enhance the google-scholar discoverability of EPrints (and
DSpace) repositories?
Kenning Arlitsch, Patrick Shawn OBrien, (2012) "Invisible Institutional
Repositories: Addressing the Low Indexing Ratios of IRs in Google
Scholar", Library Hi Tech, Vol. 30 Iss: 1
Purpose - Google Scholar has difficulty indexing the contents of
institutional repositories, and the authors hypothesize the reason is
that most repositories use Dublin Core, which cannot express
bibliographic citation information adequately for academic papers.
Google Scholar makes specific recommendations for repositories,
including the use of publishing industry metadata schemas over Dublin
Core. This paper tests a theory that transforming metadata schemas in
institutional repositories will lead to increased indexing by Google
Scholar.
Design/methodology/approach - The authors conducted two surveys of
institutional and disciplinary repositories across the United States,
using different methodologies. They also conducted three pilot projects
that transformed the metadata of a subset of papers from USpace, the
University of Utah's institutional repository, and examined the results
of Google Scholar's explicit harvests.
Findings - Repositories that use GS recommended metadata schemas and
express them in HTML meta tags experienced significantly higher indexing
ratios. The ease with which search engine crawlers can navigate a
repository also seems to affect indexing ratio. The second and third
metadata transformation pilot projects at Utah were successful,
ultimately achieving an indexing ratio of greater than 90%.
Research limitations/implications - The second survey was limited to
forty titles from each of seven repositories, for a total of 280 titles.
A larger survey that covers more repositories may be useful.
Practical implications - Institutional repositories are achieving
significant mass, and the rate of author citations from those
repositories may affect university rankings. Lack of visibility in
Google Scholar, however, will limit the ability of IRs to play a more
significant role in those citation rates.
Originality/value - Little or no research has been published about
improving the indexing ratio of institutional repositories in Google
Scholar. The authors believe that they are the first to address the
possibility of transforming IR metadata to improve indexing ratios in
Google Scholar.
This message and any attachment are intended solely for the addressee and may contain confidential information. If you have received this message in error, please send it back to me, and immediately delete it. Please do not use, copy or disclose the information contained in this message or in any attachment. Any views or opinions expressed by the author of this email do not necessarily reflect the views of the University of Nottingham.
This message has been checked for viruses but the contents of an attachment may still contain software viruses which could damage your computer system: you are advised to perform your own checks. Email communications with the University of Nottingham may be monitored as permitted by UK legislation.