Hi Chris,
I don't know if you are using DSpace, EPrints or another system but getting inlinks for a DSpace repository is not an easy/reliable task, so hopefully some of my experiences can be related to other repository systems.
I spent an afternoon a few month ago trying to get a reasonable set of KPIs that we could use to measure our repository. There is no guarantee that these methods are accurate, but they seem to be more stable than the other methods I have tried. For inlinks I have two sets of measures, one for Yahoo and one for Google.
The Yahoo one was easy to setup and uses Yahoo's API, specifically:
http://search.yahooapis.com/SiteExplorerService/V1/inlinkData?appid=%%APP_ID%%&query=http://%%REPO%%&entire_site=1&omit_inlinks=subdomain&results=1
(Replace %%APP_ID%% with your Yahoo API ID and %%REPO%% with the hostname of your repository)
The biggest problem I had with Yahoo's inlinks measure is that it is not very meaningful for a DSpace repository as people are asked to link to items in the repositories by their handles, so there are a lot of links to DSpace repositories that Yahoo doesn't measure because they all point to http://hdl.handle.net/.
The Google inlinks measure I came up with was an attempt to try and track the number of handle inlinks for the site. The solution I came up with was the following:
"http://hdl.handle.net/%%HANDLE_ID%%/" -site:%%HOST%%
(Replace %%HANDLE_ID%% with your repositories handle id and %%HOST%% with the fully qualified
hostname of your repository e.g. "http://hdl.handle.net/2134/" -site:dspace.lboro.ac.uk)
If you are using another repository system that doesn't use handles then you can vary this search to find an approximate inlink for your items by replacing the URL to the handle server ("http://hdl.handle.net/%%HANDLE_ID%%/") with your repository's URL, effectively asking Google to list those pages which mention your repository but that aren't part of your repository.
Another thing to consider is using Google's webmaster tools (http://www.google.com/webmasters/tools/) which has an option to show you who links to your site, though it seems a bit hit and miss with repositories and again doesn't take the handle server into account.
Regards
Jason Cooper,
Library Systems Team,
Loughborough University.
-----Original Message-----
From: Repositories discussion list [mailto:[log in to unmask]] On Behalf Of Chris Rusbridge
Sent: 26 April 2011 11:48
To: [log in to unmask]
Subject: Inlinks to repositories
I think I asked earlier (various individuals if not the whole list) about reliable ways of counting inlinks to repositories. I had lots of trouble getting the Google "site:" command to give me reliable data that seemed to make sense. I remembered that someone had once told me that Yahoo had a better response. I seemed to get good results using the Yahoo version of the "site:" command, which takes you to their Site Explorer. Click on the Inlinks tab, and I then include links TO the entire site, but exclude links FROM this subdomain (you can exclude from the domain to get incoming from outwith your institution). It's a tedious exercise, but it does seem to work.
No clear conclusions however, from the few I looked at; no strong correlation (sensed, not calculated) with numbers of records, although age does seem to be a big factor (not surprisingly). This is not surprising given:
a) the proportion of metadata-only records is rarely clear; there are fewer reasons to link to a metadata-only record
b) citations should always be to the authoritative version, whereas much IR content is second copies
c) the nature of the "stuff" in IRs is highly variable anyway.
--
Chris Rusbridge
Mobile: +44 791 7423828
Email: [log in to unmask]
|