In that case why does the software make its own text files from the
originals? (DSpace is my reference point here, everybody else.) Can't
the benefit be derived but linked instead to the metadata page as the
jump-off point, should the repository manager prefer it? Perhaps this is
not compatible with the way Google etc work? I do see your point of
course. Thanks,
Talat
Stuart Lewis [sdl] wrote:
> There is a way to do this. There is a file called robots.txt that you
> can use to tell crawlers not to index certain things (e.g. pdf files).
>
> BUT... we want Google to crawl and index the full text files, otherwise
> all it can work on to see if a user query matches a page is the
> metadata, rather than the richness of the words in the full text.
>
> If Google doesn't see the link to the file, or doesn't see the file, we
> loose the big benefit of Google indexing our full texts.
>
>
> -----Original Message-----
> From: Repositories discussion list
> [mailto:[log in to unmask]] On Behalf Of Talat Chaudhri
> Sent: 25 September 2008 12:01
> To: [log in to unmask]
> Subject: Re: Header sheets for files
>
> There ought to be a technical solution to force people to see the
> metadata first. The easiest would seem to be to find a way to ensure
> that the link to the file on the metadata page, while it can be accessed
>
> by a human user, cannot be indexed by search engines. Does anyone know
> if you could do this by dynamically generating the page in some way (or
> at least the part with the link to the file) or alternatively by using
> an intermediary re-direct page with a no-robots declaration? Is the
> latter even possible server-side? If the search engine doesn't see the
> link to the file then it won't be the jump-off point for the user.
>
> Just a thought - thanks,
>
>
> Talat
>
> Delasalle, Jenny wrote:
>
>> I asked about this earlier this year, and posted a summary in May,
>>
> under
>
>> the heading "cover sheets"... If you search the list archives on
>> http://jiscmail.ac.uk you'll find that summary. It would be
>>
> interesting
>
>> to know if there are other practices emerging since my posting...
>>
>> One thing that I didn't consider at the time, but which I realised
>>
> only
>
>> the other day. If we attach the cover sheets directly to the files
>> themselves, we will not help our own community with sharing files and
>> records. Academics might move from one Uni to another and want to
>> populate their latest institutional repository with an
>>
> exported/imported
>
>> selection of records. The files might then have cover sheets from the
>> other institution attached to them... Likewise for co-authored papers:
>> we might find a paper in another institution's repository, as
>>
> deposited
>
>> by the lead author, and want to use that for our own records for the
>> co-author, and we'd have to remove a cover sheet if that institution
>> used them. So although they can be helpful for many reasons, I'd very
>> much like a technical solution that created them "on the fly", or a
>>
> way
>
>> to force everyone to access the text via the metadata record as the
>>
> only
>
>> route...
>>
>> Kind regards
>>
>> Jenny Delasalle
>> E-Repositories Manager
>> Research & Innovation Unit
>> University of Warwick Library
>> Gibbet Hill Road
>> Coventry CV4 7AL
>> United Kingdom
>> Tel: (+44) (0) 24 765 75793
>> http://go.warwick.ac.uk/repositories
>>
>>
>>
>>
>>
>>> -----Original Message-----
>>> From: Repositories discussion list
>>> [mailto:[log in to unmask]] On Behalf Of Sheila Scott
>>> Sent: 25 September 2008 10:40
>>> To: [log in to unmask]
>>> Subject: Header sheets for files
>>>
>>> Does anyone routinely attach header sheets with some kind of
>>> copyright statement as the front page of any files which are
>>> placed in their institutional repository and would you be
>>> willing to give us a link to any examples or give any advice
>>> on their experience of this?
>>> Many thanks for any help you can offer
>>> Sheila Scott
>>>
>>> Sheila Scott (Cataloguer)
>>> The Georgina Scott Sutherland Library
>>> The Robert Gordon University
>>> Garthdee Road
>>> Aberdeen
>>> AB10 7QE
>>> email : [log in to unmask]
>>> Tel : 01224 263461
>>>
>>>
>>>
>
>
--
Dr Talat Chaudhri
------------------------------------------------------------
Research Officer
UKOLN, University of Bath, Bath BA2 7AY, Great Britain
Telephone: +44 (0)1225 385105 Fax: +44 (0)1225 385105
E-mail: [log in to unmask] Web: http://www.ukoln.ac.uk/
------------------------------------------------------------
|