Print

Print


****************************************************************
 Find out more about the UKMW11 conference on 25 November 2011
      and how to register at http://bit.ly/ukmw11
****************************************************************

Of course it is not correct that one has to add sites explicitly in order to be indexed :) 

What I said (or though I said) was that being indexed was not, for me, a default position. I choose to submit the site, and I also choose to use a robots.txt file to enforce my wishes. I choose to control what happens with the information on the site to the best of my ability.

As a universal default, a site will, eventually, be found by search engines, and it will, eventually, be indexed, though almost never in its entirety. My point is that a good webmaster is proactive in both what is allowed to be indexed and what is not. Even when a site is found, all indexing may be prohibited.

Your weird moments may be explained easily enough. Someone else may have submitted the site, or they may have mentioned it in (eg) gmail.

On 12 Oct 2011, at 13:15, Mike Ellis wrote:

> ****************************************************************
> Find out more about the UKMW11 conference on 25 November 2011
>      and how to register at http://bit.ly/ukmw11
> ****************************************************************
> 
> Thanks Tehmina - yes, I agree and that's a good idea / point.
> 
> Tim - yes I understand that the right to removal is extremely important, as is compliance with robots.txt.
> 
> Having said that, it's not correct that you have to explicitly add sites to Google/etc in order for them to spider you - actually, spidering of sites *is* the default position for most search engines. Any incoming links (and actually I've had weird moments in the past when there have been *no* incoming links AFAICT) will likely result in sites getting added to their index...
> 
> But anyway. Your points are good ones - thanks.
> 
> cheers
> 
> Mike 
> 
> 
> _____________________________
> 
> 
> Mike Ellis 
> 
> I've gone freelance! Find out more about our new digital agency: http://thirty8.co.uk (http://thirty8.co.uk/) 
> 
> ** I've written a book: http://heritageweb.co.uk (http://heritageweb.co.uk/) **
> 
> On Wednesday, 12 October 2011 at 13:01, Tehmina Goskar wrote:
> 
>> ****************************************************************
>> Find out more about the UKMW11 conference on 25 November 2011
>> and how to register at http://bit.ly/ukmw11
>> ****************************************************************
>> 
>> You could add to your list, 'Explain why I'm doing this' i.e. for fun,
>> greater good, to see what happens, to show how easy it can be, to
>> start my quest for world domination/the revolution etc.
>> 
>> This is probably more relevant and more important than seeking prior
>> permission per se (some organisations may say no before they've even
>> seen the results). This usually negates complaints if attribution is
>> also clear and helps a lot if third-parties from whom original
>> information/images were sought get twitchy.
>> 
>> With this in mind, the current site I'm working on have clear terms of
>> use that encourage non-commercial reuse so long as attribution and
>> source to the original site are made.
>> 
>> TG.
>> 
>> On 12 October 2011 12:39, Mike Ellis <[log in to unmask] (mailto:[log in to unmask])> wrote:
>>> ****************************************************************
>>> Find out more about the UKMW11 conference on 25 November 2011
>>>     and how to register at http://bit.ly/ukmw11
>>> ****************************************************************
>>> 
>>> Thanks Tehmina and others - always really interested to hear other points of view!
>>> 
>>> James' email also follows up on the bigger question - that lots of people are doing this already.
>>> 
>>> Take this from Google Images:
>>> 
>>> http://bit.ly/oCZDvD
>>> 
>>> Result 1 - clearly spidered from (and linked to) the British Museum web page at http://bit.ly/qDcNxG
>>> 
>>> Another example - a while back I built a cross-collections search using Google CSE - http://bit.ly/nBpdzL ...
>>> 
>>> So - why do we think these examples are ok (or don't we)...?
>>> 
>>> Is it:
>>> 
>>> 1) Google is big and probably wouldn't listen to any complaints anyway
>>> 2) Google is big and therefore sends traffic to my site and I'm not about to complain
>>> 3) We don't actually care that much
>>> 4) We know the "rules" are unworkable
>>> 
>>> I'm slightly sad if it is purely 1) and 2) - ie someone small, unimportant and likely to not want to be sued like my "friend" (alright, me) can't build something which might do something similar. As for the "image X can only be legally viewed on website Y" thing....
>>> 
>>> Anyway.
>>> 
>>> What I'm taking from all this is:
>>> 
>>> - seek permission
>>> - use an API if it exists
>>> - take notice of robots.txt
>>> - be responsive if anyone complains
>>> 
>>> Does that sum it up?
>>> 
>>> cheers
>>> 
>>> Mike
>>> 
>>> 
>>> _____________________________
>>> 
>>> 
>>> Mike Ellis
>>> 
>>> I've gone freelance! Find out more about our new digital agency: http://thirty8.co.uk (http://thirty8.co.uk/)
>>> 
>>> ** I've written a book: http://heritageweb.co.uk (http://heritageweb.co.uk/) **
>>> 
>>> On Wednesday, 12 October 2011 at 09:19, James Morley wrote:
>>> 
>>>> ****************************************************************
>>>> Find out more about the UKMW11 conference on 25 November 2011
>>>> and how to register at http://bit.ly/ukmw11
>>>> ****************************************************************
>>>> 
>>>> On that last point, if you can remeber the website address then http://www.archive.org/web/web.php is always worth a try (possibly the ultimate web scraper, and some 'interesting' debate in their forums!)
>>>> 
>>>> ________________________________________
>>>> From: Museums Computer Group [[log in to unmask] (mailto:[log in to unmask])] On Behalf Of J DAVIS [[log in to unmask] (mailto:[log in to unmask])]
>>>> Sent: 11 October 2011 17:34
>>>> To: [log in to unmask] (mailto:[log in to unmask])
>>>> Subject: Re: Screen-scraping
>>>> 
>>>> ****************************************************************
>>>> Find out more about the UKMW11 conference on 25 November 2011
>>>> and how to register at http://bit.ly/ukmw11
>>>> ****************************************************************
>>>> 
>>>> Well expressed comments, Tehmina.
>>>> 
>>>> I agree, of course (whilst also recognising the difficulty of getting some content to try something out just enough to see if it could be a viable project).
>>>> And I wish screenscrapers had been around just before some websites created in the Great Cultural Content Creation period decayed beyond repair or simply vanished.
>>>> 
>>>> Janet
>>>> 
>>>> Janet E Davis
>>>> 
>>>> 
>>>> 
>>>> ****************************************************************
>>>> website: http://museumscomputergroup.org.uk/
>>>> Twitter: http://www.twitter.com/ukmcg
>>>> Facebook: http://www.facebook.com/museumscomputergroup
>>>> [un]subscribe: http://museumscomputergroup.org.uk/email-list/
>>>> ****************************************************************
>>>> ****************************************************************
>>>> website: http://museumscomputergroup.org.uk/
>>>> Twitter: http://www.twitter.com/ukmcg
>>>> Facebook: http://www.facebook.com/museumscomputergroup
>>>> [un]subscribe: http://museumscomputergroup.org.uk/email-list/
>>>> ****************************************************************
>>> 
>>> 
>>> ****************************************************************
>>>      website:  http://museumscomputergroup.org.uk/
>>>      Twitter:  http://www.twitter.com/ukmcg
>>>     Facebook:  http://www.facebook.com/museumscomputergroup
>>> [un]subscribe:  http://museumscomputergroup.org.uk/email-list/
>>> ****************************************************************
>> 
>> 
>> 
>> -- 
>> Dr Tehmina Goskar, MA AMA
>> [log in to unmask] (mailto:[log in to unmask])
>> 
>> http://tehmina.goskar.com/
>> 
>> Research Associate
>> History & Classics
>> Prifysgol Abertawe / Swansea University
>> 
>> ****************************************************************
>> website: http://museumscomputergroup.org.uk/
>> Twitter: http://www.twitter.com/ukmcg
>> Facebook: http://www.facebook.com/museumscomputergroup
>> [un]subscribe: http://museumscomputergroup.org.uk/email-list/
>> ****************************************************************
> 
> 
> ****************************************************************
>       website:  http://museumscomputergroup.org.uk/
>       Twitter:  http://www.twitter.com/ukmcg
>      Facebook:  http://www.facebook.com/museumscomputergroup
> [un]subscribe:  http://museumscomputergroup.org.uk/email-list/
> ****************************************************************

Tim Trent - Consultant
Tel: +44 (0)7710 126618
web: ComplianceAndPrivacy.com - where busy executives go to find the news first
personal blog: timtrent.blogspot.com/ - news, views, and opinions
personal website: Tim's Personal Website - more than anyone needs to know


Important: This message is private and confidential. If you have received this message in error, please notify us and remove it from your system. This email and any attachment(s) are believed to be virus-free, but it is the responsibility of the recipient to make all the necessary virus checks. This email and any attachments to it are copyright of Meadowood Associates, owners of Compliance And Privacy, unless otherwise stated. Their copying, transmission, reproduction in whole or in part may only be undertaken with the express permission, in writing, of Meadowood Associates, at 16 Coombe Road, Dartmouth, Devon, United Kingdom TQ6 9PQ



****************************************************************
       website:  http://museumscomputergroup.org.uk/
       Twitter:  http://www.twitter.com/ukmcg
      Facebook:  http://www.facebook.com/museumscomputergroup
 [un]subscribe:  http://museumscomputergroup.org.uk/email-list/
****************************************************************