I agree with all the points you make. Yes, the duration of "linger" on a
page is inferred.
It gets even more fun if the webmaster employs Google Analytics (was
Urchin). Now Google knows "everything"
-----Original Message-----
From: This list is for those interested in Data Protection issues
[mailto:[log in to unmask]] On Behalf Of Tony Bowden
Sent: 09 June 2006 12:03
To: [log in to unmask]
Subject: Re: [data-protection] US Government and Data Retention.
On Fri, Jun 09, 2006 at 11:38:01AM +0100, Tim Trent wrote:
> Website log files store pretty much everything about a user's visit.
> They store where you entered the site, which pages you visited and for
> how long, and where you left the site. They even store the interval
> between visits (to an extent) to allow calculations of the uniqueness
> of your visit. A description of some of this is found at
> http://httpd.apache.org/docs/1.3/logs.html
Almost true; the small nit is that they don't know anything about "how
long". Each page request is logged, but there is no concept of how long
someone "stayed on a page". Marketing folks often ask for this information,
but it doesn't exist. You can (perhaps) infer from the interval between page
requests, but there's no information at all beyond the final page request.
This all gets even more complicated when you encounter users who browse in
multiple windows simultaneously, or open lots of tabs in parallel, or even
with users who use the 'back'
button and then follow a different link. Almost all log analysis software
I've examined (and I've played with a *lot* of them) tend to ignore these
issues, as (a) they're hard to deal with properly (b) they pretend they're
rare on most sites (although tabbed browsing is becoming increasingly
common, and anyone who has built a site with session management knows just
how much of a problem the 'back button' issue actually is - note that many
banking sites explicitly terminate your session if you try to use it!), and
(c) for the most part everything just evens out.
But, yes, most commercial websites spend (or should spend) a considerable
amount of time and effort on logfile analysis.
Very popular non-commercial sites, on the other hand, sometimes don't keep
these sorts of logs for very long, or sometimes even at all, as the disk
space required would be too great for the minimal value. Wikipedia, for
example, just doesn't bother logging requests at all.
> Your ISP also holds logs of your activity. So it is perfectly
> possible to determine precisely which machine (and login) searched for
> 'rose scented talcum powder' on St Smellbetter's Day at 10am.
Although the government has considered forcing ISPs to record and keep this
sort of data, there is currently no requirement to do so. Some ISPs
*may* keep this data, but for others it serves no useful purpose and so they
don't keep the logs for longer than a few weeks. These logs also usually
aren't anywhere near as "useful" as website logs; they're just logs of raw
data packets moving through the network. Again, there are tools for parsing
this data into more useful information, but most ISPs I've come across (I've
worked for two) don't really do anything with these logs unless someone
reports a problem. The bigger ISPs, particularly in the US, have realised
that this information is generally useful to others, and some sell the data
in anonymised value. One search engine, for example, puchases web
clickstream data from ISPs so they can discover how users browse the sites
that they're directed to from search results, and can better tailor their
results in future.
Tony
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
All archives of messages are stored permanently and are
available to the world wide web community at large at
http://www.jiscmail.ac.uk/lists/data-protection.html
If you wish to leave this list please send the command
leave data-protection to [log in to unmask]
All user commands can be found at : -
http://www.jiscmail.ac.uk/help/commandref.htm
Any queries about sending or receiving message please send to the list owner
[log in to unmask]
(all commands go to [log in to unmask] not the list please)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
All archives of messages are stored permanently and are
available to the world wide web community at large at
http://www.jiscmail.ac.uk/lists/data-protection.html
If you wish to leave this list please send the command
leave data-protection to [log in to unmask]
All user commands can be found at : -
http://www.jiscmail.ac.uk/help/commandref.htm
Any queries about sending or receiving message please send to the list owner
[log in to unmask]
(all commands go to [log in to unmask] not the list please)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|