Print

Print


Last week I explored
<http://www.forbes.com/sites/kalevleetaru/2015/11/16/how-much-of-the-internet-does-the-wayback-machine-really-archive/>
what precisely makes up the 20 year archive of the web held in the Internet
Archive’s Wayback Machine. Several of those findings have spawned
considerable discussion over the past week within the library and web
archival communities about what it means to archive the web, how much
documentation and metadata is enough, the tradeoffs in completeness vs
reach, and how to better engage with the myriad constituencies served by
web archives.

Why is it so important to understand what’s in our web archives? Perhaps
the most important reason is that as an infinite and ever-changing
landscape, it is simply impossible to archive the “entire internet” and
perfectly preserve every change to every page in existence. Web archives
are by their very nature an imperfect record of the web and constructing
them is an exercise in countless tradeoffs of how to preserve an infinite
stream with finite resources.


http://onforb.es/1QMw9XQ
http://onforb.es/1QMw9XQ+




-- 
Peterk
Dallas, Tx
[log in to unmask]
Save our in-boxes! http://emailcharter.org
"The problems of our economy have occurred not as an outgrowth of
laissez-faire, unbridled competition.
They have occurred under the guidance of federal agencies, and under the
umbrella of federal regulations."
Senator Ted Kennedy, in defending trucking deregulation in 1978.

Contact the list owner for assistance at [log in to unmask]

For information about joining, leaving and suspending mail (eg during a holiday) see the list website at
https://www.jiscmail.ac.uk/cgi-bin/webadmin?A0=archives-nra