Last week I explored
what precisely makes up the 20 year archive of the web held in the Internet
Archive’s Wayback Machine. Several of those findings have spawned
considerable discussion over the past week within the library and web
archival communities about what it means to archive the web, how much
documentation and metadata is enough, the tradeoffs in completeness vs
reach, and how to better engage with the myriad constituencies served by
web archives.

Why is it so important to understand what’s in our web archives? Perhaps
the most important reason is that as an infinite and ever-changing
landscape, it is simply impossible to archive the “entire internet” and
perfectly preserve every change to every page in existence. Web archives
are by their very nature an imperfect record of the web and constructing
them is an exercise in countless tradeoffs of how to preserve an infinite
stream with finite resources.

Dallas, Tx
[log in to unmask]
Save our in-boxes!
"The problems of our economy have occurred not as an outgrowth of
laissez-faire, unbridled competition.
They have occurred under the guidance of federal agencies, and under the
umbrella of federal regulations."
Senator Ted Kennedy, in defending trucking deregulation in 1978.

To view the list archives go to:
To unsubscribe from this list, send an email to [log in to unmask] with the words UNSUBSCRIBE RECORDS-MANAGEMENT-UK

For any technical queries re JISC please email [log in to unmask]
For any content based queries, please email [log in to unmask]