Five Tips for Designing Preservable Websites | The Bigger Picture
Here at the Smithsonian Institution Archives, we take pride in
preserving the Institution’s history, including its sizable web
presence. While various offices at the Smithsonian create and back up
the contents of their websites, the Archives also crawls each website
using Heritrix, an open-source tool created by the Internet Archive,
to capture content in an archival format. Our aim is to preserve the
ABCs of digital objects: appearance, behavior, and content. We take care
to tailor crawl configurations to each specific website to capture as
much of its ABCs as possible while adhering to our collections policy.