Print

Print


Apologies for cross-posting.

 

 

The three-year Collaborative Electronic Records Project (CERP) of the
Smithsonian Institution Archives and the Rockefeller Archive Center
concluded in December 2008. Among the project outcomes, the CERP Email
Parser was produced and we are pleased to offer it to the archival and
related communities as an open source software tool for the preservation
of email accounts. The Email Parser
(http://siarchives.si.edu/cerp/parserdownload.htm) migrates an email
account and its messages into a single XML file using the Email Account
XML Schema developed in collaboration with the North Carolina State
Archives and the EMCAP project.

 

The CERP Email Parser migrates an email account in MBOX format into XML,
using the schema to preserve the full body of messages, together with
their attachments, and keeps intact the account's internal organization
(e.g., an Inbox containing subfolders labeled Policies, Special Events,
and Projects). The CERP team successfully preserved email accounts from
a variety of applications including Microsoft Outlook, AppleMail,
LotusNotes, and Netscape. All email messages retain their full header
content, in contrast to some tools produced in earlier research efforts.

 

The parser runs on a workstation in a virtual machine environment
compatible with Windows, Macintosh, Linux, and some Unix platforms. CERP
testing was limited to the Windows XP environment. The CERP Email Parser
is licensed as open source software so that it may be used, supported,
and enhanced by all organizations that adopt it.

 

The Email Parser is designed to address the task of preserving bodies of
email, such as an account, without requiring access to the original
email systems. Still, email accounts from active email systems may also
be preserved using this tool. The CERP Email Parser will be featured in
the pre-conference workshop "Achieving Email Account Preservation With
XML" at the Society of American Archivists 2009 Annual Meeting this
August. 

 

For more information and to download the parser, visit
http://siarchives.si.edu/cerp/parserdownload.htm. For more on the
Collaborative Electronic Records Project, visit
http://siarchives.si.edu/cerp/. Please direct email inquiries to
[log in to unmask]

 

 

Riccardo Ferrante

IT Archivist and Electronic Records Program Director

Smithsonian Institution Archives

600 Maryland Ave SW   MRC 507

Washington, DC 20013-7012

 

[log in to unmask]  |  phone 202.633.5906 | fax 202.633.5928  |  cell
202.341.4658