E-Mail Archives: Keys to Message Classification and Retention

How intelligent archiving utilizes intelligent classification and retention technologies while reducing storage costs and simplifying management.

by Art Gilliland

E-mail’s mission-critical nature has forced organizations to evaluate their overall policies and systems for managing it—from resources and costs to retention and discovery.

Enterprises are evaluating or using e-mail archiving software solutions to manage these issues. With these systems, IT can control the growth in e-mail storage costs while giving end users e-mail storage and search in a more user-friendly manner and providing legal departments a consistent system for retaining and finding e-mails. While these systems simplify the issues of archive-storage size, archive retention period, and archive search, they do not make them disappear.

All e-mail is not created equal. Some e-mail messages are considered assets, while others are liabilities. The amount of time it should be retained depends upon the category into which the e-mail falls. But organizations that use e-mail archiving systems typically either have no automated archiving system, but keep everything in the archive for the same period of time, or archive but keep everything forever.

There must be a better way.

Enter intelligent archiving. The natural evolution of early e-mail archiving software solutions, intelligent archiving utilizes intelligent classification and retention technologies to capture, categorize, index, and store target data to enforce policies and protect corporate assets—all while helping to reduce storage costs and simplify management.

Fundamental Policy Decisions: What’s Required Intelligent archiving solutions address one of the most fundamental challenges of e-mail storage and discovery: data classification. Rather than treating all e-mail the same, intelligent archiving offers intelligent classification and categorizes messages according to their relevance to specific business purposes. Only when data is appropriately classified can it then be intelligently filtered, retained, and discovered.

Not only do different types of e-mail messages have different values, but different companies have different classification needs for their information. For example, highly process-driven organizations (such as insurance firms or mortgage companies) may require much more granular classification than would a manufacturer or other business with more fluid interaction. Other companies may already have an enterprise content management (ECM) system in place and simply want to extend it to archived e-mail.

Intelligent archiving accommodates these classification approaches, offering user classification that allows individuals to sort messages as part of archiving, automated classification that tags messages based on rules, and integration with ECM systems that applies existing ECM policies to e-mail messages.

User Classification

Many organizations must rely on their users to make difficult decisions about what e-mail to save or delete. However, this often burdens them with too many processes and impacts their productivity. For example, the user may be tasked with using a Web interface, saving an e-mail to a specific folder, or using an application plug-in to specify metadata.

To help reduce the number of steps the user must take in classifying e-mails, intelligent archiving systems offer a seamless, intelligent user-driven classification model. This software monitors user e-mail activity, identifies e-mail that needs to be classified, and prompts the user to choose from a subset of predefined classifications only when necessary.

With policy-based e-mail capture, the user classification engine enables all business-critical and regulated e-mail to be sorted as each item is created or read by the user. This helps enforce user retention policies more effectively by taking control of records where they are most vulnerable.

Automated Classification

In contrast to user classification, automated classification takes decision-making out of the hands of users and puts it into the archiving system. Today’s classification engines use a combination of approaches to analyze a message and determine its content type.

For example, an automated classification engine may evaluate senders and recipients as well as the groups in which they reside to determine content type. It may also evaluate message direction, since messages sent externally often merit a higher degree of scrutiny and retention. An automated engine may evaluate messages for keywords or phrases or for patterns, searching e-mails for sequences that identify Social Security numbers, for example.

The most robust intelligent archiving systems offer a wide variety of tagging rules based on customizable or predefined conditions. Flexibility is key because rules can be established on multiple levels. Tagging rules also allow for certain actions to be taken, including setting message retention policies, exclusion criteria, and review flags.

Integration with ECM

Many organizations already have an ECM system in place that categorizes and manages records across multiple content types. These systems can be integrated with intelligent e-mail archiving systems to allow the archive to store and optimize e-mail while enabling the ECM system to drive retention decisions that are consistent across different types of data. Once messages are in the integrated system, users can browse and search for messages the system manages.

For external management of retention policies, objects are created in the ECM system that reference archived messages in the intelligent archiving system. These objects are then controlled by the ECM system’s standard policies, which age objects through configured retention lifecycles and ultimately delete objects as they reach expiration. When a retained message is deleted by the ECM system, the integration ensures that the corresponding message is removed from the archive.

Putting Intelligence to Work

Once messages are categorized (using user or automated classification or by an integrated ECM system, the intelligent archiving system leverages intelligent filtering to delete non-relevant e-mail before archiving; intelligent retention determines how long to keep archived e-mails based upon their classification; and intelligent discovery or review tags e-mails with metadata to make them easier to search and discover in the future.

Organizations can augment the benefits of an intelligent archiving system with best practices for e-mail retention. These include archiving all e-mail for at least the same period of time that backup tapes are retained.

Organizations must place holds on all e-mail subject to outstanding investigations to ensure it is not deleted under new amendments to the United States Federal Rules of Civil Procedure. Organizations should also ask users to drag e-mail into records folders in their e-mail system to classify e-mail that needs to be stored beyond the default period; these folders should be pushed out only to users who tend to be process-oriented.

Finally, organizations should apply a default policy using automated classification for other groups of users and enforce an overriding policy to retain e-mail that has been flagged as containing sensitive information.

Regardless of the direction an organization takes for managing e-mail, adding intelligence to archiving policies helps balance storage optimization, records retention, and fast discovery while capturing the business value of e-mail archives.

- - -

Art Gilliland is the senior director of product marketing for the Symantec Information Foundation team. Gilliland joined Symantec through the acquisition of IMlogic, an instant messaging security and management vendor. You can reach the author at

Must Read Articles