IBM, Symantec Team Automates Anti-Virus Efforts

Until recently, whenever a new computer virus was discovered, "deprogrammers" at various antivirus research labs would spring into action, identifying signatures and how the virus spreads.

Until recently, whenever a new computer virus was discovered, "deprogrammers" at various antivirus research labs would spring into action, identifying signatures and how the virus spreads.

Within days, a fix would be prepared and distributed to companies. Now, identification and an antidote for malicious code can be generated and turned around almost as soon as it’s received. IBM, working in partnership with Symantec Corp. ( has automated the entire virus identification and eradication process so that most users will never see the threat.

The core of this "Digital Immune System," a system developed within IBM’s labs and licensed to Symantec, that quickly responds to new threats with appropriate antivirus code.

Until recently, "the anti-virus industry used to cope with basically manual methods," says Dr. Steve R. White, senior manager of the massively distributed systems group at IBM’s T.J. Watson Research Center. "They had a barn full of skilled deprogrammers that disassemble viruses and figure out what they do." But as the rate of virus infections climb, then "current methods used by the antivirus industry breaks down. You can’t get 10 or 100 more barns full of people to go look at viruses on such a scale. There will be too many viruses for humans to analyze, moving too quickly for people to get ahead of them."

Researchers estimate that as many as 10 computer viruses a day are unleashed on the world.

The Digital Immune System concept is based on the same virus response technology in the human body, detecting viruses and automatically routes them from client systems to Symantec’s Symantec AntiVirus Research Center (SARC), or to IBM’s Watson labs.

IBM researchers have devised a PC-based system that can scan files for suspicious patterns and identify the virus’s signature. Then, in the fashion of a body’s immune system, "antibodies" are sent out over the Internet to Norton AntiVirus client sites.

Eventually, researchers hope, corporate servers will be able to automatically initiate the process when unusual code is detected coming through the firewall.

Symantec is releasing its implementation of the Digital Immune System in phases over the coming year, starting with a managed anti-virus solution built on top of Norton AntiVirus Corporate Edition. This will be linked to the Symantec AntiVirus Research Center (SARC), featuring technology conceived at IBM Research and jointly developed with the scientists at SARC.

The goal of the Digital Immune System will be able to identify and dissect a virus and send out a cure within hours, and eventually even minutes. When a virus is addressed for one customer, every Digital Immune System user receives an update. Potentially, the entire process can be automated, with viruses detected either at the firewall or by the client system, packaged up, and sent to IBM or Symantec. "If it’s a virus that’s already known, Norton AntiVirus will deal with it routinely," says David Chess, researcher for IBM. "But if it’s a new virus, then it would automatically get bundled up and sent off."

Of course, some companies may be a little nervous about automatically shipping files to an outside service, sight unseen. "The administrator has control, and has the option to look at the files first," says Chess. "We’re hoping the administrators will come to trust IBM, because we’re good guys, and not mind having the documents sent to our system, knowing we’ll protect them carefully. The system works best the more automatic it can be.

If it has to wait for the administrator to come in on Monday and push the ‘okay’ button, then that slows down the process of finding a cure for the virus."

This antivirus ecosystem -- accompanied by enterprise-scale virus protection in corporate firewalls -- is moving the focus antivirus solutions away from desktop computers and PC-based packages, says Rob Enderle, analyst with GIGA Group ( "Three or four years from now, you won’t buy an antivirus product, you will buy a service," Enderle predicts. "All the antivirus checking will exist automatically, some of which will run on your machine, some of which will not."

Products such as ScanMail from Trend Micro, Inc. ( scan e-mail attachments for viruses at the Exchange Server before they reach users’ in-boxes, as well as provide server-based e-mail content filtering and spam blocking capabilities for Exchange’s messaging and collaboration environment.

The close relationship between Symantec and IBM in antivirus research makes this combination one of the industry’s strongest antivirus solutions, Enderle notes. "Symantec gets access to technologies out of Watson research which are not enjoyed by the other players. It gives them a fairly significant advantage, and makes them the technical leader."

IBM’s Watson labs is one of the few sites that currently has a system that automatically examines a virus’s signature and generates an antidote. New viruses are sent -- either electronically or on disk -- to IBM Watson, where potentially malicious code is dissected and cataloged. At this point, about 20,000 viruses are stored on disks inside the lab. Researchers on the lab maintain standalone PCs and networks on which they can run a virus and watch how it replicates.

First, IBM researchers "automated the process of replication," says Chess. "Instead having to sit in front of the keyboard and try to make the virus spread, we had a machine that automatically to get the virus to spread and make samples. Then I would take them and figure out which part of the sample was the virus, and then figure which part of the virus to take the signature from. We gradually automated those things too."

Code was developed that extracts the signature from the virus. One of the challenges was developing a system that could recognize signature properties in affected files that were not present in unaffected files, says Chess. "That’s the hard part. You want to avoid a false positive as much as possible." The signature evaluator runs on a statistical model of the composition of "clean code." By identifying the likely signature for a virus, the system can generate a fix with no human intervention.

However, some viruses may still require human intervention, Chess cautions. "Once it a while, it will have to kick one out to a human being. Some viruses only spread on alternate Wednesdays, or only under DOS version 3.0, or something to that effect. We’ve managed to automate a good chunk of that kind of weird stuff, too. For a typical virus, we can take it from reception to finished definition update without human intervention."

Another challenge is non-virus code that is identified as a virus. "The hardest task for the immune system -- or any heuristic program -- is when you give it a file that doesn’t have a virus," says Chess. "You spend a lot of time trying to find the virus, when there wasn’t one there at all." IBM is developing a mathematical model to determine the probability of a virus lurking in a set of files.

"False positives can be more expensive to a company than a real virus detection," Chess continues. "You spend a lot of time running around, trying to figure out what’s going on, maybe erasing files, maybe taking machines offline, and you discover there was no virus there at all." For example, one rumor that made its rounds was that files named EXPLORER.EXE was a virus.

"We got thousands of copies of Windows Explorer," he recalls.