IronPort ™ combines best-of-breed conventional techniques with IronPort’s breakthrough context-sensitive detection technology to revolutionize the fight against email threats. Today’s spam attacks have become too sophisticated for earlier-generation spam systems. These systems share a common weakness – relying heavily on analyzing content that can easily be manipulated by spammers. State of the art anti-spam systems must go beyond content examination and analyze messages in the full context in which they are sent.
As spam continues to evolve, near real-time rules will need to remain a critical part of the anti-spam equation – in order to successfully eliminate spam and blended threats. With spam on the rise, this type of multi-layer defense is critical to protecting networks worldwide.
Table of ConTenTs 1 executive summary 2 Introduction3 from Content to Context8 analyzing in Context with Case9 The IronPort anti-spam ecosystem10 enterprise Management12 summaryWh i t e Pa P e r1executive summary email threats have expanded from nuisance spam to sophis-ticated blended attacks. IronPort anti-spam eliminates the broadest range of known and emerging threats.IronPort anti-spam" combines best-of-breed conventional techniques with IronPort s breakthrough context-sensitive detection technology to revolution-ize the fight against email threats. Today s spam attacks have become too sophisticated for earlier-generation spam systems. These systems share a common weakness relying heavily on analyzing content that can easily be manipulated by spammers. state of the art anti-spam systems must go beyond content examination and analyze messages in the full context in which they are sent.as spam continues to evolve, near real-time rules will need to remain a criti-cal part of the anti-spam equation in order to successfully eliminate spam and blended threats. With spam on the rise, this type of multi-layer defense is critical to protecting networks worldwide. DoC ReV 04.06ironPort s Multi-layer Spam Defense architectural OverviewUntitled DocumentirOnPOrt S Multi-layer SPaM DefenSeWhite PaPer2intrODuCtiOnThe volume of spam has been steadily increasing every year since 2002. In addition to sheer volume, the sophistication of spammer tactics has also grown. This flood of illegitimate email is propelled by a powerful motive profit. spammers make money from selling a wide array of marginal products ranging from herbal supplements, low interest mortgages, and ergonomic mice, to criminal activities such as credit card fraud, pornography and illegal pharmaceutical sales. The profits behind these endeavors are being plowed back into new technology and infrastructure for delivering spam.When spam initially became a problem, corporations and networks began to deploy first generation spam filters. These filters primarily relied upon heuristic analysis looking at the words in a message and using a weight-ing system to create a probability that the message was spam. as these solutions became more widespread, spammers began to develop new, more sophisticated, tactics to circumvent the filters. This spawned a cat and mouse game in which spammers would develop a new tactic to get past flters, then anti-spam vendors would add a new technique to their cocktail to stop the spammers tactic, then spammers would come out with a new tactic to get past filters, etc. Recently, spam has been using increasingly sophisticated obfuscation techniques and mutating faster than ever. Most spam now includes blocks of text that contain words known to score as not spam often technical terms or a passage from a text book. other tricks involve using words with white on white text or replacing letters with numbers (e.g., l0ve). spammers have also become increasingly clever in using URls. some spam contains minimal content but includes a URl with a call to action, while other spam at-tacks host their spam URls on the same servers used by legitimate websites using free Web hosting services, like Geocities. These obfuscation techniques have effectively defeated most content based flters. While most vendors still claim to have spam capture rates in the high 90 s, in reality, their capture rate may be in the 80 s (or worse). at the same time, content based filters have the challenge of occasionally deleting legitimate mail that happens to contain words associated with spam creating a false positive . The table on page 3 highlights the evolution of spam filtering, along with the limitations of each of the approaches. The first three generations each over-came weaknesses of the prior generation, but all of these approaches suffer Untitled DocumentirOnPOrt S Multi-layer SPaM DefenSeWhite PaPer3from a common limitation. each approach can be circumvented by spammers because it relies on something the spammers themselves have full control over the content of the message. This is like building a house on top of a weak foundation.frOM COntent tO COnteXtMaintaining consistently high spam efficacy requires a new approach to the problem. This approach should leverage the latest in adaptive learning tech-nology, but be based on a more holistic understanding of the context in which a message is sent. Importantly, this technology must incorporate information that the spammer cannot influence. This includes tracking the identity and reputation of the email sender and the website advertised in the message.The spam filtering technology employed in the IronPort email security appli-ances uses a highly advanced, multi-layer approach to evaluate a message. IronPort s anti-spam solution moves beyond traditional content based analy-sis by analyzing four broad areas:1. Who (what do we know about the sender)2. Where (if the message contains links, what do we know about where those links go) 3. What (what is the nature of the contents of the message) 4. How (how was the message technically constructed) by examining a broad set of data, beyond the mere contents of a message, IronPort s anti-spam system yields robust, highly accurate results that require GeneratiOnliMitatiOnSeXaMPle1. hueristicsSpoofable spammers change words so filters don t recognize spam but humans do. false positives legitimate email often contains spammy words. C H e a P V.i.a.g.r.a 2. SignaturesSpoofable Hashbusters fool bulk detection systems by making spam look dissimilar. reactive writing signatures first requires collecting spam samples. Cheap Viagra dgjk#"3. adaptiveSpoofable Defeated by inserting good words that only machines see. high Overhead adaptive learning systems, like bayesian, are hard to train/maintain. Cheap Viagra here: http://abc.com Cancer, office, shakespeare&. 4. Context adaptiveemerging Requires extensive vendor invest-ment in tracking email and Web reputation.Untitled DocumentirOnPOrt S Multi-layer SPaM DefenSeWhite PaPer4no administrator intervention. This technology is currently deployed at the largest IsPs and enterprises in the world, protecting millions of end-user mailboxes.The First Question Who is sending me mail?When a message arrives at an IronPort appliance, before any processing begins, the IronPort sytem evaluates the nature of the sender. This process is called Reputation filtering a technique pioneered by IronPort more than three years ago, and subsequently adopted by every leading anti-spam vendor. The concept behind Reputation filtering is simple but powerful analyzing the traffic patterns and network characteristics of a given sender to determine trustworthiness. The foundation of any reputation system lies in the quantity, quality, and breadth of data tracked.Quantity IronPort s senderbase is the world s first, largest and most accurate traffic monitoring network. senderbase collects data on more than 120 parameters from over 100,000 different networks to characterize the behavior of a sender. This network includes eight of the ten largest IsPs in the world and a wide array of large and small enterprises, distributed glob-ally. This powerful network gives senderbase a view into an astounding 25 percent of the world s email traffic. senderbase traffic represents a very statistically significant sample-size resulting in the extremely high accuracy of IronPort Reputation filters". Quality In addition to size and breadth of data, IronPort has developed a sophisticated data quality engine that allows senderbase to account for data feeds from different sources with different circumstances normalizing them for proper interpretation. (see the IronPort anti-spam ecosystem section for more information). Breadth The data measured by senderbase includes the global volume of mail being sent by a particular sender, how long the sender has been sending mail and at what volume, whether the sender accepts mail in return, what the country of origin is, and whether the sender s Dns is configured properly. These are all objective, network-based parameters that can be accurately measured.because they look at such a broad set of sender data, IronPort Reputation filters are robust enough to overcome occasional outlying data points. This effect is illustrated in figure 1.Untitled DocumentirOnPOrt S Multi-layer SPaM DefenSeWhite PaPer5as an example, a sender who has a long history of sending reasonable mail volumes, accepts mail in return, and is a Global 2000 company but hap-pens to have a mis-configured Dns record will still have a positive reputa-tion, despite one or two questionable parameters. However, the sender who is sending 10 million messages per day, has just begun mailing on their IP, does not accept mail in return, is sending from a zombie PC, and is located in the Ukraine is likely to have a negative reputation. Consequently, mail from this sender can be stopped before it even enters the network.Dns blacklists and whitelists were the predecessors to reputation systems and some reputation systems today are still based solely on this early generation technology. The advantage of a true reputation system is not only the breadth of data, but also granularity. Traditional blacklists are binary a sender is either guilty (which means they are blocked) or not guilty (which means they can send as much mail as they would like). IronPort Reputation filters offer higher granularity, measuring sender reputation on a scale of -10 to +10. This allows the IronPort appliance to deal more gracefully with ambiguity. IronPort Reputation filters are linked to IronPort s unique rate limiting capability. This allows the IronPort appliance to intelligently push back or slow down a sender that appears suspicious but has not yet earned a reputation worthy of blocking. In short, the more spammy a sender appears, the slower they go. Having the ability to dynamically apply limits to new or suspicious senders allows the IronPort to greatly reduce the amount of incoming spam, without incurring false positives because suspicious or ambiguous mail is slowed but not blocked.In production for more than three years, IronPort Reputation filters are so accurate they can stop 80 percent of incoming spam at the connection level. This powerful outer layer reduces email bandwidth consumption and Figure 1: Global Efficacy for Broad Threats Broad data analysis drives accuracyUntitled DocumentirOnPOrt S Multi-layer SPaM DefenSeWhite PaPer6hardware administration costs by as much as 80 percent. This outer layer also serves as a valuable shock absorber by dramatically reducing the number of messages that need to be scanned during a denial of service at-tack, spam outbreak or virus outbreak. This filtering is illustrated in figure 2.but the concept of reputation does not stop at the perimeter. The reputation of the sender is passed to IronPort s Context adaptive scanning engine", known as Case. This the engine that looks at a broad set of data includ-ing sender reputation to evaluate a message in context and make a final determination of spaminess .The Second Question Where do the links in this message take me?In order to make money, spammers need a call to action in their message. This call to action may be a phone number to call to buy a product, a physical address to send money to or the ticker symbol of a penny stock the spam-mer wants you to buy. More often than not though, the call to action in an email message is a URl linking to a website with a product offer or malicious content. over 85 percent of spam today contains a URl in the message.Just like blacklists and whitelists of IP addresses three years ago, vendors are trying to address this problem by constructing blacklists and whitelists of URls. This approach is like a whack a mole game, however, as spammers generate hundreds or thousands of URls, often only for a few hours. by the time traditional URl blacklists list a new URl, the attacker has defrauded his victims and moved on to using a new URl. similar to email reputation, solv-ing this problem requires the ability to track the reputation of both the URl and the entity that controls it in near real time. Figure 2: IronPort Reputation Filters are the outer layer of defense, stopping 80 percent of hostile mail at the door.Untitled DocumentirOnPOrt S Multi-layer SPaM DefenSeWhite PaPer IronPort collects more than 40 different parameters to determine the reputa-tion of a website. for example: How long has the domain been registered? are the domain s whois records valid? What country is the website hosted in? What is the reputation of the network hosting the website? What is the global volume of requests to this site? How has that volume changed over time? What is the nature of the content on the site? What is the reputation of the mail server that sends URls linking to the site? This data is collected in senderbase from more than 100,000 different net-works in a similar manner as email traffic data. IronPort s statisticians have developed Web reputation algorithms, that are similar to the email reputation algorithms. These algorithms produce a Web reputation score, which is made available to the IronPort Case for spam filtering.The Third Question How was this message constructed?Today it is increasingly easy to buy an off the shelf spamware package to-generate millions of email messages. These packages are extremely power-ful, but often leave traces that indicate the program generating the message. spammers also have a vested interest in masking their real identites and exploit the weaknesses of sMTP by forging elements of their messages. structural rules examine how the message is constructed, looking for subtle patterns that differentiate good mail from bad. for example, does the message contain signs of obfuscation like legitimate text that is hidden using a font color that is nearly identical to the background color? Do the message headers contain the fingerprint of a known "spamware" toolkit like sendsafe used by spammers to send their messages? structural rules also help identify signs of forgery. for example, does a message claim to come from a trusted webmail provider, but really originate from an entirely separate source? The Fourth Question What does the message contain?While inadequate in and of itself, content analysis is useful when applied in the full context in which the message was received.IronPort anti-spam includes advanced lexical analysis that examines the contents of each message and considers this in the context of who is send-ing the message, how it was sent and where links in the message point to. a message may contain the word Viagra , but if it is coming from a source that is a known pharmaceuticals company, the positive sender reputation score will offset any suspicions raised by the content and the message will pass through. similarly, a message that contains many financial terms Untitled DocumentirOnPOrt S Multi-layer SPaM DefenSeWhite PaPer8such as mortgage and interest rate , but appears to be coming from a consumer broadband network that does not accept mail in return, will have a high likelihood of being spam. IronPort anti-spam has the ability to interpret all major international languages, including double-byte characters used in most asian languages.analyzinG in COnteXt With CaSeIronPort s Context adaptive scanning engine (Case) pulls all of these ques-tions together. by examining a message in its full context, considering who it is from, where the links point to, how it was constructed and what language it contains, the Case is an extremely powerful machine learning system that makes accurate spam/not spam decisions. by examining a message broadly in its full context, it begins to emulate the logic that a human would use when evaluating an unknown message. Who is it from? Where does it take me? Does it look real? What language does it contain? as illustrated earlier in figure 1, this broad contextual analysis allows the Case technolgy to look beyond a few attributes, that might appear anomalous, and accurately classify messages as spam or not.one of the challenges associated with contextual analysis is that a compre-hensive examination of every message can be extremely computationally intensive. IronPort offsets this challenge by eliminating unwanted email as soon as enough information about a spam message is known to block it. Reputation filters ensure that the Case only examines the small percentage of mail that is not clearly known good or known bad. Case technology uses a unique early exit algorithm to efficiently reach verdicts.early exit allows IronPort's Case to stop scanning a message once a verdict is reached. by running the most applicable rules first, the majority of spam can be stopped without running the entire rule set. Two unique aspects of the Case early exit system are: the order in which rules are processed is updated dynamically, and the early exit algorithm is applied to legitimate email as well as spam. This unique approach yields a massive increase in throughput for the system, allowing the Case to process more than three times the throughput of traditional rules-based spam filters. The early exit concept is illustrated in figure 3.Untitled DocumentirOnPOrt S Multi-layer SPaM DefenSeWhite PaPer9the irOnPOrt anti-SPaM eCOSySteMThe tactics of spam are always changing, meaning a world-class anti-spam system must constantly be measuring and responding to these changing tactics, and have facilities to provide real-time updates to stay ahead of the flood. IronPort has developed unique technology and invested in large scale infrastructure to measure and characterize spam behavior, providing a dynamic stream of updates to its appliances in the field. a critical component of anti-spam efficacy is the quality of the rules run by Case. IronPort s Threat operations Center (ToC) has built a very sophisticat-ed system to measure and manage rule efficacy and to generate a constant fow of new rules to respond to the shifting tactics of spammers. at the heart of the ToC, is a massive and highly diverse database. Data streams into the ToC from more than 100,000 different networks around the world, including very large entities such as eight of the ten largest IsPs in the world. This data feed includes sMTP and HTTP traffic data, utilized in the email and Web reputation systems, and also a stream of millions of spam messages from a variety of sources. To be able to account for the varying quality and sources of this huge feed of incoming spam, IronPort has developed a data quality engine. This engine uses statistical techniques to compare the results of a given data feed with the characteristics of a known sample and then normalize the results. for example, a feed from trained and proven reliable human reporters may indicate a 90 percent probability of spam, but a feed from a large consumer IsP might only be a 50 percent probability of spam. If the same message shows up in both feeds the probability might grow to 96 percent probability of spam. The technicians and statisticians in IronPort s ToC have had years of experience properly interpreting and weighting different data sources. Figure 3: IronPort Anti-Spam Advantage: Performance Early Exit Accelerates Scan TimeUntitled DocumentirOnPOrt S Multi-layer SPaM DefenSeWhite PaPer10This network of over 100,000 sources feed the world s largest corpus of email. The corpus contains messages from around the world that have been classified with certainty as either spam or legitimate email. The corpus is constantly updated with millions of new messages daily, automatically clas-sified, and then verified with human oversight from the ToC analysts. The corpus is used to generate new rules automatically as well as manually. The ToC contains rule writing technicians tasked with detecting the small subset of spam that automated systems fail to detect. These technicians are equipped with tools to group messages that share similar underlying charac-teristics using a patent pending technique called feature similarity Vectoring (fsV). Unlike fuzzy checksum approaches that rely on several message attributes to determine message similarity, fsV determines message related-ness by analyzing thousands of message attributes. by associating seem-ingly disparate messages, analysts are able to quickly write rules, based on the underlying attributes common across the attack. once a technician creates a new rule, it gets added to the body of rules processed and goes through a battery of tests to ensure that is accurate. Using advanced statistical techniques, the entire body of rules is repeatedly run against the corpus and each rule is assigned an optimal weight. Rules that are less effective are expired or dropped from the rule set. Rules are automatically ordered, based on their contribution towards catching spam. This dynamic ordering of rules is a key enabler for IronPort s unique early exit algorithm, described earlier. The extensive technology and infrastructure of the Threat operations Center creates over one hundred thousand new rules every day. These rules are sent to IronPort appliances using both push and pull updates. In some cases the appliances will pull new rules or launch a query to senderbase about a particular sender. In other cases new rules are pushed to the ap-pliance. The update mechanisms include HTTP and Dns text records. Rules on the system are automatically updated, deployed, cached, and expired depending on the class of rule. This highly robust update schema allows the IronPort appliances to provide reliable protection, even if some or all of the centralized rule infrastructure ever became unavailable.enterPriSe ManaGeMentWhen serving large enterprises, having cutting edge technology is obviously an important, part of a successful solution. but equally important is en-terprise reporting and management tools to minimize administrator burden and help address the business case for the investment required in the Untitled DocumentirOnPOrt S Multi-layer SPaM DefenSeWhite PaPer11email security infrastructure. IronPort has developed a world-class reporting system that allows IT staff to measure the return on their investment, as well as advanced management tools to adapt to the varied needs of a global enterprise end-user population.building a truly enterprise-class reporting and management system is non-trivial. enterprise reporting and management have been designed in to the IronPort appliance since their inception. every IronPort appliance contains a real-time reporting system called Mail flow Monitor". Mail flow Monitor gives a real-time view into what is happening in the system. Is mail backing up in a queue? Is the system being attacked by a spam outbreak or DDos attack? The system automatically highlights anomalies and generates snMP and/or email alerts as required. Mail flow Monitor also provides a historical summary of how much traffic has been received, what percentage is spam, virus, blocked by reputation filtering, etc. These historical reports can be automatically generated and distributed periodically. In addition to the on-box reporting of Mail flow Monitor, IronPort offers pow-erful centralized reporting with Mail flow Central". Mail flow Central pulls log data off of multiple appliances and stores it in a sQl database. This database can be queried using IronPort s simple Web based tools to gener-ate historical reports, perform capacity planning, and support RoI analysis. In addition, Mail flow Central has powerful message tracking capability that allows IT staff to easily see what happened to any given message. Track-ing can be done by sender, by recipient, domain, size virtually any mes-sage attribute. This unique capability reduces the trouble shooting burden significantly.In addition to real-time and centralized reporting systems, IronPort has developed a family of end-user facing controls. IronPort anti-spam supports a simple outlook plug-in that allows end-users to identify and report missed spam at the click of a button. This spam is automatically routed back to IronPort and the filter algorithms are tuned based on the feedback. The IronPort appliances also support a fully integrated end-user quarantine to store either suspect spam or suspect and definite spam . Many customers who use the quarantine only do so for suspected spam and drop known spam because of the extremely low false positive rate of IronPort anti-spam. The quarantine can automatically generate a summary email that is sent to end-users with subject lines of all quarantined messages. If a user sees a subject of interest they simply click on the link and launch a familiar webmail interface. There they can view messages and release or delete them. Released messages are routed through the IronPort appliance, so it can automatically adjust its algorithms. Quarantine size limits can be set and messages are automatically purged. The quarantine application is fully integrated into the appliance. Untitled DocumentirOnPOrt S Multi-layer SPaM DefenSeWhite PaPerIronPort systems is the leading email and Web security products provider for organizations ranging from small businesses to the Global 2000. IronPort provides high-performance, easy-to-use, and technically innovative products for those faced with the monumental task of managing and protecting their mission-critical networks from Internet threats.Copyright 2006 IronPort systems, Inc. all rights reserved. IronPort and senderbase are registered trademarks of IronPort systems, Inc. all other trademarks are the property of IronPort systems, Inc. or their respective owners. specifications are subject to change without notice. P/n 434-0202-1 4/06ironPort Systems, inc.950 elm avenue, san bruno, Ca 94066 Tel 650.989.6500 fax 650.989.6543 eMaIl info@ironport.com Web www.ironport.comSuMMaryToday s spam attacks have become too sophisticated for earlier-generation spam systems. These systems share a common weakness relying heav-ily on analyzing content that can easily be manipulated by spammer. state of the art anti-spam systems must go beyond content analysis and analyze messages in the full context in which they are sent. Maintaining leading efficacy also requires publishing high-quality rules in near real time. Rule quality is driven by the size, breadth, and quality of the data that feeds the rule generation system. finally, the most effective rule development systems have humans in the loop analyzing and responding to the last few percent of spam messages that escaped automated defenses. IronPort anti-spam is unique in the industry it analyzes messages in their full context, allowing the system to be very robust and accurate. IronPort has pioneered the concept of reputation filtering, starting with email reputa-tion and more recently Web reputation. These two factors are very powerful components of full context analysis, because they are based on factors not easily controlled by spammers. IronPort has also innovated with its Context adaptive scanning engine that examines email reputation, Web reputation, message construction and content as efficiently and accurately as possible. This system is supported by the industry s most sophisticated Threat opera-tions Center, which captures and processes massive quantities of data to keep IronPort anti-spam one step ahead of ever-changing email threats.






