Below is a list of a dozen research projects underway that focus on new technology and techniques to stop spam. While in many cases these projects are reacting to exploits already in use, such as image spam and phishing, the work by these researchers is designed to counter spammers' current developments and may also lead to prevention of future ones. This list, by no means complete, contains select papers recently made public.
Spam filter makers were stumped when image spam made its debut last fall; by hiding the spam message inside an image that filters could not discern, spammers got their messages through to inboxes.
"Learning Fast Classifiers for Image Spam" is the name of a research paper from the University of Pennsylvania that describes how filters can be tweaked to quickly determine whether or not an inbound message containing an image is spam.
The paper discusses techniques that focus on simple properties of the image to make classifications as fast as possible, the development of an algorithm that can select features for classification based on speed and predictive power, and a just-in-time feature extraction that "creates features at classification time as needed by the classifier", according to the paper.
Researchers claim a 90% to 99% success rate using real-world data in their own tests.
Another project, "Filtering Image Spam with near-Duplicate Detection", from Princeton University, also targets spam hidden in pictures. According to the researchers behind the project, image spam is often sent in batches with visually similar images that differ only with the application of randomisation algorithms.
The researchers propose a near-duplicate detection system that relies on traditional antispam filtering to whittle inbound mail down to a subset of spam images, then applies multiple image-spam filters to flag all the images that look like the spam caught by traditional means. The prototype, its developers say, has reached "high detection rates" and less than 0.001% false positive (legitimate mail classified as spam) rates.