UCSD's Approaches to Spam

Greylisting

What is greylisting, and why does it stop spam?

Greylisting is a technique that refuses to accept mail from mail senders that do not follow the proper procedure for delivering mail. When a message arrives, the sender is told to try again later. Proper procedures dictate that the sender wait a while, and try again.

This is how we separate legitimate mail senders from spammers. Spammers don't really care if a particular message gets through, so they usually do not try again later. Thus, we never accept the message, and it is not received by you.

Legitimate senders, however, will try again. When they do, we accept the message and put them in a list of legitimate mail senders. Once a mail sender has been shown to be legitimate, we remember them the next time they try to send a message, so that any future mail is accepted without delay. If many legitimate messages come through one mail server, all senders on that server are considered legitmate.

A more detailed introduction to greylisting can be found on the Texas A&M University Computing & Information Services Help Desk.

How greylisting is implemented on UCSD's mailers: The technical details

For each incoming message, we generate a data pair consisting of the relay IP address of the sender and the "From:<user@site.com>" sender address. We look that up in a common database to get the status result of that data pair.

  1. If they aren't in the database, we mark them as "grey" status, with the timestamp set to now, and return a temporary 451 message telling them to come back later.
  2. If they're in the database as status "grey", we examine the timestamp.

    a) if they've been grey for less than T1 minutes, we tell them to go away (451) for the remaining T1 interval.

    b) if they've been grey for longer than T2 minutes, we update the timestamp and leave them grey; they'll have to try again.

    c) if their last visit in grey status was between T1 and T2, we turn them into "autowhitelist" status, update the timestamp, and accept the mail.

  3. If their status was "autowhitelist", we update their timestamp and accept the mail.
Top

Cleanup

Periodically, we clean out entries from the db that are either

  1. "grey" status and much more than T2 old, or
  2. "autowhite" status and several weeks old.

In our current implementation, T1 is 1 minute, and T2 is 8 hours.

What are the limitations and problems with greylisting

There are mail servers which do not retry delivery, do not retry soon enough, retry too soon then give up, and some which just ignore the 451 and try to deliver the message anyway. Others misinterpret the 451 error and return a fatal delivery error to the sender. All such mail servers are improperly configured (not in keeping with internet standards) and they will have to be fixed by their owners to prevent further communication difficulty. Luckily there are few such mail servers.

There are also ISPs and such that use pooled outbound relays, these retry mail from a set of IP addresses, so the "try" and "retry" may well originate from separate systems. They will get through eventually, but we whitelist any that we can find out about so that they won't be delayed.

Why might a piece of real mail be delayed, or not arrive?

A "pathological" delay can occur when a mail sender pools its outbound relays, meaning that delivery retries may come from different IP addresses each time. The greylister will see that as different hosts, and defer each one separately. Those deferrals may sum to an appreciable time. [This can also occur for mailing lists that use a separate error-return (SMTP "From") for each delivery attempt. Very few lists are known to do that, besides BugTraq.]

There are basically only two reasons why mail would be unable to pass through the greylist causing the mail delivery to fail.

  1. each retry has a different IP address and/or error return (SMTP From) address, so it doesn't look like a retry, or
  2. it's simply not retrying.

For case 1, we can add the host or subnet to the permitted relay list, and they'll be able to deliver without impediment. Send email to the postmaster@ucsd.edu with information on the sender to request this action.

For case 2, the sender really has to get their mail server fixed.

Keep in mind that the retry delay is set by the sender's mail relay. If they do not retry for an hour, you will not receive your message for an hour. The retry time delay of the sender’s mail relay is not under our control.

Top

Spam and Virus Scanning

Recent Statistics

In 2004, we saw:

  • There were 245 million incoming messages.
  • Spam or viruses made up 40% of incoming e-mail. There were 74 million spam messages, and 18 million virus-containing messages.
  • Three virus families contributed almost 90% of the viruses:
    • NetSky: 9 million
    • MyDoom: 6 million
    • Bagle: 2 million

After implementing Greylisting in February 2005, the percentage of messages marked as spam dropped from 25% to 10.9%

About the Spam Scanner SpamAssassin

Contrary to the belief of some, we do not filter out or otherwise prevent the delivery of any incoming email messages (with the exception of viruses and a notice is sent to the recipient in those cases) at the central campus network level. The only action we perform is to scan and tag messages indicating how likely the message is to be spam.

In February 2003, ACS/Network Operations implemented campus spam scanning using the SpamAssassin software package, which uses a set of heuristics and tests to assign a "spam score" to each email message (based on particular characteristics of the message). All mail to UCSD email addresses that passes through the central email servers is scanned and receives a score that represents how likely that piece of mail is to be spam.

That spam score is inserted into the message headers so users can create email client filters to take action based on the spam score (ie. filter messages with 5 to 12 stars into a spam folder) to reduce the spam they see in their inbox. For example:

X-Spam-Level: Level *****
X-Spam-Flag: Spam YES

UCSD email users can then use the ‘filter’ capacity of their mail readers to redirect high-scoring messages into a separate folder to be viewed at the reader's convenience.

This was most recently explained to all UCSD employees February 2004 when an all-at-ucsd notice was sent reminding them of the campus spam scanning system, directing them to our online documentation and inviting them to our Sharecase presentation.

Since it is possible that legitimate messages can be marked as spam by the software for a variety of reasons, it is recommended that users redirect high-scoring messages into a separate folder (rather than the trash) so they can browse through it on occasion for any false-positives.

In the future, we would like to offer the option of letting users put their own definitions into the per-user SpamAssassin mysql database, but for now that's not doable.

Description of Sophos Antivirus Software

The UCSD campus-wide virus scanning system, activated May 1, 2002, uses the Sophos anti-virus software to scan each incoming email message transmitted through the central campus mail servers to intercept e-mail viruses.

Incoming e-mail messages containing viruses are quarantined when identified through the scanning process. An explanatory message is inserted to the recipient and the message is sent on as originally addressed.

In the case of a new virus, there is likely to be a delay before the anti-virus software vendors can identify the new virus and distribute files to allow their customers to do the same.

Clean messages are marked as scanned and queued for delivery to the addressees.

Additional information about the campus virus scanning may be found at http://www-no.ucsd.edu/security/virus-scan.html.

Top

Network and Mail Security

Many spammers take advantage of poor security to use others' computers to send spam. We have taken steps to prevent our systems from becoming part of the problem.

SMTP server: Stopping spam at the source

When you send a message using UCSD's outgoing (SMTP) mail server, it requires that you are either sending mail from on-campus, or that you have authorized your account to send mail. By rejecting messages from unauthorized users, we prevent spammers from taking advantage of our servers to send spam.

Sender Quarantine

Frequently, computers infected by viruses are turned into spam-sending "zombies". When such an infection is detected, mail is not accepted from the infected computer until we can verify that the owner has regained control of their systems.

Future Technologies

Coming Soon!

Top