Detecting Spam: Types of Spam Tests

E-Mail: General Topics

 

E-Mail first passes through our perimeter routers, where it passes through it's first layer of virus protection.  It then reaches our perimeter gateway, where we apply selective greylisting and tarpitting and recipient validation.  It is then passed to the server farm, where it is examined by two more anti-virus applications, a Zero Hour antivirus database is consulted, tests that review the sending server, tests that review the mail route taken, tests that review how the message was assembled, and tests for common mail vulnerabilities.  After that the message is examined for a round of internal tests, a number of public and private RBL lists are consulted, and then the message content is examined.  The syntax is compared to a number of filters, and then links within the message are compared to URIBL lists.  

Then the mail is delivered to your inbox.  We work pretty hard to deliver your mail, don't we?

This list enumerates some of our internal tests.

 

Test Name

Description

BADHEADERS

This test checks the E-mail for illegal headers that are common in spam, but not common in legitimate E-mail. This test can catch about 50% of all spam, with the only false positives being mail that comes from broken mail clients. This is a very good test to use.

BASE64

This test will catch E-mail that uses MIME "base64" encoding for text or HTML segments. Using base64 encoding in these segments is becoming common in spam, as it allows spammers to bypass most filtering systems. However, there is no advantage for legitimate mail to be sent this way (worse, it ends up causing the size of the E-mail to be greater). Very few legitimate E-mails will be caught by this test.

BCC

This test will catch E-mail that has a lot of local recipients that are not listed in the E-mail headers. This test is normally only used in advanced setups, as most mailing list E-mail has many recipients not listed in the headers.

BITMASK

This is a type of external test that allows multiple test results to be returned by a single value. An example: ESPAM bitmask 0 "[drive]\[path]\execfile.exe" 0 0 ESPAM-URIBL bitmask 1 "ESPAM" 8 0 ESPAM-PHISH bitmask 2 "ESPAM" 4 0 ESPAM-BULK bitmask 4 "ESPAM" 6 0 The first line with a bitmask 0 defines the master test, which must contain the complete path to the executable. The actual subtests define the bits of the values that will be analyzed when the executable ends. The value following the bitmask directive is the bit value, not the bit position. Not all bits have to be used. After the bit value is the name of the master test of which these subtests are a part. The subtests must be contiguous and must immediately follow the master test. If the executable returns a value of 5, it would mean that the email failed both the first and the third tests.

BYPASSWHITELIST

This optional test instructs Declude to bypass any whitelisting for E-mails with at least a specific number of recipients and at least a specific weight. 
For example, you could define a test with the following line in the global.cfg file: BYPASSWHITELIST bypasswhitelist 60 5 0 0. The 60 refers to the weight the E-mail must reach, and the 5 refers to the minimum number of recipients. In this case, it would attempt to bypass the whitelisting for E-mail with 5 or more recipients and a weight of 60 or higher.

CATCHALLMAILS

This one isn't really a test. Declude will mark all E-mail as spam if you use the CATCHALLMAILS test. This might be useful if you wanted to add a footer to all E-mails in a certain domain, for example.

CMDSPACE

The CMDSPACE test looks for a technical violation of the RFCs. This test works very well because it catches about half of all spam, while no legitimate mail servers fail this test. The one drawback is that some mail clients will fail this test, so the test is most useful if you whitelist your own users (see the "WHITELIST AUTH" option), or do not have very strict anti-spam settings.

COMMENTS

The COMMENTS test will catch spam that uses HTML comments to bypass filters. It is a very effective test, since it will not catch standard comments that occasionally appear in legitimate bulk mail; it only catches comments that are designed to bypass filters.

CONTSPACES

This optional test will tell Declude  to test to see if an E-mail has a specific number of continuous spaces in the subject. For example, you could define a test with the following line in the global.cfg file: "CONTSPACES contspaces 5 x 0 0", which would be triggered for E-mail with more than 5 continuous spaces in the subject.

DNSBL

The "dnsbl" test type is used to support future DNS-based spam databases, that use something other than the IP address (ip4r) or return address (rhsbl) to detect spam.

DOW

This optional test will tell Declude to test to see if an E-mail arrived during a specific day of the week. For example, you could define a test with the following line in the global.cfg file: "DOW dow 1 5 0 0", which would be triggered for E-mail that came in between Monday (1) and Friday (5).

FROMNOMATCH

Available in Declude version 3.1 and Declude 4.x or later. This test type, checks the sender of the message in the envelope and compares it to the sender specified in the FROM: line in the header of the message. If the sender in the envelope and the FROM: line in the header do not match the test is triggered. This test should not be weighted to high as many legitimate bulk mail newsletters, email lists, notifications and email being forwarded from another email system will fail this test.

FILTER

The "filter" test type will let you create your filters that can work with Declude's actions. See the "Filtering" section of the manual for more details.

HELOBOGUS

This test will detect bogus (non-RFC-compliant) "HELO/EHLO" data. When another mail server connects to yours, it will identify itself using an SMTP command (either "HELO" or "EHLO"). It is required to send a valid host name. However, spammers (and a few poorly designed mail servers) will occasionally not send a valid host name, which will trigger this test.

HOUR

This optional test will tell Declude to test to see if an E-mail arrived during a specific range of hours. For example, you could define a test with the following line in the global.cfg file: "HOUR hour 9 16 0 0", which would be triggered for E-mail that came in between 9:00AM and 4:00PM (16:00).

IPNOTINMX

This test should NOT be used to detect spam! It will be triggered when an E-mail is sent from an IP address that is not in its MX record. Although this test will catch a lot of spam (perhaps 80%), it will also catch a lot of legitimate mail (as quite a few larger mailers will send their mail through a different mail server than they use to receive mail). What this test is good for is helping reduce false positives. By default, Declude will subtract several points from the weighting system when an E-mail does not fail this test (which is very different from the way a spam test normally works).

MAILFROM

This test checks the SMTP envelope "Mail From:" address (which should be the sender of the E-mail) and makes sure that the domain name it is coming from is valid. This way, if mail is sent from "user@$$$success$$$.com", it will get caught (since "$$$success$$$.com" is not a valid domain).

NOLEGITCONTENT

This test should NOT be used to detect spam! It will be triggered Declude does not detect any legitimate content in an E-mail. Note that a lot of legitimate E-mail will fail this test, but almost all spam will fail it. Like the IPNOTINMX test, this test is good for helping reduce false positives. By default, Declude will subtract several points from the weighting system when an E-mail does not fail this test (which is very different from the way a spam test normally works).

NONENGLISH

The NONENGLISH test will catch a lot of E-mail that is in languages other than English. If your organization does not receive any E-mail in languages other than English, this test may be useful, as it will catch spam in Japanese, Chinese, Taiwanese, Korean, and several other languages common in spam.

PERCENT

This test will catch all mail with "To:" addresses that contain a percent sign. The percent sign indicates an outdated routing method that can be used by spammers to bypass closed relays.

REVDNS

This test will check to see if the remote mail server (or client) has a reverse DNS entry. If not, it will fail this test. All Internet hosts are required to have a reverse DNS entry, although most do not. Most mail servers do have the required reverse DNS entry, but there are still large numbers that do not, so it is likely that this test will catch a lot of legitimate mail. A warning in the headers might be appropriate for this test.

ROUTING

This test will analyze the route that an E-mail takes, and look for highly inefficient routing that is very common in spam. For example, an E-mail might get caught if it is sent from a dialup in the U.S. to another account in the U.S., but is routed through a server in China, but not if it goes from a mail server in China directly to a U.S. mail server. This may occasionally produce false positives, especially if a mailing list is hosted outside of the United States. This test will probably not work well if your mail server is located outside of the United States.

SIZE

Available in Declude version 3.1 and Declude 4.x or later. This test type, checks the message size. The size is specified in KB as in the example above 500. If the message reaches the size specified or greater then the test is triggered. The test could be used multiple times in a scaled set up as in the example below. 

SIZE-500KB size 500 x -1 0 
SIZE-750KB size 750 x -2 0 
SIZE-1MB size 1000 x -3 0 

Another way this test can be used by ISP's is to prevent large files being sent to dial-up customers Mail Clients. To do this you would use a per-user configuration with the test ACTION set to MAILBOX Large in your user.junkmail file, where large emails would be redirected to the Mail Servers Web Account Folder rather than be downloaded.

SIZE-1MB size 1000 x 0 0

SPAMHEADERS

This test checks the E-mail for headers that are common in spam, but not common in legitimate E-mail. This test is very similar to the BADHEADERS test, except the problems this test looks for are not RFC violations, so there's a good chance the test will catch a small amount of legitimate E-mail (typically mail sent from mail clients written by webmasters rather than programmers).

SPAMDOMAINS

This test will catch E-mail that is not coming from a mail server that it should be coming from. This test will only work if you set up a file listing domains that you wish to be included in this test. Specifically, it will check the return address of the E-mail, and then check to see if the reverse DNS entry of the IP that the E-mail was sent from contains the domain name. If not, the E-mail fails the test. For example, if "hotmail.com" is listed in the \MAILSERVER\Declude\spamdomains.txt file, then an E-mail coming from "law2.hotmail.com" would not fail the test, but an E-mail from "mail.example.ru" would fail the test.

SPFFAIL

This test will be triggered if an E-mail fails SPF
Note that it will not be triggered for E-mail that has other problems (no SPF record, unknown results from the SPF record, etc.). So any E-mail failing the SPFFAIL test is E-mail that is not authorized per the administrator of the domain the E-mail is being sent from.

SPFPASS

This test will be triggered if an E-mail passes SPF
Note that normally no weight should be added to the E-mail for triggering this test, as it indicates that the E-mail came from an IP that the domain it was sent from allows mail to be sent from.

SUBJECTCHARS

The "subjectchars" test type will catch E-mail that has a certain number of characters in the subject. This test is normally used only in very advanced setups.

SUBJECTSPACES

The "subjectspaces" test type will catch E-mail that has a certain number of spaces in the subject. This test is normally used only in very advanced setups.

WEIGHT10

This test will catch E-mail that has a total "weight" of at least 10. This will occur if the E-mail fails several different spam tests.

WEIGHT20

This test will catch E-mail that has a total "weight" of at least 20. This will occur if the E-mail fails a number of different spam tests. Although less spam will fail the WEIGHT20 test than the WEIGHT10 test, the WEIGHT20 test will be less likely to have false positives.

 

 

 

Add Feedback