Each RBL service has a DNS name, such as "zen.spamhaus.org".
Many RBL services combine data from different sources. These data sources are encoded in the DNS "A" records that are returned when the RBL service is queried. This record is shown on the graph as the "x@" before the RBL service name. As an example, "email@example.com" means that zen.spamhaus.org returned "127.0.0.4" in its reply, which means the IP address was listed in the CBL data set. Each service has its own encoding, so if you want to know what those numbers really mean you'll have to go to the RBL service's web site and inquire there.
dnswl.org responses have two columns in the form "127.0.CATEGORY.TRUST". Individual responses are mapped to CATEGORY@TRUST.list.dnswl.org (e.g. "firstname.lastname@example.org" represents a medium trust sender in the manufacturing/industrial category). The TRUST column values are aggregated over all categories the same way as a one-column RBL (e.g. "email@example.com" represents a medium-trust sender in all categories). The CATEGORY column values are aggregated over all values of trust using the pseudo-domain "Y.list.dnswl.org" (e.g. "13@Y.list.dnswl.org" represents a manufacturing/industrial category sender at any trust level).
For convenience, here is a non-authoritative list of RBL services and their response codes. Check with the RBL service providers directly for current and correct information.
|none||Sender IP appears in no RBL data set|
|firstname.lastname@example.org||Another Spam Prevention Early Warning System|
|email@example.com||SpamCop SCBL - a mix of data from end-user reports and spam traps|
|firstname.lastname@example.org||Anonymous Postmasters Early Warning System|
|email@example.com||Passive Spam Block List - Rik van Riel at surriel.com (Spamikaze)|
|firstname.lastname@example.org||Hosts that send mail to addresses on do-not-mail or unsubscribe lists|
|email@example.com||SBL - known spammers|
|firstname.lastname@example.org||SBL CSS - snowshoe spammers|
|email@example.com||XBL - Composite Block List (aka cbl.abuseat.org)|
|firstname.lastname@example.org||XBL - Customized NJABL data|
|email@example.com||PBL - ISP maintained non-email-sending IPs|
|firstname.lastname@example.org||PBL - Spamhaus maintained|
|email@example.com||Hosts that abused the SMTP sender address in the last 4 weeks|
|firstname.lastname@example.org||Hosts that paid to be whitelisted|
|email@example.com||Conservative: spammer IPs|
|firstname.lastname@example.org||Strict: spammer netblocks|
|email@example.com||Draconian: spammer ASNs|
|firstname.lastname@example.org||Dial-up/dynamic IP ranges (deprecated)|
|email@example.com||Multi-stage open relays (deprecated)|
|firstname.lastname@example.org||Bad Host, No Cookie (probably spam proxies)|
|email@example.com||formmail.cgi and similar insecure mail scripts|
|firstname.lastname@example.org||Open HTTP proxy servers|
|email@example.com||Open SOCKS proxy servers|
|firstname.lastname@example.org||Open misc proxy servers|
|email@example.com||Open SMTP relay servers|
|firstname.lastname@example.org||Exploitable (mostly web) servers which can send mail|
|email@example.com||Denies SORBS testing|
|firstname.lastname@example.org||Dynamic IP addresses|
|SORBS spamtrap RBLs|
|email@example.com||Spam - see below|
|firstname.lastname@example.org||Sent spam in the last 48 hours|
|email@example.com||Sent spam in the last 28 days|
|firstname.lastname@example.org||Sent spam in the last year|
|email@example.com||All the above plus their "supporters"|
|firstname.lastname@example.org||All of the above plus those who have not made an attempt to delist|
|3@Y.list.dnswl.org||Email Service Providers|
|4@Y.list.dnswl.org||Organisations (both for-profit [ie companies] and non-profit)|
|9@Y.list.dnswl.org||Media and Tech companies|
|10@Y.list.dnswl.org||some special cases|
|15@Y.list.dnswl.org||Email Marketing Providers|
|email@example.com||None - only avoid outright blocking (eg Hotmail, Yahoo mailservers)|
|firstname.lastname@example.org||Low - reduce chance of false positives|
|email@example.com||Medium - make sure to avoid false positives but allow override for clear cases|
|firstname.lastname@example.org||High - avoid override|
When mail travels through the Internet, it can be sent directly from the sender's machine to the mail server, or it can be relayed through intermediary servers. The server which connects directly to the ultimate destination machine is called the "origin" server in this graph. When relay servers are used, the "hop" numbers count upwards from the origin server. Thus, "email@example.com 2 hops" is an IP address listed at zen.spamhaus.org, which sent mail through two intermediary servers before it arrived here.
The "score" (blue) data is a value in the range 0..1, where 0 is unlikely to be spam and 1 is very likely to be spam. A typical good RBL has a score near 1, because it is a blacklist of spammers; however, a few RBLs are whitelists of non-spammers, and such RBLs are expected to have scores near 0.
RBLs that list IP addresses that have nothing to do with whether the IP address is currently sending spam via SMTP (e.g. "escalated listings" and "spam support" lists, as well as RBLs that contain old data) will list senders of a mixture of spam and non-spam. An RBL with a score between 0.01 and 0.99 is effectively useless for distinguishing spam from non-spam, and will usually be ignored by the spam filter.
RBLs that list dynamic, "dial-up" or non-email-sending IP addresses (e.g. the Policy Block List, "firstname.lastname@example.org") will normally have high scores when the IP address is the "origin" and low scores when the IP address is one or more "hops" away. This is an expected result of RBL listing policy and sites that implement current best practices, which encourage users on such IP addresses to send their outgoing mail through their ISP's mail server.
For spam blocking in the SMTP session, the accuracy of the RBL when it is used to look up the "origin" IP and the number of non-spam listed on the RBL are the most significant pieces of data. Ideally there would be zero non-spam messages detected from IP addresses listed in RBLs used for this purpose. Thanks to the logarithmic scale on this graph, even a single non-spam message is very easy to see.
All counters shown in this graph are reset automatically each year to avoid polluting the spam filter with stale data. Each individual counter is reset on a different day of the year to avoid sudden changes that would affect the spam filter's accuracy. This causes RBL services to disappear from the graph for a few days each year, and return when a sufficient number of messages from senders listed by the affected RBL have been processed.
This data is the result of automatic classification by the spam filter combined with manual retraining when the filter fails. The automatic classification may use data obtained from the RBLs to make classification decisions. Some feedback can occur from this arrangement: the filter will classify messages from IP addresses on an RBL as spam when the RBL listing itself is the most statistically significant data in the message. If one message is incorrectly classified, the spam filter will train itself to incorrectly classify similar messages. To counteract this effect, this spam filter is closely supervised. Data from multiple sources are combined during classification, and cases where reliable data sources disagree are examined. Misclassified messages that reach (or fail to reach) their intended recipients are fed back to the filter for retraining. A small number of erroneously classified messages may appear on the graph (especially when a new spammer starts up somewhere on the Internet), but they are typically corrected within 24 hours.
Spam and ham hit counts are tracked by IP address, and those addresses which send thousands of messages that are all spam or all ham are placed on internal black and white lists to avoid generating unnecessary RBL server load. A sample of mail received from addresses on these lists is still checked against RBLs, but the remainder cannot contribute to statistics reported here. This pre-filtering makes the hit rates reported here much lower than they would be if every message was checked, and disproportionately emphasizes data from unfamiliar or infrequent mail senders.
The usual disclaimers apply. This is data collected from email servers operated by a single organization with a specific, possibly atypical email workload. This data is provided for information only and there is no reason to believe any of it is accurate, correct, timely, or complete. You are required to not use this data in any way that could cause you (or anyone else) to lose your (their) mail or otherwise adversely affect you (them). Your results may vary--I know mine do.