Exim: Adding the Autonomous System Number as a header in received emails

Updates ^

2022-11-05 ^

  • Added a bit about timeouts, as concern was expressed that I am “bonkers”.

The Problem ^

For statistical purposes I wanted to add the Autonomous System Number (ASN) for the IP address of the connecting host as a header in the received email, like this:

X-ASN: AS63949 2a01:7e01::/32

The Answer ^

You can obtain this information through a DNS query to Team Cymru:

$ sipcalc -r 2a01:7e01::f03c:92ff:fe32:a408
-[ipv6 : 2a01:7e01::f03c:92ff:fe32:a408] - 0
 
[IPV6 DNS]
Reverse DNS (ip6.arpa)  -
8.0.4.a.2.3.e.f.f.f.2.9.c.3.0.f.0.0.0.0.0.0.0.0.1.0.e.7.1.0.a.2.ip6.arpa.
 
-
$ dig +short -t txt 8.0.4.a.2.3.e.f.f.f.2.9.c.3.0.f.0.0.0.0.0.0.0.0.1.0.e.7.1.0.a.2.origin6.asn.cymru.com
"63949 | 2a01:7e01::/32 | US | ripencc | 2011-02-01"

Or for legacy Internet addresses:

$ dig +noall +question -x 199.59.150.116
;116.150.59.199.in-addr.arpa.   IN      PTR
$ dig +short -t txt 116.150.59.199.origin.asn.cymru.com
"13414 | 199.59.148.0/22 | US | arin | 2010-11-23"

So for IPv6 addresses the process is:

  1. Expand the address out fully (2a01:7e01::f03c:92ff:fe32:a4082a01:7e01:0000:0000:f03c:92ff:fe32:a408)
  2. Remove the colons (2a01:7e01:0000:0000:f03c:92ff:fe32:a4082a017e0100000000f03c92fffe32a408)
  3. Reverse it (2a017e0100000000f03c92fffe32a408804a23efff29c30f0000000010e710a2)
  4. Add a dot after every hexadecimal number (804a23efff29c30f0000000010e710a28.0.4.a.2.3.e.f.f.f.2.9.c.3.0.f.0.0.0.0.0.0.0.0.1.0.e.7.1.0.a.2.)
  5. Add origin6.asn.cymru.com on the end (8.0.4.a.2.3.e.f.f.f.2.9.c.3.0.f.0.0.0.0.0.0.0.0.1.0.e.7.1.0.a.2.origin6.asn.cymru.com)
  6. Query that TXT record and parse out the first two values separated by ‘|’ in the response.

For legacy IP addresses the process is much simpler; reverse the octets, add origin.asn.cymru.com on the end and query that.

An Exim Answer ^

In Exim configuration you can do it like this:

(This is meant to go inside an ACL like your check_rcpt or check_data. Maybe near the end of check_data at the point where you’ve already decided to accept the email. No point in doing this for an email you will reject.)

# Add X-ASN: header for IPv6 senders.
  warn message = X-ASN: AS${sg{${extract{1}{|}{$acl_m9}}}{\N\s+\N}{}} ${sg{${extract{2}{{|}{$acl_m9}}}{\N\s+\N}{}}
     condition = ${if isip6{$sender_host_address}}
    set acl_m9 = ${lookup dnsdb{txt=${reverse_ip:$sender_host_address}.origin6.asn.cymru.com}}
 
# Add X-ASN: header for legacy IP senders.
  warn message = X-ASN: AS${sg{${extract{1}{|}{$acl_m9}}}{\N\s+\N}{}} ${sg{${extract{2}{{|}{$acl_m9}}}{\N\s+\N}{}}
     condition = ${if isip4{$sender_host_address}}
    set acl_m9 = ${lookup dnsdb{txt=${reverse_ip:$sender_host_address}.origin.asn.cymru.com}}

I dislike that I’ve had to use two tests that are almost exactly the same except they query slightly different DNS names (origin6.asn.cymru.com vs origin.asn.cymru.com). I’m sure it could be done in one, but I’m not good enough with the Exim string evaluations. They send me cross-eyed. I couldn’t find a better way so I decided to use the time-honoured tactic of posting what I have in order to provoke people into correcting me. Please let me know if you can improve it!

The amount of nested {} will probably drive you mad, but basically:

  • ${reverse_ip:$sender_host_address} handles expanding and reversing an IP address into the form you would use for a reverse DNS query.
  • That gets queried in DNS with the correct suffix and the full response stored in $acl_m9.
  • warn message = X-ASN: adds a header to the email, the content of which is built from two fields extracted out of $acl_m9 with all whitespace removed (${sg{source}{regex}{replacement}}).

What about timeouts? ^

One piece of feedback I got was that I am “bonkers” to make my email delivery rely on a real time network lookup. I can kind of see the argument, but also not: this is a DNS query exactly the same as a typical DNSBL query (Team Cymru IP-to-ASN service is used exactly like a typical DNSBL).

Most people’s mail servers do multiple DNSBL queries already and nobody really is up in arms saying it’s bonkers to do so. My Exim already does a couple of DNSBL queries and then if it is going to deliver the email it will call out to SpamAssassin which does many DNSBL queries itself. If these hit a timeout then it would slow down my mail delivery.

In the past where a DNSBL has unceremoniously shut down and made its nameservers unresponsive I have seen problems, as it caused the delivery processes to pile up while they waited on their timeouts and then Exim would complain that there’s too many processes. That would be resolved by removing the errant DNSBL(s) from the configuration.

Query load is not a concern as DNS is highly scalable and my system is not going to add noticeable load to Team Cymru’s already public service. The SpamAssassin ASN plugin is already out there, hard coded to use this same service and must have many many users already.

As far as I can tell, in Exim dnsdb queries use the same timeouts and retries as dnslist queries do, that being controlled by the dns_retrans and dns_rety settings. These settings both default to 0, which means “operating system / resolver library default”. If you were worried you could explicitly set these to their minimum value:

If still worried then you would first have to either turn off all DNSBLs or make sure you had local copies of them (e.g. by arranging AXFR to your own local servers). Then to do the IP-to-ASN locally you’d arrange to have a local BGP feed that you could query. I think you’d need to have an absolutely huge mail server before these issues became real concerns.

…
set acl_m9 = ${lookup dnsdb{retrans_1s,retry_1,txt=${reverse_ip:$sender_host_address}.origin6.asn.cymru.com}}
…

As for dnslist, the consequence of a time out is that you get no data, so it would just result in an empty header.

But Why? ^

I’ve actually been doing this for a while with SpamAssassin’s ASN plugin but I’ve changed the way in which I query SpamAssassin and now I don’t directly get the rewritten email that SpamAssassin makes (with its X-Spam-ASN: header in).

I use it for feeding into Bayes to see if there’s a particular prevalence of ASN amongst the email that is classified as spam, and I sometimes add a few points on manually for ASNs that are particularly bad. That is a lot less work than trying to track down all their IP addresses and keep that up to date.