PowerDNS Truncated SOA Response Problem

Posted on November 18, 2022 by Andy

I recently upgraded bind9 on my primary nameserver and soon after I noticed that one particular zone would no longer transfer to my secondary nameservers, which run PowerDNS. All the PowerDNS servers were saying:

Nov 18 00:25:26 daiquiri pdns_server[32452]: While checking domain freshness: Query to '2001:ba8:1f1:f085::53' for SOA of 'example.com' did not return a SOA

The confusing thing was that manually using dig to query for this did work fine:

daiquiri$ dig +short -t soa example.com @2001:ba8:1f1:f085::53
ns0.example.com. bind.example.com. 1668670704 28800 14400 3600000 86400

After scratching my head for several hours over this yesterday, I eventually broke out tcpdump and was surprised to see that the response to PowerDNS’s SOA query was indeed empty. And it was also truncated!

Back to dig, I could see that this zone was DNSSEC-signed and the SOA query with DNSSEC info was 2293 bytes in size:

daiquiri$ dig +dnssec -t soa example.com @2001:ba8:1f1:f085::53 | grep MSG
;; MSG SIZE  rcvd: 2293

That’s bigger than a DNS response can be in UDP, so it truncates and the client is supposed to retry over TCP. dig has no problem doing that, but PowerDNS can’t (yet).

Specifically what has changed in bind9 is the EDNS buffer size, down from its previous default of 4096 bytes to 1232 bytes.

I can stop PowerDNS from doing the SOA check at all by upgrading all PowerDNS servers to v4.7.x and using the secondary-check-signature-freshness=no option.

I could put bind9’s EDNS buffer size back up to 4096, but it doesn’t seem advisable to go over about 1400 bytes and so that won’t help.

For now I have enabled the minimal-responses option in bind9, which drops extra records from the Authority and Additional sections of responses unless they are absolutely required. This reduces the response size of that SOA query to 685 bytes, so it no longer truncates and PowerDNS is happy.

I’m not sure if an SOA response can ever go above 1232 bytes now. Maybe as DNSSEC signatures get bigger. So this might not be a permanenet solution and hopefully PowerDNS will gain the ability to retry those SOA queries over TCP.

Exim: Adding the Autonomous System Number as a header in received emails

Posted on November 4, 2022November 5, 2022 by Andy

Contents

Updates

2022-11-05

The Problem
The Answer
An Exim Answer

What about timeouts?

But Why?

Updates ^

2022-11-05 ^

Added a bit about timeouts, as concern was expressed that I am “bonkers”.

The Problem ^

For statistical purposes I wanted to add the Autonomous System Number (ASN) for the IP address of the connecting host as a header in the received email, like this:

X-ASN: AS63949 2a01:7e01::/32

The Answer ^

You can obtain this information through a DNS query to Team Cymru:

$ sipcalc -r 2a01:7e01::f03c:92ff:fe32:a408
-[ipv6 : 2a01:7e01::f03c:92ff:fe32:a408] - 0
 
[IPV6 DNS]
Reverse DNS (ip6.arpa)  -
8.0.4.a.2.3.e.f.f.f.2.9.c.3.0.f.0.0.0.0.0.0.0.0.1.0.e.7.1.0.a.2.ip6.arpa.
 
-
$ dig +short -t txt 8.0.4.a.2.3.e.f.f.f.2.9.c.3.0.f.0.0.0.0.0.0.0.0.1.0.e.7.1.0.a.2.origin6.asn.cymru.com
"63949 | 2a01:7e01::/32 | US | ripencc | 2011-02-01"

Or for legacy Internet addresses:

$ dig +noall +question -x 199.59.150.116
;116.150.59.199.in-addr.arpa.   IN      PTR
$ dig +short -t txt 116.150.59.199.origin.asn.cymru.com
"13414 | 199.59.148.0/22 | US | arin | 2010-11-23"

So for IPv6 addresses the process is:

Expand the address out fully (2a01:7e01::f03c:92ff:fe32:a408 → 2a01:7e01:0000:0000:f03c:92ff:fe32:a408)
Remove the colons (2a01:7e01:0000:0000:f03c:92ff:fe32:a408 → 2a017e0100000000f03c92fffe32a408)
Reverse it (2a017e0100000000f03c92fffe32a408 → 804a23efff29c30f0000000010e710a2)
Add a dot after every hexadecimal number (804a23efff29c30f0000000010e710a2 → 8.0.4.a.2.3.e.f.f.f.2.9.c.3.0.f.0.0.0.0.0.0.0.0.1.0.e.7.1.0.a.2.)
Add origin6.asn.cymru.com on the end (8.0.4.a.2.3.e.f.f.f.2.9.c.3.0.f.0.0.0.0.0.0.0.0.1.0.e.7.1.0.a.2.origin6.asn.cymru.com)
Query that TXT record and parse out the first two values separated by ‘|’ in the response.

For legacy IP addresses the process is much simpler; reverse the octets, add origin.asn.cymru.com on the end and query that.

An Exim Answer ^

In Exim configuration you can do it like this:

(This is meant to go inside an ACL like your check_rcpt or check_data. Maybe near the end of check_data at the point where you’ve already decided to accept the email. No point in doing this for an email you will reject.)

# Add X-ASN: header for IPv6 senders.
  warn message = X-ASN: AS${sg{${extract{1}{|}{$acl_m9}}}{\N\s+\N}{}} ${sg{${extract{2}{{|}{$acl_m9}}}{\N\s+\N}{}}
     condition = ${if isip6{$sender_host_address}}
    set acl_m9 = ${lookup dnsdb{txt=${reverse_ip:$sender_host_address}.origin6.asn.cymru.com}}
 
# Add X-ASN: header for legacy IP senders.
  warn message = X-ASN: AS${sg{${extract{1}{|}{$acl_m9}}}{\N\s+\N}{}} ${sg{${extract{2}{{|}{$acl_m9}}}{\N\s+\N}{}}
     condition = ${if isip4{$sender_host_address}}
    set acl_m9 = ${lookup dnsdb{txt=${reverse_ip:$sender_host_address}.origin.asn.cymru.com}}

I dislike that I’ve had to use two tests that are almost exactly the same except they query slightly different DNS names (origin6.asn.cymru.com vs origin.asn.cymru.com). I’m sure it could be done in one, but I’m not good enough with the Exim string evaluations. They send me cross-eyed. I couldn’t find a better way so I decided to use the time-honoured tactic of posting what I have in order to provoke people into correcting me. Please let me know if you can improve it!

The amount of nested {} will probably drive you mad, but basically:

${reverse_ip:$sender_host_address} handles expanding and reversing an IP address into the form you would use for a reverse DNS query.
That gets queried in DNS with the correct suffix and the full response stored in $acl_m9.
warn message = X-ASN: adds a header to the email, the content of which is built from two fields extracted out of $acl_m9 with all whitespace removed (${sg{source}{regex}{replacement}}).

What about timeouts? ^

One piece of feedback I got was that I am “bonkers” to make my email delivery rely on a real time network lookup. I can kind of see the argument, but also not: this is a DNS query exactly the same as a typical DNSBL query (Team Cymru IP-to-ASN service is used exactly like a typical DNSBL).

Most people’s mail servers do multiple DNSBL queries already and nobody really is up in arms saying it’s bonkers to do so. My Exim already does a couple of DNSBL queries and then if it is going to deliver the email it will call out to SpamAssassin which does many DNSBL queries itself. If these hit a timeout then it would slow down my mail delivery.

In the past where a DNSBL has unceremoniously shut down and made its nameservers unresponsive I have seen problems, as it caused the delivery processes to pile up while they waited on their timeouts and then Exim would complain that there’s too many processes. That would be resolved by removing the errant DNSBL(s) from the configuration.

Query load is not a concern as DNS is highly scalable and my system is not going to add noticeable load to Team Cymru’s already public service. The SpamAssassin ASN plugin is already out there, hard coded to use this same service and must have many many users already.

As far as I can tell, in Exim dnsdb queries use the same timeouts and retries as dnslist queries do, that being controlled by the dns_retrans and dns_rety settings. These settings both default to 0, which means “operating system / resolver library default”. If you were worried you could explicitly set these to their minimum value:

If still worried then you would first have to either turn off all DNSBLs or make sure you had local copies of them (e.g. by arranging AXFR to your own local servers). Then to do the IP-to-ASN locally you’d arrange to have a local BGP feed that you could query. I think you’d need to have an absolutely huge mail server before these issues became real concerns.

…
set acl_m9 = ${lookup dnsdb{retrans_1s,retry_1,txt=${reverse_ip:$sender_host_address}.origin6.asn.cymru.com}}
…

As for dnslist, the consequence of a time out is that you get no data, so it would just result in an empty header.

But Why? ^

I’ve actually been doing this for a while with SpamAssassin’s ASN plugin but I’ve changed the way in which I query SpamAssassin and now I don’t directly get the rewritten email that SpamAssassin makes (with its X-Spam-ASN: header in).

I use it for feeding into Bayes to see if there’s a particular prevalence of ASN amongst the email that is classified as spam, and I sometimes add a few points on manually for ASNs that are particularly bad. That is a lot less work than trying to track down all their IP addresses and keep that up to date.

Using Duplicity to back up to Amazon S3 over IPv6 (only)

Posted on March 14, 2022 by Andy

Contents

Scenario
S3 endpoint
Specifying the S3 endpoint
The full script

Scenario ^

I have a server that I use for making backups. I also send backups from that server into Amazon S3 at the “Infrequent Access” storage class. That class is cheaper to store but expensive to access. It’s intended for backups of last resort that you only access in an emergency. I use Duplicity to handle the S3 part.

(I could save a bit more by using one of the “Glacier” classes but at the moment the cost is minimal and I’m not that brave.)

I recently decided to change which server I use for the backups. I noticed that renting a server with only IPv6 connectivity was cheaper, and as all the hosts I back up have IPv6 connectivity I decided to give that a go.

This mostly worked fine. The only thing I really noticed was when I tried to install some software from GitHub. GitHub doesn’t support IPv6, so I had to piggy back that download through another host.

Then I came to set up Duplicity again and found that I needed to make some non-obvious changes to make it work with S3 over IPv6-only.

S3 endpoint ^

The main issue is that the default S3 endpoint URL is https://s3.<region>.amazonaws.com, and this host only has an A (IPv4) record! For example:

$ host s3.us-east-1.amazonaws.com
s3.us-east-1.amazonaws.com has address 52.216.89.254

If you run Duplicity with a target like s3://yourbucketname/path/to/backup then it will try that endpoint, get only an IPv4 address, and return Network unreachable.

S3 does actually support IPv6, but for that to work you need to use a dual stack endpoint! They look like this:

$ host s3.dualstack.us-east-1.amazonaws.com
s3.dualstack.us-east-1.amazonaws.com has address 54.231.129.0
s3.dualstack.us-east-1.amazonaws.com has IPv6 address 2600:1fa0:80dc:5101:34d9:451e::

So we need to specify the S3 endpoint to use.

Specifying the S3 endpoint ^

In order to do this you need to switch Duplicity to the “boto3” backend. Assuming you’ve installed the correct package (python3-boto3 on Debian), this is as simple as changing the target from s3://… to boto3+s3://….

That then allows you to use the command line arguments --s3-region-name and --s3-endpoint-url so you can tell it which host to talk to. That ends up giving you both an IPv4 and an IPv6 address and your system correctly chooses the IPv6 one.

The full script ^

The new, working script now looks something like this:

export PASSPHRASE="highlysecret"
export AWS_ACCESS_KEY_ID="notquiteassecret"
export AWS_SECRET_ACCESS_KEY="extremelysecret"
# Somewhere with plenty of free space.
export TMPDIR=/var/tmp
 
duplicity --encrypt-key ABCDEF0123456789 \
          --asynchronous-upload \
          -v 4 \
          --archive-dir=/path/to/your/duplicity/archives \
          --s3-use-ia \
          --s3-use-multiprocessing \
          --s3-use-new-style \
          --s3-region-name "us-east-1" \
          --s3-endpoint-url "https://s3.dualstack.us-east-1.amazonaws.com" \
          incr \
          --full-if-older-than 30D \
          /stuff/you/want/backed/up \
          "boto3+s3://yourbucketname/path/to/backups"

The previous version of the script looked a bit like:

# All the exports stayed the same
duplicity --encrypt-key ABCDEF0123456789 \
          --asynchronous-upload \
          -v 4 \
          --archive-dir=/path/to/your/duplicity/archives \
          --s3-use-ia \
          --s3-use-multiprocessing \
          incr \
          --full-if-older-than 30D \
          /stuff/you/want/backed/up \
          "s3+http://yourbucketname/path/to/backups"

Keeping firewall logs out of Linux’s kernel log with ulogd2

Posted on July 24, 2021 by Andy

Contents

A few words about iptables vs nft
Logging with iptables
The Problem
An answer: ulogd2

Configuring ulogd2
Configuring iptables

A few words about `iptables` vs `nft` ^

nftables is the new thing and iptables is deprecated, but I haven’t found time to convert everything to nft rules syntax yet.

I’m still using iptables rules but it’s the iptables frontend to nftables. All of this works both with legacy iptables and with nft but with different syntax.

Logging with `iptables` ^

As a contrived example let’s log inbound ICMP packets at a maximum rate of 1 per second:

-A INPUT -m limit --limit 1/s -p icmp -j LOG --log-level 7 --log-prefix "ICMP: "

The Problem ^

If you have logging rules in your firewall then they’ll log to your kernel log, which is available at /dev/kmsg. The dmesg command displays the contents of /dev/kmsg but /dev/kmsg is a fixed size circular buffer, so after a while your firewall logs will crowd out every other thing.

On a modern systemd system this stuff does get copied to the journal, so if you set that to be persistent then you can keep the kernel logs forever. Or you can additionally run a syslog daemon like rsyslogd, and have that keep things forever.

Either way though your dmesg or journalctl -k commands are only going to display the contents of the kernel’s ring buffer which will be a limited amount.

I’m not that interested in firewall logs. They’re nice to have and very occasionally valuable when debugging something, but most of the time I’d rather they weren’t in my kernel log.

An answer: `ulogd2` ^

One answer to this problem is ulogd2. ulogd2 is a userspace logging daemon into which you can feed netfilter data and have it log it in a flexible way, to multiple different formats and destinations.

I actually already use it to log certain firewall things to a MariaDB database for monitoring purposes, but you can also emit plain text, JSON, netflow and all manner of things. Since I’m already running it I decided to switch my general firewall logging to it.

Configuring `ulogd2` ^

I added the following to /etc/ulogd.conf:

# This one for logging to local file in emulated syslog format.
stack=log2:NFLOG,base1:BASE,ifi1:IFINDEX,ip2str1:IP2STR,print1:PRINTPKT,emu1:LOGEMU
 
[log2]
group=2
 
[emu1]
file="/var/log/iptables_ulogd2.log"
sync=1

I already had a stack called log1 for logging to MariaDB, so I called the new one log2 with its output being emu1.

The log2 section can then be told to expect messages from netfilter group 2. Don’t worry about this, just know that this is what you refer to in your firewall rules, and you can’t use group 0 because that’s used for something else.

The emu1 section then says which file to write this stuff to.

That’s it. Restart the daemon.

Configuring `iptables` ^

Now it’s time to make iptables log to netfilter group 2 instead of its normal LOG target. As a reminder, here’s what the rule was like before:

-A INPUT -m limit --limit 1/s -p icmp -j LOG --log-level 7 --log-prefix "ICMP: "

And here’s what you’d change it to:

-A INPUT -m limit --limit 1/s -p icmp -j NFLOG --nflog-group 2 --nflog-prefix "ICMP:"

The --nflog-group 2 needs to match what you put in /etc/ulogd.conf.

You’re now logging with ulogd2 and none of this will be going to the kernel log buffer. Don’t forget to rotate the new log file! Or maybe you’d like to play with logging this as JSON or into a SQLite DB?

Fun With SpamAssassin Meta Rules

Posted on December 31, 2020December 31, 2020 by Andy

I’ve got a ticketing system. Let’s say you open a ticket by emailing support@example.com. You then get an automated response confirming that you’ve opened a ticket, and on my side people get bothered by a notification about this support ticket that needs attention.

A problem here is that absolutely anyone or anything emailing that will open a ticket. And it’s pretty easy to find that email address.

As a result lots of ~~scum of the earth~~enterprising individuals seem to be passing that email address around to other enterprising individuals who decide to add it to their email marketing mailshots.

A reasonable response to this would perhaps be to move away from email to a web form, and put it behind a login so that only existing, authenticated customers could submit new tickets. Thing is, I still have to have a way for previously-unknown people to create tickets by email, and I kind of like email. So I persevere.

One thing I can do though is block all kinds of newsletters. There is no scenario where people who send newsletters should be trying to open support tickets. I’m prepared to disallow any email from MailJet or SendGrid being sent to my ticketing system, for example.

But how to do it?

Well, I am already able to identify emails from MailJet and SendGrid because I use the ASN plugin. This inserts a header in the email to say which Autonomous System it came from.

MailJet’s ASN is 200069 and SendGrid’s is 11377. I know that because I’ve seen mail from them before, and the ASN plugin put a header in with those numbers.

You can add some custom rules to match mails from these ASNs:

header   LOCAL_ASN_MAILJET X-ASN =~ /\b200069\b/
score    LOCAL_ASN_MAILJET 0.001
describe LOCAL_ASN_MAILJET Sent by MailJet (ASN200069)

What this will do is check the header that the ASN plugin added and if it matches it will add this label LOCAL_ASN_MAILJET with a score of 0.001 to the list of SpamAssassin scores.

Scores that are very close to zero (but not actually zero!) are typically used just to annotate an email. You can’t use zero exactly because that disables the rule entirely.

Now, if you really didn’t want any email from MailJet at all you could crank that score up and it would all be rejected. But my users do actually get quite a lot of wanted email from the likes of MailJet and SendGrid. These senders are sadly too big to block. They know this, and this probably contributes to their noted preference for taking spammers’ money, but that is a rant for another day.

Back to the original goal: I only want to reject mail from these companies if it is destined for my ticketing system. So how to identify mail that’s for the support queue? Well that’s pretty simple:

header   LOCAL_TO_SUPPORT ToCc:addr =~ /^support\@example\.com$/i
score    LOCAL_TO_SUPPORT 0.001
describe LOCAL_TO_SUPPORT Recipient is support queue

This checks just the address part(s) of the To and Cc headers to see if any match support@example.com. The periods (‘.’) and the at symbol (‘@’) need escaping because this is a Perl regular expression. If there’s a match then the LOCAL_TO_SUPPORT tag will be added.

Now all that remains is to make a new rule that only fires if both of these conditions are true, and assigns a real score to that:

meta     LOCAL_MAILSHOT_TO_SUPPORT (LOCAL_TO_SUPPORT && (LOCAL_ASN_MAILJET || LOCAL_ASN_SENDGRID))
score    LOCAL_MAILSHOT_TO_SUPPORT 10.0
describe LOCAL_MAILSHOT_TO_SUPPORT Mailshot sent to support queue

There. Now the support queue will never get emails from these companies, but the rest of my users still can.

Of course you don’t have to match those mails by ASN. There are many other indicators of senders that just shouldn’t be opening support tickets, and if you can find any other sort of rule that matches them reliably then you can chain that with other rules that identify the support queue recipient.

Another way to do it would be to run the support queue as its own SpamAssassin user with its own per-user rules. I have a fairly simple SpamAssassin setup though with only a global set of rules so I didn’t want to do that just for this.

Forcing zone transfers with BIND and PowerDNS

Posted on May 16, 2019 by Andy

Contents

The Problem
The Fix(es)

Increment the serial a bit
Increment the serial a lot
Delete the zones and re-add them
Force a zone transfer

BIND
PowerDNS

The Problem ^

Today a customer told me that they had messed up the serial numbers on their DNS zones such that their primary server now had a lower serial number than my secondary servers. Once that happens the secondary servers will stop doing zone transfers.

The Fix(es) ^

TL;DR: I chose the last one, “force a zone transfer”. I knew the BIND one but had to look up the PowerDNS way. Having me look things up for you is (sometimes) part of the BitFolk value proposition. 😀

Increment the serial a bit ^

They could fix it by simply incrementing their serial again to make it larger than mine, but they wanted to continue to use a YYYYMMDDXX format for it.

Increment the serial a lot ^

As the serial is an unsigned integer, if you increment it far enough it will wrap around and become actually smaller than your desired new serial, which you can then set. This is a complicated process which is best described elsewhere.

Delete the zones and re-add them ^

If zones were deleted from all secondary servers then the next update should put them back. This would however cause an outage in between, so it’s not a good idea.

Force a zone transfer ^

Here’s how to force a zone transfer on BIND and PowerDNS.

BIND ^

$ rndc retransfer example.com

PowerDNS ^

$ pdns_control retrieve example.com

Forcing the source address of an SNMP client (e.g. snmpwalk)

Posted on February 3, 2019 by Andy

I was using snmpwalk earlier and it kept using the “wrong” IP address to send packets from. The destination was firewalled to only accept packets from certain sources, and I didn’t want to poke another hole just because snmpwalk was being stupid.

I read a lot of man pages to try to find out how to specify the source address but couldn’t find anything anywhere. Eventually I uncovered a post from 2004 saying that you can use the clientaddr directive in snmp.conf.

So, just so this is easy to find the next time I need it, you can force it on the command line with:

$ snmpwalk --clientaddr=192.168.0.1 …

And you do have to put the ‘=’ in there.

The Internet of Unprofitable Things

Posted on December 24, 2018December 25, 2018 by Andy

Contents

Gather round children
A modest proposal
wats 85.119.80.232 precious?
NTP?
They want what?
That’s not gonna work
Down the rabbit hole
Weekly??
We made a deal
Not plain sailing
Yearly report card: must try harder
u wot m8
All good things…

Gather round children ^

Uncle Andrew wants to tell you a festive story. The NTPmare shortly after Christmas.

A modest proposal ^

Nearly two years ago, on the afternoon of Monday 16th January 2017, I received an interesting BitFolk support ticket from a non-customer. The sender identified themselves as a senior software engineer at NetThings UK Ltd.

Subject: Specific request for NTP on IP 85.119.80.232

Hi,

This might sound odd but I need to setup an NTP server instance on IP address 85.119.80.232.

wats `85.119.80.232` precious? ^

85.119.80.232 is actually one of the IP addresses of one of BitFolk’s customer-facing NTP servers. It was also, until a few weeks before this email, part of the NTP Pool project.

“Was” being the important issue here. In late December of 2016 I had withdrawn BitFolk’s NTP servers from the public pool and firewalled them off to non-customers.

I’d done that because they were receiving an unusually large amount of traffic due to the Snapchat NTP bug. It wasn’t really causing any huge problems, but the number of traffic flows were pushing useful information out of Jump‘s fixed-size netflow database and I didn’t want to deal with it over the holiday period, so this public service was withdrawn.

NTP? ^

This article was posted to Hacker News and a couple of comments there said they would have liked to have seen a brief explanation of what NTP is, so I’ve now added this section. If you know what NTP is already then you should probably skip this section because it will be quite brief and non-technical.

Network Time Protocol is a means by which a computer can use multiple other computers, often from across the Internet on completely different networks under different administrative control, to accurately determine what the current time is. By using several different computers, a small number of them can be inaccurate or even downright broken or hostile, and still the protocol can detect the “bad” clocks and only take into account the more accurate majority.

NTP is supposed to be used in a hierarchical fashion: A small number of servers have hardware directly attached from which they can very accurately tell the time, e.g. an atomic clock, GPS, etc. Those are called “Stratum 1” servers. A larger number of servers use the stratum 1 servers to set their own time, then serve that time to a much larger population of clients, and so on.

It used to be the case that it was quite hard to find NTP servers that you were allowed to use. Your own organisation might have one or two, but really you should have at least 3 to 7 of them and it’s better if there are multiple different organisations involved. In a university environment that wasn’t so difficult because you could speak to colleagues from another institution and swap NTP access. As the Internet matured and became majority used by corporations and private individuals though, people still needed access to accurate time, and this wasn’t going to cut it.

The NTP Pool project came to the rescue by making an easy web interface for people to volunteer their NTP servers, and then they’d be served collectively in a DNS zone with some basic means to share load. A private individual can just use three names from the pool zone and they will get three different (constantly changing) NTP servers.

Corporations and those making products that need to query the NTP pool are supposed to ask for a “vendor zone”. They make some small contribution to the NTP pool project and then they get a DNS zone dedicated to their product, so it’s easier for the pool administrators to direct the traffic.

Sadly many companies don’t take the time to understand this and just use the generic pool zone. NetThings UK Ltd went one step further in a very wrong direction by taking an IP address from the pool and just using it directly, assuming it would always be available for their use. In reality it was a free service donated to the pool by BitFolk and as it had become temporarily inconvenient for that arrangement to continue, service was withdrawn.

On with the story…

They want what? ^

The Senior Software Engineer continued:

The NTP service was recently shutdown and I am interested to know if there is any possibility of starting it up again on the IP address mentioned. Either through the current holder of the IP address or through the migration of the current machine to another address to enable us to lease 85.119.80.232.

Um…

I realise that this is a peculiar request but I can assure you it is genuine.

That’s not gonna work ^

Obviously what with 85.119.80.232 currently being in use by all customers as a resolver and NTP server I wasn’t very interested in getting them all to change their configuration and then leasing it to NetThings UK Ltd.

What I did was remove the firewalling so that 85.119.80.232 still worked as an NTP server for NetThings UK Ltd until we worked out what could be done.

I then asked some pertinent questions so we could work out the scope of the service we’d need to provide. Questions such as:

How many clients do you have using this?
Do you know their IP addresses?
When do they need to use the NTP server and for how long?
Can you make them use the pool properly (a vendor zone)?

Down the rabbit hole ^

The answers to some of the above questions were quite disappointing.

It would be of some use for our manufacturing setup (where the RTCs are initially set) but unfortunately we also have a reasonably large field population (~500 units with weekly NTP calls) that use roaming GPRS SIMs. I don’t know if we can rely on the source IP of the APN for configuring the firewall in this case (I will check though). We are also unable to update the firmware remotely on these devices as they only have a 5MB per month data allowance. We are able to wirelessly update them locally but the timeline for this is months rather than weeks.

Basically it seemed that NetThings UK Ltd made remote controlled thermostats and lighting controllers for large retail spaces etc. And their devices had one of BitFolk’s IP addresses burnt into them at the factory. And they could not be identified or remotely updated.

Oh, and whatever these devices were, without an external time source their clocks would start to noticeably drift within 2 weeks.

By the way, they solved their “burnt into it at the factory” problem by bringing up BitFolk’s IP address locally at their factory to set initial date/time.

I’ll admit, at this point I was slightly tempted to work out how to identify these devices and reply to them with completely the wrong times to see if I could get some retail parks to turn their lights on and off at strange times.

Weekly?? ^

We are triggering ntp calls on a weekly cron with no client side load balancing. This would result in a flood of calls at the same time every Sunday evening at around 19:45.

Yeah, they made every single one of their unidentifiable devices contact a hard coded IP address within a two minute window every Sunday night.

The Senior Software Engineer was initially very worried that they were the cause of the excess flows I had mentioned earlier, but I reassured them that it was definitely the Snapchat bug. In fact I never was able to detect their devices above background noise; it turns out that ~500 devices doing a single SNTP query is pretty light load. They’d been doing it for over 2 years before I received this email.

I did of course point out that they were lucky we caught this early because they could have ended up as the next Netgear vs. University of Wisconsin.

I am feeling really, really bad about this. I’m very, very sorry if we were the cause of your problems.

Bless. I must point out that throughout all of this, their Senior Software Engineer was a pleasure to work with.

We made a deal ^

While NTP service is something BitFolk provides as a courtesy to customers, it’s not something that I wanted to sell as a service on its own. And after all, who would buy it, when the public pool exists? The correct thing for a corporate entity to do is support the pool with a vendor zone.

But NetThings UK Ltd were in a bind and not allowing them to use BitFolk’s NTP server was going to cause them great commercial harm. Potentially I could have asked for a lot of money at this point, but (no doubt to my detriment) that just felt wrong.

I proposed that initially they pay me for two hours of consultancy to cover work already done in dealing with their request and making the firewall changes.

I further proposed that I charged them one hour of consultancy per month for a period of 12 months, to cover continued operation of the NTP server. Of course, I do not spend an hour a month fiddling with NTP, but this unusual departure from my normal business had to come at some cost.

I was keen to point out that this wasn’t something I wanted to continue forever:

Finally, this is not a punitive charge. It seems likely that you are in a difficult position at the moment and there is the temptation to charge you as much as we can get away with (a lot more than £840 [+VAT per year], anyway), but this seems unfair to me. However, providing NTP service to third parties is not a business we want to be in so we would expect this to only last around 12 months. If you end up having to renew this service after 12 months then that would be an indication that we haven’t charged you enough and we will increase the price.

Does this seem reasonable?

NetThings UK Ltd happily agreed to this proposal on a quarterly basis.

Thanks again for the info and help. You have saved me a huge amount of convoluted and throwaway work. This give us enough time to fix things properly.

Not plain sailing ^

I only communicated with the Senior Software Engineer one more time. The rest of the correspondence was with financial staff, mainly because NetThings UK Ltd did not like paying its bills on time.

NetThings UK Ltd paid 3 of its 4 invoices in the first year late. I made sure to charge them statutory late payment fees for each overdue invoice.

Yearly report card: must try harder ^

As 2017 was drawing to a close, I asked the Senior Software Engineer how NetThings UK Ltd was getting on with ceasing to hard code BitFolk’s IP address in its products.

To give you a quick summary, we have migrated the majority of our products away from using the fixed IP address. There is still one project to be updated after which there will be no new units being manufactured using the fixed IP address. However, we still have around 1000 units out in the field that are not readily updatable and will continue to perform weekly NTP calls to the fixed IP address. So to answer your question, yes we will still require the service past January 2018.

This was a bit disappointing because a year earlier the number had been “about 500” devices, yet despite a year of effort the number had apparently doubled.

That alone would have been enough for me to increase the charge, but I was going to anyway due to NetThings UK Ltd’s aversion to paying on time. I gave them just over 2 months of notice that the price was going to double.

u wot m8 ^

Approximately 15 weeks after being told that the price doubling was going to happen, NetThings UK Ltd’s Financial Controller asked me why it had happened, while letting me know that another of their late payments had been made:

Date: Wed, 21 Feb 2018 14:59:42 +0000

We’ve paid this now, but can you explain why the price has doubled?

I was very happy to explain again in detail why it had doubled. The Financial Controller in response tried to agree a fixed price for a year, which I said I would be happy to do if they paid for the full year in one payment.

My rationale for this was that a large part of the reason for the increase was that I had been spending a lot of time chasing their late payments, so if they wanted to still make quarterly payments then I would need the opportunity to charge more if I needed to. If they wanted assurance then in my view they should pay for it by making one yearly payment.

There was no reply, so the arrangement continued on a quarterly basis.

All good things… ^

On 20 November 2018 BitFolk received a letter from Deloitte:

Netthings Limited – In Administration (“The Company”)

Company Number: SC313913

[…]

Cessation of Trading

The Company ceased to trade with effect from 15 November 2018.

Investigation

As part of our duties as Joint Administrators, we shall be investigating what assets the Company holds and what recoveries if any may be made for the benefit of creditors as well as the manner in which the Company’s business has been conducted.

And then on 21 December:

Under paragraph 51(1)(b) of the Insolvency Act 1986, the Joint Administrators are not required to call an initial creditors’ meeting unless the Company has sufficient funds to make a distribution to the unsecured creditors, or unless a meeting is requested on Form SADM_127 by 10% or more in value of the Company’s unsecured creditors. There will be no funds available to make a distribution to the unsecured creditors of the Company, therefore a creditors’ meeting will not be convened.

Luckily their only unpaid invoice was for service from some point in November, so they didn’t really get anything that they hadn’t already paid for.

So that’s the story of NetThings UK Ltd, a brave pioneer of the Internet of Things wave, who thought that the public NTP pool was just an inherent part of the Internet that anyone could use for free, and that the way to do that was to pick one IP address out of it at random and bake that into over a thousand bits of hardware that they distributed around the country with no way to remotely update.

This coupled with their innovative reluctance to pay for anything on time was sadly not enough to let them remain solvent.

Using a different theme for Mediawiki’s SyntaxHighlight extension

Posted on March 20, 2018 by Andy

Probably the best syntax highlighting plugin for Mediawiki at the moment is the one simply called SyntaxHighlight. It uses Pygments to do the heavy lifting. What sets it apart from the other extensions is that it supports line numbers and picking out highlighted lines.

Unfortunately the default style (theme) is dark-on-light whereas for most of my syntax highlighting I am giving examples of either shell sessions or code. All my shell sessions and code are viewed as light-on-dark, so I would prefer that the wiki’s syntax highlighting followed suit.

I spent quite a while messing about with editing the extension itself but to little effect, until Robert pointed out that I just needed to edit the Common.css file inside the wiki itself. Then you get some decent results.

I used something like this to generate the correct CSS for the “native” style:

$ ./extensions/SyntaxHighlight_GeSHi/pygments/pygmentize -S native -f html|sed -e 's/^/.mw-highlight > pre /'
.mw-highlight > pre .hll { background-color: #404040 }
.mw-highlight > pre .c { color: #999999; font-style: italic } /* Comment */
.mw-highlight > pre .err { color: #a61717; background-color: #e3d2d2 } /* Error */
.mw-highlight > pre .esc { color: #d0d0d0 } /* Escape */
.mw-highlight > pre .g { color: #d0d0d0 } /* Generic */
.mw-highlight > pre .k { color: #6ab825; font-weight: bold } /* Keyword */
.mw-highlight > pre .l { color: #d0d0d0 } /* Literal */
.mw-highlight > pre .n { color: #d0d0d0 } /* Name */
.mw-highlight > pre .o { color: #d0d0d0 } /* Operator */
.mw-highlight > pre .x { color: #d0d0d0 } /* Other */
.mw-highlight > pre .p { color: #d0d0d0 } /* Punctuation */
.mw-highlight > pre .ch { color: #999999; font-style: italic } /* Comment.Hashbang */
.mw-highlight > pre .cm { color: #999999; font-style: italic } /* Comment.Multiline */
.mw-highlight > pre .cp { color: #cd2828; font-weight: bold } /* Comment.Preproc */
.mw-highlight > pre .cpf { color: #999999; font-style: italic } /* Comment.PreprocFile */
.mw-highlight > pre .c1 { color: #999999; font-style: italic } /* Comment.Single */
.mw-highlight > pre .cs { color: #e50808; font-weight: bold; background-color: #520000 } /* Comment.Special */
.mw-highlight > pre .gd { color: #d22323 } /* Generic.Deleted */
.mw-highlight > pre .ge { color: #d0d0d0; font-style: italic } /* Generic.Emph */
.mw-highlight > pre .gr { color: #d22323 } /* Generic.Error */
.mw-highlight > pre .gh { color: #ffffff; font-weight: bold } /* Generic.Heading */
.mw-highlight > pre .gi { color: #589819 } /* Generic.Inserted */
.mw-highlight > pre .go { color: #cccccc } /* Generic.Output */
.mw-highlight > pre .gp { color: #aaaaaa } /* Generic.Prompt */
.mw-highlight > pre .gs { color: #d0d0d0; font-weight: bold } /* Generic.Strong */
.mw-highlight > pre .gu { color: #ffffff; text-decoration: underline } /* Generic.Subheading */
.mw-highlight > pre .gt { color: #d22323 } /* Generic.Traceback */
.mw-highlight > pre .kc { color: #6ab825; font-weight: bold } /* Keyword.Constant */
.mw-highlight > pre .kd { color: #6ab825; font-weight: bold } /* Keyword.Declaration */
.mw-highlight > pre .kn { color: #6ab825; font-weight: bold } /* Keyword.Namespace */
.mw-highlight > pre .kp { color: #6ab825 } /* Keyword.Pseudo */
.mw-highlight > pre .kr { color: #6ab825; font-weight: bold } /* Keyword.Reserved */
.mw-highlight > pre .kt { color: #6ab825; font-weight: bold } /* Keyword.Type */
.mw-highlight > pre .ld { color: #d0d0d0 } /* Literal.Date */
.mw-highlight > pre .m { color: #3677a9 } /* Literal.Number */
.mw-highlight > pre .s { color: #ed9d13 } /* Literal.String */
.mw-highlight > pre .na { color: #bbbbbb } /* Name.Attribute */
.mw-highlight > pre .nb { color: #24909d } /* Name.Builtin */
.mw-highlight > pre .nc { color: #447fcf; text-decoration: underline } /* Name.Class */
.mw-highlight > pre .no { color: #40ffff } /* Name.Constant */
.mw-highlight > pre .nd { color: #ffa500 } /* Name.Decorator */
.mw-highlight > pre .ni { color: #d0d0d0 } /* Name.Entity */
.mw-highlight > pre .ne { color: #bbbbbb } /* Name.Exception */
.mw-highlight > pre .nf { color: #447fcf } /* Name.Function */
.mw-highlight > pre .nl { color: #d0d0d0 } /* Name.Label */
.mw-highlight > pre .nn { color: #447fcf; text-decoration: underline } /* Name.Namespace */
.mw-highlight > pre .nx { color: #d0d0d0 } /* Name.Other */
.mw-highlight > pre .py { color: #d0d0d0 } /* Name.Property */
.mw-highlight > pre .nt { color: #6ab825; font-weight: bold } /* Name.Tag */
.mw-highlight > pre .nv { color: #40ffff } /* Name.Variable */
.mw-highlight > pre .ow { color: #6ab825; font-weight: bold } /* Operator.Word */
.mw-highlight > pre .w { color: #666666 } /* Text.Whitespace */
.mw-highlight > pre .mb { color: #3677a9 } /* Literal.Number.Bin */
.mw-highlight > pre .mf { color: #3677a9 } /* Literal.Number.Float */
.mw-highlight > pre .mh { color: #3677a9 } /* Literal.Number.Hex */
.mw-highlight > pre .mi { color: #3677a9 } /* Literal.Number.Integer */
.mw-highlight > pre .mo { color: #3677a9 } /* Literal.Number.Oct */
.mw-highlight > pre .sa { color: #ed9d13 } /* Literal.String.Affix */
.mw-highlight > pre .sb { color: #ed9d13 } /* Literal.String.Backtick */
.mw-highlight > pre .sc { color: #ed9d13 } /* Literal.String.Char */
.mw-highlight > pre .dl { color: #ed9d13 } /* Literal.String.Delimiter */
.mw-highlight > pre .sd { color: #ed9d13 } /* Literal.String.Doc */
.mw-highlight > pre .s2 { color: #ed9d13 } /* Literal.String.Double */
.mw-highlight > pre .se { color: #ed9d13 } /* Literal.String.Escape */
.mw-highlight > pre .sh { color: #ed9d13 } /* Literal.String.Heredoc */
.mw-highlight > pre .si { color: #ed9d13 } /* Literal.String.Interpol */
.mw-highlight > pre .sx { color: #ffa500 } /* Literal.String.Other */
.mw-highlight > pre .sr { color: #ed9d13 } /* Literal.String.Regex */
.mw-highlight > pre .s1 { color: #ed9d13 } /* Literal.String.Single */
.mw-highlight > pre .ss { color: #ed9d13 } /* Literal.String.Symbol */
.mw-highlight > pre .bp { color: #24909d } /* Name.Builtin.Pseudo */
.mw-highlight > pre .fm { color: #447fcf } /* Name.Function.Magic */
.mw-highlight > pre .vc { color: #40ffff } /* Name.Variable.Class */
.mw-highlight > pre .vg { color: #40ffff } /* Name.Variable.Global */
.mw-highlight > pre .vi { color: #40ffff } /* Name.Variable.Instance */
.mw-highlight > pre .vm { color: #40ffff } /* Name.Variable.Magic */
.mw-highlight > pre .il { color: #3677a9 } /* Literal.Number.Integer.Long */

$ ./extensions/SyntaxHighlight_GeSHi/pygments/pygmentize -S native -f html|sed -e 's/^/.mw-highlight > pre /' .mw-highlight > pre .hll { background-color: #404040 } .mw-highlight > pre .c { color: #999999; font-style: italic } /* Comment */ .mw-highlight > pre .err { color: #a61717; background-color: #e3d2d2 } /* Error */ .mw-highlight > pre .esc { color: #d0d0d0 } /* Escape */ .mw-highlight > pre .g { color: #d0d0d0 } /* Generic */ .mw-highlight > pre .k { color: #6ab825; font-weight: bold } /* Keyword */ .mw-highlight > pre .l { color: #d0d0d0 } /* Literal */ .mw-highlight > pre .n { color: #d0d0d0 } /* Name */ .mw-highlight > pre .o { color: #d0d0d0 } /* Operator */ .mw-highlight > pre .x { color: #d0d0d0 } /* Other */ .mw-highlight > pre .p { color: #d0d0d0 } /* Punctuation */ .mw-highlight > pre .ch { color: #999999; font-style: italic } /* Comment.Hashbang */ .mw-highlight > pre .cm { color: #999999; font-style: italic } /* Comment.Multiline */ .mw-highlight > pre .cp { color: #cd2828; font-weight: bold } /* Comment.Preproc */ .mw-highlight > pre .cpf { color: #999999; font-style: italic } /* Comment.PreprocFile */ .mw-highlight > pre .c1 { color: #999999; font-style: italic } /* Comment.Single */ .mw-highlight > pre .cs { color: #e50808; font-weight: bold; background-color: #520000 } /* Comment.Special */ .mw-highlight > pre .gd { color: #d22323 } /* Generic.Deleted */ .mw-highlight > pre .ge { color: #d0d0d0; font-style: italic } /* Generic.Emph */ .mw-highlight > pre .gr { color: #d22323 } /* Generic.Error */ .mw-highlight > pre .gh { color: #ffffff; font-weight: bold } /* Generic.Heading */ .mw-highlight > pre .gi { color: #589819 } /* Generic.Inserted */ .mw-highlight > pre .go { color: #cccccc } /* Generic.Output */ .mw-highlight > pre .gp { color: #aaaaaa } /* Generic.Prompt */ .mw-highlight > pre .gs { color: #d0d0d0; font-weight: bold } /* Generic.Strong */ .mw-highlight > pre .gu { color: #ffffff; text-decoration: underline } /* Generic.Subheading */ .mw-highlight > pre .gt { color: #d22323 } /* Generic.Traceback */ .mw-highlight > pre .kc { color: #6ab825; font-weight: bold } /* Keyword.Constant */ .mw-highlight > pre .kd { color: #6ab825; font-weight: bold } /* Keyword.Declaration */ .mw-highlight > pre .kn { color: #6ab825; font-weight: bold } /* Keyword.Namespace */ .mw-highlight > pre .kp { color: #6ab825 } /* Keyword.Pseudo */ .mw-highlight > pre .kr { color: #6ab825; font-weight: bold } /* Keyword.Reserved */ .mw-highlight > pre .kt { color: #6ab825; font-weight: bold } /* Keyword.Type */ .mw-highlight > pre .ld { color: #d0d0d0 } /* Literal.Date */ .mw-highlight > pre .m { color: #3677a9 } /* Literal.Number */ .mw-highlight > pre .s { color: #ed9d13 } /* Literal.String */ .mw-highlight > pre .na { color: #bbbbbb } /* Name.Attribute */ .mw-highlight > pre .nb { color: #24909d } /* Name.Builtin */ .mw-highlight > pre .nc { color: #447fcf; text-decoration: underline } /* Name.Class */ .mw-highlight > pre .no { color: #40ffff } /* Name.Constant */ .mw-highlight > pre .nd { color: #ffa500 } /* Name.Decorator */ .mw-highlight > pre .ni { color: #d0d0d0 } /* Name.Entity */ .mw-highlight > pre .ne { color: #bbbbbb } /* Name.Exception */ .mw-highlight > pre .nf { color: #447fcf } /* Name.Function */ .mw-highlight > pre .nl { color: #d0d0d0 } /* Name.Label */ .mw-highlight > pre .nn { color: #447fcf; text-decoration: underline } /* Name.Namespace */ .mw-highlight > pre .nx { color: #d0d0d0 } /* Name.Other */ .mw-highlight > pre .py { color: #d0d0d0 } /* Name.Property */ .mw-highlight > pre .nt { color: #6ab825; font-weight: bold } /* Name.Tag */ .mw-highlight > pre .nv { color: #40ffff } /* Name.Variable */ .mw-highlight > pre .ow { color: #6ab825; font-weight: bold } /* Operator.Word */ .mw-highlight > pre .w { color: #666666 } /* Text.Whitespace */ .mw-highlight > pre .mb { color: #3677a9 } /* Literal.Number.Bin */ .mw-highlight > pre .mf { color: #3677a9 } /* Literal.Number.Float */ .mw-highlight > pre .mh { color: #3677a9 } /* Literal.Number.Hex */ .mw-highlight > pre .mi { color: #3677a9 } /* Literal.Number.Integer */ .mw-highlight > pre .mo { color: #3677a9 } /* Literal.Number.Oct */ .mw-highlight > pre .sa { color: #ed9d13 } /* Literal.String.Affix */ .mw-highlight > pre .sb { color: #ed9d13 } /* Literal.String.Backtick */ .mw-highlight > pre .sc { color: #ed9d13 } /* Literal.String.Char */ .mw-highlight > pre .dl { color: #ed9d13 } /* Literal.String.Delimiter */ .mw-highlight > pre .sd { color: #ed9d13 } /* Literal.String.Doc */ .mw-highlight > pre .s2 { color: #ed9d13 } /* Literal.String.Double */ .mw-highlight > pre .se { color: #ed9d13 } /* Literal.String.Escape */ .mw-highlight > pre .sh { color: #ed9d13 } /* Literal.String.Heredoc */ .mw-highlight > pre .si { color: #ed9d13 } /* Literal.String.Interpol */ .mw-highlight > pre .sx { color: #ffa500 } /* Literal.String.Other */ .mw-highlight > pre .sr { color: #ed9d13 } /* Literal.String.Regex */ .mw-highlight > pre .s1 { color: #ed9d13 } /* Literal.String.Single */ .mw-highlight > pre .ss { color: #ed9d13 } /* Literal.String.Symbol */ .mw-highlight > pre .bp { color: #24909d } /* Name.Builtin.Pseudo */ .mw-highlight > pre .fm { color: #447fcf } /* Name.Function.Magic */ .mw-highlight > pre .vc { color: #40ffff } /* Name.Variable.Class */ .mw-highlight > pre .vg { color: #40ffff } /* Name.Variable.Global */ .mw-highlight > pre .vi { color: #40ffff } /* Name.Variable.Instance */ .mw-highlight > pre .vm { color: #40ffff } /* Name.Variable.Magic */ .mw-highlight > pre .il { color: #3677a9 } /* Literal.Number.Integer.Long */

(Yes, I also need to do the light-on-dark thing here in this blog)

To get a list of available styles:

$ ./extensions/SyntaxHighlight_GeSHi/pygments/pygmentize -L styles
Pygments version 2.2.0, (c) 2006-2017 by Georg Brandl.
Styles:
~~~~~~~
* manni:
A colorful style, inspired by the terminal highlighting style.
* igor:
Pygments version of the official colors for Igor Pro procedures.
* lovelace:
The style used in Lovelace interactive learning environment. Tries to avoid the "angry fruit salad" effect with desaturated and dim colours.
* xcode:
Style similar to the Xcode default colouring theme.
* vim:
Styles somewhat like vim 7.0
* autumn:
A colorful style, inspired by the terminal highlighting style.
* abap:
* vs:
* rrt:
Minimalistic "rrt" theme, based on Zap and Emacs defaults.
* native:
Pygments version of the "native" vim theme.
* perldoc:
Style similar to the style used in the perldoc code blocks.
* borland:
Style similar to the style used in the borland IDEs.
* arduino:
The Arduino® language style. This style is designed to highlight the Arduino source code, so exepect the best results with it.
* tango:
The Crunchy default Style inspired from the color palette from the Tango Icon Theme Guidelines.
* emacs:
The default style (inspired by Emacs 22).
* friendly:
A modern style based on the VIM pyte theme.
* monokai:
This style mimics the Monokai color scheme.
* paraiso-dark:
* colorful:
A colorful style, inspired by CodeRay.
* murphy:
Murphy's style from CodeRay.
* bw:
* pastie:
Style similar to the pastie default style.
* rainbow_dash:
A bright and colorful syntax highlighting theme.
* algol_nu:
* paraiso-light:
* trac:
Port of the default trac highlighter design.
* default:
The default style (inspired by Emacs 22).
* algol:
* fruity:
Pygments version of the "native" vim theme.

Although you may find it easier looking at the Pygments style gallery.

Let’s Encrypt wildcard certificates, acme.sh and automated DNS verification

Posted on March 19, 2018March 19, 2018 by Andy

Contents

Let’s Encrypt’s wildcard certificates
dns-01 ACME challenges

strugglers.net zone file content
acmesh.strugglers.net zone file content

DNS server configuration

Generate a key for dynamic DNS updates
Put the key in the BIND config
Put the new zone into the BIND config
Reload the DNS server
Check things out with nsupdate

acme.sh

Issuing the first wildcard certificate
Examining a certificate

Let’s Encrypt’s wildcard certificates ^

Now that Let’s Encrypt can issue wildcard TLS certificates I found some time to look into that.

I already use a Lua script with haproxy which takes care of automatically answering http-01 ACME challenges, but to issue/renew a wildcard certificate you need to answer a dns-01 challenge. A different client/setup would be needed.

dns-01 ACME challenges ^

Most of the clients that support ACME v2 offer a range of integrations for DNS providers, plus a manual mode that prints out the DNS record that you need to add and then waits for you to indicate that you’ve done it. I run my own DNS infrastructure so the thing to do would be RFC2136 dynamic DNS updates.

One wrinkle here is that currently none of my DNS zones have dynamic updates enabled. At the moment I manage them as zone files (some are automatically generated by scripts though). After looking at a few of the client options I found that acme.sh supports an “alias zone”.

Basically, in your main zone you create a CNAME for the challenge record that points at another zone, and then enable dynamic updates in that other zone. The other zone is dedicated for this purpose, so the only updates which will be happening will be for the purpose of answering dns-01 ACME challenges. I made my dynamic zone a sub-zone of my main one:

strugglers.net zone file content ^

These records need to be added to the main zone for this to work.

.
.
.
; sub-zone purely used for dns-01 ACME challenges.
acmesh          NS a.authns.bitfolk.co.uk.
                NS b.authns.bitfolk.com.
                NS c.authns.bitfolk.com.
 
; Alias the dns-01 challenge record into the dedicated zone.
_acme-challenge CNAME _acme-challenge.acmesh.strugglers.net.
.
.
.

acmesh.strugglers.net zone file content ^

Initially this just needs to be an empty zone with only SOA and NS records, so this is the entire content of the file.

$ORIGIN .
$TTL 86400      ; 1 day
acmesh.strugglers.net   IN SOA  a.authns.bitfolk.co.uk. hostmaster.bitfolk.com. (
                                2018031905 ; serial
                                14400      ; refresh (4 hours)
                                7200       ; retry (2 hours)
                                1209600    ; expire (2 weeks)
                                43200      ; minimum (12 hours)
                                )
                        NS      a.authns.bitfolk.co.uk.
                        NS      b.authns.bitfolk.com.
                        NS      c.authns.bitfolk.com.

DNS server configuration ^

The DNS server needs to know a key by which it will authenticate acme.sh‘s updates, and also needs to be told that the new zone is a dynamic zone. I use BIND, so it goes as follows.

Generate a key for dynamic DNS updates ^

Use the dnssec-keygen command to generate a key suitable for authenticating DNS updates.

$ dnssec-keygen -r /dev/urandom -a HMAC-SHA512 -b 512 -n HOST DDNS_UPDATE

This creates two files named like Kddns_update.+165+14059.key and Kddns_update.+165+14059.private.

Put the key in the BIND config ^

Look in the private file and take the key from the line that starts “Key:”. Put that in some config file that you will load into your BIND like this:

key "strugglers" {
    algorithm hmac-sha512;
    secret "Sb8nvwpO8bhiU4haPB+NiJKoMO6vVJumrr29Bj3daSuB8hBoTKoqPKMBKTYLRUv12pbKPwJATgdsU6BtL4Hmcw==";
};

The thing in quotes after “key” is a symbolic name for this key and can be anything that makes sense to you. The “secret” is the key from the private file. You can delete the two Kddns_update.+165+14059.* files now.

Put the new zone into the BIND config ^

The config for the zone itself looks something like this:

zone "acmesh.strugglers.net" {
    type master;
    file "/path/to/acmesh.strugglers.net";
    allow-update {
        key "strugglers";
    };
};

Reload the DNS server ^

Once BIND has been reloaded the log file should indicate that the acemsh.strugglers.net zone was loaded correctly, and in my case that triggers DNS NOTIFY to my secondary servers which automatically begin zone transfers.

Check things out with `nsupdate` ^

At this point it might be worth using the nsupdate command to check that you can do dynamic DNS updates.

Just type the nsupdate line in the shell, the > is a prompt at which you will type the updates you wish to send. We’ll add a trivial TXT record. The -k argument is the path to the file containing the key.

$ nsupdate -k /path/to/strugglers.key -v
> server a.authns.bitfolk.co.uk
> debug yes
> zone acmesh.strugglers.net.
> update add foo.acmesh.strugglers.net. 86400 TXT "bar"
> show
Outgoing update query:
;; ->>HEADER<<- opcode: UPDATE, status: NOERROR, id:      0
;; flags:; ZONE: 0, PREREQ: 0, UPDATE: 0, ADDITIONAL: 0
;; ZONE SECTION:
;acmesh.strugglers.net.         IN      SOA
 
;; UPDATE SECTION:
foo.acmesh.strugglers.net. 86400 IN     TXT     "bar"
 
> send
Sending update to 85.119.80.222#53
Outgoing update query:
;; ->>HEADER<<- opcode: UPDATE, status: NOERROR, id:  19987
;; flags:; ZONE: 1, PREREQ: 0, UPDATE: 1, ADDITIONAL: 1
;; ZONE SECTION:
;acmesh.strugglers.net.         IN      SOA
 
;; UPDATE SECTION:
foo.acmesh.strugglers.net. 86400 IN     TXT     "bar"
 
;; TSIG PSEUDOSECTION:
strugglers.             0       ANY     TSIG    hmac-sha512. 1521454639 300 64 dPndp1/ZyqzmSEn0AKIsGR62HrsplJBhntWioM4oBdPlNXUIAwg7Jwpg DGSM2S3kY+5hfGTleNqwXZrMvnBhUQ== 19987 NOERROR 0 
 
 
Reply from update query:
;; ->>HEADER<<- opcode: UPDATE, status: NOERROR, id:  19987
;; flags: qr; ZONE: 1, PREREQ: 0, UPDATE: 0, ADDITIONAL: 1
;; ZONE SECTION:
;acmesh.strugglers.net.         IN      SOA
 
;; TSIG PSEUDOSECTION:
strugglers.             0       ANY     TSIG    hmac-sha512. 1521454639 300 64 NfH/78kvq6f+59RXnyJwC6kfFRLGjG6Rh9jdYRId7UjH0jwIbtRVpqCu xx4HToGmlJrDTUqpgbYZq2orUOZlkQ== 19987 NOERROR 0
 
> [Ctrl-D]

And to verify it really got added (though the status of NOERROR should be confirmation enough):

$ dig +short -t txt foo.acmesh.strugglers.net
"bar"

That it; you can do dynamic DNS updates.

acme.sh ^

I’m going to assume you’ve installed acme.sh according to one of its supported installation methods. Personally I am not into curl | sh so I:

Create a system user that can’t log in.
git clone the source.
acme.sh --install it as that user.

acme.sh doesn’t have to be run on the primary DNS server, because it’s going to use a dynamic DNS update to do all the DNS things. It just needs access to the dynamic DNS update key file. Either you can install acme.sh on each host that will need to generate/renew certificates and copy the DNS key there, or else do all the certificate generation/renewal in one place and copy the certificate files around.

However you manage it, make sure that the user you’re going to run acme.sh as can read the dynamic DNS update key file.

Issuing the first wildcard certificate ^

The first time you issue the certificate you need to set NSUPDATE_KEY and NSUPDATE_SERVER in your environment. After the first successful issuance acme.sh will store these variables in its configuration for use in the automated renewals.

$ NSUPDATE_SERVER=a.authns.bitfolk.co.uk NSUPDATE_KEY=/path/to/strugglers.key ./acme.sh --issue -d strugglers.net -d '*.strugglers.net' --challenge-alias acmesh.strugglers.net --dns dns_nsupdate
[Mon 19 Mar 09:19:00 UTC 2018] Multi domain='DNS:strugglers.net,DNS:*.strugglers.net'
[Mon 19 Mar 09:19:00 UTC 2018] Getting domain auth token for each domain
[Mon 19 Mar 09:19:03 UTC 2018] Getting webroot for domain='strugglers.net'
[Mon 19 Mar 09:19:03 UTC 2018] Getting webroot for domain='*.strugglers.net'
[Mon 19 Mar 09:19:04 UTC 2018] Found domain api file: /path/to/acmesh/dnsapi/dns_nsupdate.sh
[Mon 19 Mar 09:19:04 UTC 2018] adding _acme-challenge.acmesh.strugglers.net. 60 in txt "WmenhbXRtenhpNLYLOBjznyHcVvFk-jjxurCVTrhWc8"
[Mon 19 Mar 09:19:04 UTC 2018] Found domain api file: /path/to/acmesh/dnsapi/dns_nsupdate.sh
[Mon 19 Mar 09:19:04 UTC 2018] adding _acme-challenge.acmesh.strugglers.net. 60 in txt "fwZPUBHijOQkJJaoOF_nIn3Z_FtuVU9R635NDVz_hPA"
[Mon 19 Mar 09:19:04 UTC 2018] Sleep 120 seconds for the txt records to take effect

At this point a DNS update has been crafted and sent so you should see your zone update and zone transfer happen to any secondary servers. If that doesn’t happen within 120 seconds then when Let’s Encrypt tries to verify the challenge it might query a DNS server that doesn’t yet have the record. Your zone transfers need to be reliable.

[Mon 19 Mar 09:21:08 UTC 2018] Verifying:strugglers.net
[Mon 19 Mar 09:21:12 UTC 2018] Success
[Mon 19 Mar 09:21:12 UTC 2018] Verifying:*.strugglers.net
[Mon 19 Mar 09:21:15 UTC 2018] Success
[Mon 19 Mar 09:21:15 UTC 2018] Removing DNS records.
[Mon 19 Mar 09:21:15 UTC 2018] removing _acme-challenge.acmesh.strugglers.net. txt
[Mon 19 Mar 09:21:16 UTC 2018] removing _acme-challenge.acmesh.strugglers.net. txt
[Mon 19 Mar 09:21:16 UTC 2018] Verify finished, start to sign.
[Mon 19 Mar 09:21:18 UTC 2018] Cert success.
-----BEGIN CERTIFICATE-----
MIIFETCCA/mgAwIBAgISAz4ZQV27n1FgemVAEhIqiUZnMA0GCSqGSIb3DQEBCwUA
MEoxCzAJBgNVBAYTAlVTMRYwFAYDVQQKEw1MZXQncyBFbmNyeXB0MSMwIQYDVQQD
.
.
.
NeAmr5I=
-----END CERTIFICATE-----
[Mon 19 Mar 09:21:18 UTC 2018] Your cert is in  /path/to/acmesh/.acme.sh/strugglers.net/strugglers.net.cer 
[Mon 19 Mar 09:21:18 UTC 2018] Your cert key is in  /path/to/acmesh/.acme.sh/strugglers.net/strugglers.net.key 
[Mon 19 Mar 09:21:18 UTC 2018] The intermediate CA cert is in  /path/to/acmesh/.acme.sh/strugglers.net/ca.cer 
[Mon 19 Mar 09:21:18 UTC 2018] And the full chain certs is there:  /path/to/acmesh/.acme.sh/strugglers.net/fullchain.cer

Examining a certificate ^

Just for peace of mind…

$ openssl x509 -text -noout -certopt no_subject,no_header,no_version,no_serial,no_signame,no_subject,no_issuer,no_pubkey,no_sigdump,no_aux -in /path/to/acmesh/.acme.sh/strugglers.net/strugglers.net.cer
        Validity
            Not Before: Mar 19 08:21:17 2018 GMT
            Not After : Jun 17 08:21:17 2018 GMT
        X509v3 extensions:
            X509v3 Key Usage: critical
                Digital Signature, Key Encipherment
            X509v3 Extended Key Usage: 
                TLS Web Server Authentication, TLS Web Client Authentication
            X509v3 Basic Constraints: critical
                CA:FALSE
            X509v3 Subject Key Identifier: 
                BF:C7:8E:F5:87:05:D0:6E:15:AC:7B:37:9F:82:05:C3:E3:11:B7:32
            X509v3 Authority Key Identifier: 
                keyid:A8:4A:6A:63:04:7D:DD:BA:E6:D1:39:B7:A6:45:65:EF:F3:A8:EC:A1
 
            Authority Information Access: 
                OCSP - URI:http://ocsp.int-x3.letsencrypt.org
                CA Issuers - URI:http://cert.int-x3.letsencrypt.org/
 
            X509v3 Subject Alternative Name: 
                DNS:*.strugglers.net, DNS:strugglers.net
            X509v3 Certificate Policies: 
                Policy: 2.23.140.1.2.1
                Policy: 1.3.6.1.4.1.44947.1.1.1
                  CPS: http://cps.letsencrypt.org
                  User Notice:
                    Explicit Text: This Certificate may only be relied upon by Relying Parties and only in accordance with the Certificate Policy found at https://letsencrypt.org/repository/

From the Subject Alternative Name we can see it is a wildcard certificate.

I'll get there one day.

Updates ^

2022-11-05 ^

The Problem ^

The Answer ^

An Exim Answer ^

What about timeouts? ^

But Why? ^

Scenario ^

S3 endpoint ^

Specifying the S3 endpoint ^

The full script ^

A few words about iptables vs nft ^

Logging with iptables ^

The Problem ^

An answer: ulogd2 ^

Configuring ulogd2 ^

Configuring iptables ^

The Problem ^

The Fix(es) ^

Increment the serial a bit ^

Increment the serial a lot ^

Delete the zones and re-add them ^

Force a zone transfer ^

BIND ^

PowerDNS ^

Gather round children ^

A modest proposal ^

wats 85.119.80.232 precious? ^

NTP? ^

They want what? ^

That’s not gonna work ^

Down the rabbit hole ^

Weekly?? ^

We made a deal ^

Not plain sailing ^

Yearly report card: must try harder ^

u wot m8 ^

All good things… ^

Let’s Encrypt’s wildcard certificates ^

dns-01 ACME challenges ^

strugglers.net zone file content ^

acmesh.strugglers.net zone file content ^

DNS server configuration ^

Generate a key for dynamic DNS updates ^

Put the key in the BIND config ^

Put the new zone into the BIND config ^

Reload the DNS server ^

Check things out with nsupdate ^

acme.sh ^

Issuing the first wildcard certificate ^

Examining a certificate ^

A few words about `iptables` vs `nft` ^

Logging with `iptables` ^

An answer: `ulogd2` ^

Configuring `ulogd2` ^

Configuring `iptables` ^

wats `85.119.80.232` precious? ^

Check things out with `nsupdate` ^