Tricky issues when upgrading to the GoCardless “Pro” API

Background ^

Since 2012 BitFolk has been using GoCardless as a Direct Debit payment provider. On the whole it has been a pleasant experience:

  • Their API is a pleasure to integrate against, having excellent documentation
  • Their support is responsive and knowledgeable
  • Really good sandbox environment with plenty of testing tools
  • The fees, being 1% capped at £2.00, are pretty good for any kind of payment provider (much less than PayPal, Stripe, etc.)

Of course, if I was submitting Direct Debits myself there would be no charge at all, but BitFolk is too small and my bank (Barclays) are not interested in talking to me about that.

The “Pro” API ^

In September 2014 GoCardless came out with a new version of their API called the “Pro API”. It made a few things nicer but didn’t come with any real new features applicable to BitFolk, and also added a minimum fee of £0.20.

The original API I’d integrated against has a 1% fee capped at £2.00, and as BitFolk’s smallest plan is £10.79 including VAT the fee would generally be £0.11. Having a £0.20 fee on these payments would represent nearly a doubling of fees for many of my payments.

So, no compelling reason to use the Pro API.

Over the years, GoCardless made more noise about their Pro API and started calling their original API the “legacy API”. I could see the way things were going. Sure enough, eventually they announced that the legacy API would be disabled on 31 October 2017. No choice but to move to the Pro API now.

Payment caps ^

There aren’t normally any limits on Direct Debit payments. When you let your energy supplier or council or whatever do a Direct Debit, they can empty your bank account if they like.

The Direct Debit Guarantee has very strong provisions in it for protecting the payee and essentially if you dispute anything, any time, you get your money back without question and the supplier has to pursue you for the money by other means if they still think the charge was correct. A company that repeatedly gets Direct Debit chargebacks is going to be kicked off the service by their bank or payment provider.

The original GoCardless API had the ability to set caps on the mandate which would be enforced their side. A simple “X amount per Y time period”. I thought that this would provide some comfort to customers who may not be otherwise familiar with authorising Direct Debits from small companies like BitFolk, so I made use of that feature by default.

This turned out to be a bad decision.

The main problem with this was that there was no way to change the cap. If a customer upgraded their service then I’d have to cancel their Direct Debit mandate and ask them to authorise a new one because it would cease being possible to charge them the correct amount. Authorising a new mandate was not difficult—about the same amount of work as making any sort of online payment—but asking people to do things is always a pain point.

There was a long-standing feature request with GoCardless to implement some sort of “follow this link to authorise the change” feature, but it never happened.

Payment caps and the new API ^

The Pro API does not support mandates with a capped amount per interval. Given that I’d already established that it was a mistake to do that, I wasn’t too bothered about that.

I’ve since discovered however that the Pro API not only does not support setting the caps, it does not have any way to query them either. This is bad because I need to use the Pro API with mandates that were created in the legacy API. And all of those have caps.

Here’s the flow I had using the legacy API.

Legacy payment process

This way if the charge was coming a little too early, I could give some latitude and let it wait a couple of days until it could be charged. I’d also know if the problem was that the cap was too low. In that case there would be no choice but to cancel the customer’s mandate and ask them to authorise another one, but at least I would know exactly what the problem was.

With the Pro API, there is no way to check timings and charge caps. All I can do is make the charge, and then if it’s too soon or too much I get the same error message:

“Validation failed / exceeds mandate cap”

That’s it. It doesn’t tell me what the cap is, it doesn’t tell me if it’s because I’m charging too soon, nor if I’m charging too much. There is no way to distinguish between those situations.

Backwards compatible – sort of ^

GoCardless talk about the Pro API being backwards compatible to the legacy API, so that once switched I would still be able to create payments against mandates that were created using the legacy API. I would not need to get customers to re-authorise.

This is true to a point, but my use of caps per interval in the legacy API has severely restricted how compatible things are, and that’s something I wasn’t aware of. Sure, their “Guide to upgrading” does briefly mention that caps would continue to be enforced:

“Pre-authorisation mandates are not restricted, but the maximum amount and interval that you originally specified will still apply.”

That is the only mention of this issue in that entire document, and that statement would be fine by me, if there would have continued to be a way to tell which failure mode would be encountered.

Thinking that I was just misunderstanding, I asked GoCardless support about this. Their reply:

Thanks for emailing.

I’m afraid the limits aren’t exposed within the new API. The only solution as you suggest, is to try a payment and check for failure.

Apologies for the inconvenience caused here and if you have any further queries please don’t hesitate to let us know.

What now? ^

I am not yet sure of the best way to handle this.

The nuclear option would be to cancel all mandates and ask customers to authorise them again. I would like to avoid this if possible.

I am thinking that most customers continue to be fine on the “amount per interval” legacy mandates as long as they don’t upgrade, so I can leave them as they are until that happens. If they upgrade, or if a DD payment ever fails with “exceeds mandate cap” then I will have to cancel their mandate and ask them to authorise again. I can see if their mandate was created before ~today and advise them on the web site to cancel it and authorise it again.

Conclusion ^

I’m a little disappointed that GoCardless didn’t think that there would need to be a way to query mandate caps even though creating new mandates with those limits is no longer possible.

I can’t really accept that there is a good level of backwards compatibility here if there is a feature that you can’t even tell is in use until it causes a payment to fail, and even then you can’t tell which details of that feature cause the failure.

I understand why they haven’t just stopped honouring the caps: it wouldn’t be in line with the consumer-focused spirit of the Direct Debit Guarantee to alter things against customer expectations, and even sending out a notification to the customer might not be enough. I think they should have gone the other way and allowed querying of things that they are going to continue to enforce, though.

Could I have tested for this? Well, the difficulty there is that the GoCardless sandbox environment for the Pro API starts off clean with no access to any of your legacy activity neither from live nor from legacy sandbox. So I couldn’t do something like the following:

  1. Create legacy mandate in legacy sandbox, with amount per interval caps
  2. Try to charge against the legacy mandate from the Pro API sandbox, exceeding the cap
  3. Observe that it fails but with no way to tell why

I did note that there didn’t seem to be attributes of the mandate endpoint that would let me know when it could be charged and what the amount left to charge was, but it didn’t set off any alarm bells. Perhaps it should have.

Also I will admit I’ve had years to switch to Pro API and am only doing it now when forced. Perhaps if I had made a start on this years ago, I’d have noted what I consider to be a deficiency, asked them to remedy it and they might have had time to do so. I don’t actually think it’s likely they would bump the API version for that though. In my defence, as I mentioned, there is nothing attractive about the Pro API for my use, and it does cost more, so no surprise I’ve been reluctant to explore it.

So, if you are scrambling to update your GoCardless integration before 31 October, do check that you are prepared for payments against capped mandates to fail.

Removing dead tracks from your Banshee library

I used MusicBrainz Picard to reorganize hundreds of the files in my music collection. For some reason Banshee spotted that new files appeared, but it didn’t remove the old ones from the library. My library had hundreds of entries relating to files that no longer existed on disk.

I would have hoped that Banshee would just skip when it got to a playlist entry that didn’t exist, but unfortunately (as of v2.4.1 in Ubuntu 12.04) it decides to stop playing at that point. I have to switch to it and double click on the next playlist entry, and even then it thinks it is playing the file that it tried to play before. So I have to double click again.

I have filed a bug about this.

In the mean time I wanted to remove all the dead entries from my library. I could delete my library and re-import it all, but I wanted to keep some of the metadata. I knew it was an SQLite database, so I poked around a bit and came up with this. It seems to work, so maybe it will be useful for someone else.

Scanning for open recursive DNS resolvers

A few days ago we unfortunately had some abuse reports regarding customers with DNS resolvers being abused in order to participate in a distributed denial of service attack.

Amongst other issues, DNS servers which are misconfigured to allow arbitrary hosts to do recursive queries through them can be used by attackers to launch an amplified attack on a forged source address.

I try to scan our address space reasonably often but I must admit I hadn’t done so for some time. I kicked off another scan and found one more customer with a misconfigured resolver, which has since been fixed.

After mentioning that I would do a scan I was asked how I do that.

I use a Perl script I’ve hacked together over the last couple of years. I took a few minutes to tidy it up and add a small amount of documentation (run it with --man to read that), so here it is in case anyone finds it useful:

Update: This code has now been moved to GitHub. If you have any comments, problems or improvements please do submit them as an issue and I will get to it much quicker. The gist below is now out of date so please don’t use it.

Using the default 100 concurrent queries it scans a /21 in about 80 seconds (YMMV depending upon how many hosts you have that firewall 53/UDP). That scales sort of linearly with how many you do, so using -q 200 for example will cut that down to about 40 seconds. It’s only a select loop though so it’ll use more CPU if you do that.

Two things I’ve noticed since:

  • It doesn’t handle failing to create a socket with bgsend so for example if you run up against your limit of file descriptors (commonly ~1024 on Linux) the whole thing will get stuck at 100% CPU.
  • One person reporting a similar situation (bgsend fails, stuck at 100% CPU) when they allowed it to try to send to a broadcast address. I haven’t been ale to replicate that one yet.

Converting an IPv6 address to its reverse zone in Perl

I’m needing to work out the IPv6 reverse zone for a given IPv6 CIDR prefix, that is a prefix with number of bits in the network on the end after a forward slash. e.g.:

  • 2001:ba8:1f1:f004::/64 → 4.0.0.f.1.f.1.0.8.a.b.0.1.0.0.2.ip6.arpa
  • 4:2::/32 → 2.0.0.0.4.0.0.0.ip6.arpa
  • 2001:ba8:1f1:400::/56 → 0.0.4.0.1.f.1.0.8.a.b.0.1.0.0.2.ip6.arpa

I had a quick look for a module that does it, but couldn’t find one, so I hacked this subroutine together:

Is there a more elegant way? Is there a module I can replace this with?

Must support:

  • Arbitrary prefix length
  • Use of ‘::’ anywhere legal in the address

Adding watermarks to a PDF with Perl’s PDF::API2

For ages I’ve been trying to work out how to programmatically add a watermark to an existing PDF using the Perl PDF::API2 module.

In case it’s useful for anyone else, here goes:

use warnings;
use strict;

use PDF::API2;

my $ifile = "/some/file.pdf";
my $ofile = "/some/newfile.pdf";
my $pdf   = PDF::API2->open($ifile);
my $fnt   = $pdf->ttfont('verdana.ttf', -encode => 'utf8');
# Only on the front page
my $page  = $pdf->openpage(1);
my $text  = $page->text;

$page->mediabox('A4');

# Create two extended graphic states; one for transparent content
# and one for normal content
my $eg_trans = $pdf->egstate();
my $eg_norm  = $pdf->egstate();

$eg_trans->transparency(0.7);
$eg_norm->transparency(0);

# Go transparent
$text->egstate($eg_trans);
# Black edges
$text->strokecolor('#000000');
# With light green fill
$text->fillcolor('#d3e6d2');
# Text render mode: fill text first then put the edges on top
$text->render(2);
# Diagonal transparent text
$text->textlabel(55, 235, $fnt, 56, "i'm in ur pdf", -rotate => 30);
# Back to non-transparent to do other stuff
$text->egstate($eg_norm);

# …

# Save
$pdf->saveas($ofile);
$pdf->end;

I think there may be better ways of doing this within Perl now, but I already had some code using PDF::API2.

Copying block devices between machines

Having a bunch of Linux servers that run Linux virtual machines I often find myself having to move a virtual machine from one server to another. The tricky thing is that I’m not in a position to be using shared storage, i.e., the virtual machines’ storage is local to the machine they are running on. So, the data has to be moved first.

A naive approach ^

The naive approach is something like the following:

  1. Ensure that I can SSH as root using SSH key from the source host to the destination host.
  2. Create a new LVM logical volume on the destination host that’s the same size as the origin.
  3. Shut down the virtual machine.
  4. Copy the data across using something like this:
    $ sudo dd bs=4M if=/dev/mapper/myvg-src_lv |
      sudo ssh root@dest-host 'dd bs=4M of=/dev/mapper/myvg-dest_lv'
    
  5. While that is copying, do any other configuration transfer that’s required.
  6. When it’s finished, start up the virtual machine on the destination host.

I also like to stick pv in the middle of that pipeline so I get a nice text mode progress bar (a bit like what you see with wget):

$ sudo dd bs=4M if=/dev/mapper/myvg-src_lv | pv -s 10g |
  sudo ssh root@dest-host 'dd bs=4M of=/dev/mapper/myvg-dest_lv'

The above transfers data between hosts via ssh, which will introduce some overhead since it will be encrypting everything. You may or may not wish to force it to do compression, or pipe it through a compressor (like gzip) first, or even avoid ssh entirely and just use nc.

Personally I don’t care about the ssh overhead; this is on the whole customer data and I’m happier if it’s encrypted. I also don’t bother compressing it unless it’s going over the Internet. Over a gigabit LAN I’ve found it fastest to use ssh with the -c arcfour option.

The above process works, but it has some fairly major limitations:

  1. The virtual machine needs to be shut down for the whole time it takes to transfer data from one host to another. For 10GiB of data that’s not too bad. For 100GiB of data it’s rather painful.
  2. It transfers the whole block device, even the empty bits. For example, if it’s a 10GiB block device with 2GiB of data on it, 10GiB still gets transferred.

Limitation #2 can be mitigated somewhat by compressing the data. But we can do better.

LVM snapshots ^

One of the great things about LVM is snapshots. You can do a snapshot of a virtual machine’s logical volume while it is still running, and transfer that using the above method.

But what do you end up with? A destination host with an out of date copy of the data on it, and a source host that is still running a virtual machine that’s still updating its data. How to get just the differences from the source host to the destination?

Again there is a naive approach, which is to shut down the virtual machine and mount the logical volume on the host itself, do the same on the destination host, and use rsync to transfer the differences.

This will work, but again has major issues such as:

  1. It’s technically possible for a virtual machine admin to maliciously construct a filesystem that interferes with the host that mounts it. Mounting random filesystems is risky.
  2. Even if you’re willing to risk the above, you have to guess what the filesystem is going to be. Is it ext3? Will it have the same options that your host supports? Will your host even support whatever filesystem is on there?
  3. What if it isn’t a filesystem at all? It could well be a partitioned disk device, which you can still work with using kpartx, but it’s a major pain. Or it could even be a raw block device used by some tool you have no clue about.

The bottom line is, it’s a world of risk and hassle interfering with the data of virtual machines that you don’t admin.

Sadly rsync doesn’t support syncing a block device. There’s a --copy-devices patch that allows it to do so, but after applying it I found that while it can now read from a block device, it would still only write to a file.

Next I found a --write-devices patch by Darryl Dixon, which provides the other end of the functionality – it allows rsync to write to a block device instead of files in a filesystem. Unfortunately no matter what I tried, this would just send all the data every time, i.e., it was no more efficient than just using dd.

Read a bit, compare a bit ^

While searching about for a solution to this dilemma, I came across this horrendous and terrifying bodge of shell and Perl on serverfault.com:

ssh -i /root/.ssh/rsync_rsa $remote "
  perl -'MDigest::MD5 md5' -ne 'BEGIN{\$/=\1024};print md5(\$_)' $dev2 | lzop -c" |
  lzop -dc | perl -'MDigest::MD5 md5' -ne 'BEGIN{$/=\1024};$b=md5($_);
    read STDIN,$a,16;if ($a eq $b) {print "s"} else {print "c" . $_}' $dev1 | lzop -c |
ssh -i /root/.ssh/rsync_rsa $remote "lzop -dc |
  perl -ne 'BEGIN{\$/=\1} if (\$_ eq\"s\") {\$s++} else {if (\$s) {
    seek STDOUT,\$s*1024,1; \$s=0}; read ARGV,\$buf,1024; print \$buf}' 1<> $dev2"

Are you OK? Do you need to have a nice cup of tea and a sit down for a bit? Yeah. I did too.

I’ve rewritten this thing into a single Perl script so it’s a little bit more readable, but I’ll attempt to explain what the above abomination does.

Even though I do refer to this script in unkind terms like “abomination”, I will be the first to admit that I couldn’t have come up with it myself, and that I’m not going to show you my single Perl script version because it’s still nearly as bad. Sorry!

It connects to the destination host and starts a Perl script which begins reading the block device over there, 1024 bytes at a time, running that through md5 and piping the output to a Perl script running locally (on the source host).

The local Perl script is reading the source block device 1024 bytes at a time, doing md5 on that and comparing it to the md5 hashes it is reading from the destination side. If they’re the same then it prints “s” otherwise it prints “c” followed by the actual data from the source block device.

The output of the local Perl script is fed to a third Perl script running on the destination. It takes the sequence of “s” or “c” as instructions on whether to skip 1024 bytes (“s”) of the destination block device or whether to take 1024 bytes of data and write it to the destination block device (“c<1024 bytes of data>“).

The lzop bits are just doing compression and can be changed for gzip or omitted entirely.

Hopefully you can see that this is behaving like a very very dumb version of rsync.

The thing is, it works really well. If you’re not convinced, run md5sum (or sha1sum or whatever you like) on both the source and destination block devices to verify that they’re identical.

The process now becomes something like:

  1. Take an LVM snapshot of virtual machine block device while the virtual machine is still running.
  2. Create suitable logical volume on destination host.
  3. Use dd to copy the snapshot volume to the destination volume.
  4. Move over any other configuration while that’s taking place.
  5. When the initial copy is complete, shut down the virtual machine.
  6. Run the script of doom to sync over the differences from the real device to the destination.
  7. When that’s finished, start up the virtual machine on the destination host.
  8. Delete snapshot on source host.

1024 bytes seemed like rather a small buffer to be working with so I upped it to 1MiB.

I find that on a typical 10GiB block device there might only be a few hundred MiB of changes between snapshot and virtual machine shut down. The entire device does have to be read through of course, but the down time and data transferred is dramatically reduced.

There must be a better way ^

Is there a better way to do this, still without shared storage?

It’s getting difficult to sell the disk capacity that comes with the number of spindles I need for performance, so maybe I could do something with DRBD so that there’s always another server with a copy of the data?

This seems like it should work, but I’ve no experience of DRBD. Presumably the active node would have to be using the /dev/drbdX devices as disks. Does DRBD scale to having say 100 of those on one host? It seems like a lot of added complexity.

I’d love to hear any other ideas.

Setting envelope sender when using Mail::Send

I just thought I would blog this so that people can find it in a search, as I could not when I tried.

I have a perl script on a machine that sends out transactional emails with the from address like billing@example.com. The machine it sends from is not example.com. Let’s call it admin.foobar.example.com. Since some misguided people think that sender verification callouts are a good idea, their MX hosts try to connect to admin.foobar to see if the billing address is there and accepting mails. This fails because admin.foobar does not run a public MTA and I do not want it to run one either.

It’s clear that the correct thing is for the script to be setting its envelope sender to billing@example.com since billing@example.com should be an email address that does actually work! How to do it with perl’s Mail::Send, however, is not so clear.

After trying to work this out myself for a few hours I asked dg, who is not just a perl monger but a veritable perl Tesco, making mere perl mongers compromise on their margins and putting many of them completely out of business. Every little helps.

Based on his reading of the source of Mail/Util.pm, he thought it might be accomplished by setting the MAILADDRESS environment variable. He was wrong however:


<dg> oh, that only works for SMTP
<dg> how crap

The next suggestion was a no-go also, but third time lucky:


<dg> actually, $msg->open("sendmail", "-f ...");

and that works. So there you go.

Thanks dg.