Recovering From an Exif Disaster

The Discovery ^

Sometime in late December (2019) I noticed that when I clicked on a tag in Shotwell, the photo management software that I use, it was showing either zero or hardly any matching photos when I knew for sure that there should be many more.

(When I say “tag” in this article it’s mostly going to refer to the type of tags you generally put on an image, i.e. the tags that identify who or what is in the image, what event it is associated with, the place it was taken etc. Images can have many different kinds of tags containing all manner of metadata, but for avoidance of doubt please assume that I don’t mean any of those.)

I have Shotwell set to store the tags in the image files themselves, in the metadata. There is a standard for this called Exif. What seems to have happened is that Shotwell had removed a huge number of tags from the files themselves. At the time of discovery I had around 15,500 photos in my collection and it looked like the only way to tell what was in them would be by looking at them. Disaster.

Here follows some notes about what I found out when trying to recover from this situation, in case it si ever useful for anyone.

Shotwell still had a visible tag hierarchy, so I could for example click on the “Pets/Remy” tag, but this brought up only one photo that I took on 14 December 2019. I’ve been taking photos of Remy for years so I knew there should be many more. Here’s Remy.

Remy at The Avenue Ealing Christmas Fair, December 2019
Remy at The Avenue Ealing Christmas Fair

Luckily, I have backups.

Comparing Good and Bad Copies of a Photo ^

I knew this must have happened fairly recently because I’d have noticed quite quickly that photos were “missing”. I had a look for a recent photo that I knew I’d tagged with a particular thing, and then looked in the backups to see when it was last modified.

As an example I found a photo that was taken on 30 October 2019 that should have been tagged “Pets/Violet” but no longer was. It had been modified (but not by me) on 7 December 2019.

A broken photo of Violet
A broken photo of Violet

(Sorry about the text-as-images; I’m reconstructing this series of events from a Twitter thread, where things necessarily had to be posted as screenshots.)

What the above shows is that the version of the photo that existed on 30 October 2019 had the tags “Pets“, “Edna“, and “Violet” but then the version that was written on 7 December 2019 lost the “Violet” tag.

Here I used the exiftool utility to display EXIF tags from the photo files. You can do that like this:

$ exiftool -s $filename

Using egrep I limited this to the tag keys “Subject“, “Keywords“, and “TagsListLastKeywordXMP” but this was a slight mistake: “TagsListLastKeywordXMP” was actually a typo, is totally irrelevant and should be ignored.

Subject” and “Keywords” were always identical for any photo I examined and contained the flattened list of tags. For example, in Shotwell that photo originally had the tags:

  • Pets/Edna
  • Pets/Violet

It seems that Shotwell flattens that to:

  • Pets
  • Edna
  • Violet

and then stores it in “Subject” and “Keywords“.

The tags with hierarchy are actually in the key “TagsList” like:

  • Pets
  • Pets/Edna
  • Pets/Violet

Fixing One Photo ^

I tested stuffing the tag “Violet” back in to this file under the keys “Subject” and “Keywords“:

$ exiftool -keywords+="…" -subject+="…" $filename

Stuffing the Violet tag back in
Stuffing the Violet tag back in

This shows that the “Violet” tag is now back in the current version of the file. After restarting Shotwell and doing a free text search for “Violet”, this photo now shows up whereas before it did not. It still did not show up when I clicked on “Pets/Violet” in the tag hierarchy however. It was then that I realised I also needed to put “Pets/Violet” into the “TagsList” key.

I ended up using a script to do this in bulk fashion, but individually I think you should be able to do this like:

$ exiftool -keywords+=Violet -subject+=Violet -TagsList+=Pets/Violet

After restarting Shotwell I was able to click on the “Pets/Violet” tag and see this photo.

Fixing All the Photos? ^

My process to recover from this, then, was to compile a list of each file that had been modified at the suspected time of disaster, and for each:

  1. Read the list of tags from “Keywords
  2. Read the list of tags from “Subject
  3. De-duplicate them and store them as $keywords
  4. Read the list of tags from “TagsList” and store them as $tagslist
  5. Stuff $keywords back into both “Subject” and “Keywords” of the current version of the file

Gulp.

Which files were tampered with? ^

It was relatively easy to work out which files had been screwed with, because thankfully I didn’t make any other photo modifications on 7 December 2019. So any photo that got modified that day was probably a candidate.

I haven’t mentioned what actually caused this problem yet. I don’t know exactly. At 16:53 on 7 December 2019 I was importing some photos into Shotwell, and I do seem to recall it crashed at some point, either while I was doing that or shortly after.

The photos from that import and all others afterwards had retained their tags correctly, but many that existed prior to that time seemed to be missing some or all tags. I have no idea why such a crash would cause Shotwell to do that but that must have been what did it.

Running this against my backups identified 3,721 files that had been modified on 7 December 2019:

$ cd weekly.2/specialbrew.21tc.bitfolk.com/srv/tank/Photos/Andy
$ find . -type f \
  -newermt "2019-12-07 00:00:00" \! \
  -newermt "2019-12-07 23:59:59" > ~/busted.txt

The next thing I did was to check that each of these file paths still exist in the current photo store and in the known-good backups (weekly.3).

Extract tags from known-good copies ^

Next up, I wrote a script which:

  1. Goes to the known-good copies of the files
  2. Extracts the Subject and Keywords and deduplicates them
  3. Extracts the TagsList
  4. Writes it all into a hash
  5. Dumps that out as a YAML file

All scripts mentioned here script use the Perl module Image::ExifTool which is part of the exiftool package.

backup_host$ ./gather_tags.pl < ~/busted.txt > ~/tags.yaml

tags.yaml looks a bit like this:

---
2011/01/16/16012011163.jpg:
  keywords:
  - Hatter
  - Pets
  tagslist:
  - Pets
  - Pets/Hatter
[]
2019/11/29/20191129_095218~2.jpg:
  keywords:
  - Bedfont Lakes
  - Feltham
  - London
  - Mandy
  - Pets
  - Places
  tagslist:
  - Pets
  - Pets/Mandy
  - Places
  - Places/London
  - Places/London/Feltham
  - Places/London/Feltham/Bedfont Lakes

Stuff tags back into current versions of photos ^

After transferring tags.yaml back to my home fileserver it was time to use it to stuff the tags back into the files that had lost them.

One thing to note while doing this is that if you just add a tag, it adds it even if the same tag already exists, leading to duplicates. I thought it best to first delete the tag and then add it again so that there would only be one instance of each one.

I called that one fix_tags.pl.

$ ./fix_tags.pl tags.yaml

Profit! Or, only slight loss, I guess ^

16m53s of runtime later, it had completed its work… 🙌 2020 will definitely be the year of Linux on the desktop¹.

¹ As long as you know how to manipulate EXIF tags from a programming language and have a functioning backup system and even then don’t mind losing some stuff

Losing some stuff…? ^

Unfortunately there were some things I couldn’t restore. It was at this point that I discovered that Shotwell does not ever put tags into video files (even though they do support EXIF tags…)

That means that the only record of the tags on a video file is in Shotwell’s own database, which I did not back up as I didn’t think I needed to.

Getting Tags Out of Shotwell ^

I am now backing that up, but should this sort of thing happen in the future I’d need to know how to manipulate the tags for videos in Shotwell’s database.

Shotwell’s database is an SQLite file that’s normally at $HOME/.local/share/shotwell/data/photo.db. I’m fairly familiar with SQLite so I had a poke around, but couldn’t immediately see how these tags were stored. I had to ask on the Shotwell mailing list.

Here’s how Shotwell does it. There’s a table called TagTable which stores the name of each tag and a comma-separated list of every photo/video which matches it:

sqlite> .schema TagTable 
CREATE TABLE TagTable (id INTEGER PRIMARY KEY, name TEXT UNIQUE NOT NULL, photo_id_list TEXT, time_created INTEGER);

The photo_id_list column holds the comma-separated list. Each item in the list is of the form:

  1. “thumb” or “video-” depending on whether the item is a photo or a video
  2. 16 hex digits, zero padded, which is the ID value from the PhotosTable or VideosTable for that item
  3. a comma

Full example of extracting tags for the video file 2019/12/31/20191231_121604.mp4:

$ sqlite3 /home/andy/.local/share/shotwell/DATA/photo.db
SQLite version 3.22.0 2018-01-22 18:45:57
Enter ".help" FOR usage hints.
sqlite> SELECT id
        FROM VideoTable
        WHERE filename LIKE '%20191231%';
553
sqlite> SELECT printf("%016x", 553);
0000000000000229
sqlite> SELECT name
        FROM TagTable
        WHERE photo_id_list LIKE '%video-0000000000000229,%';
/Places
/Places/London
/Places/London/Feltham
/Pets
/Places/London/Feltham/Bedfont Lakes
/Pets/Marge
/Pets/Mandy

If that is not completely clear:

  • The ID for that video file is 553
  • 553 in hexadecial is 229
  • Pad that to 16 digits, add “video-” at the front and “.” at the end (even the last item in the list has a comma at the end)
  • Search for that string in photo_id_list
  • If a row matches then the name column is a tag that is attached to that file

I don’t exactly know how I would have identified which videos got messed with, but at least I would have had both versions of the database to compare, and I now know how I would do the comparison.

Should Tags Even Be In Photos? ^

During my Twitter thread it was suggested to me that tags should not be stored in photos, but only in the photo cataloging software, where they can be backed up along with everything else.

I disagree with this for several reasons:

  • Exif exists for the purpose of storing tags like this.

  • When I move my photos from one piece of software to another I want it to be able to read the tags. I don’t want to have to input them all over again. That would be unimaginably tedious.

    When I moved from F-Spot to Shotwell the fact that the tags were in the files saved me countless hours of work. It just worked on import.

    If there wasn’t a dedicated importer feature then it would be so much work that really the only way to do it would be to extract the tags from the database and insert them again programmatically, which is basically admitting that to change software you need to be an expert. That really isn’t how this should work.

  • If the only copy of my tags is in the internal database of a unique piece of cataloging software, then I have to become an expert on the internal data store of that piece of software. I don’t want to have to do that.

    I’ve been forced to do that here for Shotwell because of a deficiency of Shotwell in not storing video tags in the files. But if we’re only talking about photos then I could have avoided it, and could also avoid having to be an expert on every future piece of cataloging software.

  • Even if I’m not moving to a different cataloging solution, lots of software understands Exif and it’s useful to be able to query those things from other software.

    I regard it very much like artist, album, author, genre etc tags in the metadata of digital music and ebooks, all of which are in the files; you would not expect to have to reconstruct these out of the database of some other bit of software every time you wanted to use them elsewhere.

It was a mistake not to backup the Shotwell database though; I thought I did not need it as I thought all tags were being stored in files, and tags were the only things I cared about. As it happened, tags were not being stored in video files and tags for video files only exist in Shotwell’s database.

Other Thoughts ^

Having backups was obviously a lifesaver here. It took me ~3 weeks to notice.

Being able to manipulate them like a regular filesystem made things a lot more convenient, so that’s a property I will want to keep in whatever future backup arrangements I have.

I might very well switch to different photo management software now, assuming I could find any that I prefer, but all software has bugs. Whatever I switch to I would have to ensure that I knew how to extract the tags from that as well, if it doesn’t store them in the files.

I don’t want to store my photos and videos “in the cloud” but it is a shortcoming of Shotwell that I can basically only use it from my desktop at home. Its database does not support multiple or remote access. I wonder if there is some web-based thing that can just read (and cache) the tags out of the files, build dynamic galleries and allow arbitrary searches on them…

Shotwell’s database schema and its use of 16 hexadecimal digits (nibbles?) means I can only store a maximum of 18,446,744,073,709,551,615 (1.844674407×10¹⁹ -1) photos or videos of dogs. Arbitrary limits suck so much.

Greyhounds Marge, Janti and Will at Sainsbury's Staines with Wimbledon Greyhound Welfare, December 2019
Marge, Janti and Will at Sainsbury’s Staines with Wimbledon Greyhound Welfare, December 2019

Google App Engine started requiring Content-Length header on POST requests

TL;DR ^

Update: It’s GoCardless who moved api.gocardless.com to Google Cloud. Google Cloud has behaved this way for years.

I think that Google App Engine may have recently started requiring every POST request to have a Content-Length header, even if there is no request body.

That will cause you problems if your library doesn’t add one for POST requests that have no content. Perl’s HTTP::Request is one such library.

You might be experiencing this if an API has just started replying to you with:

Error 411 (Length Required)!!1

411.That’s an error.

POST requests require a Content-length header. That’s all we know.

(Yes, the title does contain “!!1”.)

You can fix it by adding the header yourself, e.g.:

my $ua = LWP::UserAgent->new;
 
my $req = HTTP::Request->new(
    POST => 'https://api.example.com/things/$id/actions/fettle'
);
 
$req->header('Accept' => 'application/json');
$req->content_type('application/json');
 
my $json;
$json = JSON->new->utf8->canonical->encode($params) if $params;
 
$req->content($json) if $json;
# Explicitly set Content-Length to zero as HTTP::Request doesn't add one
# when there's no content.
$req->header( 'Content-Length' => 0 ) unless $json;
 
my $res = $ua->request( $req );

This is a bit far outside of my comfort zone so I’m not sure if I’m 100% correct, but I do know that sending the header fixes things for me.

What happened? ^

Yesterday a BitFolk customer tried to cancel their Direct Debit mandate, and it didn’t work. The server logs contained the above message.

For Direct Debit payments we use the Perl module Business::GoCardless for integrating with GoCardless, but the additional HTML styling in the message (which I’ve left out for brevity) made clear that the message was coming from Google. api.gocardless.com is hosted on Google App Engine (or some bit of Google cloud anyway).

After a bit of debugging I established that HTTP::Request was only setting a Content-Length header when there was actually request content. The API for cancelling a Direct Debit mandate is to send an empty POST to https://api.gocardless.com//mandates/$id/actions/cancel.

Adding Content-Length: 0 makes it work again.

When did it change? ^

There was a successful mandate cancellation on 25 October 2018, so some time between then and 12 December 2018. I haven’t looked for any change notice put out by Google as I’m not a Google Cloud user and wouldn’t know where to look.

Who’s to blame ^

I haven’t yet looked into whether the HTTP standard requires POST requests to have a Content-Length header. I welcome comments from someone who wants to do the digging.

Realistically even if it doesn’t and Google is just being overly strict, other servers might also be strict, so I guess HTTP::Request should always send the header.

Tricky issues when upgrading to the GoCardless “Pro” API

Background ^

Since 2012 BitFolk has been using GoCardless as a Direct Debit payment provider. On the whole it has been a pleasant experience:

  • Their API is a pleasure to integrate against, having excellent documentation
  • Their support is responsive and knowledgeable
  • Really good sandbox environment with plenty of testing tools
  • The fees, being 1% capped at £2.00, are pretty good for any kind of payment provider (much less than PayPal, Stripe, etc.)

Of course, if I was submitting Direct Debits myself there would be no charge at all, but BitFolk is too small and my bank (Barclays) are not interested in talking to me about that.

The “Pro” API ^

In September 2014 GoCardless came out with a new version of their API called the “Pro API”. It made a few things nicer but didn’t come with any real new features applicable to BitFolk, and also added a minimum fee of £0.20.

The original API I’d integrated against has a 1% fee capped at £2.00, and as BitFolk’s smallest plan is £10.79 including VAT the fee would generally be £0.11. Having a £0.20 fee on these payments would represent nearly a doubling of fees for many of my payments.

So, no compelling reason to use the Pro API.

Over the years, GoCardless made more noise about their Pro API and started calling their original API the “legacy API”. I could see the way things were going. Sure enough, eventually they announced that the legacy API would be disabled on 31 October 2017. No choice but to move to the Pro API now.

Payment caps ^

There aren’t normally any limits on Direct Debit payments. When you let your energy supplier or council or whatever do a Direct Debit, they can empty your bank account if they like.

The Direct Debit Guarantee has very strong provisions in it for protecting the payee and essentially if you dispute anything, any time, you get your money back without question and the supplier has to pursue you for the money by other means if they still think the charge was correct. A company that repeatedly gets Direct Debit chargebacks is going to be kicked off the service by their bank or payment provider.

The original GoCardless API had the ability to set caps on the mandate which would be enforced their side. A simple “X amount per Y time period”. I thought that this would provide some comfort to customers who may not be otherwise familiar with authorising Direct Debits from small companies like BitFolk, so I made use of that feature by default.

This turned out to be a bad decision.

The main problem with this was that there was no way to change the cap. If a customer upgraded their service then I’d have to cancel their Direct Debit mandate and ask them to authorise a new one because it would cease being possible to charge them the correct amount. Authorising a new mandate was not difficult—about the same amount of work as making any sort of online payment—but asking people to do things is always a pain point.

There was a long-standing feature request with GoCardless to implement some sort of “follow this link to authorise the change” feature, but it never happened.

Payment caps and the new API ^

The Pro API does not support mandates with a capped amount per interval. Given that I’d already established that it was a mistake to do that, I wasn’t too bothered about that.

I’ve since discovered however that the Pro API not only does not support setting the caps, it does not have any way to query them either. This is bad because I need to use the Pro API with mandates that were created in the legacy API. And all of those have caps.

Here’s the flow I had using the legacy API.

Legacy payment process

This way if the charge was coming a little too early, I could give some latitude and let it wait a couple of days until it could be charged. I’d also know if the problem was that the cap was too low. In that case there would be no choice but to cancel the customer’s mandate and ask them to authorise another one, but at least I would know exactly what the problem was.

With the Pro API, there is no way to check timings and charge caps. All I can do is make the charge, and then if it’s too soon or too much I get the same error message:

“Validation failed / exceeds mandate cap”

That’s it. It doesn’t tell me what the cap is, it doesn’t tell me if it’s because I’m charging too soon, nor if I’m charging too much. There is no way to distinguish between those situations.

Backwards compatible – sort of ^

GoCardless talk about the Pro API being backwards compatible to the legacy API, so that once switched I would still be able to create payments against mandates that were created using the legacy API. I would not need to get customers to re-authorise.

This is true to a point, but my use of caps per interval in the legacy API has severely restricted how compatible things are, and that’s something I wasn’t aware of. Sure, their “Guide to upgrading” does briefly mention that caps would continue to be enforced:

“Pre-authorisation mandates are not restricted, but the maximum amount and interval that you originally specified will still apply.”

That is the only mention of this issue in that entire document, and that statement would be fine by me, if there would have continued to be a way to tell which failure mode would be encountered.

Thinking that I was just misunderstanding, I asked GoCardless support about this. Their reply:

Thanks for emailing.

I’m afraid the limits aren’t exposed within the new API. The only solution as you suggest, is to try a payment and check for failure.

Apologies for the inconvenience caused here and if you have any further queries please don’t hesitate to let us know.

What now? ^

I am not yet sure of the best way to handle this.

The nuclear option would be to cancel all mandates and ask customers to authorise them again. I would like to avoid this if possible.

I am thinking that most customers continue to be fine on the “amount per interval” legacy mandates as long as they don’t upgrade, so I can leave them as they are until that happens. If they upgrade, or if a DD payment ever fails with “exceeds mandate cap” then I will have to cancel their mandate and ask them to authorise again. I can see if their mandate was created before ~today and advise them on the web site to cancel it and authorise it again.

Conclusion ^

I’m a little disappointed that GoCardless didn’t think that there would need to be a way to query mandate caps even though creating new mandates with those limits is no longer possible.

I can’t really accept that there is a good level of backwards compatibility here if there is a feature that you can’t even tell is in use until it causes a payment to fail, and even then you can’t tell which details of that feature cause the failure.

I understand why they haven’t just stopped honouring the caps: it wouldn’t be in line with the consumer-focused spirit of the Direct Debit Guarantee to alter things against customer expectations, and even sending out a notification to the customer might not be enough. I think they should have gone the other way and allowed querying of things that they are going to continue to enforce, though.

Could I have tested for this? Well, the difficulty there is that the GoCardless sandbox environment for the Pro API starts off clean with no access to any of your legacy activity neither from live nor from legacy sandbox. So I couldn’t do something like the following:

  1. Create legacy mandate in legacy sandbox, with amount per interval caps
  2. Try to charge against the legacy mandate from the Pro API sandbox, exceeding the cap
  3. Observe that it fails but with no way to tell why

I did note that there didn’t seem to be attributes of the mandate endpoint that would let me know when it could be charged and what the amount left to charge was, but it didn’t set off any alarm bells. Perhaps it should have.

Also I will admit I’ve had years to switch to Pro API and am only doing it now when forced. Perhaps if I had made a start on this years ago, I’d have noted what I consider to be a deficiency, asked them to remedy it and they might have had time to do so. I don’t actually think it’s likely they would bump the API version for that though. In my defence, as I mentioned, there is nothing attractive about the Pro API for my use, and it does cost more, so no surprise I’ve been reluctant to explore it.

So, if you are scrambling to update your GoCardless integration before 31 October, do check that you are prepared for payments against capped mandates to fail.

Removing dead tracks from your Banshee library

I used MusicBrainz Picard to reorganize hundreds of the files in my music collection. For some reason Banshee spotted that new files appeared, but it didn’t remove the old ones from the library. My library had hundreds of entries relating to files that no longer existed on disk.

I would have hoped that Banshee would just skip when it got to a playlist entry that didn’t exist, but unfortunately (as of v2.4.1 in Ubuntu 12.04) it decides to stop playing at that point. I have to switch to it and double click on the next playlist entry, and even then it thinks it is playing the file that it tried to play before. So I have to double click again.

I have filed a bug about this.

In the mean time I wanted to remove all the dead entries from my library. I could delete my library and re-import it all, but I wanted to keep some of the metadata. I knew it was an SQLite database, so I poked around a bit and came up with this. It seems to work, so maybe it will be useful for someone else.

Scanning for open recursive DNS resolvers

A few days ago we unfortunately had some abuse reports regarding customers with DNS resolvers being abused in order to participate in a distributed denial of service attack.

Amongst other issues, DNS servers which are misconfigured to allow arbitrary hosts to do recursive queries through them can be used by attackers to launch an amplified attack on a forged source address.

I try to scan our address space reasonably often but I must admit I hadn’t done so for some time. I kicked off another scan and found one more customer with a misconfigured resolver, which has since been fixed.

After mentioning that I would do a scan I was asked how I do that.

I use a Perl script I’ve hacked together over the last couple of years. I took a few minutes to tidy it up and add a small amount of documentation (run it with --man to read that), so here it is in case anyone finds it useful:

Update: This code has now been moved to GitHub. If you have any comments, problems or improvements please do submit them as an issue and I will get to it much quicker. The gist below is now out of date so please don’t use it.

Using the default 100 concurrent queries it scans a /21 in about 80 seconds (YMMV depending upon how many hosts you have that firewall 53/UDP). That scales sort of linearly with how many you do, so using -q 200 for example will cut that down to about 40 seconds. It’s only a select loop though so it’ll use more CPU if you do that.

Two things I’ve noticed since:

  • It doesn’t handle failing to create a socket with bgsend so for example if you run up against your limit of file descriptors (commonly ~1024 on Linux) the whole thing will get stuck at 100% CPU.
  • One person reporting a similar situation (bgsend fails, stuck at 100% CPU) when they allowed it to try to send to a broadcast address. I haven’t been ale to replicate that one yet.

Converting an IPv6 address to its reverse zone in Perl

I’m needing to work out the IPv6 reverse zone for a given IPv6 CIDR prefix, that is a prefix with number of bits in the network on the end after a forward slash. e.g.:

  • 2001:ba8:1f1:f004::/64 → 4.0.0.f.1.f.1.0.8.a.b.0.1.0.0.2.ip6.arpa
  • 4:2::/32 → 2.0.0.0.4.0.0.0.ip6.arpa
  • 2001:ba8:1f1:400::/56 → 0.0.4.0.1.f.1.0.8.a.b.0.1.0.0.2.ip6.arpa

I had a quick look for a module that does it, but couldn’t find one, so I hacked this subroutine together:

Is there a more elegant way? Is there a module I can replace this with?

Must support:

  • Arbitrary prefix length
  • Use of ‘::’ anywhere legal in the address

Adding watermarks to a PDF with Perl’s PDF::API2

For ages I’ve been trying to work out how to programmatically add a watermark to an existing PDF using the Perl PDF::API2 module.

In case it’s useful for anyone else, here goes:

use warnings;
use strict;

use PDF::API2;

my $ifile = "/some/file.pdf";
my $ofile = "/some/newfile.pdf";
my $pdf   = PDF::API2->open($ifile);
my $fnt   = $pdf->ttfont('verdana.ttf', -encode => 'utf8');
# Only on the front page
my $page  = $pdf->openpage(1);
my $text  = $page->text;

$page->mediabox('A4');

# Create two extended graphic states; one for transparent content
# and one for normal content
my $eg_trans = $pdf->egstate();
my $eg_norm  = $pdf->egstate();

$eg_trans->transparency(0.7);
$eg_norm->transparency(0);

# Go transparent
$text->egstate($eg_trans);
# Black edges
$text->strokecolor('#000000');
# With light green fill
$text->fillcolor('#d3e6d2');
# Text render mode: fill text first then put the edges on top
$text->render(2);
# Diagonal transparent text
$text->textlabel(55, 235, $fnt, 56, "i'm in ur pdf", -rotate => 30);
# Back to non-transparent to do other stuff
$text->egstate($eg_norm);

# …

# Save
$pdf->saveas($ofile);
$pdf->end;

I think there may be better ways of doing this within Perl now, but I already had some code using PDF::API2.

Copying block devices between machines

Having a bunch of Linux servers that run Linux virtual machines I often find myself having to move a virtual machine from one server to another. The tricky thing is that I’m not in a position to be using shared storage, i.e., the virtual machines’ storage is local to the machine they are running on. So, the data has to be moved first.

A naive approach ^

The naive approach is something like the following:

  1. Ensure that I can SSH as root using SSH key from the source host to the destination host.
  2. Create a new LVM logical volume on the destination host that’s the same size as the origin.
  3. Shut down the virtual machine.
  4. Copy the data across using something like this:
    $ sudo dd bs=4M if=/dev/mapper/myvg-src_lv |
      sudo ssh root@dest-host 'dd bs=4M of=/dev/mapper/myvg-dest_lv'
    
  5. While that is copying, do any other configuration transfer that’s required.
  6. When it’s finished, start up the virtual machine on the destination host.

I also like to stick pv in the middle of that pipeline so I get a nice text mode progress bar (a bit like what you see with wget):

$ sudo dd bs=4M if=/dev/mapper/myvg-src_lv | pv -s 10g |
  sudo ssh root@dest-host 'dd bs=4M of=/dev/mapper/myvg-dest_lv'

The above transfers data between hosts via ssh, which will introduce some overhead since it will be encrypting everything. You may or may not wish to force it to do compression, or pipe it through a compressor (like gzip) first, or even avoid ssh entirely and just use nc.

Personally I don’t care about the ssh overhead; this is on the whole customer data and I’m happier if it’s encrypted. I also don’t bother compressing it unless it’s going over the Internet. Over a gigabit LAN I’ve found it fastest to use ssh with the -c arcfour option.

The above process works, but it has some fairly major limitations:

  1. The virtual machine needs to be shut down for the whole time it takes to transfer data from one host to another. For 10GiB of data that’s not too bad. For 100GiB of data it’s rather painful.
  2. It transfers the whole block device, even the empty bits. For example, if it’s a 10GiB block device with 2GiB of data on it, 10GiB still gets transferred.

Limitation #2 can be mitigated somewhat by compressing the data. But we can do better.

LVM snapshots ^

One of the great things about LVM is snapshots. You can do a snapshot of a virtual machine’s logical volume while it is still running, and transfer that using the above method.

But what do you end up with? A destination host with an out of date copy of the data on it, and a source host that is still running a virtual machine that’s still updating its data. How to get just the differences from the source host to the destination?

Again there is a naive approach, which is to shut down the virtual machine and mount the logical volume on the host itself, do the same on the destination host, and use rsync to transfer the differences.

This will work, but again has major issues such as:

  1. It’s technically possible for a virtual machine admin to maliciously construct a filesystem that interferes with the host that mounts it. Mounting random filesystems is risky.
  2. Even if you’re willing to risk the above, you have to guess what the filesystem is going to be. Is it ext3? Will it have the same options that your host supports? Will your host even support whatever filesystem is on there?
  3. What if it isn’t a filesystem at all? It could well be a partitioned disk device, which you can still work with using kpartx, but it’s a major pain. Or it could even be a raw block device used by some tool you have no clue about.

The bottom line is, it’s a world of risk and hassle interfering with the data of virtual machines that you don’t admin.

Sadly rsync doesn’t support syncing a block device. There’s a --copy-devices patch that allows it to do so, but after applying it I found that while it can now read from a block device, it would still only write to a file.

Next I found a --write-devices patch by Darryl Dixon, which provides the other end of the functionality – it allows rsync to write to a block device instead of files in a filesystem. Unfortunately no matter what I tried, this would just send all the data every time, i.e., it was no more efficient than just using dd.

Read a bit, compare a bit ^

While searching about for a solution to this dilemma, I came across this horrendous and terrifying bodge of shell and Perl on serverfault.com:

ssh -i /root/.ssh/rsync_rsa $remote "
  perl -'MDigest::MD5 md5' -ne 'BEGIN{\$/=\1024};print md5(\$_)' $dev2 | lzop -c" |
  lzop -dc | perl -'MDigest::MD5 md5' -ne 'BEGIN{$/=\1024};$b=md5($_);
    read STDIN,$a,16;if ($a eq $b) {print "s"} else {print "c" . $_}' $dev1 | lzop -c |
ssh -i /root/.ssh/rsync_rsa $remote "lzop -dc |
  perl -ne 'BEGIN{\$/=\1} if (\$_ eq\"s\") {\$s++} else {if (\$s) {
    seek STDOUT,\$s*1024,1; \$s=0}; read ARGV,\$buf,1024; print \$buf}' 1<> $dev2"

Are you OK? Do you need to have a nice cup of tea and a sit down for a bit? Yeah. I did too.

I’ve rewritten this thing into a single Perl script so it’s a little bit more readable, but I’ll attempt to explain what the above abomination does.

Even though I do refer to this script in unkind terms like “abomination”, I will be the first to admit that I couldn’t have come up with it myself, and that I’m not going to show you my single Perl script version because it’s still nearly as bad. Sorry!

It connects to the destination host and starts a Perl script which begins reading the block device over there, 1024 bytes at a time, running that through md5 and piping the output to a Perl script running locally (on the source host).

The local Perl script is reading the source block device 1024 bytes at a time, doing md5 on that and comparing it to the md5 hashes it is reading from the destination side. If they’re the same then it prints “s” otherwise it prints “c” followed by the actual data from the source block device.

The output of the local Perl script is fed to a third Perl script running on the destination. It takes the sequence of “s” or “c” as instructions on whether to skip 1024 bytes (“s”) of the destination block device or whether to take 1024 bytes of data and write it to the destination block device (“c<1024 bytes of data>“).

The lzop bits are just doing compression and can be changed for gzip or omitted entirely.

Hopefully you can see that this is behaving like a very very dumb version of rsync.

The thing is, it works really well. If you’re not convinced, run md5sum (or sha1sum or whatever you like) on both the source and destination block devices to verify that they’re identical.

The process now becomes something like:

  1. Take an LVM snapshot of virtual machine block device while the virtual machine is still running.
  2. Create suitable logical volume on destination host.
  3. Use dd to copy the snapshot volume to the destination volume.
  4. Move over any other configuration while that’s taking place.
  5. When the initial copy is complete, shut down the virtual machine.
  6. Run the script of doom to sync over the differences from the real device to the destination.
  7. When that’s finished, start up the virtual machine on the destination host.
  8. Delete snapshot on source host.

1024 bytes seemed like rather a small buffer to be working with so I upped it to 1MiB.

I find that on a typical 10GiB block device there might only be a few hundred MiB of changes between snapshot and virtual machine shut down. The entire device does have to be read through of course, but the down time and data transferred is dramatically reduced.

There must be a better way ^

Is there a better way to do this, still without shared storage?

It’s getting difficult to sell the disk capacity that comes with the number of spindles I need for performance, so maybe I could do something with DRBD so that there’s always another server with a copy of the data?

This seems like it should work, but I’ve no experience of DRBD. Presumably the active node would have to be using the /dev/drbdX devices as disks. Does DRBD scale to having say 100 of those on one host? It seems like a lot of added complexity.

I’d love to hear any other ideas.

Setting envelope sender when using Mail::Send

I just thought I would blog this so that people can find it in a search, as I could not when I tried.

I have a perl script on a machine that sends out transactional emails with the from address like billing@example.com. The machine it sends from is not example.com. Let’s call it admin.foobar.example.com. Since some misguided people think that sender verification callouts are a good idea, their MX hosts try to connect to admin.foobar to see if the billing address is there and accepting mails. This fails because admin.foobar does not run a public MTA and I do not want it to run one either.

It’s clear that the correct thing is for the script to be setting its envelope sender to billing@example.com since billing@example.com should be an email address that does actually work! How to do it with perl’s Mail::Send, however, is not so clear.

After trying to work this out myself for a few hours I asked dg, who is not just a perl monger but a veritable perl Tesco, making mere perl mongers compromise on their margins and putting many of them completely out of business. Every little helps.

Based on his reading of the source of Mail/Util.pm, he thought it might be accomplished by setting the MAILADDRESS environment variable. He was wrong however:


<dg> oh, that only works for SMTP
<dg> how crap

The next suggestion was a no-go also, but third time lucky:


<dg> actually, $msg->open("sendmail", "-f ...");

and that works. So there you go.

Thanks dg.