Recovering From an Exif Disaster

The Discovery ^

Sometime in late December (2019) I noticed that when I clicked on a tag in Shotwell, the photo management software that I use, it was showing either zero or hardly any matching photos when I knew for sure that there should be many more.

(When I say “tag” in this article it’s mostly going to refer to the type of tags you generally put on an image, i.e. the tags that identify who or what is in the image, what event it is associated with, the place it was taken etc. Images can have many different kinds of tags containing all manner of metadata, but for avoidance of doubt please assume that I don’t mean any of those.)

I have Shotwell set to store the tags in the image files themselves, in the metadata. There is a standard for this called Exif. What seems to have happened is that Shotwell had removed a huge number of tags from the files themselves. At the time of discovery I had around 15,500 photos in my collection and it looked like the only way to tell what was in them would be by looking at them. Disaster.

Here follows some notes about what I found out when trying to recover from this situation, in case it si ever useful for anyone.

Shotwell still had a visible tag hierarchy, so I could for example click on the “Pets/Remy” tag, but this brought up only one photo that I took on 14 December 2019. I’ve been taking photos of Remy for years so I knew there should be many more. Here’s Remy.

Remy at The Avenue Ealing Christmas Fair, December 2019
Remy at The Avenue Ealing Christmas Fair

Luckily, I have backups.

Comparing Good and Bad Copies of a Photo ^

I knew this must have happened fairly recently because I’d have noticed quite quickly that photos were “missing”. I had a look for a recent photo that I knew I’d tagged with a particular thing, and then looked in the backups to see when it was last modified.

As an example I found a photo that was taken on 30 October 2019 that should have been tagged “Pets/Violet” but no longer was. It had been modified (but not by me) on 7 December 2019.

A broken photo of Violet
A broken photo of Violet

(Sorry about the text-as-images; I’m reconstructing this series of events from a Twitter thread, where things necessarily had to be posted as screenshots.)

What the above shows is that the version of the photo that existed on 30 October 2019 had the tags “Pets“, “Edna“, and “Violet” but then the version that was written on 7 December 2019 lost the “Violet” tag.

Here I used the exiftool utility to display EXIF tags from the photo files. You can do that like this:

$ exiftool -s $filename

Using egrep I limited this to the tag keys “Subject“, “Keywords“, and “TagsListLastKeywordXMP” but this was a slight mistake: “TagsListLastKeywordXMP” was actually a typo, is totally irrelevant and should be ignored.

Subject” and “Keywords” were always identical for any photo I examined and contained the flattened list of tags. For example, in Shotwell that photo originally had the tags:

  • Pets/Edna
  • Pets/Violet

It seems that Shotwell flattens that to:

  • Pets
  • Edna
  • Violet

and then stores it in “Subject” and “Keywords“.

The tags with hierarchy are actually in the key “TagsList” like:

  • Pets
  • Pets/Edna
  • Pets/Violet

Fixing One Photo ^

I tested stuffing the tag “Violet” back in to this file under the keys “Subject” and “Keywords“:

$ exiftool -keywords+="…" -subject+="…" $filename

Stuffing the Violet tag back in
Stuffing the Violet tag back in

This shows that the “Violet” tag is now back in the current version of the file. After restarting Shotwell and doing a free text search for “Violet”, this photo now shows up whereas before it did not. It still did not show up when I clicked on “Pets/Violet” in the tag hierarchy however. It was then that I realised I also needed to put “Pets/Violet” into the “TagsList” key.

I ended up using a script to do this in bulk fashion, but individually I think you should be able to do this like:

$ exiftool -keywords+=Violet -subject+=Violet -TagsList+=Pets/Violet

After restarting Shotwell I was able to click on the “Pets/Violet” tag and see this photo.

Fixing All the Photos? ^

My process to recover from this, then, was to compile a list of each file that had been modified at the suspected time of disaster, and for each:

  1. Read the list of tags from “Keywords
  2. Read the list of tags from “Subject
  3. De-duplicate them and store them as $keywords
  4. Read the list of tags from “TagsList” and store them as $tagslist
  5. Stuff $keywords back into both “Subject” and “Keywords” of the current version of the file

Gulp.

Which files were tampered with? ^

It was relatively easy to work out which files had been screwed with, because thankfully I didn’t make any other photo modifications on 7 December 2019. So any photo that got modified that day was probably a candidate.

I haven’t mentioned what actually caused this problem yet. I don’t know exactly. At 16:53 on 7 December 2019 I was importing some photos into Shotwell, and I do seem to recall it crashed at some point, either while I was doing that or shortly after.

The photos from that import and all others afterwards had retained their tags correctly, but many that existed prior to that time seemed to be missing some or all tags. I have no idea why such a crash would cause Shotwell to do that but that must have been what did it.

Running this against my backups identified 3,721 files that had been modified on 7 December 2019:

$ cd weekly.2/specialbrew.21tc.bitfolk.com/srv/tank/Photos/Andy
$ find . -type f \
  -newermt "2019-12-07 00:00:00" \! \
  -newermt "2019-12-07 23:59:59" > ~/busted.txt

The next thing I did was to check that each of these file paths still exist in the current photo store and in the known-good backups (weekly.3).

Extract tags from known-good copies ^

Next up, I wrote a script which:

  1. Goes to the known-good copies of the files
  2. Extracts the Subject and Keywords and deduplicates them
  3. Extracts the TagsList
  4. Writes it all into a hash
  5. Dumps that out as a YAML file

All scripts mentioned here script use the Perl module Image::ExifTool which is part of the exiftool package.

backup_host$ ./gather_tags.pl < ~/busted.txt > ~/tags.yaml

tags.yaml looks a bit like this:

---
2011/01/16/16012011163.jpg:
  keywords:
  - Hatter
  - Pets
  tagslist:
  - Pets
  - Pets/Hatter
[]
2019/11/29/20191129_095218~2.jpg:
  keywords:
  - Bedfont Lakes
  - Feltham
  - London
  - Mandy
  - Pets
  - Places
  tagslist:
  - Pets
  - Pets/Mandy
  - Places
  - Places/London
  - Places/London/Feltham
  - Places/London/Feltham/Bedfont Lakes

Stuff tags back into current versions of photos ^

After transferring tags.yaml back to my home fileserver it was time to use it to stuff the tags back into the files that had lost them.

One thing to note while doing this is that if you just add a tag, it adds it even if the same tag already exists, leading to duplicates. I thought it best to first delete the tag and then add it again so that there would only be one instance of each one.

I called that one fix_tags.pl.

$ ./fix_tags.pl tags.yaml

Profit! Or, only slight loss, I guess ^

16m53s of runtime later, it had completed its work… 🙌 2020 will definitely be the year of Linux on the desktop¹.

¹ As long as you know how to manipulate EXIF tags from a programming language and have a functioning backup system and even then don’t mind losing some stuff

Losing some stuff…? ^

Unfortunately there were some things I couldn’t restore. It was at this point that I discovered that Shotwell does not ever put tags into video files (even though they do support EXIF tags…)

That means that the only record of the tags on a video file is in Shotwell’s own database, which I did not back up as I didn’t think I needed to.

Getting Tags Out of Shotwell ^

I am now backing that up, but should this sort of thing happen in the future I’d need to know how to manipulate the tags for videos in Shotwell’s database.

Shotwell’s database is an SQLite file that’s normally at $HOME/.local/share/shotwell/data/photo.db. I’m fairly familiar with SQLite so I had a poke around, but couldn’t immediately see how these tags were stored. I had to ask on the Shotwell mailing list.

Here’s how Shotwell does it. There’s a table called TagTable which stores the name of each tag and a comma-separated list of every photo/video which matches it:

sqlite> .schema TagTable 
CREATE TABLE TagTable (id INTEGER PRIMARY KEY, name TEXT UNIQUE NOT NULL, photo_id_list TEXT, time_created INTEGER);

The photo_id_list column holds the comma-separated list. Each item in the list is of the form:

  1. “thumb” or “video-” depending on whether the item is a photo or a video
  2. 16 hex digits, zero padded, which is the ID value from the PhotosTable or VideosTable for that item
  3. a comma

Full example of extracting tags for the video file 2019/12/31/20191231_121604.mp4:

$ sqlite3 /home/andy/.local/share/shotwell/DATA/photo.db
SQLite version 3.22.0 2018-01-22 18:45:57
Enter ".help" FOR usage hints.
sqlite> SELECT id
        FROM VideoTable
        WHERE filename LIKE '%20191231%';
553
sqlite> SELECT printf("%016x", 553);
0000000000000229
sqlite> SELECT name
        FROM TagTable
        WHERE photo_id_list LIKE '%video-0000000000000229,%';
/Places
/Places/London
/Places/London/Feltham
/Pets
/Places/London/Feltham/Bedfont Lakes
/Pets/Marge
/Pets/Mandy

If that is not completely clear:

  • The ID for that video file is 553
  • 553 in hexadecial is 229
  • Pad that to 16 digits, add “video-” at the front and “.” at the end (even the last item in the list has a comma at the end)
  • Search for that string in photo_id_list
  • If a row matches then the name column is a tag that is attached to that file

I don’t exactly know how I would have identified which videos got messed with, but at least I would have had both versions of the database to compare, and I now know how I would do the comparison.

Should Tags Even Be In Photos? ^

During my Twitter thread it was suggested to me that tags should not be stored in photos, but only in the photo cataloging software, where they can be backed up along with everything else.

I disagree with this for several reasons:

  • Exif exists for the purpose of storing tags like this.

  • When I move my photos from one piece of software to another I want it to be able to read the tags. I don’t want to have to input them all over again. That would be unimaginably tedious.

    When I moved from F-Spot to Shotwell the fact that the tags were in the files saved me countless hours of work. It just worked on import.

    If there wasn’t a dedicated importer feature then it would be so much work that really the only way to do it would be to extract the tags from the database and insert them again programmatically, which is basically admitting that to change software you need to be an expert. That really isn’t how this should work.

  • If the only copy of my tags is in the internal database of a unique piece of cataloging software, then I have to become an expert on the internal data store of that piece of software. I don’t want to have to do that.

    I’ve been forced to do that here for Shotwell because of a deficiency of Shotwell in not storing video tags in the files. But if we’re only talking about photos then I could have avoided it, and could also avoid having to be an expert on every future piece of cataloging software.

  • Even if I’m not moving to a different cataloging solution, lots of software understands Exif and it’s useful to be able to query those things from other software.

    I regard it very much like artist, album, author, genre etc tags in the metadata of digital music and ebooks, all of which are in the files; you would not expect to have to reconstruct these out of the database of some other bit of software every time you wanted to use them elsewhere.

It was a mistake not to backup the Shotwell database though; I thought I did not need it as I thought all tags were being stored in files, and tags were the only things I cared about. As it happened, tags were not being stored in video files and tags for video files only exist in Shotwell’s database.

Other Thoughts ^

Having backups was obviously a lifesaver here. It took me ~3 weeks to notice.

Being able to manipulate them like a regular filesystem made things a lot more convenient, so that’s a property I will want to keep in whatever future backup arrangements I have.

I might very well switch to different photo management software now, assuming I could find any that I prefer, but all software has bugs. Whatever I switch to I would have to ensure that I knew how to extract the tags from that as well, if it doesn’t store them in the files.

I don’t want to store my photos and videos “in the cloud” but it is a shortcoming of Shotwell that I can basically only use it from my desktop at home. Its database does not support multiple or remote access. I wonder if there is some web-based thing that can just read (and cache) the tags out of the files, build dynamic galleries and allow arbitrary searches on them…

Shotwell’s database schema and its use of 16 hexadecimal digits (nibbles?) means I can only store a maximum of 18,446,744,073,709,551,615 (1.844674407×10¹⁹ -1) photos or videos of dogs. Arbitrary limits suck so much.

Greyhounds Marge, Janti and Will at Sainsbury's Staines with Wimbledon Greyhound Welfare, December 2019
Marge, Janti and Will at Sainsbury’s Staines with Wimbledon Greyhound Welfare, December 2019

The Internet of Unprofitable Things

Gather round children ^

Uncle Andrew wants to tell you a festive story. The NTPmare shortly after Christmas.

A modest proposal ^

Nearly two years ago, on the afternoon of Monday 16th January 2017, I received an interesting BitFolk support ticket from a non-customer. The sender identified themselves as a senior software engineer at NetThings UK Ltd.

Subject: Specific request for NTP on IP 85.119.80.232

Hi,

This might sound odd but I need to setup an NTP server instance on IP address 85.119.80.232.

wats 85.119.80.232 precious? ^

85.119.80.232 is actually one of the IP addresses of one of BitFolk’s customer-facing NTP servers. It was also, until a few weeks before this email, part of the NTP Pool project.

Was” being the important issue here. In late December of 2016 I had withdrawn BitFolk’s NTP servers from the public pool and firewalled them off to non-customers.

I’d done that because they were receiving an unusually large amount of traffic due to the Snapchat NTP bug. It wasn’t really causing any huge problems, but the number of traffic flows were pushing useful information out of Jump‘s fixed-size netflow database and I didn’t want to deal with it over the holiday period, so this public service was withdrawn.

NTP? ^

This article was posted to Hacker News and a couple of comments there said they would have liked to have seen a brief explanation of what NTP is, so I’ve now added this section. If you know what NTP is already then you should probably skip this section because it will be quite brief and non-technical.

Network Time Protocol is a means by which a computer can use multiple other computers, often from across the Internet on completely different networks under different administrative control, to accurately determine what the current time is. By using several different computers, a small number of them can be inaccurate or even downright broken or hostile, and still the protocol can detect the “bad” clocks and only take into account the more accurate majority.

NTP is supposed to be used in a hierarchical fashion: A small number of servers have hardware directly attached from which they can very accurately tell the time, e.g. an atomic clock, GPS, etc. Those are called “Stratum 1” servers. A larger number of servers use the stratum 1 servers to set their own time, then serve that time to a much larger population of clients, and so on.

It used to be the case that it was quite hard to find NTP servers that you were allowed to use. Your own organisation might have one or two, but really you should have at least 3 to 7 of them and it’s better if there are multiple different organisations involved. In a university environment that wasn’t so difficult because you could speak to colleagues from another institution and swap NTP access. As the Internet matured and became majority used by corporations and private individuals though, people still needed access to accurate time, and this wasn’t going to cut it.

The NTP Pool project came to the rescue by making an easy web interface for people to volunteer their NTP servers, and then they’d be served collectively in a DNS zone with some basic means to share load. A private individual can just use three names from the pool zone and they will get three different (constantly changing) NTP servers.

Corporations and those making products that need to query the NTP pool are supposed to ask for a “vendor zone”. They make some small contribution to the NTP pool project and then they get a DNS zone dedicated to their product, so it’s easier for the pool administrators to direct the traffic.

Sadly many companies don’t take the time to understand this and just use the generic pool zone. NetThings UK Ltd went one step further in a very wrong direction by taking an IP address from the pool and just using it directly, assuming it would always be available for their use. In reality it was a free service donated to the pool by BitFolk and as it had become temporarily inconvenient for that arrangement to continue, service was withdrawn.

On with the story…

They want what? ^

The Senior Software Engineer continued:

The NTP service was recently shutdown and I am interested to know if there is any possibility of starting it up again on the IP address mentioned. Either through the current holder of the IP address or through the migration of the current machine to another address to enable us to lease 85.119.80.232.

Um…

I realise that this is a peculiar request but I can assure you it is genuine.

That’s not gonna work ^

Obviously what with 85.119.80.232 currently being in use by all customers as a resolver and NTP server I wasn’t very interested in getting them all to change their configuration and then leasing it to NetThings UK Ltd.

What I did was remove the firewalling so that 85.119.80.232 still worked as an NTP server for NetThings UK Ltd until we worked out what could be done.

I then asked some pertinent questions so we could work out the scope of the service we’d need to provide. Questions such as:

  • How many clients do you have using this?
  • Do you know their IP addresses?
  • When do they need to use the NTP server and for how long?
  • Can you make them use the pool properly (a vendor zone)?

Down the rabbit hole ^

The answers to some of the above questions were quite disappointing.

It would be of some use for our manufacturing setup (where the RTCs are initially set) but unfortunately we also have a reasonably large field population (~500 units with weekly NTP calls) that use roaming GPRS SIMs. I don’t know if we can rely on the source IP of the APN for configuring the firewall in this case (I will check though). We are also unable to update the firmware remotely on these devices as they only have a 5MB per month data allowance. We are able to wirelessly update them locally but the timeline for this is months rather than weeks.

Basically it seemed that NetThings UK Ltd made remote controlled thermostats and lighting controllers for large retail spaces etc. And their devices had one of BitFolk’s IP addresses burnt into them at the factory. And they could not be identified or remotely updated.

Facepalm

Oh, and whatever these devices were, without an external time source their clocks would start to noticeably drift within 2 weeks.

By the way, they solved their “burnt into it at the factory” problem by bringing up BitFolk’s IP address locally at their factory to set initial date/time.

Group Facepalm

I’ll admit, at this point I was slightly tempted to work out how to identify these devices and reply to them with completely the wrong times to see if I could get some retail parks to turn their lights on and off at strange times.

Weekly?? ^

We are triggering ntp calls on a weekly cron with no client side load balancing. This would result in a flood of calls at the same time every Sunday evening at around 19:45.

Yeah, they made every single one of their unidentifiable devices contact a hard coded IP address within a two minute window every Sunday night.

The Senior Software Engineer was initially very worried that they were the cause of the excess flows I had mentioned earlier, but I reassured them that it was definitely the Snapchat bug. In fact I never was able to detect their devices above background noise; it turns out that ~500 devices doing a single SNTP query is pretty light load. They’d been doing it for over 2 years before I received this email.

I did of course point out that they were lucky we caught this early because they could have ended up as the next Netgear vs. University of Wisconsin.

I am feeling really, really bad about this. I’m very, very sorry if we were the cause of your problems.

Bless. I must point out that throughout all of this, their Senior Software Engineer was a pleasure to work with.

We made a deal ^

While NTP service is something BitFolk provides as a courtesy to customers, it’s not something that I wanted to sell as a service on its own. And after all, who would buy it, when the public pool exists? The correct thing for a corporate entity to do is support the pool with a vendor zone.

But NetThings UK Ltd were in a bind and not allowing them to use BitFolk’s NTP server was going to cause them great commercial harm. Potentially I could have asked for a lot of money at this point, but (no doubt to my detriment) that just felt wrong.

I proposed that initially they pay me for two hours of consultancy to cover work already done in dealing with their request and making the firewall changes.

I further proposed that I charged them one hour of consultancy per month for a period of 12 months, to cover continued operation of the NTP server. Of course, I do not spend an hour a month fiddling with NTP, but this unusual departure from my normal business had to come at some cost.

I was keen to point out that this wasn’t something I wanted to continue forever:

Finally, this is not a punitive charge. It seems likely that you are in a difficult position at the moment and there is the temptation to charge you as much as we can get away with (a lot more than £840 [+VAT per year], anyway), but this seems unfair to me. However, providing NTP service to third parties is not a business we want to be in so we would expect this to only last around 12 months. If you end up having to renew this service after 12 months then that would be an indication that we haven’t charged you enough and we will increase the price.

Does this seem reasonable?

NetThings UK Ltd happily agreed to this proposal on a quarterly basis.

Thanks again for the info and help. You have saved me a huge amount of convoluted and throwaway work. This give us enough time to fix things properly.

Not plain sailing ^

I only communicated with the Senior Software Engineer one more time. The rest of the correspondence was with financial staff, mainly because NetThings UK Ltd did not like paying its bills on time.

NetThings UK Ltd paid 3 of its 4 invoices in the first year late. I made sure to charge them statutory late payment fees for each overdue invoice.

Yearly report card: must try harder ^

As 2017 was drawing to a close, I asked the Senior Software Engineer how NetThings UK Ltd was getting on with ceasing to hard code BitFolk’s IP address in its products.

To give you a quick summary, we have migrated the majority of our products away from using the fixed IP address. There is still one project to be updated after which there will be no new units being manufactured using the fixed IP address. However, we still have around 1000 units out in the field that are not readily updatable and will continue to perform weekly NTP calls to the fixed IP address. So to answer your question, yes we will still require the service past January 2018.

This was a bit disappointing because a year earlier the number had been “about 500” devices, yet despite a year of effort the number had apparently doubled.

That alone would have been enough for me to increase the charge, but I was going to anyway due to NetThings UK Ltd’s aversion to paying on time. I gave them just over 2 months of notice that the price was going to double.

u wot m8 ^

Approximately 15 weeks after being told that the price doubling was going to happen, NetThings UK Ltd’s Financial Controller asked me why it had happened, while letting me know that another of their late payments had been made:

Date: Wed, 21 Feb 2018 14:59:42 +0000

We’ve paid this now, but can you explain why the price has doubled?

I was very happy to explain again in detail why it had doubled. The Financial Controller in response tried to agree a fixed price for a year, which I said I would be happy to do if they paid for the full year in one payment.

My rationale for this was that a large part of the reason for the increase was that I had been spending a lot of time chasing their late payments, so if they wanted to still make quarterly payments then I would need the opportunity to charge more if I needed to. If they wanted assurance then in my view they should pay for it by making one yearly payment.

There was no reply, so the arrangement continued on a quarterly basis.

All good things… ^

On 20 November 2018 BitFolk received a letter from Deloitte:

Netthings Limited – In Administration (“The Company”)

Company Number: SC313913

[…]

Cessation of Trading

The Company ceased to trade with effect from 15 November 2018.

Investigation

As part of our duties as Joint Administrators, we shall be investigating what assets the Company holds and what recoveries if any may be made for the benefit of creditors as well as the manner in which the Company’s business has been conducted.

And then on 21 December:

Under paragraph 51(1)(b) of the Insolvency Act 1986, the Joint Administrators are not required to call an initial creditors’ meeting unless the Company has sufficient funds to make a distribution to the unsecured creditors, or unless a meeting is requested on Form SADM_127 by 10% or more in value of the Company’s unsecured creditors. There will be no funds available to make a distribution to the unsecured creditors of the Company, therefore a creditors’ meeting will not be convened.

Luckily their only unpaid invoice was for service from some point in November, so they didn’t really get anything that they hadn’t already paid for.

So that’s the story of NetThings UK Ltd, a brave pioneer of the Internet of Things wave, who thought that the public NTP pool was just an inherent part of the Internet that anyone could use for free, and that the way to do that was to pick one IP address out of it at random and bake that into over a thousand bits of hardware that they distributed around the country with no way to remotely update.

This coupled with their innovative reluctance to pay for anything on time was sadly not enough to let them remain solvent.

Google App Engine started requiring Content-Length header on POST requests

TL;DR ^

Update: It’s GoCardless who moved api.gocardless.com to Google Cloud. Google Cloud has behaved this way for years.

I think that Google App Engine may have recently started requiring every POST request to have a Content-Length header, even if there is no request body.

That will cause you problems if your library doesn’t add one for POST requests that have no content. Perl’s HTTP::Request is one such library.

You might be experiencing this if an API has just started replying to you with:

Error 411 (Length Required)!!1

411.That’s an error.

POST requests require a Content-length header. That’s all we know.

(Yes, the title does contain “!!1”.)

You can fix it by adding the header yourself, e.g.:

my $ua = LWP::UserAgent->new;
 
my $req = HTTP::Request->new(
    POST => 'https://api.example.com/things/$id/actions/fettle'
);
 
$req->header('Accept' => 'application/json');
$req->content_type('application/json');
 
my $json;
$json = JSON->new->utf8->canonical->encode($params) if $params;
 
$req->content($json) if $json;
# Explicitly set Content-Length to zero as HTTP::Request doesn't add one
# when there's no content.
$req->header( 'Content-Length' => 0 ) unless $json;
 
my $res = $ua->request( $req );

This is a bit far outside of my comfort zone so I’m not sure if I’m 100% correct, but I do know that sending the header fixes things for me.

What happened? ^

Yesterday a BitFolk customer tried to cancel their Direct Debit mandate, and it didn’t work. The server logs contained the above message.

For Direct Debit payments we use the Perl module Business::GoCardless for integrating with GoCardless, but the additional HTML styling in the message (which I’ve left out for brevity) made clear that the message was coming from Google. api.gocardless.com is hosted on Google App Engine (or some bit of Google cloud anyway).

After a bit of debugging I established that HTTP::Request was only setting a Content-Length header when there was actually request content. The API for cancelling a Direct Debit mandate is to send an empty POST to https://api.gocardless.com//mandates/$id/actions/cancel.

Adding Content-Length: 0 makes it work again.

When did it change? ^

There was a successful mandate cancellation on 25 October 2018, so some time between then and 12 December 2018. I haven’t looked for any change notice put out by Google as I’m not a Google Cloud user and wouldn’t know where to look.

Who’s to blame ^

I haven’t yet looked into whether the HTTP standard requires POST requests to have a Content-Length header. I welcome comments from someone who wants to do the digging.

Realistically even if it doesn’t and Google is just being overly strict, other servers might also be strict, so I guess HTTP::Request should always send the header.

Fun with Supermicro motherboard serial headers

or, “LOL, standards” ^

TL;DR: Most motherboards have a serial header in an IDC-10 (5×2 pins) arrangement with the pins as a row of even numbered pins (2,4,6,8,X) followed by a row of odd numbered pins (1,3,5,7,9). Supermicro ones appear to have the pins in sequential order (6,7,8,9,X and then 1,2,3,4,5). As a result a standard IDC-10 to DB-9 cable will not work and you’ll need to either hack one about or buy the Supermicro one.

Update ^

A comment below kindly points out that Supermicro actually is using a standard header pinout, it’s just that it’s a competing and lesser-used standard. It’s apparently called Intel/DTK or crossover, so that may help you find a working cable.

Are we sitting comfortably? ^

I bought a Supermicro motherboard. It doesn’t have a serial port exposed at the back. I like to use serial ports for a serial console even though I am aware that IPMI exists. IPMI on this board works okay but I like knowing I can always get to the “real” serial port as well.

The motherboard has a COM1 serial header, and I wasn’t using the PCI expansion slot on the back of the chassis, so I decided to put a serial port there. I bought a typical IDC-10 / DB-9 cable and plate:

IDC-10 to DB-9

Didn’t work. Serial-over-LAN (IPMI) worked alright. On COM1 I would get either nothing or a run of garbage characters from time to time. I wasted a good number of hours messing with BIOS settings, baud rates, checking if my USB serial adaptor actually worked with another device (of which I only have one in my home), before I decided to sit down and check the pin numbering for both the header and the cable.

Looking at the motherboard manual we see this:

x10sdv board com1 pin layout

And the cable?

IDC-10 to DB-9 pinout

Notice anything amiss?

The cable’s pins go in a row of odd numbers and then a row of even numbers:

2 4 6 8 X
1 3 5 7 9
    -

The X is the missing pin (serial uses 9 pins) and the - indicates where the notch for the connector would be: next to pin 5 in this case.

The header’s pins go in sequential order:

6 7 8 9 X
1 2 3 4 5
    -

As a result all but pin 1 are incorrect.

You actually need a Supermicro cable for this. CBL-0010L is the part number in my case. CBL-0010LP would be the low profile version. Good luck finding it mentioned on Supermicro’s site, but your favourite reseller will probably know of it. As it was I found one on Ebay for £1.58+VAT, and it works now.

After knowing what to search for I also found someone else having a similar issue with a Supermicro board.

You could of course instead hack any existing cable’s pins about or fit an adaptor in between (as the person in the above link did).

Thanks Supermicro. Thupermicro.

Firefox, Ubuntu and middlemouse.contentLoadURL

I use Firefox web browser, currently on Ubuntu 10.04 LTS. For many years I have set the config option middlemouse.contentLoadURL to true so that middle clicking anywhere in the page (that does not accept input) will load the URL that is in my clipboard.

After restarting my web browser somewhere near the end of January 2012 I found my Firefox 3.x had been upgraded to Firefox 9.x. Also the middle click behaviour no longer worked.

Perusing about:config showed that the option had been set to false again. I set it back to true but on restart of the browser it was set back to false. A bit of searching about found various suggestions about forcing it in my user.js file, but none of those worked either.

Finally, in desperation, I did a search of every file beneath /usr for the string “middlemouse”. Lo and behold:

/usr/lib/firefox-9.0.1/extensions/ubufox@ubuntu.com/defaults/preferences/ubuntu-mods.js

…
pref("middlemouse.contentLoadURL", false); //setting to false disables pasting urls on to the page
…

Commenting this line out once more allowed me to change the setting myself.

It seems this this override was discussed by Ubuntu as far back as 2004, but it only became something that I could not override upon the upgrade to Firefox 9.

I reported a bug about this, and one of the comments seems to suggest that the method Ubuntu uses to change these settings has changed because they were breaking Firefox Sync, and that this outcome (overriding middlemouse.contentLoadURL) is not as bad as breaking Firefox Sync.

Even so, I would suggest that this outcome is very confusing for people and that as middlemouse.contentLoadURL is a popular setting which is easy to change, it should not be overridden in some obscure file.

As of the recent upgrade to Firefox 11, the file with the override in it has now moved to /usr/share/xul-ext/ubufox/defaults/preferences/ubuntu-mods.js.

Dear System Integrators, a few words about screwing

Right, System Integrators – those companies that buy components from Supermicro et al and build you a server out of them. You guys seem to have a bit of a fascination with screwing. Screwing things in as tight as you can. Please stop.

It’s 100% true that vibration of components like hard disks is bad. numerous studies have been done that prove that vibration causes performance problems as drives need to do more corrective work.

However, this does not mean that you have to screw in the drives to the caddies to the limit of what is physically possible. They just need to be tightened until a little force won’t tighten them any more.

When you supply me with a server that’s got four super-tightened screws for each drive in it, and I deploy that server, chances are that one of the first things that will break in that server is one of the disk drives.

During the years those screws have been there they haven’t got any looser. It’s likely that if you tightened them all to the limit of your strength and tools, by now the force required to unscrew them will be less than the force required to deform the screw head. Like this:

Stripped screw heads in a drive caddy

Close-up of a stripped screw head

No, this is not an issue of using the wrong driver head. Yes, you will strip a screw if you use the wrong driver head. That’s why I carry this stuff every time I go to a datacentre:

A selection of screwdrivers for your pleasure

There’s two exactly correct drivers in there, and several that should also work anyway despite being a little bit off. I have never had a problem unscrewing any screw that I originally put in. Probably because I don’t tighten them like I am some sort of lunatic. I can even unscrew them around a corner with the offline driver. Oh yeah baby. So far nothing I have screwed in with merely normal force has fallen apart.

And this is not an isolated occurrence! Nearly all of you seem to do this with every screw, everywhere. Stop it!

The drive in that caddy is a dead one, and luckily I had a spare caddy with me for the replacement drive to go in, otherwise I too would have been screwed beyond the limits of my endurance.

So, now I’ve got to drill those out just to get this caddy back to being useful again. Or more likely find someone else to drill it out for me as I don’t trust myself with power tools really.

ffffuuuuu

Did anyone else get this spam to an address they gave to Red Hat?

On November 2nd I received this spam:

(some headers removed; xxxxxxxxxxx@strugglers.net is my censored email address)

Received: from mail15.soatube.com ([184.105.143.66])
        by mail.bitfolk.com with esmtp (Exim 4.72)
        (envelope-from <bounce@soatube.com%gt;)
        id 1RLikr-00070I-6U
        for xxxxxxxxxxx@strugglers.net; Wed, 02 Nov 2011 21:53:57 +0000
Received: from [64.62.145.53] (mail3.soatube.com [64.62.145.53])
        by mail15.soatube.com (Postfix) with ESMTP id 6B324181CFF
        for <xxxxxxxxxxx@strugglers.net>;
        Wed,  2 Nov 2011 14:46:01 -0700 (PDT)
To: xxxxxxxxxxx@strugglers.net
From: events@idevnews.com
Date: Wed, 02 Nov 2011 14:00:40 -0700
Subject: BPM Panel Discussion: IBM, Oracle and Progress Software

-------------
BPM-CON: BPM Panel Discussion - IBM, Oracle and Progress Software
-------------
Online Conference

Expert Speakers:
IBM, Oracle, Progress Software
etc..

The email address it arrived at was an email address I created in November 2004 in order to take a web-based test on Red Hat’s web site prior to going on an RHCE course. It has only ever been provided to Red Hat, and has not received any email since 2007 (and all of that was from Red Hat). Until November 2nd.

The spam email contains no reference to Red Hat and is not related to any Red Hat product.

From my point of view, I can only think that one of the following things has happened:

  1. Spammers guessed this email address out of the blue, first time, without trying any of the other possible variations of it all of which would still reach me.
  2. One of my computers has been cracked into and the only apparent repercussion is that someone spammed an email address that appears only in an email archive from 2004/2005.
  3. Red Hat knowingly gave/sold my email address to some spammers.
  4. Red Hat or one of its agents have accidentally lost a database containing email addresses.

Possibility #4 seems far and away the most likely.

I contacted Red Hat to ask them if they knew what had happened, but they ignored all of my questions and simply sent me the following statement:

“Hello.

Thank you for contacting Red Hat.

we apologies for the inconvenience caused however we would like to inform you that we have not provided your email address to anyone.

Thank You.

Red Hat Training coordinator.”

That wasn’t really what I was asking. Let’s try again.

“Hi Red Hat Training coordinator,

Thanks for your reply, but I’m afraid I am not very reassured by your response. Do you have any suggestions as to how an email address created in 2004 and used only by yourselves for my RHCE exam managed to be used for unrelated marketing by a third party in 2011, unless Red Hat either provided my email address or leaked my email address?

For clarity we are talking about the email address “xxxxxxxxxxx@strugglers.net” which has never ever received any email except from Red Hat, until yesterday, when it got some unwanted
marketing email from a third party.”

“Hi Andy,

Please be assured that Red Hat does not circulate student’s e-mail address to any third party.

Thanks,
Red Hat Training Coordinator”

I’m not getting anywhere am I? I was only after some reassurance that they would actually look into it. Maybe they are looking into it, and for some reason decided that the best way to assure me of this was to show complete disinterest.

Oh well, I can send that email address to the bitbucket, but I can’t help thinking it’s not just my email address that has been leaked.

Anyone else received similar email? If so, was it to an address you gave to Red Hat?

Update 2011-11-10: Someone suggested I politely ask the marketer where they obtained my email address. It’s worth a try.

“Hi Integration Developer News,

May I ask where you obtained my email address
“xxxxxxxxxxx@strugglers.net”? I’m concerned that it may have been
given to you without my authority.

Thanks,
Andy”

Also I have now been contacted by someone from Red Hat’s Information Security team, who is looking into it. Thanks!

My email marketing adventure with British Telecom

The saga so far ^

I have a phone line from BT. I only use it for ADSL (which I get from Zen Internet). I gave my email address to BT because they offered to tell me useful things about my account via email. I now wish I had never done this.

I use extension addresses to identify what the email addresses are being used for. This is not a new idea and I didn’t invent it. For those who don’t know what an extension address is, it’s an email address like andy+foo@example.com. It ends up at the same place as andy@example.com. The point is that if I receive an email to andy+foo@example.com then I know that it’s either from whoever I gave that address to, or it’s from someone they gave/lost my address to. It’s handy for working out who’s sold their database to spammers, or had it stolen.

I used to prefer using “+” in the extension address just because it looks nicer to me than other popular alternatives like “-“. Unfortunately, some web developers are idiots and don’t believe that “+” is valid in an email address, so they try to help by refusing to accept the address. For that reason my email servers accept both “+” and “-” and I used to use “-” when “+” wasn’t accepted.

After I started doing that, I began to experience an even more annoying failure: web sites that accepted “+” in my email address when I signed up, but later got redeveloped by idiots who think that “+” is no longer valid. That means that I can no longer log in to those sites, and predictably customer service is not trained to deal with situations like that.

It seems that BT is an example of such a company, and I am having unbelievable difficulty finding anyone there that can understand this.

When I signed up with BT, the email address I gave them had a “+” in it. They accepted it at the time.

March 2011 ^

I start to receive marketing emails from BT for extra BT services, as well as BT group companies such as Dabs and Plusnet.

29th March 2011 ^

I receive another marketing email from BT, decide I don’t want to receive them any more, and follow the unsubscribe link. The unsubscribe page at http://bt.custhelp.com/app/contact/c/769,978 tells me that the email address (which BT is emailing me on) is invalid.

I contact BTCare on Twitter to ask them how to opt out and to opt me out on my behalf. Also sent a request via BT’s site for someone to call me back about it.

Am called back by a polite BT chap who totally failed to understand the problem, told me I was opted out (funny, I never opted in…) and advised that I sign up to a no commercial email scheme.

18th April 2011 ^

Receive more marketing email from BT. Ask BTCare on Twitter why that is. Am told that it can take a month to take effect.

18th May 2011 ^

Receive more marketing email from BT. Ask BTCare on Twitter why that is.

29th May 2011 ^

BTCare tells me on twitter that they opted out the wrong address last time. Apologises and says it may take a further month.

25th July 2011 ^

BT sends me a marketing email on behalf of Plusnet.

2nd August 2011 ^

I (somewhat exasperatedly) ask BTCare if, since they can’t opt me out of the emails, we can come to a more formal arrangement for my proofreading services of £50 per future email.

BTCare replies that “We can’t opt you out of emails for other companies” and that “no compensation is available sorry.”

I point out that Plusnet is a BT company, that the emails are sent by BT on an email address given only to BT, and contain a BT unsubscribe link which does not work.

3rd August 2011 ^

BTCare asks if the email was from BT, and advises the use of a US-based commercial email opt-out site.

4th August 2011 ^

BTCare tells me that their unsubscribe link works now and that I should try it again. I try it again. It fails the same way. I tell BTCare.

5th August 2011 ^

BTCare tells me that I need to contact Plusnet directly: “the link may be BT related but its seperate to us and we have no control over them

PlusNet (on twitter and identica) disagrees with BTCare and says BT sends those emails and operates the unsubscribe facility. They give me an email address at Plusnet to forward the marketing to anyway.

I have forwarded the email there and have so far got nothing back except an out of office email bounce. Oh well, it’s not really their problem anyway.

What to do now? ^

I would quite like to send a snail mail letter to BT to complain about this cluelessness. Does anyone know the best postal address and entity within BT for that to be directed to? If nothing else perhaps I can start sending the £50 a time invoices there?

I’d also quite like to not be a BT customer after this. I’m not too aware of my choices on that front though. My DSL is currently through Zen Internet, who I’m fairly happy with. I’d like a bit faster but don’t want to become a Sky or Virgin Media customer.

I’m told I can get Zen to “take over the copper”. What does this mean? Would it cause me difficulty in switching to another ISP in future?

Finally I have a feeling that there’s some DPA consequences for failing to opt me out of marketing in 4 months of asking, and then saying that I can’t get them to opt me out of marketing from companies they have given my email address to. Worth dropping a line to ICO?

Just hit delete / block all email from BT ^

Yeah it’s not that annoying but hopefully you can agree that this run-around is ridiculous. While I remain a BT customer I would prefer not to bitbucket all email from them as they do sometimes send stuff related to the operation of my account.

On extension addresses ^

It’s a shame, but I now consider “+” as unusable in an extension address because of idiot web developers who turn sites that used to accept these completely valid addresses into sites that reject them.

Just use “-” instead. It doesn’t look as pretty but at least not even the most ill-informed developer can think that “-” is invalid. If your email address already contains “-” (perhaps because your name does?), shit, sucks to be you.

Domain name as hostname not recommended

I had an interesting support ticket yesterday.

Someone was trying to do an apt-get update via BitFolk‘s apt cache and was ending up connecting to 2607:f0d0:1003:85::c40a:2942, where it was failing to update. This is not a BitFolk IPv6 address, nor is it the IPv6 address of a Debian mirror. Where was it coming from?

I’d asked the customer for the contents of a bunch of config files and output of the dig command, and while I was waiting for that I mentioned the problem on IRC, where Graham said:

<gdb> $ dig -t aaaa +short apt-cacher.com.net
<gdb> 2a00:1c10:3:634::3486:75a0
<gdb> 2607:f0d0:1003:85::c40a:2942
<grifferz> interesting
<gdb> Same for apt-cacher.bitfolk.com.net
<grifferz> so he's probably got some  search line in
           his resolv.conf
<gdb> I would ask what the search line is
<grifferz> r
<grifferz> search lines always good entertainment for
           those times when wtf moments are scarce
<gdb> Actually it's possible that the hostname is
      foo.net and there's no search line.

It seems that the enterprising folks at com.net have put in wildcard A and AAAA records which basically means that if you try to resolve *.com.net you end up at their “search portal”. That’s all web-based of course.

The customer didn’t have a search line, but the issue was that their host had a fully-qualified domain name (FQDN) along the lines of example.net.

This meant that according to default resolver settings it considered itself to be inside the domain net, and when searching for hosts (like apt-cacher.bitfolk.com) it would try to find them with .net appended first.

Massively confusing.

It can be fixed by giving the resolver libraries a hint as to which domain you are actually in, in the /etc/resolv.conf:

domain example.net
nameserver 192.168.1.2
nameserver 192.168.1.3

Having said that, it’s better not to pick your domain as the FQDN for any host and this is just one of the weird issues I have seen.

Sometimes customers order a VPS with a FQDN set to something like this, and I’ve yearned for an authoritative bit of documentation that says it’s not recommended. I asked about it on HantsLUG a while back also, and while it seems there was some agreement, it still seems to be down to preference.

I’ve never really tried to tell a prospective customer that they should pick a host within their domain (e.g. foo.example.net) instead of the domain name as the FQDN, because it always seemed like too complicated a subject to explain. Maybe I should try to find a way in future.

Which site’s database got sold/leaked?

Earlier today I received several emails of the form:

Return-path: macdaddy@dedibox.fr
Envelope-to: andy@example.com
Delivery-date: Wed, 01 Jun 2011 00:58:02 +0000
Received: from impaqm2.telefonica.net ([213.4.138.10]
        helo=telefonica.net)
        by bitfolk.com with esmtp (Exim 4.69)
        (envelope-from <macdaddy@dedibox.fr>)
        id 1QRZl3-0006v3-06
        for andy@example.com; Wed, 01 Jun 2011 00:58:02 +0000
Received: from IMPmailhost3.adm.correo ([10.20.102.124])
        by IMPaqm2.telefonica.net with bizsmtp
        id qQYS1g01y2h2L9m3MQlr7A; Wed, 01 Jun 2011 02:45:51
        +0200
Received: from sd-1622.dedibox.fr ([88.191.14.154])
        by IMPmailhost3.adm.correo with BIZ IMP
        id qQlq1g00D3KS0VC1jQlqTB; Wed, 01 Jun 2011 02:45:5
        +0200
X-Brightmail-Tracker: ??
X-original-sender: electricidadromero@telefonica.net
Received: from [88.191.14.154] by sd-1622.dedibox.fr id
        96YxWPB6QbSt with SMTP; Wed, 01 Jun 2011 02:52:25
        +0200
Date: Wed, 01 Jun 2011 02:52:25 +0200
From: Support <macdaddy@dedibox.fr>
X-Mailer: The Bat! (v4.05.2) Personal
X-Priority: 3 (Normal)
Message-ID: <0288215865.30146090204853@sd-1622.dedibox.fr>
To: XXXX <andy@example.com>
MIME-Version: 1.0
Content-Type: text/plain;
        charset="windows-1252"
Content-Transfer-Encoding: 8bit
Subject: Your order reference is 1460489

Dear User, XXXX.

Your order has been accepted.

Your order reference is 18973.

Terms of delivery and the date can be found with the auto-generated msword
file located at:
http://www.macarthurmumsnbubs.com/Orders/Orders.zip?id:11190401Generation_mail=andy@example.com

============================
Best regards, ticket service.
Tel.: (050) 404 53 824

The above is verbatim other than I’ve replaced my email address with “andy@example.com” and the “XXXX” is actually a password that I’ve used on multiple web sites.

I assume that the linked Zip file is a trojan; I haven’t looked at it.

Does anyone else who’s received the same email know which site it might be who’s leaked or sold their user database?

Please don’t contact me to tell me that I should use a different password on every web site. That is impractical for me; I already use several different classes of password and the one in the email is one I only use on the most trivial sites. I’m not particularly worried over what details have been leaked, I’m more interested in which site leaked because whoever they are, they store their passwords in the clear.

I also can’t tell by email address. They seem to have used my generic email address, so this would be from before I started using a unique email address for each site.

Any ideas?

Sites which it is not:

Amazon, Apple, The Book Depository, Ebay, Facebook, Forbidden Planet, Giffgaff, Lulu, Moonpig, Novatech, PayPal, Play, T-Mobile, Twitter

(either I’m not a user of these services or my email/password there isn’t what were used)

Update 2010-Jun-02: It was Friendster.

Reporting it was hard work, but they did eventually agree to look into it.