Adventures in entropy, part 2

Recap ^

Back in part 1 I discussed what entropy is as far as Linux is concerned, why I’ve started to look in to entropy as it relates to a Linux/Xen-based virtual hosting platform, how much entropy I have available, and how this might be improved.

If you didn’t read that part yet then you might want to do so, before carrying on with this part.

As before, click on any graph to see the full-size version.

Hosting server with an Entropy Key ^

Recently I colocated a new hosting server so it seemed like a good opportunity to try out the Entropy Key at the same time. Here’s what the available entropy looks like whilst ekeyd is running.

urquell.bitfolk.com available entropy with ekey, daily

First impressions are, this is pretty impressive. It hovers very close to 4096 bytes at all times. There is very little jitter.

Trying to deplete the entropy pool, while using an Entropy Key ^

As per Hugo’s comment in part 1, I tried watch -n 0.25 cat /proc/sys/kernel/random/entropy_avail to see if I could deplete the entropy pool, but it had virtually no effect. I tried with watch -n 0.1 cat /proc/sys/kernel/random/entropy_avail (so every tenth of a second) and the available entropy fluctuated mostly around 4000 bytes with a brief dip to ~3600 bytes:

urquell.bitfolk.com available entropy with ekey, trying to deplete the pool

In the above graph, the first watch invocation was at ~1100 UTC. The second one was at ~1135 UTC.

Disabling the Entropy Key ^

Unfortunately I forgot to get graphs of urquell before the ekeyd was started, so I have no baseline for this machine.

I assumed it would be the same as all the other host machines, but decided to shut down ekeyd to verify that. Here’s what happened.

urquell.bitfolk.com available entropy with ekeyd shut down, daily

The huge chasm of very little entropy in the middle of this graph is urquell running without an ekeyd. At first I was at a loss to explain why it should only have ~400 bytes of entropy by itself, when the other hosting servers manage somewhere between 3250 and 4096 bytes.

I now believe that it’s because urquell is newly installed and has no real load. Looking into how modern Linux kernels obtain entropy, it’s basically:

  • keyboard interrupts;
  • mouse interrupts;
  • other device driver interrupts with the flag IRQF_SAMPLE_RANDOM.

Bear in mind that headless servers usuallly don’t have a mouse or keyboard attached!

You can see which other drivers are candidates for filling up the entropy pool by looking where the IRQF_SAMPLE_RANDOM identifier occurs in the source of the kernel:

http://www.cs.fsu.edu/~baker/devices/lxr/http/ident?i=IRQF_SAMPLE_RANDOM

(as an aside, in 2.4.x kernels, most of the network interface card drivers had IRQF_SAMPLE_RANDOM and then they all got removed through the 2.6.x cycle since it was decided that IRQF_SAMPLE_RANDOM is really only for interrupts that can’t be observed or tampered with by an outside party. That’s why a lot of people reported problems with lack of entropy after upgrading their kernels.)

My hosting servers are typically Supermicro motherboards with Intel gigabit NICs and 3ware RAID controller. The most obvious device in the list that could be supplying entropy is probably block/xen-blkfront since there’s one of those for each block device exported to a Xen virtual machine on the system.

To test the hypothesis that the other servers are getting entropy from busy Xen block devices, I shut down ekeyd and then hammered on a VM filesystem:

urquell.bitfolk.com available entropy with ekeyd shut down, hammering a VM filesystem

The increase you see towards the end of the graph was while I was hammering the virtual machine’s filesystem. I was able to raise the available entropy to a stable ~2000 bytes doing this, so I’m satisfied that if urquell were as busy as the other servers then it would have similar available entropy to them, even without the Entropy Key.

Feeding entropy to other hosts ^

ekeyd by default feeds entropy from the key directly into the Linux kernel of the host it’s on, but it can be configured to listen on a Unix or TCP socket and mimic the egd protocol. I set it up this way and then put an instance of HAProxy into a VM with my ekeyd as a back end. So at this point I had a service IP which would talk egd protocol, and client machines could use to request entropy.

On the client side, ekeyd-egd-linux can be found in Debian lenny-backports and in Debian squeeze, as well as Ubuntu universe since Jaunty. This daemon can read from a Unix or TCP socket using the egd protocol and will feed the received entropy into the Linux kernel.

I took a look at which of my VMs had the lowest available entropy and installed ekeyd-egd-linux on them, pointing it at my entropy service IP:

admin.obstler.bitfolk.com available entropy after hooking up to entropy service

panel0.bitfolk.com available entropy after hooking up to entropy service

spamd0.lon.bitfolk.com available entropy after hooking up to entropy service

Success!

Where next? ^

  • Get some customers using it, explore the limits of how much entropy can be served.
  • Buy another Entropy Key so that it doesn’t all grind to a halt if one of them should die.
  • Investigate a way to get egd to read from another egd so I can serve the entropy directly from a VM and not have so many connections to my real hardware. Anyone interested in coding that?
  • Monitor the served entropy both for availability and for quality.

7 thoughts on “Adventures in entropy, part 2

  1. Really interesting pair of articles!

    I’ve tried running ‘watch -n 0.25 cat entropy_avail’ and my entropy goes down too. What I did find interesting about that command though is that leaving my computer alone the entropy stayed put for up to a few seconds at a time. But, moving the mouse and typing on my keyboard caused the entropy to jump around, obviously as a result of all the interrupts being generated.

  2. The reason entropy drops when you try to look at it is that you’re causing the execution of a program. Modern Linux runtime linkers randomise where shared libraries are loaded to make attacks more difficult. It does this by reading 32 or 64 bytes from /dev/random.

    Glad you’re having success with the Entropy Key! (Disclaimer: I work for the manufacturer.)

  3. I’ve been doing some experiments with Entropy Key too. Thanks for the explanation about why starting a process (such as cat) to read entropy_avail actually uses entropy.

    To counteract this, I used the following little Perl script which reads entropy_avail every 0.1 seconds without spawning a process each time:

    http://gist.github.com/485595

  4. Two very interesting articles. I really need to get hold of an Entropy Key now that they are available.

    Another solution to this problem is to strip out the Linux kernel PRNG, and use a better PRNG, one that can not be drained of entropy. There are patches against the Linux kernel for both Yarrow and Fortuna PRNGs – have you considered this approach as well or instead?

    1. Gav, I must admit I personally haven’t explored that because I’m in the position of being asked to provide entropy rather than being the one in need of it. Although my examples have been based on (mostly virtual) machines I operate, that’s only because those are the only machines I have access to. 🙂

      While I can measure particularly low available entropy on some of them, it wasn’t actually causing me any problems. If it had been causing me problems then I would probably have set it up to top up /dev/random from /dev/urandom and considered that secure enough.

      What brought all this to my attention was customers complaining to me about lack of entropy in their virtual machines. I started suggesting they just use /dev/urandom or top up their /dev/random from something else, but I’m not willing to promise someone else that this is secure enough for their needs. I didn’t think of suggesting they change the PRNG too. I think it might be more work than they are willing to take on (customers support their own kernels, so almost all use their distribution’s packaged one). Anyway, they now have the Entropy Key as another option.

  5. Gav: Replacing Linux’s current RNG with another doesn’t really solve any problems. /dev/urandom will just turn into a PRNG (instead of an TRNG) when the pool runs low anyway. If you if you want that, just symlink /dev/urandom to /dev/random. Otherwise, stick with having them separate and have things you really care about use /dev/random, and stuff you don’t use /dev/urandom; best of both worlds.

Leave a Reply

Your email address will not be published. Required fields are marked *