Red Hat-based Linux under Xen, from Debian Etch

I find myself in the position of needing to overhaul BitFolk‘s Red Hat-alike offerings, which at the moment are limited to Fedora Core 6. Apparently that is out of date now and Fedora aren’t supplying any updates for it anymore, and my understanding is that in Red Hat land one can’t just upgrade to the latest release without reinstalling. Sadly I don’t think I can convince every customer to upgrade to Debian or Ubuntu, so I need to move with the Red Hat times. I decided to start with Centos 5 because that’s the one with the most demand.

The first thing I find is what looks like a very nice guide to installing Centos under Xen. At the moment all of my dom0s are Debian Etch, some with Linux software RAID (md) and the newer ones with hardware RAID. All of the Xen domains currently have their block devices exported as individual LVM logical volumes from dom0.

Going by the HOWTO, that’s not the way to do it in Centos (and by extension, every other Red Hat-alike) anymore. They want their storage as a disk image, exported as /dev/xvda which they will partition themselves. A departure from my current setup, but should be possible. Unfortunately it seems that there is some bug somewhere in md/LVM/Xen about trying to export an LV as a whole disk like this. Say this is my domU config file:

name = 'centos5'
memory = 120
kernel = '/boot/vmlinuz-centos5-install'
ramdisk = '/boot/initrd.img-centos5-install'
extra = 'text'
disk = [ 'phy:mapper/mainvg-domu_centos5_xvda,xvda,w' ]
vif = [ 'mac=00:16:3e:1b:b2:e7, bridge=xenbr0, vifname=v-centos5' ]

with /dev/mapper/mainvg-domu_centos5_xvda being an LV.

If I start this up, the installer kernel kicks in and begins the process, but any operations on exported disk result in thousands of dom0 kernel messages like this:

Jan 19 01:54:43 montelena kernel: raid10_make_request bug: can't convert block across chunks or bigger than 64k 653279743 4

At some point the domU will see this as a corrupted block device, remount it read-only and the installation will fail.

Searching for more information on this problem leads me only to a similar report from January 2007 by Ask Bjørn Hansen with no follow-ups except from me, reporting the same thing. I mailed Ask to see if he solved the problem, but he told me he just moved to image files and then the problem went away. I do not want to move to image files.

The problem does not happen when I try it on a machine with hardware RAID. I still have too few machines with hardware RAID to accept that as a solution though.

I’m thinking that it won’t happen if I did away with exporting whole disks and went back to exporting individual block devices. To try to test my theory I thought I’d go ahead and do the Centos install on one of the machines with hardware RAID, which would leave me with a disk image on an LV, that I could later split out into separate block devices.

I then encountered problems getting pygrub to work, so I may as well note down what I did to get that going in case it would be useful for anyone else.

For those who aren’t aware, pygrub is a Python tool which can look inside a disk image or a filesystem to find a GRUB menu.lst. It then emulates the GRUB menu and returns the kernel, initrd, boot arguments etc. etc. that GRUB would normally have selected. It’s a way to keep all the kernel stuff inside the guest filesystem so it can be managed as usual by the admin of the guest.

pygrub comes as part of Xen, and on Etch it can be found at /usr/lib/xen-3.0.3-1/bin/pygrub. To test it you can just run it with a disk or filesystem image/device as the parameter:

$ sudo /usr/lib/xen-3.0.3-1/bin/pygrub /dev/mainvg/domu_centos5_xvda

What that got me at first was a python error about “GrubConfig” not being found (sorry, the exact text has scrolled off my screen and I forgot to save it to a file…). That’s because of Debian bug #390678. To fix it just do as it says and edit the sys.path in /usr/lib/xen-3.0.3-1/lib/python/grub/GrubConf.py to be /usr/lib/xen-3.0.3-1/lib/python.

That got me a new error about being unable to read the filesystem. Searching around provided me with some clues that it would be a case of putting some python filesystem stuff in /usr/lib/xen-3.0.3-1/lib/python/grub/fsys/.

I downloaded the source of Xen 3.0.3, went into tools/pygrub/, installed the python2.4-dev and e2fslibs-dev packages, then did a “make”. That completed without error and left me with:

$ ls -la build/lib.linux-i686-2.4/grub/fsys/ext2/
total 36
drwxr-xr-x 2 andy andy  1024 2008-01-19 22:33 .
drwxr-xr-x 3 andy andy  1024 2008-01-19 22:33 ..
-rw-r--r-- 1 andy andy  1126 2006-10-15 12:22 __init__.py
-rwxr-xr-x 1 andy andy 30352 2008-01-19 22:33 _pyext2.so
-rw-r--r-- 1 andy andy   220 2006-10-15 12:22 test.py

which I copied into /usr/lib/xen-3.0.3-1/lib/python/grub/fsys/ext2/.

Success! I’ve now got Centos 5 in a disk image which can be booted from my Etch dom0 with the following domU config file:

name = 'centos5'
memory = 120
disk = [ 'phy:mapper/mainvg-domu_centos5_xvda,xvda,w' ]
vif = [ 'mac=00:16:3e:1b:b2:e7, bridge=xenbr0, vifname=v-centos5' ]
bootloader = '/usr/lib/xen-3.0.3-1/bin/pygrub'

(Remember this is only on a dom0 with hardware RAID; if I try to do this on LVM over md I suffer disk corruption in the exported LV)

The next thing I wanted to do was get access to the first partition of that disk image from the dom0 so that I could take a copy of it and use it as the basis for another block device. It may not be immediately obvious to you how to get at a partition inside a disk image; the way to do it is with kpartx. On Debian that can be found in the package multipath-tools:

$ sudo kpartx -a /dev/mainvg/domu_centos5_xvda
$ sudo ls -la /dev/mapper/*centos5*
brw-rw---- 1 root disk 254, 96 2008-01-19 23:51 /dev/mapper/domu_centos5_xvda1
brw-rw---- 1 root disk 254, 97 2008-01-19 23:51 /dev/mapper/domu_centos5_xvda2
brw-rw---- 1 root disk 254, 95 2008-01-19 20:20 /dev/mapper/mainvg-domu_centos5_xvda

The first two were added by kpartx and can be mounted like any block device. I did so, took an archive of the contents, created a new test LV, and unpacked that tar into it. I then labelled the new LV as “/” (because that’s how Red Hatters find their root filesystem):

$ sudo e2label /dev/mainvg/domu_test_root /

and came up with a new domU config file:

name = 'centos5'
memory = 120
disk = [ 'phy:mapper/mainvg-domu_test_root,xvda1,w',
         'phy:mapper/mainvg-domu_test_swap,xvda2,w' ]
vif = [ 'mac=00:16:3e:1b:b2:e7, bridge=xenbr0, vifname=v-centos5' ]
bootloader = '/usr/lib/xen-3.0.3-1/bin/pygrub'

This works, no disk corruption.

So Centos 5 is available now, and I will try to get around to Fedora 8 as well at some point.

10 thoughts on “Red Hat-based Linux under Xen, from Debian Etch

  1. Thanks for this post. Pulling my hair out w/ pygrub, trying to build a centos domU on my debian dom0. I’m also using LV’s for domU’s, and wasn’t about to switch to images. Downloading the xen source and building the ext2 pieces worked like a champ.

  2. Just for the record: The disk corruption problem does not occur when using RAID1 (at least not with Xen 3.1/3.2, kernel 2.6.22 [xen patch from ubuntu] in dom0 and 2.6.18 [xen patch form debian/fedora] in domU. I can’t remember if i tried xvda with Xen 3.0.x and 2.6.18 in dom0, though.

  3. I really love how i just need to waste an hour googling and messing around in the pygrub source and just download the xen sources on top of it just because noone debian bothers to TEST or FIX this issue since 2006.
    wouldnt be as funny if there werent this exec exploit in pygrub which debian now at least can claim to be absolutely prone against.
    You can’t hack whats broken…

  4. Hi,

    The broken pygrub is not a security bug so not going to be fixed at least until the next stable release I suppose.

    Do you have the details of the pygrub exploit? I thought I saw it fixed, which is what I wouls expect as a security issue.

    If you’re just here to bash Debian then I’m not really interested however.

  5. of course I hadnt come here just to bash debian…?! I just ran into this site while try to make pygrub work like everyone else; just at some point I got real angry about how many peoples time is spent on this; the libs are missing in the package and it should -obviously- just be fixed.

    that aside, I think only rPath has actually bothered to fix the pygrub exploit, at least I didnt find much further references. There’s a long way to go for pygrub anyway, it’ll be interesting to see if the other grub features like tftp support or trusted grub extensions will go in at some point.

    About the exploits nature, let’s just say pygrub was very eager to execute domu:/boot/grub/menu.lst, so instructions could be crafted to run in dom0, without using overflows or such.

    for shared hosting scenarios this is an obvious nightmare 🙂

Leave a Reply to Andy Cancel reply

Your email address will not be published. Required fields are marked *