Red Hat-based Linux under Xen, from Debian Etch

I find myself in the position of needing to overhaul BitFolk‘s Red Hat-alike offerings, which at the moment are limited to Fedora Core 6. Apparently that is out of date now and Fedora aren’t supplying any updates for it anymore, and my understanding is that in Red Hat land one can’t just upgrade to the latest release without reinstalling. Sadly I don’t think I can convince every customer to upgrade to Debian or Ubuntu, so I need to move with the Red Hat times. I decided to start with Centos 5 because that’s the one with the most demand.

The first thing I find is what looks like a very nice guide to installing Centos under Xen. At the moment all of my dom0s are Debian Etch, some with Linux software RAID (md) and the newer ones with hardware RAID. All of the Xen domains currently have their block devices exported as individual LVM logical volumes from dom0.

Going by the HOWTO, that’s not the way to do it in Centos (and by extension, every other Red Hat-alike) anymore. They want their storage as a disk image, exported as /dev/xvda which they will partition themselves. A departure from my current setup, but should be possible. Unfortunately it seems that there is some bug somewhere in md/LVM/Xen about trying to export an LV as a whole disk like this. Say this is my domU config file:

name = 'centos5'
memory = 120
kernel = '/boot/vmlinuz-centos5-install'
ramdisk = '/boot/initrd.img-centos5-install'
extra = 'text'
disk = [ 'phy:mapper/mainvg-domu_centos5_xvda,xvda,w' ]
vif = [ 'mac=00:16:3e:1b:b2:e7, bridge=xenbr0, vifname=v-centos5' ]

with /dev/mapper/mainvg-domu_centos5_xvda being an LV.

If I start this up, the installer kernel kicks in and begins the process, but any operations on exported disk result in thousands of dom0 kernel messages like this:

Jan 19 01:54:43 montelena kernel: raid10_make_request bug: can't convert block across chunks or bigger than 64k 653279743 4

At some point the domU will see this as a corrupted block device, remount it read-only and the installation will fail.

Searching for more information on this problem leads me only to a similar report from January 2007 by Ask Bjørn Hansen with no follow-ups except from me, reporting the same thing. I mailed Ask to see if he solved the problem, but he told me he just moved to image files and then the problem went away. I do not want to move to image files.

The problem does not happen when I try it on a machine with hardware RAID. I still have too few machines with hardware RAID to accept that as a solution though.

I’m thinking that it won’t happen if I did away with exporting whole disks and went back to exporting individual block devices. To try to test my theory I thought I’d go ahead and do the Centos install on one of the machines with hardware RAID, which would leave me with a disk image on an LV, that I could later split out into separate block devices.

I then encountered problems getting pygrub to work, so I may as well note down what I did to get that going in case it would be useful for anyone else.

For those who aren’t aware, pygrub is a Python tool which can look inside a disk image or a filesystem to find a GRUB menu.lst. It then emulates the GRUB menu and returns the kernel, initrd, boot arguments etc. etc. that GRUB would normally have selected. It’s a way to keep all the kernel stuff inside the guest filesystem so it can be managed as usual by the admin of the guest.

pygrub comes as part of Xen, and on Etch it can be found at /usr/lib/xen-3.0.3-1/bin/pygrub. To test it you can just run it with a disk or filesystem image/device as the parameter:

$ sudo /usr/lib/xen-3.0.3-1/bin/pygrub /dev/mainvg/domu_centos5_xvda

What that got me at first was a python error about “GrubConfig” not being found (sorry, the exact text has scrolled off my screen and I forgot to save it to a file…). That’s because of Debian bug #390678. To fix it just do as it says and edit the sys.path in /usr/lib/xen-3.0.3-1/lib/python/grub/GrubConf.py to be /usr/lib/xen-3.0.3-1/lib/python.

That got me a new error about being unable to read the filesystem. Searching around provided me with some clues that it would be a case of putting some python filesystem stuff in /usr/lib/xen-3.0.3-1/lib/python/grub/fsys/.

I downloaded the source of Xen 3.0.3, went into tools/pygrub/, installed the python2.4-dev and e2fslibs-dev packages, then did a “make”. That completed without error and left me with:

$ ls -la build/lib.linux-i686-2.4/grub/fsys/ext2/
total 36
drwxr-xr-x 2 andy andy  1024 2008-01-19 22:33 .
drwxr-xr-x 3 andy andy  1024 2008-01-19 22:33 ..
-rw-r--r-- 1 andy andy  1126 2006-10-15 12:22 __init__.py
-rwxr-xr-x 1 andy andy 30352 2008-01-19 22:33 _pyext2.so
-rw-r--r-- 1 andy andy   220 2006-10-15 12:22 test.py

which I copied into /usr/lib/xen-3.0.3-1/lib/python/grub/fsys/ext2/.

Success! I’ve now got Centos 5 in a disk image which can be booted from my Etch dom0 with the following domU config file:

name = 'centos5'
memory = 120
disk = [ 'phy:mapper/mainvg-domu_centos5_xvda,xvda,w' ]
vif = [ 'mac=00:16:3e:1b:b2:e7, bridge=xenbr0, vifname=v-centos5' ]
bootloader = '/usr/lib/xen-3.0.3-1/bin/pygrub'

(Remember this is only on a dom0 with hardware RAID; if I try to do this on LVM over md I suffer disk corruption in the exported LV)

The next thing I wanted to do was get access to the first partition of that disk image from the dom0 so that I could take a copy of it and use it as the basis for another block device. It may not be immediately obvious to you how to get at a partition inside a disk image; the way to do it is with kpartx. On Debian that can be found in the package multipath-tools:

$ sudo kpartx -a /dev/mainvg/domu_centos5_xvda
$ sudo ls -la /dev/mapper/*centos5*
brw-rw---- 1 root disk 254, 96 2008-01-19 23:51 /dev/mapper/domu_centos5_xvda1
brw-rw---- 1 root disk 254, 97 2008-01-19 23:51 /dev/mapper/domu_centos5_xvda2
brw-rw---- 1 root disk 254, 95 2008-01-19 20:20 /dev/mapper/mainvg-domu_centos5_xvda

The first two were added by kpartx and can be mounted like any block device. I did so, took an archive of the contents, created a new test LV, and unpacked that tar into it. I then labelled the new LV as “/” (because that’s how Red Hatters find their root filesystem):

$ sudo e2label /dev/mainvg/domu_test_root /

and came up with a new domU config file:

name = 'centos5'
memory = 120
disk = [ 'phy:mapper/mainvg-domu_test_root,xvda1,w',
         'phy:mapper/mainvg-domu_test_swap,xvda2,w' ]
vif = [ 'mac=00:16:3e:1b:b2:e7, bridge=xenbr0, vifname=v-centos5' ]
bootloader = '/usr/lib/xen-3.0.3-1/bin/pygrub'

This works, no disk corruption.

So Centos 5 is available now, and I will try to get around to Fedora 8 as well at some point.