Monday, June 7, 2010

Experimental Virtual Machines

I'm doing some work these days on trying to get Python 2.7 as the default Python in the next version of Ubuntu, Maverick Meerkat (10.10). This work will occasionally require me to break my machine by installing experimental packages. That's a good and useful thing because I want to test various potentially disruptive changes before I think about unleashing them on the world. This is where virtual machines really shine!

To be efficient, I need a really fast turnaround from known good state, to broken state, back to known good state. In the past, I've used VMware Fusion on my Mac to create a VM, then take a live snapshot of the disk before making my changes. It was really easy then to revert to the last known good snapshot, try something else and iterate.

But lately Fusion has sprouted a nasty habit of freezing the host OS, such that a hard reboot is necessary. This will inevitably cause havoc on the host, by losing settings, trashing mail, corrupting VMs, etc. VMware can't reproduce the problem but it happens every time to me, and it hurts, so I'm not doing that any more :).

Back to my Lucid host and libvirt/kvm and the sanctuary of FLOSS. It's really easy to create new VMs, and there are several ways of doing it, from virt-manager to vmbuilder to straight up kvm (thanks Colin for some recipes). The problem is that none of these are exactly fast to go from bare metal to working Maverick VM with all the known good extras I need (like openssh-server and bzr, plus my comfortable development environment).

I didn't find a really good fit for vmbuilder or the kvm commands, and I'm not smart enough to use the libvirt command line tools, but I think I've figured out a hack using virt-manager that will work well enough.

1. Create a disk for the baseline VM (named 'scars' in my case :) manually
% qemu-img create -f qcow2 scars.qcow2 20G

2. Create the baseline VM using virt-manager
* I use dhcp internally, so I give this thing a mac address, assign it 1GB
of RAM and 1 processor.
* For storage, I tell it to use the scars.qcow2 file I created above
* Boot from the maverick ISO of your choice, install everything you want,
and get your development environment in place
* Shut this machine down

3. Clone your baseline VM
* In the virt-manager Manager window, right click on your baseline VM and
select Clone
* You will not be given an opportunity to select a disk or a mac address,
so for now just go with the defaults.
* Do not start your clone

4. Create an 'overlay' disk that is a backed by your baseline disk.
% qemu-img create -f qcow2 -b scars.qcow2 scars.ovl

5. Edit your clone
* Delete the disk given to your clone by default
* Create a new virtio storage that points to scars.ovl
* Delete the nic given to your clone by default
* Create a new virtio network device with the mac address of your
baseline. You'll get a warning about a mac address collision, but this
can be ignored (see below).

6. Boot your clone

At this point you'll have a baseline which is your known good system, and a clone/overlay which you can break to your heart's content. When it's time to iterate back to a known good state, shut down your clone, delete the overlay disk, and create a new one from the baseline qcow2 disk. This is pretty fast, and your turn around time is not much more than the time it takes to shutdown one machine and boot another. It actually feels a lot faster by the wall clock than Fusion ever was to snapshot and restore.

One downside is that you cannot run both VMs at the same time. I think mostly this is because of the MAC address collision, but also because creating the overlay requires that both machines be powered off.

The other downside seems to be that if you want to update your known good baseline, say by installing more packages or apt-get update/upgrade, you will have to recreate your overlay disk for your next experiment. Changes to the underlying disk do not seem to propagate to the overlay automatically. Maybe that's intentional; I can't find much documentation on it. (Note too that the manpage for qemu-img does not describe the -b option.)

I guess the last downside is that I spent way too much time trying to figure all this out. The Googles were not a lot of help but did give me the qemu-img clue. But at least now you don't have to! :)


  1. I was talking with David Malcolm at PyCon, and I told him I think the distros that rely heavily on Python should have their own executable *not* called "python". Call it "syspython" or something. The user--that drooling malicious idiot behind the keyboard--would be shooed away from it, encouraged to use the nominal "python". This way, if they installed some wonky new version of an important module, it wouldn't break their package manager, system notifications, etc. As I recall he thought it was an interesting idea but didn't have much else to say. I reckon it'd be a lot of work to get there, and has only marginal benefit, and perhaps for those reasons it's not worth pursuing.

  2. Wouldn't the persistent disk feature of VMWare be useful in this context? Changed to the disk image are stored as incremental, and a cold restart (as opposed to a warm one) reverts to the original disk image.

  3. Curious. I snapshot and revert like a madman with fusion 3/3.1 and Ubuntu Server 10.04 - never had an issue. Normally I don't run multiple Ubuntu vms at the same time though..

  4. "Changes to the underlying disk do not seem to propagate to the overlay automatically".

    This is intentional, as disk images are unaware of the filesystem used on top of them (if any), it is just blocks of data.

    Once an overlay is made, the base image should not be changed or you will see a corrupt filesystem in the overlay (unless you only change blocks in the base image which have already been masked by blocks written in the overlay, etc.).

    You can have 2 or more overlays sharing a base image (which never changes).

    If you want to re-combine an overlay into its base image (this will of course invalidate any other overlays using this base), use "qemu-img commit <overlay-filename>".

    I like playing with qemu, as you can tell, and I hope this comment sounds friendly.

  5. @Larry: I completely agree - and have for *years* (much longer than I have been in a position to do anything about it :). I think I've had similar conversations with David about it too, and it's part of the reason why on Debian/Ubuntu we have the dist-packages hack. Still, it would be a useful conversation to have at Pycon 2011.

  6. @Steve: yes, something like that would help. I don't think the Linux virtualization stuff is that far along though (would love to be proven wrong).

    @BrettH: yep, the freezes I see are *always* related to starting the second VM, or switching to it. It used to work though. :(

    @pysquared: very friendly! :) I didn't understand this aspect of it at first, but it was explained the same way to me by a colleague. Thanks for the great comment.