EC2 AMI creation without magic

Magic

While I enjoy the fact that there are people out there maintaining EC2 AMI:s for other people to use, I was faced with two problems. First, there were no AMI:s maintained for the Linux distribution I wanted to use (Arch). Second, I don’t like the idea of relying on something magical out of my hands that I don’t understand and cannot affect – in this case, I am referring to the kernel AKI:s that were traditionally not under the control of an average EC2 user (I believe Amazon itself and select partners were able to provide these kernels). Using one of those AKI:s I would essentially be relying on someone else release engineering kernels that are compatible with my userland.

In short, given a Linux kernel I have built, and a userland I know how to prepare, I want to create an EC2 bootable disk image/AMI.

As it turns out, this is possible nowadays, but the details were a bit hard to find (for me anyway). So, here is a short guide on how to go create an AMI from scratch, relying only on the kernel, your distribution and a host system on which to create the image (can be virtualized, such as with VirtualBox). It is assumed that you’re already familiar with things like boot loaders, building a kernel and such.

arch4ec2

If you wish, you can look at arch4ec2 as an example of the process briefly described below. arch4ec2 is a small tool that automates the creation of an Arch Linux system (with a btrfs root fs). It must be run from a host Arch Linux system, such as one installed using the Arch Linux installation ISO onto a VirtualBox. Alternatively if you want to play, you can use one of the AMI:s I built and list in README.

Doing without the magic (almost)

EC2 supports something called user specified kernels. Without it, as mentioned above, you choose which kernel to boot by selecting a so-called AKI to boot your image with. The AKI was provided by Amazon or (I believe) one of a few select partners, and you had to run an image that was compatible with that kernel.

With user specified kernels, the AKI you choose is instead pv-grub (which I assume stands for “paravirtualized grub”). As a result, all you have to do is create a disk image which is accessible by grub (i.e., correct partitioning/filesystem layout) and which has a grub configuration that points to a kernel which is compiled with the necessary support for paravirtualization (i.e., it has to be Xen compatible). The only significant difference from installing grub locally is that grub itself is provided by Amazon (through the AKI chosen) rather than being installed in the boot record of your image (this is where there is still a small bit of magic).

Step 1: Selecting an AKI

In the user specified kernel documentation (NOTE: do not cut’n’paste the “hyphen” from this PDF as it is not actually a hyphen, and ec2-register will fail) there is a list of AKI:s to use depending on whether you intend to run a 32 bit or a 64 bit kernel, which region you intend to run in, and whether or not your image will be based on EBS or S3. I have only tested EBS, and I don’t know what might be different for S3 based images. I have also only tested 32 bit as of this writing.

Step 2: Paravirtualization support in the kernel

In order to enable the appropriate support (for a 32 bit kernel, 64 bit not yet tested by me), these are needed:

CONFIG_HIGHMEM64G=y
CONFIG_HIGHMEM=y
CONFIG_PARAVIRT_GUEST=y
CONFIG_XEN=y
CONFIG_PARAVIRT=y
CONFIG_PARAVIRT_CLOCK=y
CONFIG_XEN_BLKDEV_FRONTEND=y
CONFIG_XEN_NETDEV_FRONTEND=y
CONFIG_HVC_XEN=y
CONFIG_XEN_BALLOON=y
CONFIG_XEN_SCRUB_PAGES=y

(It is possible some variation is acceptable.)

Step 3: Use pv-grub compatible kernel compression

If you’re using a sufficiently new kernel, the kernel build might produce a kernel compress with XZ/LZMA2 instead of GZIP. Such a kernel will not boot on EC2 and you need to use GZIP instead:

CONFIG_KERNEL_GZIP=y
# CONFIG_KERNEL_BZIP2 is not set
# CONFIG_KERNEL_LZMA is not set
# CONFIG_KERNEL_XZ is not set
# CONFIG_KERNEL_LZO is not set

Step 4: Populate a boot partition and root file system

Your disk image should be partitioned and file system initialized (in the case of arch4ec2 I use a small ext3fs boot partition and a btrfs root partition). If you have a separate boot partition mounted under /boot, do not forget to put a boot directory in it and symlink grub to boot/grub.

How to best populate a system is mostly up to which distribution you use. In the case of Arch Linux, the mkarchroot tool is helpful for scripting it (this is what arch4ec2 uses). But, in most cases if you are doing this manually as a one-off, you can just install your system as you would normally in a virtualized environment and take whatever steps necessary to switch to a properly configured kernel.

Step 5: Make an EBS snapshot of your disk image

In order to register an AMI, you must have an EBS snapshot which contains the contents to be used when spawning an instance using the AMI. If you did the original setup on ec2 maybe you already have an EBS volume and can just snapshot it. Otherwise, you are going to have to get your disk image onto EC2 in some way. For example, you can use the alesic AMI:s I mentioned before and boot a system, then mount an EBS volume you’ve created and ‘dd’ your device image to it over ssh.

In any case, once you have an EBS volume containing the image as you want it to appear in the AMI, snapshot it:

ec2-create-snapshot -d 'my-ami-snapshot' vol-XXXXXXX

Step 6: Register an AMI based on your snapshot

You will first have to recall the AKI you chose in step 1, and the snapshot id that was emitted by ec2-create-snapshot in step 5. Then, register the AMI:

ec2-register --debug -s snap-XXXXXXXX --root-device-name /dev/sda -n my-arch-ami --kernel AKI

The AMI registration will not succeed until the EBS snapshot has completed, so you will have to wait for that first.

Done

At this point your AMI is ready and you should be able to spawn instances with your AMI.

Some resources