Instance store HVM AMIs for Amazon EC2

At SmugMug, we have a rule for EC2 instances: do not use EBS for production systems unless instance store is not an option for the instance type. EBS is networked storage, which has many useful capabilities, but in our production environment we have automated builds for our servers and have no need for the added complexity of networked storage.

We’ve been successful with this goal for the most part, but the one area that we couldn’t use instance store is instances that required HVM virtualization, such as cc2.8xlarge and i2.* instances. EC2 offers two ways to boot instances, off of an instance store (like local disk) or via EBS, and two types of virtualization, Hardware Virtual Machine (HVM) and Paravirtual (PV). Recently AWS added instance store volumes to these HVM-only instances, so we wanted to take advantage of that. To create instance-store HVM AMIs, AWS guided us to boot an HVM AMI and convert it to instance store.

Before beginning, make sure you have both access keys and X.509 certificates for use. You must have both types of credentials for these steps to work. See bundle AMI prerequisites if you need guidance.

Here are the steps for creating instance-store HVM AMIs:

  1. Boot an EBS HVM instance. I used ami-dfa98cb6 (Ubuntu 12.04.3) on a c3.large instance (do not use 20131205 AMIs as they don’t boot on i2.* due to a bug)
  2. Set up the instance with your environment; we use masterless Puppet to configure the host with our base environment
  3. Copy the below script to the instance
  4. Copy your cert.pem and key.pem to /tmp
  5. export AWS_SECRET_KEY="foo"
  6. Run the script. The script creates the machine image, uploads it to s3, and registers it with EC2 so it is available for use
  7. Document the AMI created somewhere script:

set -e

if [ `whoami` != $USER ]; then
    sudo -u $USER -H AWS_SECRET_KEY=$AWS_SECRET_KEY $0 "$@"
    exit $?

if [ -z "$AWS_SECRET_KEY" ] ; then
    echo "ERROR: \$AWS_SECRET_KEY not set"
    echo "    export AWS_SECRET_KEY=foo"
    exit 2


STAMP=`/bin/date +%s`
export EC2_HOME=/opt/ec2-ami-tools-
export EC2_AMITOOL_HOME=/opt/ec2-ami-tools-

apt-get -y install ruby1.8 gdisk kpartx grub unzip python-pip
apt-get -y autoremove

if ! [ -d /opt/ec2-ami-tools- ] ; then
    curl -o/tmp/
    unzip /tmp/ -d /opt

/bin/rm -f /var/cache/apt/archives/*deb

sed -i 's;ro console=hvc0;ro console=ttyS0 xen_emul_unplug=unnecessary;' /boot/grub/menu.lst

$EC2_HOME/bin/ec2-bundle-vol \
    --privatekey /tmp/key.pem \
    --user $AWS_ACCOUNT \
    --cert /tmp/cert.pem \
    --arch x86_64 \
    --partition mbr \
    --prefix $PREFIX-$STAMP \
    --block-device-mapping ami=sda,root=/dev/sda1,ephemeral0=sdb,ephemeral1=sdc,ephemeral2=sdd,ephemeral3=sde \
    --exclude `find /tmp | tail -n+2 | tr '\n' ','` \
    --include `find / -name '*.gpg' -o -name '*.pem' -o -name 'authorized_keys' | grep -v '^/mnt\|^/tmp' | tr '\n' ','`

$EC2_HOME/bin/ec2-upload-bundle \
    --bucket $BUCKET \
    --manifest /tmp/$PREFIX-$STAMP.manifest.xml \
    --access-key $AWS_ACCESS_KEY \
    --secret-key $AWS_SECRET_KEY \
    --batch \
    --location US \

echo -e "[default]\nregion = $REGION\naws_access_key_id = $AWS_ACCESS_KEY\naws_secret_access_key = $AWS_SECRET_KEY" > /tmp/aws
export AWS_CONFIG_FILE="/tmp/aws"
pip install awscli
aws ec2 register-image \
    --image-location $BUCKET/$PREFIX-$STAMP.manifest.xml \
    --name $PREFIX-$STAMP \
    --virtualization-type hvm

Notes for the script:

  • Set up the variables on lines 16-19 to match your environment
  • We’re using a beta version of ec2-ami-tools that has support for HVM on instance store, previous versions of these tools will not work: EC2 AMI tools v1.4.0.10 (beta)
  • AWS_SECRET_KEY is passed through your shell so it doesn’t accidentally get baked into the AMI
  • You may need to adjust the include/exclude lines for ec2-bundle-vol as appropriate for your setup
  • With Ubuntu/Debian, you need to include *.gpg, *.pem, and authorized_keys files otherwise you’ll have problems connecting and performing apt-get operations
  • Adjusting menu.lst ended up being critical for getting this working, as well as using an older grub (0.97). Without these two changes, our AMIs would not boot
  • AWS list of AMI types to instance types

Thanks to Joshua F. from AWS support for help getting this going.

— Shane Meyers, SmugMug Operations

Scaling Puppet in EC2

At SmugMug, we’re constantly scaling up and down the number of machines running in ec2. We’re big fans of puppet, using it to configure all of our machines from scratch. We use generic Ubuntu AMIs provided by Canonical as our base, saving ourselves the trouble of building custom AMIs every time there is a new operating system release.

To help us scale automatically without intervention (we use AutoScaling), we run puppet in a nodeless configuration. This means we do not run a puppet master on any machines in our infrastructure. All machines run puppet independent of any other, removing dependencies and improving reliability.

I will first explain nodeless puppet, then I will dive into how we use it.

Understanding Nodeless Puppet

Most instructions for setting up puppet tell you to create a puppet master instance that all puppet agents talk to. When an agent needs to apply a configuration, the master compiles a config and hands it back to the agent. With nodeless, the puppet agent compiles its own configuration and applies it to the host.

We start with a simple puppet.conf file that is pretty generic:


Then we create a top-level manifest called mainrun.pp. In our setup this manifest lives in a directory called /manifests. An example mainrun.pp:

include ntp
include puppet
include ssh
include sudo

if $hostname == "foo" {
    include apache2

There is also a /modules directory that contains puppet modules. Each include statement in the mainrun.pp manifest exists as a module.

Once we have all of our modules created and listed appropriately in mainrun.pp, we run puppet with the apply command: sudo puppet apply /etc/puppet/manifests/mainrun.pp. Puppet will then run and do all that our manifests tell it to.

Scaling Puppet

Upon booting, all machines download their entire puppet manifest code tree from an Amazon S3 bucket. Then puppet is run and the machine is configured. By using S3, we’re leveraging Amazon’s ability to provide highly available access to files despite server or data center outages.

To help keep our changes to puppet sane, we use a git repository. When anyone does a push to the central repository server, it copies our files to our Amazon S3 bucket. The S3 bucket has custom IAM access rules applied so puppet can only see its bucket and no other.

When we launch a new instance in ec2, we use the --user-data-file option in ec2-run-instances to run a first-boot script that sets us up with puppet.

A simple first-boot script:


apt-get update
apt-get --yes --force-yes install puppet s3cmd
wget --output-document=$S3CMDCFG$BUCKET_PUPPET/s3cmd.cfg
sed -i -e "s#__AWS_ACCESS_KEY__#$AWS_ACCESS_KEY#" \
chmod 400 $S3CMDCFG

until \
    s3cmd -c $S3CMDCFG sync --no-progress --delete-removed \
    s3://$BUCKET_PUPPET/ /etc/puppet/ && \
    /usr/bin/puppet apply /etc/puppet/manifests/mainrun.pp ; \
do sleep 5 ; done

s3cmd.cfg in s3 is a publicly accessible template file, containing placeholders for the AccessKey and SecretKey that is supplied by the first-boot script. As s3cmd.cfg is a publicly accessible file, do not place any real credential data in it.

Puppet will install additional tools for keeping puppet running on all machines.

Keeping Puppet Running

As puppet is not running in agent mode, it does not wake up from a sleeping state to apply manifests that have changed since booting. We use cron to run puppet every 30 minutes. Our cron entry:

*/30 * * * * sleep `perl -e 'print int(rand(300));'` && /usr/local/sbin/ > /dev/null

We have the 5 minute sleep in the cron to ensure that machines run puppet at a staggered interval. This is to prevent all machines from restarting a service at the same moment (Apache for example), causing an interruption for our customers.

We have three simple scripts that are also installed by puppet for puppet:



/usr/bin/s3cmd -c /etc/s3cmd.cfg sync --no-progress \
    --delete-removed s3://$BUCKET_PUPPET/ /etc/puppet/


/usr/bin/puppet apply /etc/puppet/manifests/mainrun.pp

We split puppet runs into three scripts to help with manual maintenance of a box. We run sudo if we simply want puppet to run immediately. sudo is handy when making manual changes to the puppet manifests and modules for testing purposes; once testing is complete we copy our changes back into the git repository. sudo is infrequently run manually, mostly for resetting the puppet config when making manual testing changes.

Final Thoughts

As you can imagine there is a lot more involved in our manifests. There are a large number of conditional operators that enable and disable different parts of the manifests depending on what role an instance has.

EC2 tags have proven to be invaluable for us; each machine is assigned two tags that exactly describe any role. A script for reading the ec2 tags at boot combined with a custom fact is used to expose the ec2 tags to puppet.

Future posts about puppet may include:

  • How we use EC2 tags for determining instance roles
  • Speeding up initial booting of instances
  • Using custom facts to enable one-off configurations for testing or debugging

— Shane Meyers, SmugMug Operations