Over the past month I’ve been toying with the idea of getting Opscode’s Chef to automate the installation of applications on my server stack on AWS, but it wasn’t until last weekend that I actually bit the bullet and decided to make it work. For those of you don’t know what Chef is, it’s a set of tools designed to automate installing, upgrading and removing applications on servers. It’s most used in Linux environments but it does have growing Windows support and appears (in my opinion) to be more mature in that area than it’s other competitors (CFEngine and Puppet). Aside from Chef being open source meaning you can run it on your own hardware, it also has a managed version which allows for up to 5 server instances (called nodes) to be register at once for free.
During my research into Chef I have noticed one downside, you still have to manually provision servers. Now, if like me, you’re a large Cloud Provider as I am (AWS) you can install the specific tools to spin up servers from the command line, but this is hardly automating the whole of the process. This is where AWS does actually have a solution. It’s called CloudFormation and allows you describe your whole stack (or parts of stacks) in templates. The templates can provision pretty much any of the services AWS offers and allows you to hook them together as required with other service inside or outside of the stack.
Now you may be wondering how this helps, well luckily CloudFormation allows you configure an EC2 instance in such a way that you can bootstrap almost anything you like. The CloudFormation site has many sample templates of how to bootstrap different applications into stacks, and even some on Chef, but I though I’d share my experiences of getting this up and running.
When I was working out how to convert my stack to use Chef and CloudFormation I spent a good chunk of time trying to plan the process of what I should do. Should I start with Chef on my existing servers and set up the required components to use that or start by automating the provisioning of resources? In the end I settled on creating the CloudFormation template as I had scripts to set up the servers once they were provisioned anyway and it also means that you can have Chef on your servers ready to go once you’ve set up your Chef environments.
Setting up the CloudFormation Template
The first thing to do before creating a template is to decide what exactly your stack must require and then work from there. My stack must include a database server with a static IP and at least one web server behind a load balancer that auto scale when demand picks up (as you can see my blog doesn’t get that many hits). Once you’ve identified what you require the next task is to find the appropriate AWS components to create the stack required. The basic CloudFormation resources needed to create my stack are as follows:
- One database EC2 instance
- One Elastic IP on the database EC2 instance
- One web server group with a default of one server
- One web server launch group containing information about what type of instances to create
- One web server group trigger to scale up and down the web server group as required
- One database security group
- One web server security group with access to the database security group
- One Elastic Load Balancer for the web server group
As you can see there are a fair few resources required to get a simple stack created. A few things to note are firstly the database server can’t scale, this is because you can’t assign an Elastic IP to a server in a server group, and the static IP is required to generate a static public DNS record in AWS. Secondly you need to manually specify a trigger to increase or decrease the size of a server group. By default a server group will make sure you have the number of required servers running (and replace ones that fail) but it won’t scale by itself. A Trigger is relatively simple and can only monitor the same metric to scale up and down, but you can have more complex rules for scaling using a Scaling Policy.
This is all good and well but I would like to know when something goes wrong so I will have a bunch of alarms that will tell me when things go wrong. This means I will have the following resources required also:
- SNS Topic with an email subscription
- CPU alarm on the database server
- CPU alarms on the web servers
- Unhealthy instance alarm on the Elastic Load Balancer
- Request latency alarm on the load balancer
Now I will be alerted when ever there are problems. Another added benefit is that I can use the same SNS topic to notify me when auto scaling of the web server group takes place.
An example of the template created can be found here.
Adding Chef to the CloudFormation Template
Before we get on with modifying the template to bootstrap Chef you need to manually create a S3 bucket to contain the validation PEM required for servers to register themselves with Chef. The permissions on this bucket will be modified by the template so you only need to drop the PEM into it and do nothing else. Once the bucket is created a few resources are required to allow the bucket to be accessed by the template:
- An IAM user
- Host Keys for the IAM user
The reason you create an IAM user is so that you can assign a policy to it to allow for accessing the S3 bucket you’ve created. As the bucket, and its contents, are private this is the only way useful way to access the PEM inside the bucket. The host keys for the user are created so that when running any S3 tools the user can access the bucket.
The final things to add as resources, before modifying existing resources, are a Wait Handle and Condition for the database server and the web server group. This allows the EC2 servers being configured to signal the success of their configuration, thus preventing the stack from fully creating if there are problems. An added benefit is that you can make the web server group initialise only if the database server has successfully initialised. The Wait Handle is an address that the EC2 server can hit to tell CloudFormation whether it’s configuration has been successful and the Wait Condition allows you to set a timeout time for the configuration of the server to take place.
Now we are ready to modify the actual configuration of the EC2 instances we will be spinning up. There are two parts that are going to be added each of the declarations of the database server and web server launch group. First, under the properties section a new property called UserData is going to be added. This creates a script that that will run on start up of the instance and effectively makes sure the following happens:
- The AWS configuration tools are installed
- Runs the AWS configuration tools initialisation command
- Runs Chef Solo to install Chef Client
- Retrieves the validation PEM from the S3 bucket
- Registers the node with Chef Managed Service
You may be thinking, “well how can it run Chef Solo when we haven’t installed it?”. That’s because the a second section needs to be added to the declaration of a server called Metadata and this sits at the same level as the properties of the EC2 Instance or Launch Configuration. This section basically holds a bunch of information about what packages to install using a given package management system as well as what files to add to the server and where (for more information see here). This information is processed when the AWS configuration tools initialise command is run.
So once all the extra Chef configuration is added you should get a file that looks very similar to my template.
I’ve tried not to delve too deep into the specifics of setting up the template or what each bit of code does, but I do feel that I’ve tried to give an overview of how you would go about setting up a CloudFormation template bootstrapping Chef. There are a set of tutorials on how to use CloudFormation with Chef or Puppet, but my experience is that these are a little out of date, hence the need to walk through some of the steps here.
I hope you find this useful or interesting.