Skip to main content

Configuration Management, pt.2 - Setting up Chef

Back-end Development
Drupal
Planning & Strategy

If you have read my previous post you know a little about how I have configured servers for web hosting, which probably mirrors a large swath of what was being done across the industry. Now I'd like to take a look at how we implemented Chef.

We started with Chef about the time that version 11 was becoming the standard. There are two main ways to implement Chef. The most popular way is using what is often called "Chef Server", meaning there is a server running Chef software that is responsible for communicating with the "nodes" (web hosts / servers) and ensuring they are configured correctly. The other way to use Chef is via "Chef Solo", which is a mechanism for running Chef on the node itself. This likely involves figuring out how to move at least a portion of your Chef code base to the node via SFTP, SCM checkout, etc.

Chef Server is nice, but also costly if you are managing a large number of web servers. Hosted Chef is charged per node, making it affordable if being used for only a few instances, but it makes hosting many servers relatively expensive. Though hosting many nodes is where Chef really excels. The core of Chef server can be hosted on your own server for free, but lacks the nice administrative interface beyond a few nodes, and seems to me, to be more of a trial like experience. Because of the friction of cost, and our philosophy of keeping overhead to a minimum we decided to pursue Chef Solo, but we did so in a way that allowed us to manage all of our hosts from a single repository using a utility called Knife Solo.

Knife Solo

Knife Solo allows us to maintain all of our Chef code in one location, then, as required, the the libraries and code needed on a node by Chef Solo are transferred and executed remotely over an SSH connection. One downside worth mentioning about this approach is that each "DevOps Admin" must have a fully functional local setup of Chef with a common versions of Ruby and other libraries and frameworks.

Chef Librarian

Another utility we implemented was called Chef Librarian which helped procure the correct Cookbook dependencies for our Chef repository, and helped separate our custom cookbooks from those contributed by the Chef community.

I should mention that since our adoption of Chef Librarian another similar tool has become very popular called Berkshelf which seems to be the most widely adopted at this time. Berkshelf and Librarian both accomplish the same goal in very similar ways.

Librarian uses a Cheffile to determine what dependencies are needed. The format looking something like this.

$ cat Cheffile
  cookbook "rbenv"
  cookbook 'apt'
  cookbook 'chef-client'
  cookbook 'apache2'
  cookbook 'mysql', '~>5.0'
  cookbook 'php' #, '1.1.4' # we need to use an older version because of pear installation issue
  cookbook 'database', '~>2.3.1'
  cookbook 'git'

Each dependency is listed for any of the nodes we will be managing. Meaning, we may not be using PHP on every host, but this file establishes the requirements for our project which will configure every node in our portfolio.

Project Setup

knife solo init sets up the project structure. Some important things to note:

  • librarian-chef will take over the cookbooks directory. So any custom cookbooks should be placed in another location. I've chosen the directory site-cookbooks.
  • A new directory called nodes is created and this is where the specific configuration for each site will be stored. In the JSON format. We'll look at this closer later.

Adding Nodes

As soon as you have your new host instance procured you need to ensure that you are able to ssh, most likely via root and it is best to use ssh shared key authentication so that you avoid the password prompts. Make any necessary .ssh/config changes to ensure that you are connecting to the host with the correct user, etc.

knife solo prepare [host] creates our JSON file in the nodes directory. It will be named the same as host. And you must use this hostname each time you run knife solo as opposed to the IP address for example. This command also sets up the remote host with the bare necessities to run chef. The JSON file created is simply

{
  "run_list": [

  ],
  "automatic": {
    "ipaddress": "chef-demo.rapiddg.com"
  }
}

The run-list being the important part that actually tells chef-solo what recipes to "cook". If you using the timezone (to set up server timezone) cookbook you can specify the parameters / variables you want to pass to those cookbook's recipes in JSON prior to the run-list, and then put the actual recipes you want to run in the run-list.

{
  "tz": "America/Detroit",
  "run_list": [
     "recipe[apt]", 
     "recipe[timezone-ii]"
  ],
  "automatic": {
    "ipaddress": "chef-demo.rapiddg.com"
  }
}

Cooking

Finally, to actually run chef-solo on the node with the specified run-list and variables you run knife solo cook [host]

In part III, I'll show what kinds of things you can do with Chef.