Puppet: System Administration Automated

Closures, Element Normalization, and Hell


While I have never exactly been strict about it, I've been trying to apply some of Alva Couch's closure theory to Puppet. The process has not been easy, partially because no one really seems to understand how to apply this to anything practical, but mainly because I have never been able to cleanly figure out whether the closures are demarcated in the language or the library. This has resulted in a kind of twisted dual nature where I partially did both but neither were good enough.

Instead of being able to accomplish what Alva is looking for, I seem to have devolved into a lesser goal. Rather than seeking to manage through closures, I am merely seeking element normalization. That is, rather than creating autonomous collections of elements managed through defined interfaces, I am settling for the lesser ability to create a configuration in which all configurable elements are normalized.

Closures, Opacity, and Parsing

One of the biggest influences that seeking closure had on Puppet's design was that it forced me to realize that most configuration files cannot be managed as opaque collections of text. Any configuration file that contains directives relating to multiple service, such as inetd.conf's collection of services or a cron tab's collection of unrelated cron jobs, needs to be managed in a way that allows each of those directives to be managed through the closure that maps to the service that needs to the directive.

For instance, if you have two services in /etc/inetd.conf and each of those services requires a cron job, if you cannot manage the four elements (two cron jobs and two services) independently, then you have to join the services into one shared space:

class server {
    file { "/etc/inetd.conf":
        source => "puppet://server/files/inetd.conf"
    }
    file { "/var/spool/cron/crontabs":
        source => "puppet://server/files/rootcron"
    }
}

This configuration completely hides what is inside the files, and makes no mention of the services in the files or why those files are being downloaded. Most likely, these files would be named in a way that implied what they did:

class serviceAandserviceB {
    file { "/etc/inetd.conf":
        source => "puppet://server/files/AplusBinetd.conf"
    }
    file { "/var/spool/cron/crontabs":
        source => "puppet://server/files/AplusBrootcron"
    }
}

It's immediately obvious that this is not scalable. You need to have a new file for every unique combination of services, and you are also trusting people to name every file in a way that makes the humans maintaining the configurations able to understand what is going on.

I figured the correct design was to enable these elements to be managed according to need, not according to whether they happen to reside together in a file. Puppet is designed to make it easy to parse files like crontab and inetd.conf and turn their contents into separately manageable elements:

class serviceA {
    cron { serviceA: hour => [0, 11], user => root, command => "..." }
    inetdsvc { serviceA: user => root, binary => "..." }
}

class serviceB {
    cron { serviceB: minute => [0, 29], user => root, command => "..." }
    inetdsvc { serviceB: user => root, binary => "..." }
}

Now you can easily separate out the services, and apply them a al carte to any machines that need them. Additionally, you are not hiding the configuration detail into opaque files managed far from the server definitions.

I figured this separation would provide me the flexibility and power to create simple closures that I could then use to manage large networks with ease. The downside of this design choice, of course, is that Puppet needs to support every little discrete type of element on every system, which is a real bear given how little consistency there is even across different Linux distributions, much less across different Unix operating systems or entirely unrelated systems.

The truth is, though, that you want this separate management anyway -- this level of separation will allow you to manage entire networks entirely through object-like interfaces, instead of collecting strings together into a file or creating unique files for every unique combination of services on your network. I am quite excited at the progress, and even though things have not progressed as smoothly as I could have hoped, I'm still convinced it was the right move. It also happens to be a design that I don't see any other configuration management tools following, so it nicely sets Puppet apart.

From Closures to Normalization

In order to try to collect these separate elements into closures, I came up with components, server class, and nodes as an attempt to provide closure-like structures. You would have independent collections of objects that would be configured through passed-in parameters.

The problem was that I couldn't easily map configurations to this style of thinking, and as I modified the structures more to provide the ability to configure networks like I wanted, the connection between these structures and Alva's purer concept of closurehood seemed to lessen. Now, I find that thinking of them as closures does not provide me with much direction, and I instead get my direction from thinking about normalizing elements.

Normalization is normally (heh) used in the database world to talk about reducing the number of times a given object is repeated in a database, thus decreasing storage requirements and maintenance costs. Because we're building heirarchical configurations rather than collections of tables, the term cannot be applied in quite the same way. When I say that my goal is to normalize the configurable elements throughout my configuration I mean that I want to develop my configuration so that any given configurable element (e.g., a file, service, package, or cron job ) is only referred to once in the entire configuration.

The reason I like this goal is that it is a kind of softened combination of abstraction and seeking closure. It encourages you to write abstract configurations, pulling common elements into abstract classes which then get applied as necessary, and it provides closure-like behaviour because if a configuration is entirely normalized than any given element is only being managed through its enclosing classes.

This immediately brings up sticky issues like what constitutes a separate element. For instance, are Apache versions 1 and 2 separate elements or the same element? In the short term, this kind of issue will be handled separately at each Puppet installation, but over time there should be enough configuration sharing that some kind of de facto standard can be achieved.

Enforcing Normalization

Is normalization such a high goal that it is worthy of enforcement? That is, should Puppet allow users to create non-normalized configurations? If they are allowed, then some mechanism must be developed for resolving which specification should be applied. Additionally, any hope of moving towards supporting real closures pretty much gets tossed right out the window. So, Puppet was (poorly) designed from the beginning to enforce normalized configurations -- if you attempt to create a Puppet configuration that tries to manage the same object from different classes, you will encounter an error.

Well, theoretically. The implementation is a lot stickier than that. Here's where we delve into the practical aspects, though, and out of theory, and since this post is long enough as it is, I'll finish this and continue in another.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Wed, 26 Apr 2006 | Tags: , , ,