Puppet: System Administration Automated

Redesigning the Parser


It's amazing how translated intent tends to take on a life of its own. I apologize for how scattered this post is, but it's being written on a plane while the thoughts are developing. I'd try harder to make it clear, but the whole reason for the existence of this blog is for working through ideas like this.

Configuration Context

When I initially designed Puppet, one of my goals was to retain the tree structure created by the classes and definitions in the original manifest. In looking back, the only reason that I can find for that goal is to provide meaningful feedback to the user -- "object A in class Y had a problem" or some such.

I think that reason is important, but I did not do a good job of stating the problem, which meant my solution would tend to be just as muddled. In this case, the real problem is how to provide contextual information to the user, and the solution I chose was to retain context throughout the entire process.

This solution seems to have required some really bad design, though. In particular, if I wanted to retain the class structure as originally designed, objects would have to remain the scope that designed them. Take the following class structure:

class unix {
    file { "/etc/passwd":
        owner => root,
        group => root
    }
}

class webserver {
    service { httpd:
        ensure => running
    }
}

include unix
include webserver

This would result in a scope tree with three individual scopes -- the top-level scope and one for each class -- and the two lower-level scopes would each contain one object.

This keeps the structure in a way that satisfies my initial goal. Stupidly, though, it complicates the language considerably.

Closurehood

One of the primary design goals of Puppet is that it only allow a given element to be managed in one part of a given host's configuration. In other words, two classes can't both try to manage a given file or a given package. This is heavily based on Alva Couch's Closures paper, but it's also based on a lot of work I've done, and I think it's an important part making it easy to build consistent, maintainable configurations.

However, it's just not possible to reduce all elements to one syntactical statement, because some elements vary so dramatically and in unforeseen ways that its configuration might be composed from many statements, rather than just one. So, the Puppet language needs a way to distinguish between intentially modifying the configuration of an existing element and two portions of a configuration trying to manage the same element.

My initial solution to this involved scoping -- if a specified element already existed in a parent scope, then overriding would be allowed, else it would be considered an error. However, after creating this system, I realized it worked fine for variables (although it's still just as confusing) but it didn't work at all for elements. The following code is very hackish:

class unix {
    file { "/etc/passwd":
        owner => root,
        group => root
    }
    include bsd
}

class bsd {
    file { "/etc/passwd":
        group => wheel
    }
}

The include puts the bsd class in a scope below the unix class, but it's not maintainable -- the bsd class should be specifying a relationship to the unix class, not the other way around.

So, this led me to create class inheritance, so you could redo the code this way:

class unix {
    file { "/etc/passwd":
        owner => root,
        group => root
    }
}

class bsd inherits unix {
    file { "/etc/passwd":
        group => wheel
    }
}

I had to hack inheritance so that the scope of the parent class was created as the parent scope of the child class, but after that, any elements in the child class naturally override the same elements specified in the parent class.

The problem is that I didn't realize that this is the extent of overriding that Puppet should support. I created inheritance something like 4 months after I'd put the rest of the system in place, so I just added it, without reanalyzing the whole system. This parent-child relationship is the only one that should support overriding, and scopes are merely a hack to support it. In truth, scopes should have nothing to do with overriding, and it should be based entirely on ho the two managing containers related -- if one container is a subclass of the other container, then overriding should be allowed, and not otherwise.

As it is, stupid code like this overrides instead of conflicting:

class solaris {
    file { "/etc/ssh/sshd_config":
        owner => bin,
        group => bin
    }
}

class ssh {
    file { "/etc/ssh/sshd_config":
        owner => root,
        group => root
    }
}

file { "/etc/ssh/sshd_config":
    source => "/nfs/apps/ssh/sshd_config"
}

include solaris, ssh

It overrides because the top scope mentions sshdconfig, so each of the child scopes overrides that parent object. This is obviously dumb -- each of these scopes has unrelated intent, they just happen to mention the same object (the conflict is clear in such a small snippet, but gets confusing quickly in a larger configuration).

Overriding Definitions

A further complication is that the current mechanism does not even allow overriding definitions:

define remotfile(source) {
    file { $name:
        source => "/nfs/apps/config/$source"
    }
}

class base {
    remotefile { "/etc/ssh/sshd_config":
        source => "ssh/sshd_config.base"
    }
}

class sub {
    remotefile { "/etc/ssh/sshd_config":
        source => "ssh/sshd_config.sub"
    }
}

This won't work, because the definition gets evaluated, instead of getting stored in the current scope, so it's not available for overriding. This is a huge oversight, because it severely limits the ability to build truly abstract configurations.

Rethinking the Language

So, I set out to come up with a solution to this problem. How do I keep the features that I want but get rid of the confusing scope-based overrides and also support definition overrides?

At first I thought about adding on -- what additional code do I have to write to make this work? I came up with a few workable solutions, but they were generally pretty hackish and resulted in an even more complicated system internally. Finally I began rethinking the whole thing, taking it down to the basic questions -- what do I really need, and what's the best way to accomplish it?

I began to rethink the way that elements are stored in the given scope. It's critical to how overrides work right now (scopes are searched for a given element name and type, and if found are overridden), but it also makes the "right" way to do things far more complicated. This got me thinking about the fact that element statements are currently strange, linguistically, in that they don't have values -- that is, they result in a modification to the scope tree, but they don't return a value. Assigning an element to a variable isn't even syntactically legal in Puppet because it would only make sense if elements returned a value, and it doesn't make sense for elements to return a value and be stored in the scope tree, so I just store them in the scope tree.

But what if elements did return a value? That is, what if an element statement's value was the element itself? That would require that classes and definitions and nodes all return, as their values, the collected list of elements, which means that I lose the class structure that I've been using to retain configuration context.

Except that, really, I don't need it any more -- the context is already being provided by two attributes, tags (which are essentially the names of every class, node, and definition containing the element) and the path (which is a string that maps directly to the class path to the element). I can't come up with a use for configuration context that these two features don't provide, so it seems that this entire structure is unnecessary.

Collapsing the scope tree using values (element statements return the specified element as a value, and enclosing structures return the list of specified elements as values) immediately provides inroads into a much, much, much better language. For one, you just throw away all scope-based object overrides -- you need to provide a special system for subclasses overriding parent classes, but that should be relatively clear and very specific. For another, it makes the collection process recently discussed on the list (where one host can collect specified elements from other hosts) much more straightforward -- you need only provide a syntax for specifying an element that stores it centrally rather than returning itself as a value, and then a syntax for retrieving centrally specified elements. It's a much more consistent feature of the language when values are used.

The Results

The biggest change to the language is that evaluation of the parse tree should return a flat list of elements. With this change, the parser will have to treat inheritance specially, in that overriding elements within a class tree must still be supported, but this should be relatively straightforward, at least compared to how overriding is done now. Context will be retained by adding tags and paths to elements as they are created

This makes the 'component' subclass of Puppet::Type entirely unnecessary -- its only purpose before was to provide a container that mapped to the class tree. It should also make the whole process much clearer, from parse tree evaluation to configuration application.

Wrinkles

The only problem I can't yet think of how to solve is that of component dependencies:

define remotefile(source) {
    file { $name: source => "/nfs/config/$source"
}

remofilefile { "/etc/ssh/sshd_config":
    source => "ssh/sshd_config"
}

service { sshd:
    ensure => running,
    require => remotefile["/etc/ssh/sshd_config"]
}

This type of requirement would no longer be allowed, because the 'remotefile' object would not even exist on the client, so there would be no way to look it up using the current mechanisms.

This is a somewhat contrived example, because the service should probably actually require 'file["/etc/ssh/sshd_config"]', but there will certainly be cases where the elements in a given component will need to be applied before other elements can be, so the requirements subsystem needs to be able to take component and class membership into account for both sorting and notification (e.g., "if anything in component A changes notify service X").

At least the sorting aspects are relatively easily solvable -- sorting is currently done in Puppet::Type::Component, and it is being deprecated, so sorting will need to be redone anyway; I just need to redo it in a way that accounts for either direct dependency of container-level dependency.

Notification is already complicated, but it's also cleanly in one area of the code. In fact, $10 says it already doesn't behave exactly as I'd like, so it probably would just continue to be broken in the same way.

Generally, there would be two generic ways to solve these problems -- either make the dependency system able to take advantage of the remaining contextual information (tags or paths), or move dependencies into the language.

Hmmm.

I'll continue thinking about it.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Sat, 22 Apr 2006 | Tags:


Posted by Adrian Conway at Sun Jun 15 07:22:42 2008
nonaccredited astur fraternal include wangateur redry microerg onomatope
<a href= http://www.lordstone.us/ >Lordstone Executive Search and Recruiting Research</a>
http://www.alvaradohospital.signonsandiego.com/

Name:


E-mail:


URL:


Comment: