Caching and REST
(by Luke)
One of the things I was supposed to write about last week was how I'm rethinking some of Puppet's internal caching. This rethinking is a direct result of listening to ThoughtWork's IT Matters Podcast on REST (I've only listened to part 1 so far). I actually listened to the episode three times, because it's only about 20 minutes and I listened to it on a 60 minute bike ride, which worked well because it was so windy that day that I didn't hear the whole thing any of those listenings.
I'll hopefully write later about how this podcast made me rethink how environments are used in fileserving, but for now, I'm going to focus on caching.
Indirection
For a couple of months now, Puppet has had an Indirector module that is basically useful for connecting classes with collections of instances of those classes. The only reason you'd really even bother to use it is if you had multiple collections, and needed to interact with different collections at different times, but you wanted those differences to be transparent.
For instance, when retrieving node information, you just call this code:
Puppet::Node.find("mynode")
Somewhere else, you'll have configured which collection (the word I'm currently using is terminus) this uses, and the Indirector just delegates the find call to the right collection. For nodes, you might be using the exec collection, which calls an external script, turns the resulting YAML into a Node instance, and returns it (or returns nil if nothing was found).
I think the Indirector is pretty cool, and it's certainly simplified a lot of my modeling of interacting with different sources of information. Those who are familiar with REST, at least how it's usually done in the Ruby world, will recognize the find as one of the methods usually used for REST interfaces -- it's mapped to the HTTP verb get. One of the primary design goals of the Indirector was to facilitate REST interfaces, so the methods we're indirecting are, not coincidentally, exactly the methods you'd implement for REST support.
Caching
One of the later additions to the indirection code was support for cache collections. That is, you might have a canonical collection, and then a cache collection for speed or proximity purposes. Following our Node example above, if you were using the exec collection, you'd probably want to have the results cached in the yaml collection, so they were inexpensive to retrieve.
The critical question with any caching system is how to know when the cache is dirty. How do you know if you should use the cached node information or go back to the source?
I expect there are as many answers to this question as there are caching implementations, just about. I had never implemented a caching solution before, and I probably misinterpreted my discussions with Rick Bradley, because I ended choosing a not-very-good system. The current cache invalidation mechanism is based on relative versions: If the version of the cached object is older than the version of the object in the other collection, then your cache is dirty.
What is a version? Well, normally it's just the timestamp of when the instance was created. This might work okay for some systems, but in general, the timestamp ends up being pretty useless. Look at our Node example -- the timestamp of the exec collection is always later, because we retrieve the cache version, then generate a new node using the exec collection, and compare. Duh. The answer's always the same.
Even worse, in most situations the cache doesn't save you any work, because you're pulling fresh data from the original source. If we have to re-execute the external node script to get the latest node version, we haven't saved any effort at all, we've just added a bunch of useless work, which is stupid.
Puppet 0.24.4 "fixed" this problem by saying that the cached node's version was the timestamp of the node's Facts cache. If the facts are updated, then the cache needs to be updated. This seems to mostly work, but it feels like a hack for something that should be easy.
TTL
So, on to the podcast. It was a good podcast in general, and they focused a good bit on caching. At first I found this pretty strange -- why is caching an important design criterion? As they talked, though, I realized that a generalized, simple caching model is useful a lot more places than I would expect, including in Puppet.
There didn't seem to be any disagreement over the best way to handle knowing when a cache is dirty -- they apparently just use time-to-live (TTL) or expiration headers. I think it was the second time listening through that I realized that the vast majority of my caching problems could be fixed with this.
Puppet has a natural TTL for most of its information -- every host runs every half an hour, so if you set a TTL of half an hour (or whatever you're run interval is), then you'll get fresh data once a run, and cached data the rest of the time. In the above Node scenario, the exec collection would set the TTL of the node (so that your external node app could pick its own TTL), or Puppet would have a default TTL equal to the run interval. Then, when Puppet goes to check whether its cache was dirty, it could just compare the TTL against the current time -- no need to hit both collections, and no arbitrary definition of "version".
This actually makes even more sense with the current problem I'm trying to solve. I'm trying to remodel the SSL certificate signing process, and it's gotten pretty messy. With this, though, you just set the TTL of the certificate to its own internal TTL, and you use the local system as the cache the CA server as the ultimate source. If there is a local cert and it's still valid, use it; if there's a local cert but we're past its TTL, then discard it and get a fresh cert; if there's no cert, then get one from the server and cache it locally.
Next Steps
I don't have the whole thing figured out mentally yet, but I'm pretty close. At the least, the next step is to replace the current broken version-based cache with ttl-based caching. The two things I most need to resolve are:
- Who's responsible for the ttl? Is it the indirection (e.g., Node), or the collection (e.g., the external node script)?
- How does the user configure the ttl? Say I want the ttl for my node to be 30 seconds instead of thirty minutes, or I want to invalidate the cached values for all nodes; how would I do that?
Obviously, these two things are linked -- the user needs a complete configuration path from the command line or configuration file to the bit that actually sets the ttl.
For now, fortunately, I don't need to worry about it, because I can just stick with the run interval as the TTL for essentially everything I'm doing. As things get more interesting, though, we're going to want to configure these values, because....
TTL Can Help Provide Change Control
One of my primary goals in moving the catalog compiling process to REST is to enable a decoupling between compiling and applying. In other words, I want people to be able to apply a configuration without recompiling.
Imagine a configuration TTL of a week -- every host recompiles its configuration during some specific maintenance window, like Sunday morning between 2 and 6 am. They still apply their configurations every half an hour, but that's normally just validating that nothing has drifted.
Obviously, this wouldn't be used by most shops -- most people would still want all hosts to recompile every time. But for those shops that are highly worried about change control, or those who want to do rolling upgrades, where they upgrade 10% of a pool of servers at a time, this would help a lot. You take your pool of servers, trigger a recompile on 10%, and once you're confident they're working, you trigger a recompile on another 10%, and so on.
Once you can do that with Puppet, it'll feel almost enterprisey. :)
Wed, 02 Apr 2008 | Tags: programming, thoughtworks, podcast, rest, caching, design, ruby, api, luke
Reification in Puppet
It's apparently Ruby day in Puppetton (as in, Puppet + Hobbiton, or something like that). I've been trying to add more Ruby blogs to my blogroll, and I guess they're starting to succeed.
Piers Cawley has a post about reification:
However, in OO circles (or maybe just in my head), reification is a good thing. Its the process of taking something abstract and turning it into a real object. Usually, the word gets used for big things like turning an intractable method into an object as a step on the way to refactoring that method. I tend to use it in a slightly broader sense. For me, reification is the process of turning something (a method or a data structure usually) into a full blown object with its own behaviour.
This is something I've only recently started thinking about as a concrete process. I learned to program (according to most definitions) in the perl world, and I relied ridiculously heavily on hashes as my mechanism for managing all of my data. Even when you go OO in perl, you're still relying on hashes, so there isn't as much impetus to reify as you might find in Ruby.
That reliance on hashes has stuck with me over the years, and I'm only just starting to get over it. A heckuva lot of parts of Puppet have been modeled as hashes long past their useful lifetimes, and even when I correctly use a separate class to model something, I'll often stupidly avoid differentiating two uses of a class until it's so painful I can't deny it any longer.
Probably the biggest example of this is that I did not have a parser-specific resource class until relatively recently (sometime in 2007, anyway, I think). Before it existed, I was using the same Hash-like class I use for transferring configurations over the wire. Hey, it existed, it stores parameters, yay. The problem is that the two classes share almost no other functionality. Parser resources have tons of special behaviour -- they get to determine whether a class can override a given parameter, they have to add default values, they can be exported or virtual, and they can model types that are defined in the parser, as opposed to only types built into the RAL.
When I finally couldn't take the pain any more, I created a simple parser resource class. Then, before I knew it, I moved tons of code into that class, and all of the parser structural classes -- mainly the parser, the interpreter, and the scope classes, at the time -- became much easier.
I'm actually going through all of this again right now. As insane as it sounds, until two weeks ago, Puppet did not have a Configuration class to model a specific configuration. Well, really, a configuration is a collection of resources, and it's the resources that matter, right? Except there's so much more than that -- the resources have relationships to each other, and those relationships really matter. More importantly, there's a lot of code related to these configurations. When I created a separate Configuration class, my Interpreter class went from 708 lines to 105. That means that 6/7 of my Interpreter class was actually Configuration code, but because I hadn't reified my configuration class, I didn't realize it.
It's actually stupider than that. When I first created a Configuration class, it only modeled the configuration as it was being compiled; I still returned something that wasn't a stand-alone object that could have code attached. When I finally realized, I started really punching myself in the head. I now have a Compile class that manages interaction between the parser and the configuration, and I have a Configuration class that is the result of a compile. It really is a world of difference.
The way that I tend to think about this process is that I'm converting "else" code into "me" code. That is, the Interpreter previously had a ton of code that interacted with something that wasn't itself, and now essentially all of that code has been moved into the Compile and Configuration classes, and all of that code can pretty much be used as private methods, meaning it's all "me" code now -- it's the classes talking about themselves, instead of a class talking about something "else".
This provides what might be a good way to notice whether it's time to reify -- if you have a lot of code that uses receivers on the methods, then you're mostly interacting with something else. There's a very good chance that "else" should be its own class, and all that code should become methods on that new class.
This is something I'm still working on in Puppet. I've already created a bunch of new classes for the next release (Compile, Configuration, Node, Facts, Checksum) and I'll be shocked if I don't end up with more. The key to these classes is that I'm not adding code -- I'm taking code that already exists but talks about something "else" and making it talk about itself. In the majority of cases, this results in a significant reduction in line count, because I lose all the reference management and argument validation that has to go on when a class has to assume some hostile external force is interacting with it. I even get to make a lot of that code private, because no one needs to care about it any more.
Tue, 11 Sep 2007 | Tags: ruby, puppet, reification, refactoring, oo, design, programming
Canonical Names vs. Colloquial Names
Based on my experience with cfengine, one my initial design requirements for Puppet was that it support two names for every object; I have been calling these names the canonical name and the colloquial name.
For instance, take the ssh daemon. Or is that the sshd daemon? Or
the openssh daemon? Unfortunately, it depends on your operating system,
and potentially even on your environment (e.g., you could create a custom
init script that called it whatever you wanted), which is exactly the
point. We humans have a single name that we use to refer to all of these
daemons -- personally, I always think of it as "the ssh daemon", or maybe just
"sshd" -- but the computers have their own names.
The first name, the name that humans apply consistently throughout a site, I have been calling the canonical name ("canonical" because it's true everywhere), and the second name, the name that varies with every platform or whatever, I have been calling the calloquial name ("colloquial" because it's based on the local dialect, so to speak (pun intended)).
However, both of those names are, um, way too long to use as method names.
Puppet always requires a colloquial name, and it uses that as the default for
the canonical name -- that is, if you do not provide a canonical name, then
the colloquial name is used. However, the difference between the two is, um,
a touch embarrassing -- the canonical name is retrieved by calling the
name method on a Puppet type, and the canonical name is retrieved by
getting the name attribute of a Puppet type (e.g.,
name = obj[:name]).
Yes, this is hideous, and no, it's not reasonable for me to expect anyone else to understand this. But it's worse than that.
I've been planning on moving Puppet types from using the hash-style attribute
retrieval methods ([] and []=) to using standard gettor and settor
methods (e.g., name and name=). The problem is, of course, that you
would then get name clobbering -- there would be no way to distinguish the two
names.
I actually wrote up all of the code to make this change (two different ways, even) at the beginning of 2006, but this naming problem stymied me, so I left it alone (I did the work mostly on a lark, while I was travelling -- I was hoping for performance improvements, but I found them elsewhere).
Now, as I continue thinking about providers and abstraction and modeling and
those other things that keep me up at night and put the rest of you to sleep,
this problem is getting more pronounced. As I
mentioned, I'm working
on potentially adding a layer above the types to handle the @is and
@should values (or, as I've been describing it, handling the three C's --
collect, compare, commit), and changing the types so that direct method calls
work. In that case, it makes much more sense to get rid of this hash syntax.
(Yes, it must be said, Puppet was initially inspired just a bit too much by
hashes.)
So, I think I'm going to start calling the canonical name the "title", and
continue using "name" for the colloquial name. The language will continue
preferring the title over the name (you probably didn't know that it did
that), and in the majority of cases this won't matter to you. But I will have
to modify the Transportable classes to use title instead of name, and
I'll have to make a few other internal changes. Once this is done, though,
and it should be pretty straightforward, I can move to
file.uid = 0 instead of file[:uid] = 0, which I think makes more
sense; certainly it will be better for users of Puppet's library interface.
Wed, 16 Aug 2006 | Tags: puppet, design, naming
Puppet's Two Abstraction Layers
Today I basically concluded I'd finished with providers, but I was stupid enough to spend a while on the phone with a friend who's kind enough to talk about Puppet with me.
In the course of this discussion, I realized that I've been stupidly conflating two layers within Puppet this whole time. Making providers into a separate layer is a good start toward fixing that, but it's not enough.
See, there are two aspects to the type library within Puppet: The first
aspect functions as an abstraction layer above the operating system: I say
Create user johnny and it says useradd johnny or
niutil -create / /user/johnny, depending on the platform or my preferences
or whatever.
The second aspect involves the tool that Puppet is; the transactional configuration management tool. This tool is inherently two-phase: I must first collect the state of any given object, then compare it to the prescribed state, and then either log or fix the problem.
These layers should be entirely decoupled within Puppet, but they are not. There's no good reason for their lack of separation, I just didn't have a clear design in my head when I started oh so long ago. Now that I have providers created for a few types, I realize that these providers should probably handle validation, for instance -- clearly, anyone who ever uses the providers outside of Puppet will want the same validation that Puppet has. And they'll certainly want the same documentation that Puppet has.
But then you walk into these complicated problems -- what stays in the transactional layer? Anything? Is it even possible to define a generic API between these layers, such that they don't have to be coupled? Right now, types and their associated providers share a unique API -- the service type expects a specific set of methods on service providers, and the package type expects a completely different set of methods on the package providers. Is it possible to fix this?
I can't see the separation; I don't know exactly where it sits. It's funny, because this keeps causing flashbacks to organic chemistry lab, where the question was always, "which layer do I keep?" You always had an aqueos layer and a nonpolar layer, and your experimental result was in one layer and the other was waste. In this case, though, it feels like a gradient in which I have to draw a hard line that doesn't really exist.
It seems a lot like 99% of what's currently in what I'll call the transaction layer -- the types -- should be pushed into providers. This includes the docs, the validation, the value specification, etc. Most likely, the transactional layer can be reduced to a single class, rather than one class per type. I expect that the transactional layer will keep most of the metaparams, since scheduling clearly takes place at that layer not at the provider layer, and relationships are also at that level.
This means that the transactional layer's job is basically just to do the three-step: Retrieve, compare, sync. It shouldn't need much more than that.
Ouch. This is nearly a completely rethink of the whole damn system. Fortunately, it should have almost no impact from the outside -- no language changes, and very few changes to what amounts to the external API -- but it's still, um, big.
Sun, 13 Aug 2006 | Tags: puppet, abstraction, design
Web 2.0 Is About Possibilities
I recently read Russell Beattie's Where's the Ambition? post, and I have to agree with his main premise, which is that most Web 2.0 companies are both unambitious and generally uninspiring. I'll go one further, though, and claim that I have a good idea why.
I think the main problem with Web 2.0 is that people are too focused on applying Web 2.0 principles to existing solutions, instead of trying to create solutions that weren't possible before. For instance, there's a big hullaballoo over Zimbra's web-based office productivity suite, but I think it's overrated. Sure, it looks like a great product, and they had to do some innovation to get it online, but really, office suites are not innovation. Zimbra is clearly competing against MS Office, and if your product has a clear competitor, it isn't that innovative.
This is why so many of the Bubble 1.0 companies failed miserably -- they didn't add anything to the mix other than the fact that you were doing X online. Online pet food vs. offline pet food? Who cares. Ebay is a great counterexample, because they really enabled something that wasn't possible before, as did all of the big players who made it through the bubble (and plenty of the smaller players did, too).
Look at the heroes of Web 2.0, and it should be pretty obvious that none of them had competitors when they started. Sure, you could say that Flickr competed with Ofoto, but you'd be silly for saying so, since they had completely different purposes, they just both happened to involve personal pictures. del.icio.us obviously wasn't trying to compete with anything, and 37signals has always prided itself that its software is so much simpler than any potential competitors' that they don't actually compete.
When a space is mature, then it's time to start innovating within that space, by doing things like innovating on smaller features (like Zimbra's deeper integration between the office suite and the back office), but it's much tougher to compete in a mature space, and buzzwords are never enough. It's not only tougher, though, it's stupid. There are so many areas ripe for innovation. As Paul Graham has said, find some part of someone's life that sucks and make it great, and you'll make money. The thing is, there are hundreds of specialized careers in the world, and every one of those careers has some suckage that could be massaged away with a good application. Not every app can be useful to every person, but not every app needs to be. Lawyers have tons of money, spend tons of money, have extremely specialized needs, and are barely targeted at all with online applications. If you started a great web app to help lawyers do their jobs, it's be a good bet that you wouldn't have any competitors. Sexy? No. Lucrative, interesting, and probably actually helping the world out? You betcha.
The whole point of Web 2.0, to me, is that it makes possible things that were previously prohibitively difficult. If you are just using Web 2.0 on things you could do before, then please stop.
My main software product is about server automation and configuration management, but in studying all the innovation online recently, I'm realizing that the real innovation isn't in the automation itself, it's in providing more channels of information between the different servers and between the servers and the sysadmins. Puppet's real innovation will be in combining the semantic information derived from full configuration management with these extra channels and feedback loops to make the data much richer. You won't need to mine the data produced by your systems, because it will be labeled at the get-go.
Who is this product competing with? Well, the configuration management part is competing with cfengine, but it's not competing with anyone on the back end. Literally, there is no one trying to take the intent behind your configurations and carry that intent along the entire feedback loop. Instead, there are all these companies that take your intent, use it to produce a system, take the system's output, and try to mine your intent back out. That's just stupid. Why not close the loop in the middle, and avoid the whole mining part in the first place?
This has become Puppet's goal, even if it's taking me some time to make that transition.
Thu, 04 May 2006 | Tags: design
One Dimension Is Not Enough
As I search for a partner, I'm doing silly things like looking for people via LinkedIn and managing my just-created calendar at Airset, I'm thinking about why these sites are never as good as you want them to be.
I think it's because they really only try to do one thing -- they have a hook, but nothing else. Sometimes this works (and it's actually working relatively well with Airset), but there are usually 10 things I would like to do with my data that I can't do, and if you're going to make me upload that data, then the least you can do is come up with tons of great things to do with the data.
For instance, LinkedIn wants a copy of my entire addressbook, for one specific purpose. Great, there's now another duplicate of that data (in addition to my Mac address book, my Pine address book on SVN but on two different machines, and my phone), with one straight benefit, and it's not even supported by iSync (not that Pine is, either).
The reason I hesitate to use this is because I want some metabenefit that I wasn't thinking of, but there isn't one. I can think of 20 ways you could make my life easier WRT my addressbook data, but this doesn't solve any of the ones I was thinking of, and it only does a mediocre job of solving the one it set out to solve.
Which brings me to the point of the post -- one dimension of usefulness is not enough. You can choose a single dimension to be core, but you need to stack on other dimensions of functionality fast and thick to really deliver value.
LinkedIn isn't the only one doing this. PubSub has some interesting ideas (although I cancelled all of my subs in about 2 days, since I got a useful hit rate of about 1%), but they don't seem to be making an effort to mash in all of the other great ideas that obviously apply like clustering.
When I started blogging, I was livid that it wasn't easier -- I don't want to type HTML, I don't want to link every instance of a word. Shouldn't the software automatically link every unlinked instance of a word that I link once? If I create a link for "pine", should'nt the blog software go all Wiki on me and link all of them? Shouldn't it do that for every article I ever post? I link to my own product, Puppet, constantly -- why isn't my blog software smart enough to insert that link for me?
How does this relate to Puppet? Very clearly. In using Puppet to build your server configurations, you are adding in huge amounts of semantic content that wasn't there before. It would be embarrassing if I didn't take advantage of that extra information to make your job easier. This is exactly why I'm going to start slapping Rails on the output side of Puppet. Eventually I want Puppet itself to be smart enough to take advantage of the extra data on its own, but for now having simple interfaces that make that data available and provide you some ability to manipulate it will be a big step forward.
Thu, 27 Apr 2006 | Tags: design
Old Ideas
I have been thinking about what I can do to make Puppet a bit more obviously useful, because people don't seem to think that there is a real visceral difference between it and cfengine, and the two things that keep popping up are centralization of the output of Puppet (e.g., logs, metrics) and using the semantics of the configuration to make that output easier to manage (e.g., error log messages should know what objects produced them and what services they are affecting).
In somewhat standard fashion for me, I decided to do both at once, and I decided to start with logs since there is much less expected (note I said expected not desired) from logs than from metrics. In particular, I can get away without a database, at least to start.
I had already set up tagging in Puppet, so I figured I just needed to add those tags to the log messages, and then set up another server module to accept the logs. No real problem there.
Before I got a chance to implement this, though, I spent some time talking about it with my friend Andrew Shafer, and I realized that the logs should also store the path from the objects generating them. My objects have long had a concept of a path, but I haven't used it much and so it's not very well characterized. The basic idea is that the path would map directly to hierarchy of the configuration -- e.g., server classes would be analogous to directories in the path -- but because I've basically not used the paths much, that hasn't really been the case.
So, I went to implement what seemed a pretty easy idea, add the path and tags to the log messages. As I went about it, I realized that some refactoring was in order, because the current methods of logging had no way to link the messages with the object creating them. When I went about that refactoring, though, I was reminded that I originally wrote the Log class to accommodate this kind of link.
So, I spend tons of time and cash learning about Web 2.0 stuff, do lots of thinking, and go about starting something that I think will be cool and new, and I find that I planned it out, oh, 3 weeks into development of Puppet.
It's horribly embarrassing to have to reinvent a great idea, but at least I'm on track to do some cool stuff, even if I should have had it working 5 months ago.
I now have this all working, and the additional benefit (although planned, months ago) is that the log messages can easily include the path to the object that generated the message, or just the object name, or none. This should make parsing of log messages much easier, and should move to a database very easy, since the object will be a separate field, instead of sometimes-but-not-always being in the log text itself.
Thu, 27 Apr 2006 | Tags: process, puppet, design
Closures, Element Normalization, and Hell
While I have never exactly been strict about it, I've been trying to apply some of Alva Couch's closure theory to Puppet. The process has not been easy, partially because no one really seems to understand how to apply this to anything practical, but mainly because I have never been able to cleanly figure out whether the closures are demarcated in the language or the library. This has resulted in a kind of twisted dual nature where I partially did both but neither were good enough.
Instead of being able to accomplish what Alva is looking for, I seem to have devolved into a lesser goal. Rather than seeking to manage through closures, I am merely seeking element normalization. That is, rather than creating autonomous collections of elements managed through defined interfaces, I am settling for the lesser ability to create a configuration in which all configurable elements are normalized.
Closures, Opacity, and Parsing
One of the biggest influences that seeking closure had on Puppet's design was
that it forced me to realize that most configuration files cannot be managed
as opaque collections of text. Any configuration file that contains
directives relating to multiple service, such as inetd.conf's collection of
services or a cron tab's collection of unrelated cron jobs, needs to be
managed in a way that allows each of those directives to be managed through
the closure that maps to the service that needs to the directive.
For instance, if you have two services in /etc/inetd.conf and each of those
services requires a cron job, if you cannot manage the four elements (two cron
jobs and two services) independently, then you have to join the services into
one shared space:
class server {
file { "/etc/inetd.conf":
source => "puppet://server/files/inetd.conf"
}
file { "/var/spool/cron/crontabs":
source => "puppet://server/files/rootcron"
}
}
This configuration completely hides what is inside the files, and makes no mention of the services in the files or why those files are being downloaded. Most likely, these files would be named in a way that implied what they did:
class serviceAandserviceB {
file { "/etc/inetd.conf":
source => "puppet://server/files/AplusBinetd.conf"
}
file { "/var/spool/cron/crontabs":
source => "puppet://server/files/AplusBrootcron"
}
}
It's immediately obvious that this is not scalable. You need to have a new file for every unique combination of services, and you are also trusting people to name every file in a way that makes the humans maintaining the configurations able to understand what is going on.
I figured the correct design was to enable these elements to be managed
according to need, not according to whether they happen to reside together in
a file. Puppet is designed to make it easy to parse files like crontab and
inetd.conf and turn their contents into separately manageable elements:
class serviceA {
cron { serviceA: hour => [0, 11], user => root, command => "..." }
inetdsvc { serviceA: user => root, binary => "..." }
}
class serviceB {
cron { serviceB: minute => [0, 29], user => root, command => "..." }
inetdsvc { serviceB: user => root, binary => "..." }
}
Now you can easily separate out the services, and apply them a al carte to any machines that need them. Additionally, you are not hiding the configuration detail into opaque files managed far from the server definitions.
I figured this separation would provide me the flexibility and power to create simple closures that I could then use to manage large networks with ease. The downside of this design choice, of course, is that Puppet needs to support every little discrete type of element on every system, which is a real bear given how little consistency there is even across different Linux distributions, much less across different Unix operating systems or entirely unrelated systems.
The truth is, though, that you want this separate management anyway -- this level of separation will allow you to manage entire networks entirely through object-like interfaces, instead of collecting strings together into a file or creating unique files for every unique combination of services on your network. I am quite excited at the progress, and even though things have not progressed as smoothly as I could have hoped, I'm still convinced it was the right move. It also happens to be a design that I don't see any other configuration management tools following, so it nicely sets Puppet apart.
From Closures to Normalization
In order to try to collect these separate elements into closures, I came up with components, server class, and nodes as an attempt to provide closure-like structures. You would have independent collections of objects that would be configured through passed-in parameters.
The problem was that I couldn't easily map configurations to this style of thinking, and as I modified the structures more to provide the ability to configure networks like I wanted, the connection between these structures and Alva's purer concept of closurehood seemed to lessen. Now, I find that thinking of them as closures does not provide me with much direction, and I instead get my direction from thinking about normalizing elements.
Normalization is normally (heh) used in the database world to talk about reducing the number of times a given object is repeated in a database, thus decreasing storage requirements and maintenance costs. Because we're building heirarchical configurations rather than collections of tables, the term cannot be applied in quite the same way. When I say that my goal is to normalize the configurable elements throughout my configuration I mean that I want to develop my configuration so that any given configurable element (e.g., a file, service, package, or cron job ) is only referred to once in the entire configuration.
The reason I like this goal is that it is a kind of softened combination of abstraction and seeking closure. It encourages you to write abstract configurations, pulling common elements into abstract classes which then get applied as necessary, and it provides closure-like behaviour because if a configuration is entirely normalized than any given element is only being managed through its enclosing classes.
This immediately brings up sticky issues like what constitutes a separate element. For instance, are Apache versions 1 and 2 separate elements or the same element? In the short term, this kind of issue will be handled separately at each Puppet installation, but over time there should be enough configuration sharing that some kind of de facto standard can be achieved.
Enforcing Normalization
Is normalization such a high goal that it is worthy of enforcement? That is, should Puppet allow users to create non-normalized configurations? If they are allowed, then some mechanism must be developed for resolving which specification should be applied. Additionally, any hope of moving towards supporting real closures pretty much gets tossed right out the window. So, Puppet was (poorly) designed from the beginning to enforce normalized configurations -- if you attempt to create a Puppet configuration that tries to manage the same object from different classes, you will encounter an error.
Well, theoretically. The implementation is a lot stickier than that. Here's where we delve into the practical aspects, though, and out of theory, and since this post is long enough as it is, I'll finish this and continue in another.
Wed, 26 Apr 2006 | Tags: puppet, closures.lisa, alvacouch, design
Redesigning the Parser
It's amazing how translated intent tends to take on a life of its own. I apologize for how scattered this post is, but it's being written on a plane while the thoughts are developing. I'd try harder to make it clear, but the whole reason for the existence of this blog is for working through ideas like this.
Configuration Context
When I initially designed Puppet, one of my goals was to retain the tree structure created by the classes and definitions in the original manifest. In looking back, the only reason that I can find for that goal is to provide meaningful feedback to the user -- "object A in class Y had a problem" or some such.
I think that reason is important, but I did not do a good job of stating the problem, which meant my solution would tend to be just as muddled. In this case, the real problem is how to provide contextual information to the user, and the solution I chose was to retain context throughout the entire process.
This solution seems to have required some really bad design, though. In particular, if I wanted to retain the class structure as originally designed, objects would have to remain the scope that designed them. Take the following class structure:
class unix {
file { "/etc/passwd":
owner => root,
group => root
}
}
class webserver {
service { httpd:
ensure => running
}
}
include unix
include webserver
This would result in a scope tree with three individual scopes -- the top-level scope and one for each class -- and the two lower-level scopes would each contain one object.
This keeps the structure in a way that satisfies my initial goal. Stupidly, though, it complicates the language considerably.
Closurehood
One of the primary design goals of Puppet is that it only allow a given element to be managed in one part of a given host's configuration. In other words, two classes can't both try to manage a given file or a given package. This is heavily based on Alva Couch's Closures paper, but it's also based on a lot of work I've done, and I think it's an important part making it easy to build consistent, maintainable configurations.
However, it's just not possible to reduce all elements to one syntactical statement, because some elements vary so dramatically and in unforeseen ways that its configuration might be composed from many statements, rather than just one. So, the Puppet language needs a way to distinguish between intentially modifying the configuration of an existing element and two portions of a configuration trying to manage the same element.
My initial solution to this involved scoping -- if a specified element already existed in a parent scope, then overriding would be allowed, else it would be considered an error. However, after creating this system, I realized it worked fine for variables (although it's still just as confusing) but it didn't work at all for elements. The following code is very hackish:
class unix {
file { "/etc/passwd":
owner => root,
group => root
}
include bsd
}
class bsd {
file { "/etc/passwd":
group => wheel
}
}
The include puts the bsd class in a scope below the unix class, but
it's not maintainable -- the bsd class should be specifying a relationship
to the unix class, not the other way around.
So, this led me to create class inheritance, so you could redo the code this way:
class unix {
file { "/etc/passwd":
owner => root,
group => root
}
}
class bsd inherits unix {
file { "/etc/passwd":
group => wheel
}
}
I had to hack inheritance so that the scope of the parent class was created as the parent scope of the child class, but after that, any elements in the child class naturally override the same elements specified in the parent class.
The problem is that I didn't realize that this is the extent of overriding that Puppet should support. I created inheritance something like 4 months after I'd put the rest of the system in place, so I just added it, without reanalyzing the whole system. This parent-child relationship is the only one that should support overriding, and scopes are merely a hack to support it. In truth, scopes should have nothing to do with overriding, and it should be based entirely on ho the two managing containers related -- if one container is a subclass of the other container, then overriding should be allowed, and not otherwise.
As it is, stupid code like this overrides instead of conflicting:
class solaris {
file { "/etc/ssh/sshd_config":
owner => bin,
group => bin
}
}
class ssh {
file { "/etc/ssh/sshd_config":
owner => root,
group => root
}
}
file { "/etc/ssh/sshd_config":
source => "/nfs/apps/ssh/sshd_config"
}
include solaris, ssh
It overrides because the top scope mentions sshdconfig, so each of the
child scopes overrides that parent object. This is obviously dumb -- each of
these scopes has unrelated intent, they just happen to mention the same object
(the conflict is clear in such a small snippet, but gets confusing quickly in
a larger configuration).
Overriding Definitions
A further complication is that the current mechanism does not even allow overriding definitions:
define remotfile(source) {
file { $name:
source => "/nfs/apps/config/$source"
}
}
class base {
remotefile { "/etc/ssh/sshd_config":
source => "ssh/sshd_config.base"
}
}
class sub {
remotefile { "/etc/ssh/sshd_config":
source => "ssh/sshd_config.sub"
}
}
This won't work, because the definition gets evaluated, instead of getting stored in the current scope, so it's not available for overriding. This is a huge oversight, because it severely limits the ability to build truly abstract configurations.
Rethinking the Language
So, I set out to come up with a solution to this problem. How do I keep the features that I want but get rid of the confusing scope-based overrides and also support definition overrides?
At first I thought about adding on -- what additional code do I have to write to make this work? I came up with a few workable solutions, but they were generally pretty hackish and resulted in an even more complicated system internally. Finally I began rethinking the whole thing, taking it down to the basic questions -- what do I really need, and what's the best way to accomplish it?
I began to rethink the way that elements are stored in the given scope. It's critical to how overrides work right now (scopes are searched for a given element name and type, and if found are overridden), but it also makes the "right" way to do things far more complicated. This got me thinking about the fact that element statements are currently strange, linguistically, in that they don't have values -- that is, they result in a modification to the scope tree, but they don't return a value. Assigning an element to a variable isn't even syntactically legal in Puppet because it would only make sense if elements returned a value, and it doesn't make sense for elements to return a value and be stored in the scope tree, so I just store them in the scope tree.
But what if elements did return a value? That is, what if an element statement's value was the element itself? That would require that classes and definitions and nodes all return, as their values, the collected list of elements, which means that I lose the class structure that I've been using to retain configuration context.
Except that, really, I don't need it any more -- the context is already being provided by two attributes, tags (which are essentially the names of every class, node, and definition containing the element) and the path (which is a string that maps directly to the class path to the element). I can't come up with a use for configuration context that these two features don't provide, so it seems that this entire structure is unnecessary.
Collapsing the scope tree using values (element statements return the specified element as a value, and enclosing structures return the list of specified elements as values) immediately provides inroads into a much, much, much better language. For one, you just throw away all scope-based object overrides -- you need to provide a special system for subclasses overriding parent classes, but that should be relatively clear and very specific. For another, it makes the collection process recently discussed on the list (where one host can collect specified elements from other hosts) much more straightforward -- you need only provide a syntax for specifying an element that stores it centrally rather than returning itself as a value, and then a syntax for retrieving centrally specified elements. It's a much more consistent feature of the language when values are used.
The Results
The biggest change to the language is that evaluation of the parse tree should return a flat list of elements. With this change, the parser will have to treat inheritance specially, in that overriding elements within a class tree must still be supported, but this should be relatively straightforward, at least compared to how overriding is done now. Context will be retained by adding tags and paths to elements as they are created
This makes the 'component' subclass of Puppet::Type entirely unnecessary -- its only purpose before was to provide a container that mapped to the class tree. It should also make the whole process much clearer, from parse tree evaluation to configuration application.
Wrinkles
The only problem I can't yet think of how to solve is that of component dependencies:
define remotefile(source) {
file { $name: source => "/nfs/config/$source"
}
remofilefile { "/etc/ssh/sshd_config":
source => "ssh/sshd_config"
}
service { sshd:
ensure => running,
require => remotefile["/etc/ssh/sshd_config"]
}
This type of requirement would no longer be allowed, because the 'remotefile' object would not even exist on the client, so there would be no way to look it up using the current mechanisms.
This is a somewhat contrived example, because the service should probably actually require 'file["/etc/ssh/sshd_config"]', but there will certainly be cases where the elements in a given component will need to be applied before other elements can be, so the requirements subsystem needs to be able to take component and class membership into account for both sorting and notification (e.g., "if anything in component A changes notify service X").
At least the sorting aspects are relatively easily solvable -- sorting is currently done in Puppet::Type::Component, and it is being deprecated, so sorting will need to be redone anyway; I just need to redo it in a way that accounts for either direct dependency of container-level dependency.
Notification is already complicated, but it's also cleanly in one area of the code. In fact, $10 says it already doesn't behave exactly as I'd like, so it probably would just continue to be broken in the same way.
Generally, there would be two generic ways to solve these problems -- either make the dependency system able to take advantage of the remaining contextual information (tags or paths), or move dependencies into the language.
Hmmm.
I'll continue thinking about it.
Sat, 22 Apr 2006 | Tags: design
Error Handling
I need to implement error handling within Puppet. At this point, errors are largely handled internally and always the same way, which is probably not what most people want. Even if it is okay to make everything consistent, it is probably a good idea to rethink and revamp error handling as it exists now.
In particular, I don't have a clear idea of how to deal with part of a system failing. If a single state fails, either at initialization or run time, should the whole object fail? If a single object in a dependency tree fails, what should the rest of the tree to?
I've been thinking about these questions for a while, and I don't really have a clear answer. It seems like errors are generally an indication of a misconfiguration, so it would make sense to fail the whole object and require a reconfiguration, but I'm not sure that's really the best option.
Fri, 21 Apr 2006 | Tags: puppet, errors, design
[1] 2 >>