Puppet: System Administration Automated

Reification in Puppet


It's apparently Ruby day in Puppetton (as in, Puppet + Hobbiton, or something like that). I've been trying to add more Ruby blogs to my blogroll, and I guess they're starting to succeed.

Piers Cawley has a post about reification:

However, in OO circles (or maybe just in my head), reification is a good thing. Its the process of taking something abstract and turning it into a real object. Usually, the word gets used for big things like turning an intractable method into an object as a step on the way to refactoring that method. I tend to use it in a slightly broader sense. For me, reification is the process of turning something (a method or a data structure usually) into a full blown object with its own behaviour.

This is something I've only recently started thinking about as a concrete process. I learned to program (according to most definitions) in the perl world, and I relied ridiculously heavily on hashes as my mechanism for managing all of my data. Even when you go OO in perl, you're still relying on hashes, so there isn't as much impetus to reify as you might find in Ruby.

That reliance on hashes has stuck with me over the years, and I'm only just starting to get over it. A heckuva lot of parts of Puppet have been modeled as hashes long past their useful lifetimes, and even when I correctly use a separate class to model something, I'll often stupidly avoid differentiating two uses of a class until it's so painful I can't deny it any longer.

Probably the biggest example of this is that I did not have a parser-specific resource class until relatively recently (sometime in 2007, anyway, I think). Before it existed, I was using the same Hash-like class I use for transferring configurations over the wire. Hey, it existed, it stores parameters, yay. The problem is that the two classes share almost no other functionality. Parser resources have tons of special behaviour -- they get to determine whether a class can override a given parameter, they have to add default values, they can be exported or virtual, and they can model types that are defined in the parser, as opposed to only types built into the RAL.

When I finally couldn't take the pain any more, I created a simple parser resource class. Then, before I knew it, I moved tons of code into that class, and all of the parser structural classes -- mainly the parser, the interpreter, and the scope classes, at the time -- became much easier.

I'm actually going through all of this again right now. As insane as it sounds, until two weeks ago, Puppet did not have a Configuration class to model a specific configuration. Well, really, a configuration is a collection of resources, and it's the resources that matter, right? Except there's so much more than that -- the resources have relationships to each other, and those relationships really matter. More importantly, there's a lot of code related to these configurations. When I created a separate Configuration class, my Interpreter class went from 708 lines to 105. That means that 6/7 of my Interpreter class was actually Configuration code, but because I hadn't reified my configuration class, I didn't realize it.

It's actually stupider than that. When I first created a Configuration class, it only modeled the configuration as it was being compiled; I still returned something that wasn't a stand-alone object that could have code attached. When I finally realized, I started really punching myself in the head. I now have a Compile class that manages interaction between the parser and the configuration, and I have a Configuration class that is the result of a compile. It really is a world of difference.

The way that I tend to think about this process is that I'm converting "else" code into "me" code. That is, the Interpreter previously had a ton of code that interacted with something that wasn't itself, and now essentially all of that code has been moved into the Compile and Configuration classes, and all of that code can pretty much be used as private methods, meaning it's all "me" code now -- it's the classes talking about themselves, instead of a class talking about something "else".

This provides what might be a good way to notice whether it's time to reify -- if you have a lot of code that uses receivers on the methods, then you're mostly interacting with something else. There's a very good chance that "else" should be its own class, and all that code should become methods on that new class.

This is something I'm still working on in Puppet. I've already created a bunch of new classes for the next release (Compile, Configuration, Node, Facts, Checksum) and I'll be shocked if I don't end up with more. The key to these classes is that I'm not adding code -- I'm taking code that already exists but talks about something "else" and making it talk about itself. In the majority of cases, this results in a significant reduction in line count, because I lose all the reference management and argument validation that has to go on when a class has to assume some hostile external force is interacting with it. I even get to make a lot of that code private, because no one needs to care about it any more.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Tue, 11 Sep 2007 | Tags: , , , , , ,