Puppet: System Administration Automated

Content-Type Negotiation and REST


I'm working on a few enhancements for the development branch in Puppet, and in particular trying to add support for negotiating content-type. Strangely, I can't find much evidence of people talking about doing this. Or rather, I can find lots of people talking about it, but not many real implementations.

Most people, including Rails, seem to take the easy way out of just using file extensions. I don't want to do this in Puppet because I think file extensions are essentially evil, since you're embedding metadata into your file name, but also, it doesn't work because we need to be able to transfer files directly -- if the file ends in .pdf, we want to be able to transfer it even though we don't necessarily know anything about PDFs.

So, I want to use the content-type header (and apparently the Accept header, for clients). I was hoping to just be able to crib on someone else's work, but I guess I can't.

Most of it's going to be pretty easy -- do a bit of metaprogramming to convert the type to methods (e.g., to_xml and from_xml). However, we need to be able to list supported types, so I've been trying to decide how to provide a list of supported types. I guess the easiest way is to continue with the metaprogramming and just convert back from those method names to a type name. E.g., if your class has to_xml, from_xml, to_json, and from_json methods, then it supports the json and xml formats.

Also, I should probably be using the word representation instead of format, but, well, that's a lot of typing. :)

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Mon, 21 Jul 2008 | Tags: , , ,


I can't test like I'm supposed to


So, I'm toward the end of converting my certificate handling in Puppet to using my nifty new indirection features, which will handle both reading and writing certs to disk and to remote servers (for signing). Yee-haw. The only bit left (I hope) is the certificate authority itself.

In the course of this work, I've created a 'SSL::Host' class that is basically a composite of a key, a certificate request, and a certificate, so it makes sense to treat a CA as a special case of that: It's various files are stored in different places, and it can sign certificates, but it's otherwise the same.

So, I need this class to read and write to different locations (easy, although not yet tested). I also want the CA to initialize itself completely upon creation -- that is, when I call CertificateAuthority.new, I want it to create and write to disk all of the files it needs. I think this is reasonable, because I don't think it's reasonable to either require the caller to call a setup method of some kind, nor is it reasonable to do it late-binding, since initialization failures would show up during a call to sign which is stupid.

Okay. Seems easy. Here's the kicker: I want to test that the CA always chooses its name as the value of the certname Puppet setting. It's pretty easy to write that basic test:

it "should always set its name to the value of :certname" do
    Puppet.settings.expects(:value).with(:certname).returns("whatever")
    Puppet::SSL::CertificateAuthority.new.name.should == "whatever"
end

Except... all that initialization stuff is happening (by design). So now I'm in the position of using this kind of hack:

Puppet::SSL::CertificateAuthority.any_instance.stubs(:setup_ca)

Which is mostly a hack just because that method is private -- you never need to call it, yet here we need to know it exists just for testing.

The only other option is literally about 15 lines of stubs, since the CA uses a bunch of other settings, which now all need to be stubbed because of my initial expectation, plus needing to stub out any file reading or writing.

I keep thinking that I'm just crazy, at some point I'll see the light and be able to write single-expectation tests with no setup code like Jay Fields recommends, but in reality, I think Jay is crazy. I can't fight that feeling, and I less and less want to.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Wed, 12 Mar 2008 | Tags: , , ,


A first pass at DTrace


I've never really spent much time optimizing Puppet except in those areas that get particular complaints (and not always then), but now that I'm forced to run Leopard I figured I should see if I can put DTrace to use.

The first pass used the functime.d script, which tells me how long Puppet spends in each function. I couldn't get the file to execute directly, and I also couldn't get it to execute my script for me (which is a pretty good indication that I don't really know how to use DTrace), so I added the ability to pause my test script, giving me time to start dtrace. So, I run my test script, which I'm using to test parse time:

~/puppet/ext/puppet-test --modulepath /Users/luke/Desktop/puppet-stanford/modules/ -s parser -t parse --manifest ~/Desktop/puppet-stanford/master/manifests/site.pp -p

Then I run the dtrace script:

sudo dtrace -s ./functime.d -p 45847 2>&1 | tee functimes.log

This takes a heckuva long time to run (380 seconds or so, vs. about 6 normally), but in the end I get a big file that has histograms for all of the classes and methods, along with a sorted list of how long Puppet spends in each method. E.g., here's a histogram:

Puppet::Parser::Parser                              parse
         value  ------------- Distribution ------------- count
       8388608 |                                         0
      16777216 |                                         1
      33554432 |                                         0
      67108864 |                                         1
     134217728 |@@@@@@@                                  30
     268435456 |@@@@@@@@@                                39
     536870912 |@@@@@@@@@@@                              49
    1073741824 |@@@@@                                    21
    2147483648 |@@@                                      14
    4294967296 |@@                                       10
    8589934592 |@                                        6
   17179869184 |                                         2
   34359738368 |                                         0

And here's a few of the methods:

Puppet::Parser::Parser                   ast                      25881       1090     28219110
NilClass                                 nil?                    1982238         19     38713614
StringScanner                            check                   2044008         24     50380323
Hash                                     each                     84949       2789    236945697
Puppet::Parser::Parser                   import                       9   29288385    263595467
Puppet::Parser::Parser                   _reduce_132                  9   29289048    263601440
Object                                   catch                    56018       5403    302711262
Puppet::Parser::Lexer                    scan                       173    1752645    303207689
Racc::Parser                             _racc_yyparse_c            173    1752730    303222439
Object                                   __send__                   173    1752803    303234951
Racc::Parser                             yyparse                    173    1753138    303292971
Puppet::Parser::Parser                   parse                      173    1753536    303361785
Array                                    collect                    331     925551    306357434
Array                                    each                     26303      11912    313340970

The first annoying thing to notice about this is that this test is clearly collecting total time between method entry and exit, not the total time that we're in a method, which makes it a bit less useful for testing.

The next thing to notice is that we're calling nil? and check a ton of times, which adds up even though they're individually very cheap.

If we add up all of the calls to check and nil?, we get a bit less than half of the total run time of the parse method (which is the entry point to all of this code), which means they're having a big impact.

This really isn't anything I couldn't get from normal Ruby profiling, but based on my experience working with Brendan a bit at OSCON last year, I know there's much more available.

My next post on DTrace will hopefully include me covering how I used it to drill down a bit further.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Mon, 28 Jan 2008 | Tags: , ,


Ten Challenges by Yegge


In following Steve Yegge's comments on code size, I noticed an old post of his about ten challenges, which is really about books:

These are books that are important to me. Not in the Lewis Carroll or Herman Melville sense; they're not cherished fictional works, or even fictional works that are just thick enough to prop up the couch. For the most part they're technical books. But each of them is a book that I return to regularly as I try to figure out well, how stuff "works".

There are more than ten, obviously, but I decided to cap this list at ten books just to have a shot at finishing this essay before the end of the year.

I've read his first book, GEB, (all the way through, really!) and it's basically the most important book in my life. It helped pull me out of a post-college intellectual funk, and as Yegge says, it's hugely informative about intelligence. It provides great insight on recursion, self-reference, and much more. In fact, I did a talk (note that's a 16MB download) at RubyConf this year that was basically things I've learned from trying to apply this book to programming.

His other books are all directly programming books, which is slightly disappoint, but I've added them all to my Amazon Wish List since I've been recently feeling a bit like I've let my book learning fall away and I need to get back into it. If you're a Puppet user who's been looking for a way to show your appreciation, buying one of these books would be a great way to do so, hint hint.

Yegge's also got a Ten Great Books post, which ends up looking pretty similar but with different actual books.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Mon, 24 Dec 2007 | Tags: , ,


Ohloh is sweet


The more I play with Ohloh, the more I like it. I've added Facter and ldapsh to the site (Puppet was already there), and I've played a bit with how to record contributions and how code is tracked. I have to say, I've been waiting for a site like this for ages. It's very developer-centric, in that it's not necessarily something that a user would want to use to find and download projects, but I'm perfectly comfortable with that. Frankly, though, I think it could quickly overtake sites like Freshmeat and SourceForge, at least partially because they don't seem to have changed in years, even though there are plenty of opportunities for them to get better.

I love that I can get a great idea of what's actually happening in a project, rather than some mystical 'activity' metric, and people can claim their work across multiple projects. I think a site like this could really encourage open source development because people will be able to point to a central metric of what they're doing across the different projects they're on.

You can check my stats on the site, although anyone who follows my work won't exactly be surprised by where my time goes. You can also compare Puppet, Bcfg2, and Cfengine, which I think is interesting (it's especially interesting to check the number of contributers_ to Cfengine).

Docutils System Messages

System Message: ERROR/3 (<string>, line 13); backlink

Unknown target name: "contributers".

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Fri, 21 Dec 2007 | Tags: , , ,


New Development and Release Practices


(Posted to the list earlier, duplicated here for easy linking.)

As mentioned, I'm making some changes to how I manage development around releases, since 0.23.2 still had too many bugs but it took me four months to get a release out. This is obviously unacceptable, regardless of how good my excuses are. :)

The major change is that I'm always going to maintain a stable branch, named after the current stable minor release. For example, I created a new branch today called '0.24.x' (yes, with a literal 'x'). We'll do all bug fixes against that branch and all development against the master branch, with frequent merges from the stable branch to the master branch, but never the other direction.

This will guarantee that we'll always have a stable branch that's ready to release, and it will mean that each stable branch will only have bug fixes in it with no new development. It will also likely mean an acceleration in new minor release versions, since we won't be doing new development in that branch (I've traditionally done new small features in the minor-minor releases sometimes).

I'm also hoping to find someone willing to maintain this stable branch, including deciding when to make a new release, so that there's a clear separation in duties between maintaining development and maintaining stable. I've got one potential volunteer for this, but if you're interested, please let me know.

Related to this change, I'm planning to stick to doing any significant development in feature branches which will only be merged with master upon completion -- this was prohibitively difficult with SVN, since merging is so painful with it, but with git it's easy to keep your feature branches up to date with the master branch. I'm not entirely sure how this is going to work yet, whether I'll be publicly publishing all of those branches or what, but we'll see, I guess.

These changes are all based on the recommendations of a community member, and I need more help. If you know of a better way to do things, or just of a couple additional changes I should make, then please, please let me know -- I'm learning a lot about project management all the time, but I'd much rather learn by being told than by making mistakes.

I hope to have a page up describing these practices, but if you're willing to document them and help maintain the page, please do so -- I'd love the help.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Thu, 13 Dec 2007 | Tags: , , ,


Recent Lessons on Managing Open Source Projects


Ok, so it wasn't the next day, but, well, what can I say, I got sick, and this release is just taking forever. And my wife defended her dissertation last week, so things kind of got put on hold for that. Anyway, back to the program.

As I mentioned previously, I've learned a lot in the last few months, mostly by making mistakes.

Release Management

One of the biggest mistakes I've made recently is not putting out bug-fix releases in the last few months. The current release, 0.23.2, has some sticky bugs in it, and I've gotten so much development done in the mean time that it won't necessarily be a pain-free upgrade for everyone.

I let this happen because I haven't normally done large chunks of development, so it's been safe to release after each significant feature, since they were always short time periods, which meant at the same time there was no real need to keep a stable branch separate from the development branch. This batch of development has been varied and very large, enough work that I should have been maintaining a stable branch this whole time.

So, once I finally get 0.24.0 out (which should be tomorrow), I'm going to announce a new policy of always having a branch for every point release; in this case, there'll be a 0.24.x branch. All bugs will get fixed ASAP against that branch, and then merged over to the master branch if appropriate. This way I've always got a stable branch I can release if there are important bugs to fix, and I've always got any fixes in both the stable and development branches.

It looked for a while like I had someone to manage the point-point releases (e.g., 0.24.1 and 0.24.2), but he changed his mind at the last minute. It was an interesting idea, though, and got me pretty excited, so I'm probably going to try to find someone who will handle this role after all.

Community Involvement

While no one can argue I haven't built a pretty successful community, I think I haven't done quite enough to get them involved in the core of the project. Part of it is just the fact that most community members are sysadmins not developers, so it's reasonable that they wouldn't be mucking in the guts, but part of it is that I never quite know where to draw the line between the interests of the project and the interests of my company, which pays my salary and allows me to eat.

I think I need to work more toward getting a better balance of community involvement, which means finding more people to hand responsibility over to. A stable release manager would obviously be great, but I expect there are other areas we could find for people to take over. I've tried this some by defining community roles, but that was largely a failure. I'm still the only person really doing any ticket triage, even though we've had a published policy for months, for instance.

So it looks like I'll continue to make mistakes in this area, rather than starting to make the right decisions, but hopefully we'll eventually fumble toward a better balance.

This is a pretty rambling post, partially because I haven't really had time to lift my head much in the run-up to 0.24.0, but hopefully things will get better all around in the next few months.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Wed, 12 Dec 2007 | Tags: , , ,


Things I Have Learned Recently About Development


This is the second post in the series Things I Have Learned Recently

Development

I'd say there are three big things I've learned recently about coding:

Testing

In terms of tools, I've started using RSpec instead of test/unit, but what really matters is that I think about testing much differently now. I don't know that I'm testing "well" yet, but I'm at least testing much, much better. In particular, I'm doing my best to focus on the behaviour of the code that I'm testing, rather than internal state or unexposed behaviour, and I'm using a lot more mocking and stubbing to avoid setup/teardown costs and force simplification of interfaces. I'm also writing much shorter tests although many more of them.

A lot of my older tests were classes with a few, very long tests, each of which exercised a specific aspect of the given class's functionality. Newer tests, though, will treat that functionality as a context, which will itself have many specific behaviours, and each of those behaviours will get its own test.

I think I'm still making some mistakes here, as my contexts are generally per-method, which I think is a mistake. For instance, I have a Collector class whose tests I recently decided to rewrite. This class can be used to find either virtual or exported resources, and it can be used with either a query or by specifying individual resources. These are really the main contexts I should be testing, but I didn't really do that. One of the issues is tha RSpec itself doesn't do much to make this easier, but it's not really fair to blame the tools.

Design

Amazingly, Puppet hasn't had a class to model configurations until recently, nor has it had classes to model nodes or the different aspects of file serving. It's recently been made clear to me how important it is that I draw classes out of my code, and it's had a huge impact on clarity and maintainability. It's tough to describe exactly what I mean here or how to find these classes that aren't yet classes.

The main reason Puppet never had a Configuration class is because configurations show up in so many forms -- at the least, they exist during compilation, transfer, and execution. To see them as a common class, I had to step up a bit, and I'm still not entirely satisfied wih the abstraction, but man, it's a lot better than it was.

The biggest change is that we're moving to REST from XMLRPC, and REST enforces a standardization that is entirely absent from my shoddy XMLRPC implementation. REST only has a few verbs (methods) -- find, destroy, and save, mostly -- whereas with XMLRPC I can make as many verbs as I want. This limitation enforces a consistency across all of my network-enabled classes, and at the same time requires that me expose those classes that maybe I didn't see clearly before. In order to find a Node, you must first have a Node class to call find on, for instance.

Where previously I had ten different classes, each with a unique set of methods that worked over the network and each of which didn't clearly model anything explicitly -- because they were more focused on providing network transparency than on classes or objects -- I now have a roughly equivalent number of classes but each modeling specific objects that I want to pass around the network.

Development Process

First of all, distributed development is awesome. I kinda regret choosing git, because its usability is as bad as everyone says, but it really has made it easier for people to contribute patches and code. Its in-repository branching has made it much easier for others to create and publish their own branches and then for me to integrate their work. I don't know that I have any more people contributing, but the contributions seem more meaningful.

Mainly, though, I've learned that feature branches really are a good idea. I tried them about a year ago but I found them to be just overhead, given how few people were participating and how short my development cycles were. Recently, though, I've been doing larger chunks of work and there have been more people involved, so feature branches make much more sense. In particular, I haven't had a clean master branch in a while because I've been doing all of this development in it, but at the same many bugs have been getting filed and fixed. I couldn't release because of the lack of cleanliness of the main branch, all because I wasn't using feature branches.

Conclusion

Mainly, it's clear that I still don't really know what I'm doing. I like to think I've become a much better developer in the last three months, but it's been three months of little apparent progress from the outside and a lot of pain and frustration on my part, so it's hard to see the positive right now. Hopefully we'll all reap the benefits in the near future.

Check back tomorrow for a post on managing open source projects.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Mon, 26 Nov 2007 | Tags: , ,


Things I Have Learned Recently


It's difficult to enumerate exactly what I've learned in the last few months. In fact, it's difficult to enumerate the type of things I've learned.

I've certainly learned a lot about development. Rick Bradley has introduced many better ways of thinking about the world of development, albeit at the cost of much mental stability and clarity, and I've learned a lot about the value of distributed development and feature branches. I've learned a lot about managing a project, mostly through making mistakes. I've learned a lot about business, again mostly through mistakes rather than success (is it even possible to learn through success?).

Really, you could say that the majority of what I've learned recently is by making mistakes. As Barry LePatner says, though:

Good judgment comes from experience, and experience comes from bad judgment.

I've got lots of bad judgement in the recent past, so I guess I'm much more experienced.

I'm going to do a series of posts about what I've learned, because I think one post would just be too damn long. First up will be development.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Sun, 25 Nov 2007 | Tags: , ,


State of Development, October 2007


(A copy of a post sent to the users list.)

REST Development

First, I've found the REST work to be significantly more complicated than I'd feared. The plumbing is nearly all done and the majority of the functionality is now available, but there's still the painful and lengthy process of converting the internals from using the old xmlrpc-style classes to the newer and much cleaner REST-style classes.

Release Status

Given how long I've been out in the wilderness on this, the fact that I don't know how long this conversion will take, and the rate at which tickets have been piling up, I've decided to put off this conversion and do a release instead.

Starting today, I'm refocusing on getting 0.24.0 out the door as soon as possible. I'll have a more complete idea of what'll be in it by next week, but at this point it'll include the environment work I did a while back plus as many tickets as I can reasonably fix in the next week or two. This will be the misspiggy release, so by around Monday you should have some idea of what tickets I'm planning on fixing in this release. I'll probably also create another intermediate release and start assigning tickets to that, which will also include the REST work when I finally get it done.

I was initially planning on rolling back to the last known-good state, but in assessing the current state, I don't think that's necessary. We've mostly gotten plumbing done without hooking it into anything, which means that we would just be releasing the old functionality along-side the new plumbing, which shouldn't have any affect at all, other than making my life easier.

I really hope to get this release out in less than two weeks: One week of ticket-closing, and one week of testing. Apparently the git repo is kinda hosed right now, but it'll be working again by Monday. If you're in a position to help test, please start testing on Monday.

Ticket Management

I've mentioned this a few times, but I'm still looking for help from the community to manage tickets. I appear to have really let the open tickets pile up, and I'm having a hard time keeping on top of the list of unreviewed tickets. If anyone is interested in helping out, this would be a great place to start, and there are tickets ranging from trivially easy to fantastically complicated, so we can find the right challenge for just about anyone.

I've also been considering putting a bounty on some of the tickets. I'm a bit broke at the moment, but some of the tickets have the right combination of annoyance and simplicity that it might be worth some money to get rid of them. Is anyone interested in this? If so, please email me personally, and, I guess, let me know what it would actually take to get you interested.

Would others be interested in putting their own bounty on tickets? Do you have a feature request or bug that's just killing you that you'd be willing to pay a little or a lot of money to have fixed? Hopefully the same bounty system would work for you.

Errata

This is more Reductive Labs than Puppet, but it's at least worth pointing out. I've recently been joined by two partners, Andrew Shafer and Shane Olson. Neither of them is at the company full time, but hopefully I'll be able to afford to bring them on full-time soon. My big hope is that their help will allow me to make the product even better for all of you and to develop both Puppet and the tools around it, like PuppetShow and Runnels, with a little more vigour.

Conclusion

The summary here, of course, is that REST is delayed for a while, but I'm hoping to get a release out relatively soon with the features I've already developed (including support for multiple environments) along with any critical open tickets. I'm also still looking for community help in managing tickets, and I'd love someone to help document features as I add them (James Turnbull has been doing a great job of picking my brain and documenting the results, as an example).

I should be posting more in the coming days, covering functionality you should expect to see in this release.

Feel free to contact me directly if you have any concerns.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Thu, 25 Oct 2007 | Tags: , ,


[1] 2  >>