Puppet: System Administration Automated

Closing the first DTrace loop


I ended up spending what time I had for a couple of days at LCA optimizing Puppet's lexer. As I mentioned in my first DTrace post, I'm mostly just trying to use existing scripts for now and worry about understanding the darn thing later.

It turns out that the Swiss Army Knife of available scripts for Ruby is the rb_calltime.d script in the DTraceToolkit. This one gives us pretty much everything we might want to know in a first pass of debugging a Ruby program, including (most importantly for me) the inclusive and exclusive elapsed times for each method in the system (DTrace insists upon calling them functions, but it's Ruby, so we know better).

Unfortunately, I'd already hacked out most of the optimization before I discovered this script, and it doesn't seem to want to run at the moment, so all I can do is show the current data. Here are the methods that take the most time inclusively (meaning that it counts the time from entry to exit, and thus just tells us where the time is being spent in the overall program, not where the actual problems might lie):

lexer.rb             func       Puppet::Parser::Lexer::munge_token 12062622
lexer.rb             func       Puppet::Parser::Lexer::TokenList::lookup 12410291
branch.rb            func       Puppet::Parser::AST::Branch::initialize 19084001
methodhelper.rb      func       Hash::each                       19144097
methodhelper.rb      func       Object::set_options              20712067
ast.rb               func       Puppet::Parser::AST::initialize  22431107
lexer.rb             func       Array::each                      23868735
parser_support.rb    func       Class::new                       28520157
lexer.rb             func       Puppet::Parser::Lexer::find_string_token 29610463
lexer.rb             func       Puppet::Parser::Lexer::find_regex_token 29844180
parser_support.rb    func       Puppet::Parser::Parser::ast      36132924
lexer.rb             func       Puppet::Parser::Lexer::find_token 44581509
lexer.rb             func       Object::catch                    46187059

And here are the exclusive times (i.e., only counting the time spent in each method, not the time between entry and exit):

ast.rb               func       Puppet::Parser::AST::initialize   1719039
lexer.rb             func       Puppet::Parser::Lexer::Token::convert  1920189
branch.rb            func       Puppet::Parser::AST::Branch::initialize  2062436
lexer.rb             func       Object::catch                     2262789
lexer.rb             func       StringScanner::match?             2471109
parser_support.rb    func       Class::new                        3112051
lexer.rb             func       Puppet::Parser::Lexer::skip       3282333
lexer.rb             func       Puppet::Parser::Lexer::find_token  4930508
lexer.rb             func       Puppet::Parser::Lexer::munge_token  5232851
lexer.rb             func       Puppet::Parser::Lexer::find_regex_token  5572052
lexer.rb             func       Puppet::Parser::Lexer::TokenList::lookup  6659161
parser_support.rb    func       Puppet::Parser::Parser::ast       7144763
lexer.rb             func       Hash::[]                          7179650
methodhelper.rb      func       Hash::each                       14954253
lexer.rb             func       Puppet::Parser::Lexer::find_string_token 17045065
lexer.rb             func       Array::each                      20768233
-                    total      -                                132806477

Given this data, we're spending about 1/7th of the total parsing time just in the find_string_token method, which currently looks like this:

def find_string_token
    matched_token = value = nil

    # We know our longest string token is three chars, so try each size in turn
    # until we either match or run out of chars.  This way our worst-case is three
    # tries, where it is otherwise the number of string chars we have.  Also,
    # the lookups are optimized hash lookups, instead of regex scans.
    [3, 2, 1].each do |i|
        str = @scanner.peek(i)
        if matched_token = TOKENS.lookup(str)
            value = @scanner.scan(matched_token.regex)
            break
        end
    end

    return matched_token, value
end

The method is responsible for determining if the next token is a simple string-based token. I've optimized it by taking the fact that the longest string-based tokens are three characters (the <<| and |>> tokens), so I look for three character matches, then two, then one. If I don't get a match by then, then we don't have a match. I could probably optimize further, since these three character tokens are pretty darn rare, especially compared to the one character tokens, but I'd need to hard-code a lot more knowledge about the token list, and really, this iteration should be delimited by an automatic determination of the longest token, rather than hard-coding it.

This really isn't a very good write-up of what I did with DTrace or how it was helpful, other than showing the interesting differences between the exclusive and inclusive data, and letting you know that the rb_calltimes.d script is the one to start with, but hey, that's more than I could find when I started looking, so hopefully this will get you somewhere.

I expect to continue spending more time using DTrace for optimizations, and I'll hopefully start uploading my data so I don't have to worry about taking these snapshots. Graphs would sure be nice....

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Tue, 05 Feb 2008 | Tags: , , , , , ,


A bit more DTrace


(This should have been posted a while ago, but I guess I had a problem and it's been sitting uncommitted for a while.)

After pulling apart the skip method in the lexer, so that the various parts are in separate methods, I get this as my count:

Puppet::Parser::Lexer                    munge_token              56778        358     20335592
Class                                    new                      28242        889     25132822
Puppet::Parser::Parser                   ast                      25881       1147     29695496
Fixnum                                   <                       1817071         16     30723097
StringScanner                            check                   1829886         26     48732560
String                                   length                  3757782         20     78611361
Puppet::Parser::Lexer::TokenList         each                     56778       6618    375813485
Puppet::Parser::Lexer                    find_token               56778       6714    381227038
Hash                                     each                     84949       4563    387630769
Puppet::Parser::Parser                   import                       9   45754308    411788774
Puppet::Parser::Parser                   _reduce_132                  9   45755009    411795083
Object                                   catch                    56018       8086    452970031
Puppet::Parser::Lexer                    scan                       173    2751816    476064309
Racc::Parser                             _racc_yyparse_c            173    2751907    476080064
Object                                   __send__                   173    2751984    476093248
Racc::Parser                             yyparse                    173    2752322    476151712
Puppet::Parser::Parser                   parse                      173    2752742    476224530
Array                                    collect                    331    1446548    478807659
Array                                    each                     26303      18476    485983221

The interesting one there is the Lexer.find_token method -- I just created that, and it looks like it's taking 38/48 of the total parse time, which is a helluva lot.

This method is responsible for picking the token to return, and the complicated aspect of the method is that it has to return the longest match, which is currently done by matching each token in turn (skipping those that don't match), and picking the longest match. This is expensive, because it means that every token is iterated over for every returned token, which means it scales at O(N^2), which is bad.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Mon, 28 Jan 2008 | Tags: , , ,


A first pass at DTrace


I've never really spent much time optimizing Puppet except in those areas that get particular complaints (and not always then), but now that I'm forced to run Leopard I figured I should see if I can put DTrace to use.

The first pass used the functime.d script, which tells me how long Puppet spends in each function. I couldn't get the file to execute directly, and I also couldn't get it to execute my script for me (which is a pretty good indication that I don't really know how to use DTrace), so I added the ability to pause my test script, giving me time to start dtrace. So, I run my test script, which I'm using to test parse time:

~/puppet/ext/puppet-test --modulepath /Users/luke/Desktop/puppet-stanford/modules/ -s parser -t parse --manifest ~/Desktop/puppet-stanford/master/manifests/site.pp -p

Then I run the dtrace script:

sudo dtrace -s ./functime.d -p 45847 2>&1 | tee functimes.log

This takes a heckuva long time to run (380 seconds or so, vs. about 6 normally), but in the end I get a big file that has histograms for all of the classes and methods, along with a sorted list of how long Puppet spends in each method. E.g., here's a histogram:

Puppet::Parser::Parser                              parse
         value  ------------- Distribution ------------- count
       8388608 |                                         0
      16777216 |                                         1
      33554432 |                                         0
      67108864 |                                         1
     134217728 |@@@@@@@                                  30
     268435456 |@@@@@@@@@                                39
     536870912 |@@@@@@@@@@@                              49
    1073741824 |@@@@@                                    21
    2147483648 |@@@                                      14
    4294967296 |@@                                       10
    8589934592 |@                                        6
   17179869184 |                                         2
   34359738368 |                                         0

And here's a few of the methods:

Puppet::Parser::Parser                   ast                      25881       1090     28219110
NilClass                                 nil?                    1982238         19     38713614
StringScanner                            check                   2044008         24     50380323
Hash                                     each                     84949       2789    236945697
Puppet::Parser::Parser                   import                       9   29288385    263595467
Puppet::Parser::Parser                   _reduce_132                  9   29289048    263601440
Object                                   catch                    56018       5403    302711262
Puppet::Parser::Lexer                    scan                       173    1752645    303207689
Racc::Parser                             _racc_yyparse_c            173    1752730    303222439
Object                                   __send__                   173    1752803    303234951
Racc::Parser                             yyparse                    173    1753138    303292971
Puppet::Parser::Parser                   parse                      173    1753536    303361785
Array                                    collect                    331     925551    306357434
Array                                    each                     26303      11912    313340970

The first annoying thing to notice about this is that this test is clearly collecting total time between method entry and exit, not the total time that we're in a method, which makes it a bit less useful for testing.

The next thing to notice is that we're calling nil? and check a ton of times, which adds up even though they're individually very cheap.

If we add up all of the calls to check and nil?, we get a bit less than half of the total run time of the parse method (which is the entry point to all of this code), which means they're having a big impact.

This really isn't anything I couldn't get from normal Ruby profiling, but based on my experience working with Brendan a bit at OSCON last year, I know there's much more available.

My next post on DTrace will hopefully include me covering how I used it to drill down a bit further.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Mon, 28 Jan 2008 | Tags: , ,