Puppet: System Administration Automated

A bit more DTrace


(This should have been posted a while ago, but I guess I had a problem and it's been sitting uncommitted for a while.)

After pulling apart the skip method in the lexer, so that the various parts are in separate methods, I get this as my count:

Puppet::Parser::Lexer                    munge_token              56778        358     20335592
Class                                    new                      28242        889     25132822
Puppet::Parser::Parser                   ast                      25881       1147     29695496
Fixnum                                   <                       1817071         16     30723097
StringScanner                            check                   1829886         26     48732560
String                                   length                  3757782         20     78611361
Puppet::Parser::Lexer::TokenList         each                     56778       6618    375813485
Puppet::Parser::Lexer                    find_token               56778       6714    381227038
Hash                                     each                     84949       4563    387630769
Puppet::Parser::Parser                   import                       9   45754308    411788774
Puppet::Parser::Parser                   _reduce_132                  9   45755009    411795083
Object                                   catch                    56018       8086    452970031
Puppet::Parser::Lexer                    scan                       173    2751816    476064309
Racc::Parser                             _racc_yyparse_c            173    2751907    476080064
Object                                   __send__                   173    2751984    476093248
Racc::Parser                             yyparse                    173    2752322    476151712
Puppet::Parser::Parser                   parse                      173    2752742    476224530
Array                                    collect                    331    1446548    478807659
Array                                    each                     26303      18476    485983221

The interesting one there is the Lexer.find_token method -- I just created that, and it looks like it's taking 38/48 of the total parse time, which is a helluva lot.

This method is responsible for picking the token to return, and the complicated aspect of the method is that it has to return the longest match, which is currently done by matching each token in turn (skipping those that don't match), and picking the longest match. This is expensive, because it means that every token is iterated over for every returned token, which means it scales at O(N^2), which is bad.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Mon, 28 Jan 2008 | Tags: , , ,


Name:


E-mail:


URL:


Comment: