Puppet: System Administration Automated

Testing Cached Values


I'm currently in the middle of the largest refactoring effort I've ever done, while simultaneously learning tons about how to be a better developer. I'm constantly feeling a bit overwhelmed, a bit behind the curve, and like someone's going to look at my code one day and say, "hey nimwit, you just pull this string here and suddenly 2/3 of your code just goes away."

However, one thing I've really grokked recently is that if you're fighting your tools too much, you are on the wrong track, and one way in which it seems I'm constantly fighting my tools is the combination of cached values and testing.

For instance, I have an HttpPool class that knows how to set up Net::HTTP instances with all of the SSL information they need. This class caches SSL information like the certificate and key, so that each connection doesn't hit the disk for this info, which is obviously a pretty decent use of the cache. This caching generally takes this simple form:

def ssl_host
    unless defined?(@ssl_host)
        @ssl_host = Puppet::SSL::Host.new
    end
    @ssl_host
end

Yes, you could just do @ssl_host ||= Puppet::SSL::Host.new, but I've gotten some weird behaviour out of that in the past, and it also throws a warning for undefined variables.

So anyway, this works fine in unit tests; I test the caching, and I use mocks everywhere else. When I get to integration tests, though, it starts to really hurt, especially since I try not to do much mocking in my integration tests.

For instance, say I've got two unrelated tests that do an ssl connection. They each create some certificates, start a daemon, and try to connect. In this situation, the first one caches the ssl information, and the second one uses the cached values instead of its own new certificate, and you get invalid certificates.

After talking to Rick Bradley on #nashdl on IRC (gotta represent!), I've decided on at least an initial course of action. I'm going to create a module that provides both a caching and cache-clearing interface; anyone using cached values would use this caching interface instead of caching the values themselves, and the module itself would give you a single point of entry for clearing all caches on the system.

My first instinct was to create a Cache class or struct and keep a list of them in the caching module, so they can be cleared as necessary, but my recent work with TTLs has made me realize that time-based concepts of cache dirtiness are much better than actively cleaning.

So now, I'm thinking that the caching module will just have a timestamp, and only cached values created after that time will be valid. Before the Cache struct returns any values, it will always check that time, and it will know whether the value it has is still good or should be discarded.

This keeps us from maintaining global lists of caches, and it also makes clearing caches insanely cheap -- just reset a timestamp. Given that Puppet already sometimes has onerous memory requirements, I also like that it makes the caches themselves more likely to get garbage collected, since only the caching instance ever knows about the actual cache itself.

So my new method would look something like this:

def ssl_host
    cache(:ssl_host) { Puppet::SSL::Host.new }
end

That bit with the block is something I just thought of; if we've got a value, it's not called, but if we need a value, it's there for us. Pretty sweet.

Now just to implement it.

add to del.icio.us Add to Blinkslist add to furl Digg it add to ma.gnolia Stumble It! add to simpy seed the vine TailRank post to facebook

Wed, 07 May 2008 | Tags: , ,


Posted by vinbarnes at Thu May 8 03:27:04 2008
Looks good. Taking it a step further, you could get some inspiration from attr_accessor and have it define the method as well. And shave it down to just one line:

cache(:ssl_host) { Puppet::SSL::Host.new }

Posted by Luke Kanies at Thu May 8 03:46:43 2008
Yeah, that's basically what I have::

  def cached_attr(name, &block)
  define_method(name) do
  attr_cache(name, &block)
  end
  end

I ended up not being able to use ``cache`` as the method, because I was already using it in some places.

Posted by Yossef at Thu May 8 04:59:17 2008
Yeah, the block thing is seriously cool in Ruby. It's obvious that's never called because you can have some extremely wrong code in there and if there's no yield or block.call, nothing happens (assuming the parser can handle it, of course).

Name:


E-mail:


URL:


Comment: