Twiiter

Twitter Updates

    follow me on Twitter
    Loading..
    Loading..
    « Networked Time Machine Best Practices | Main | Sun and Oracle, a tale of two filesystems »
    Thursday
    Apr232009

    Good programs are lazy

    Once upon a time my job title was "Performance Engineer." Back then my job was to make things go faster. Now for every piece of code the particulars of how you do that varies, but at the end of the day almost all code optimization is fundamentally about figuring out how to do less.

    It seems fairly basic, things run faster when you are doing less, when you need to send less data, when you need to walk through smaller structure. Except for hiding latency (making a long process appear look shorter actually without making it shorter, incrementally displaying results for instance) almost all optimization is about figuring out how to do less.

    That is why code being lazy tends to be a big win. By lazy I mean procrastinating, putting off anything it can do until the last moment. If you need an array for some operation why should you ever initialize it until the first time you try to do that operation. Of course that is just a rule of thumb, in some cases there may be responsiveness or latency concerns that require you to build some structures ahead of time, but people tend to default to precalculating things, and special casing lazy initialization when their profiling tells them something.

    The problem with that is that individually most of the time non-lazy initialization is not a huge a deal, but as Jamie points[Jamie] out, small performance issues everywhere do add up. Not only do you burn fewer cycles, but I have found that if I am aggressively making my code use lazy initialization techniques it tends to reduce ram footprint and heap fragmentation, because any time some precalculated object turns out not to be used it is never allocated in the first place. On small devices with no paging (like an iPhone) that can be be a significant improvement.

    So why don't people write lazy code? Well, it is harder. That's right, being being lazy can be more work. The reason it is harder is more of a mindset and tool issue than a fundamental one. It is a systemic problem, and moving to being consistently lazy code requires a systemic solution. After I found myself doing the same pattern repeatedly I decided to codify it.

    Imagine you had an Objective C class with an init like this:

    - (id) initWithString:(NSString *)string_ {
      if ((self = [super init])) {
        cache = [[NSMutableDictionary alloc] init];
        derivedStringValue = [[self expensiveCalculation: string_] retain];
      }
    
      return self;
    }
    

    In that code we are allocating a cache array and doing some expensive calculation at object creation, but we don't know for sure the user is going to require us to use either of those. So, how would we rewrite that lazily? Well, here is what I used to do:

    - (id) initWithString:(NSString *)string_ {
      if ((self = [super init])) {
        string = [string_ retain];
      }
    
      return self;
    }
    
    @synthesize derivedStringValue;
    
    - (NSString *) derivedStringValue {
      if (!derivedStringValue) {
        derivedStringValue = [[self expensiveCalculation: string_] retain];
      }
    
      return derivedStringValue;
    }
    
    @synthesize cache;
    
    - (NSMutableDictionary *)cache {
      if (!cache) {
        cache = [[NSMutableDictionary alloc] init];
      }
    
      return cache;
    }
    

    Okay, that code doesn't allocate anything until we need it. In exchange it is no longer safe to directly access the ivar, you always need to go through the getters (i.e. [self derivedStringValue] or self.derivedStringValue). There are also added complications if you have some dependent values, you need to be careful their initializers don't form a loop. We don't have to worry about initializing the ivars to nil since the Objective C runtime zeros out newly allocated objects. Generally this works well, but it is more code and it is no longer clear what is initialization code. Since we use synthesized properties, if we call the generated setter it will bypass the initializer, which is equivalent to manually setting the ivar. So the question is, can we do better? Since I am writing a blog post about it, I must think the answer is yes.

    First, lets look at what is common between the two initializers, and try to isolate it:

    #define LG_SYNTHESIZE(name, type, initializer)  \
    @synthesize name;     \     
                                             \
    - (type) name {              \
      if (!name) {                   \
        name = (initializer);  \
      }                                     \
                                            \
      return name;               \
    }
    

    Now, using that macro lets rewrite the lazy version:

    - (id) initWithString:(NSString *)string_ {
      if ((self = [super init])) {
        string = [string_ retain];
      }
    
      return self;
    }
    
    LG_SYNTHESIZE(derivedStringValue, NSString *, [self expensiveCalculation: string_]);
    LG_SYNTHESIZE(cache, NSMutableDictionary *, [[NSMutableDictionary alloc] init]);
    

    Now that is much better. Ignoring the macro, it is about the same length as the original, and it is clear what is initialization code. One interesting thing to point out is that in many cases doing this allows you to eliminate the -init method. This example requires it because we need to stash an a parameter, but in general any class that can be initialized with -init (as opposed to -initWithFoo:) can be written with lazy initialization so that you don't have an explicit -init (of course some super class will implement -init).

    So, by coming up with a few macros, working out a few rules, and always using accessors over ivars there is a lazy init pattern for Cocoa that works fairly well. I have found that there is a need for several different synthesis macros, basicly corresponding to Objective C property attributes (LGSYNTHESIZEATOMIC, etc) as well as different macros if you to initializer a value that defined in a super via chained accessors. In some cases it is also necessary to implement the setter as part of macro. Personally I have only bothered to implement this and refactor my code for nonatomic retained synthesis, since that hit ~95% of the cases in my code.

    This could be made a lot nicer by implementing a source->source transform in clang and extending the existing synthesizer syntax, but I have found that adding custom extensions to your compiler to make your source code nicer tends to carry a very high cost in the long run.

    Update: changed source code highlighting to Syntax Highlighter 2.0.

    Reader Comments (2)

    Very interesting post. This is something that should ideally be a lot closer to the language level, but exactly how it would best be implemented will require some careful thought.

    Have you played around much with source->source transforms in clang? At a casual glance, it seems as though if you were careful enough with the syntax, you could get away with covering 95% of cases and retain a simple, robust syntax extension.

    April 23, 2009 | Unregistered CommenterNick Forge

    I have played with clang, though not for source->source transforms. Specifically I wrote some custom analysis passes to find places where I was breaking rules in my code. Similiar to the current leak finding stuff in checker, but specific to semantics of my code.

    I have some ideas for how to do the syntax pretty nicely, basically just like you can do something like:

    @synthesize myProp (ivar=_myOtherProp); you could do something like:

    @synthesize myProp (initializer = myInitializerMethod);

    and then have the synthesizer generate ASTs equivalent to my macros. It is actually exactly how getter synthesis works right now, just with one extra named parameter and some extra predefined synthesized forms. The reason I am hesitent to do it is that I like it when I can compile my code on an our of box dev tools installation, or on someoneelse's computer.

    If I did the source->source transform as a preprocessing step, and could include the preprocessors source as part of the project I would be more likely to do it. I don't want to include all of the LLVM and CLANG sources as part of my project to just bootstrap a custom preprocessor, so I definitely won't do that until they are both as dylibs in a standard development install.

    April 23, 2009 | Unregistered CommenterLouis Gerbarg
    Comments for this entry have been disabled. Additional comments may not be added to this entry at this time.