Twiiter

Twitter Updates

    follow me on Twitter
    Search
    Powered by Squarespace
    Wednesday
    28Oct2009

    New theme

    I have changed the this blogs theme since a number complained that it was hard to read light text on a dark background. I am still fiddling around with them template, so if something is broken, that is probably why.

     

    Saturday
    24Oct2009

    The loss of ZFS

    Well, in case you haven't read any of the myriad stories about it, it appears that Apple has decided not to use ZFS on Mac OS X. Gruber has sources that say it was primarily licensing concerns, which is consistent with what people have implied to me, both recently, and around WWDC (although at that time I think there was probably still hope of resolving the issues).

    Now, some people jump may comment that it couldn't be licensing issues, since ZFS is opensource (under the CDDL), and that Apple already uses CDDL software (DTrace). That may be true, but often in deals that involve large companies there is more to it than that. Apple may have wanted guarantees of indemnification in the NetApp lawsuit. Maybe it wanted guarantees that certain modifications it wanted to make would be accepted upstream, or even to get Sun to make certain changes. It also might have wanted additional distribution rights that were not granted under the CDDL. It is typical for companies to negotiate custom agreements in such cases (and for some money to change hands), so the idea that licensing issues are why it fell through is entirely reasonable, even though it is an opensource product. Obviously Sun's steady decline in the market place, and the uncertainty caused by the Oracle acquisition may have greatly complicated any such negotiations.

    Why not do a new filesystem?

    Apple has a lot of talented filesystem engineers. They are certainly capable of doing something comparable to ZFS, at least for their target market. The problem with developing a new modern filesystem is that it generally takes longer than a single OS release cycle. Most companies are really bad at having large teams focused on projects that will not ship in the next version of the project they are working on.

    This is a particularly acute problem at Apple, which traditionally has done things with very few engineers. I don't want to get into exact numbers, but I recall having a discussion with the head of a university FS team who was discussing the FS he was working on. He was pitching it to a group of Apple engineers. It was some interesting work, but there were some unsolved problems. When he was asked about them he commented that they didn't have enough people to deal with them, but he had some ideas and it shouldn't be an issue for a company with a real FS team. It turned out his research team had about the same number of people working on their FS as Apple had working on HFS, HFS+, UFS, NFS, WebDAV, FAT, and NTFS combined. I think people don't appreciate how productive Apple is on a per-engineer basis. The downside of that is that sometimes it is hard to find the resources to do something large and time consuming, particularly when it is not something that most users will notice in a direct sense. That is especially true if senior management is not excited about the idea.

    Because of that, I was fairly convinced ZFS was a credible future primary FS for Apple. Not because it was an optimal design for them (it isn't), but because it was a lot less work than doing a new design from scratch. The fact its fundamental architecture is 20 years newer than HFS meant it would still be better than HFS+ in almost all respects even if it was not designed for Apple's exact needs. Clearly I was wrong, since Apple has stopped the ZFS project.

    What changed?

    Well, a couple of things have happened. The first is that Mac OS X has gotten more mature. They no longer need to port all of those FSes, they already have them working, and in most cases they work fairly well. That frees up some engineers. Apple has also greatly expanded the number of people working on their kernel since it is amortized over many different products (Mac OS X, iPhone, AppleTV, etc).

    Suddenly the notion of doing a new filesystem seems doable, so long as it is a real priority and the FS team doesn't get pulled to keep adding features or doing major work to legacy FSes. That is still a lot of work when Apple had ZFS approaching production quality on OS X.

    Apple can do better than ZFS

    Sun calls ZFS "The Last Word in Filesystems", but that is hyperbole. ZFS is one of the first widely deployed copy on write FSes. That certainly makes it a tremendous improvement over existing FSes, but pioneers are the ones with arrows in their back. By looking at ZFS's development it is certainly possible to identify mistakes that they made, and ways to do things better if one were to start from scratch. From where I sit, there are 3 obvious ways doing a new FS will be better for Apple than ZFS:

    1. There have been new fundamental research since ZFS was designed that simplifies many of the issues involved with it. In particular the "B-trees, Shadowing, and Clones" (PDF). That paper is the basis for the design of BtrFS, which has a very similar feature set to ZFS, but internally is entirely different. LWN has an article about BtrFS that explains the significance in some detail (it is written Valerie Aurora, who worked on ZFS at Sun).

    2. ZFS was designed for the storage interfaces available a decade ago. Spinning disks are going to be with us for a long time, especially for bulk storage in data centers and on backup devices. The future is all about solid state. Flash SSDs have significantly different performance characteristics than spinning media, and there may be FS design decisions one could make that would benefit from that. Now, any FS Apple designs will have to work acceptably on traditional drives, but if they are designing for the future then flash is what to target.

      ZFS has had some optimization work for flash, but it is all in terms of using flash as part of a storage hierarchy. That makes complete sense, since ZFS's primary deployment targets are high-end systems and data center storage. Those systems have multiple drives, so the idea of separate flash drives for a ZIL and L2ARC are completely reasonable. Most consumers have one drive in their system, and maybe an external drive for bulk data, data exchange, and backup.

    3. That brings up the last point. ZFS is designed for big systems. It works on small systems, but most of the tradeoffs favor very large computers, with lots of drives. This shows up in a number of ways. The first is that ZFS is not currently capable of adding single drives to an existing vdev or migrating vdevs between various types (mirror, raidz, raidz2). This is a major feature for smaller users who might want to add a single drive, but is a non-issue for data center users who tend to add large number of drives all at once, since they will add whole vdevs. Another issue is that ZFS assumes you have a lot of ram. NEC has been doing a port of OpenSolaris to ARM, and they determined they could not get ZFS to use less than 8 megabytes of ram without making incompatible format changes (Compacted ZFS). With those changes they could squeeze it into a more reasonable 2 megabytes. On a desktop that doesn't seem like a big deal, but on an iPhone 3G or a Time Capsule 8MB of wired memory is an enormous issue.

    The only major downside is that if Apple is just starting on a next generation FS now it could be a long time before we get our hands on it.

    But now we are going to have another incompatible next generation filesystem

    Wolf brought this point up during some of the ZFS talk on twitter yesterday. My general opinion is that it doesn't matter. People use drives for two largely unrelated tasks. One is running their computers. This is fixed storage. The other is for data exchange. In the old days people used floppies for their sneakernet media, which made the situation much simpler to understand. In recent years the market realities have caused people to move to using SD cards, thumbdrives, and hard drives as the exchange medium of sneakernet.

    The important point is that understand is that while the physical devices may be the same, the use model is different, just as the using a floppy disk and an internal hard drive were different. Nobody would balk at the notion that floppies should use different FSes than internal drives. Likewise, most people shouldn't care that their external drives are formatted differently than their internal drives.

    There are complicated features you want for your boot drives and system disks. Ideally you could have them on your interchange disks, but there are other features that are more important, particularly interoperability, and simplicity. ZFS didn't bring either of those. There might have been a few people who were psyched to be able to use ZFS to share disks between a Mac and a Solaris or FreeBSD box, but honestly those people are few and far between. Whether Apple used ZFS or something else it is just as interoperable with Linux and Windows (which is to say, not at all). So that fact that Apple looks to be doing a new FS does not impact interoperability in any real sense.

    The other feature you really want for an interchange FS is simplicity. There are a lot of devices out there that use an FS to communicate with a computer. The simplest example is a digital camera via its media cards, but there are many others. Something like ZFS is way too complex for those devices, and honestly most of the features of ZFS like multiple drive support and snapshots are useless since the devices don't have the physical interconnects or user interfaces to expose those features. There is certainly an argument to be made that we could use something a bit better than FAT32 or exFAT as that format, but ZFS was not the right solution for that.

    In other words, for that disk you want to use as an external drive to drag between computers you don't want something like ZFS, you want something that is simple enough that a firmware engineer can write a read-only implementation from the specs in less than a week. For the disk embedded in your computer (operationally or literally) you want something like ZFS, but it doesn't matter if it is interoperable with anything else because you won't be moving it between systems.

    This is basically how Windows works. Microsoft generally uses NTFS for internal drives, but FAT for external drives. Ultimately somebody should design a filesystem explicitly for use as an interchange format and license it for free, then everyone can deal with their internal FSes and do what makes the most sense for their OSes and markets.

    Thursday
    08Oct2009

    Flash on the iPhone

    Update: One of the developers of Trading Stuff posted a comment below, and has a great blog post about his experience getting it running on the iPhone.

    There has been a lot of discussion about running Flash apps on the iPhone over
    the last few days. It was precipitated by Adobe's announcement that Flash
    Professional CS5
    would have support for publishing apps as iPhone native
    executables. They went into a little more detail, saying that they were going to
    use an Ahead of Time (AOT) compiler backend based on LLVM, and that
    there are already several apps on the store using it. This generated a large
    number of responses from various people, some knee-jerk, some well
    reasoned out. Of course, the fact that there are samples we can dissect means
    that it is possible to make some informed analysis about them.

     

    Personal views

     

    Before we get into the technical details of this, let me go into my background a
    bit, lest I be accused of being biased or having an agenda. I have around 10
    years of Objective C development experience, and almost no experience writing
    anything substantial with Flash or ActionScript. I am also primarily a Macintosh
    user, where the Flash experience (even in the browser) is often less than ideal.
    I am told it is a better experience on Windows. I do use the Hulu Desktop app, which is written in Flash, and think it is pretty nice (though
    they should make cut and paste work in their text fields, grrr). Of course,
    something like Hulu is an immersive app that has no need to integrate with the
    native OS experience, but neither do most games.

    There are a lot of neat little web games and what not that are written in Flash,
    that I would like to run. If that was the tool the author felt it best to
    express their ideas and it worked for them then great. In particular, for
    software that I don't feel needs interface with OS and which can use a
    completely custom UI (in other words, games) I think there is no difference to
    the user. So long as the environment generates good code that can run at full
    frame rate without killing the battery I am in favor of getting Flash apps to
    run on the iPhone.

     

    Warm up

     

    Okay, so first off lets look at what we know. Adobe is using the LLVM arm
    backend to generate code. Right off the bat that gets me bit worried. Don't get
    me wrong, I love LLVM. Hell, I was one of the people who wrote
    the LLVM ppc backend. Having said that, I wouldn't use the LLVM arm backend at
    this time. The reason I wouldn't is because every time I have asked the
    people who are actively working on it they tell me it is not ready for prime
    time. That is the reason why LLVM-gcc and clang are not supported compiler
    targets for iPhone, despite them being supported (and encouraged) for development
    on Snow Leopard. Apple has a lot of compiler engineers and has basically stated
    that LLVM is their future compiler direction, so if those guys are telling us to
    hold back, then how is it good enough for AOT Flash?

    Now, Adobe could potentially have another arm backend they developed, or maybe
    they branched off a particular build and have fixed whatever bugs would impact
    them while ignoring stuff they didn't need. It is entirely possible that they
    could have reasonable code coming out of this thing, and I don't have access to
    the toolchain itself to inspect it, but it definitely gets me nervous. This may
    also be one of the reasons why they are not ready to widely release the tools
    yet.

     

    Lets get to it

     

    Okay, so I downloaded one of the games that was available, Trading
    Stuff
    , decompressed its IPA and had a look inside. At first
    glance it looked like a pretty normal iPhone app. Then I noticed there were no
    resources besides a basic MainWindow.nib. No images, no sounds, no
    localizations. The next thing I noticed was that the binary was ~13 megabytes,
    or approximately ~95% the size of the entire app. That is enormous for a binary.
    For reference, compare that to a normal iPhone game, like The Oregon
    Trail
    , which is ~106 megabyte game has ~1 megabyte executable, or
    about 1% the size of the app.

    What is going on is that the Flash build environment is not using any of the
    standard Mac OS X/iPhone OS bundling or localization mechanisms. Instead they
    are transforming all their assets into embeddable objects and shoving them
    directly into their application's TEXT section. At first glance that might not
    seem so bad, but it has a bunch of consequences. It defeats almost any sort of
    caching or prefetch logic the OS has for specific data types (like images), and
    instead places all of the pressure directly on the VM and paging subsystems.

    To be clear this is no way violation of the SDK agreement, and embedding objects
    into an app in this way is occasionally appropriate, but the degree to which it
    is happening with these cross compiled apps is different, and likely will have a
    number of significant (negative) performance implications.

    Now, moving on from the obvious, its time to actually start poking at the
    generated code. If we just take a cursory look at the linkage, we can see some
    bad stuff going on here. Why are they calling dlopen, dlclose, and dlsym? The
    only reason to use those is load in frameworks and resolve symbols after launch,
    something that is strictly verboten. In the best case that might be some dead
    code they use from debugging that should have been stripped that should have
    out. In the worst case, they depend on them to get access to symbols they are
    not supposed to use. I want to be clear about this: There is no legitimate
    reason why any app that follows the terms of the SDK agreement should use these
    functions
    , and I find it shocking that Apple lets apps that link against them
    on the store at all. I should also note this is not exhaustive, those are just the most obvious things, but in my cursory inspection I saw a dozen or so objective C selectors that I believe are private.

    Actually that brings up another point. Despite the app developer doing nothing
    wrong, one of their toolchain or middleware vendors is doing something that could
    be an issue. When I write apps for the store I might choose to play fast and
    loose with something if there is a compelling reason, but if I am providing a
    library for someone else I never do. Using private APIs is putting your
    customer's apps at risk. Not only do I find that unacceptable,
    but I think any vendor who does that is generally irresponsible and makes it me
    hesitate to use any of their other products because I feel it shows a certain
    casualness about how you treat their customers. If there is a legitimate reason
    you need to do something that risks your customer's products, then as a company
    you need to disclose it so your customers can make an informed decision.

    I should note this is not an intrinsic issue with Flash, I know for a fact
    certain major vendors ship iPhone libraries that call APIs that can get your app
    rejected from the store without informing developers. For instance, various analytics companies
    really shouldn't be poking at private APIs to try to find cached location
    framework data. It isn't just a privacy breach, it places your clients apps at risk.

     

    So what. How does it run?

     

    On my iPhone 3G it runs really choppy, on my 3GS it runs acceptably, but it
    still isn't smooth. Given the OpenGL performance people have seen on the 3GS
    that is still pretty bad. I have not done any invasive tests by instrumenting
    the binary, that is just what I can get via basic usage. The sad thing is that
    there is no reason it has to have performance like this. This is not an inherent
    issue with the ActionScript used in this app (though that may have issues), it
    is that what is coming out of the toolchain is a huge, monstrous binary that stresses
    the runtime and has performance characteristics completely different than
    anything the iPhone is currently setup for.

    Also, remember, the slower the frame rate the more work the phone is doing per frame, and the
    more battery the app is using. When you see an app that can do 120 FPS in its
    demo loop, that means that when it is running at 30 FPS it is using ~25% of the
    CPU/GPU assets. When you see one that can only get 20 FPS that means it cannot
    hit 30 FPS to clamp at despite maxing some or the costly (in terms of battery)
    system assets.

     

    Punchline

     

    Technically speaking, these do appear to basically be within letter of the SDK
    agreement, modulo the fact that Adobe appears to making private API calls. They
    should be able to do what they need to without making those calls, so ultimately
    that should be a non-issue.

    Now, the notion that what this thing emits is indistinguishable from
    something Xcode emits is laughable. They are very different, and not in a good
    way. While the apps may get acceptable frame rates on an iPhone 3GS, they don't
    on earlier hardware, and they almost certainly use substantially more
    battery power than native games.

    I want to be excited about this thing, both because it is a seriously cool
    piece of tech, and because there are Flash games I would like to run on my
    phone, but looking at what this thing is spitting out I think the apps it will
    generate will perpetuate the stereotypes about Flash (especially on cell
    phones), and give Objective C programmers a (somewhat misplaced) sense of
    vindication about their views on Flash.

    This is all still in beta, it could end up a lot better than it currently is. It
    could be something that can make some great games available on the iPhone.
    Unfortunately looking it right now I am very skeptical, and I think that is the
    right position to have given Flash's performance elsewhere. Yes, this is
    entirely new technology, but it comes from the same company with the same
    priorities. Given the product they have delivered to me on my desktop for the
    last 5 years they don't get benefit the doubt, they have to pull themselves out
    of the doghouse as far as I am concerned. Come on Adobe, prove me wrong!

    Sunday
    27Sep2009

    C4[3]

    C4[3] has just ended, and I am sitting in the hotel lobby right now. It was a great show this year, a lot of interesting talks, I hope everyone had as much fun as I did. I am posting the slides from my Blitz Talk (How to become a compiler engineer in 5 minutes) here (PDF).

    Thursday
    10Sep2009

    Apple has opensourced libdispatch

    Apple has opensourced the libdispatch Mac OS X implementation. This is excellent news, and hopefully we will quickly see ports to other platforms. This is a the last core piece of GCD that was proprietary, so it is a very exciting piece of news.