Twiiter

Twitter Updates

    follow me on Twitter
    Loading..
    Loading..

    Entries in iPhone (2)

    Friday
    Apr232010

    iPhone, iPad, Security, and Privacy. Oh my!

    Intro

    Plenty of people are debating the significance of the iPad. Opinions range from those who believe it represents the future of computing, to those who think it is just a toy or a diversion. There are lots of aspects to these new types of devices, and any discussion about their impact that would necessarily look at a lot of different issues.

    Personally, I find the iPad very interesting for many reasons, though most of them are probably different than what excites the average consumer. One of the aspects I find most exciting is that the iPad (and iPhone) represent a new platform that has been designed from the ground up in such a way that they can avoid a number of the security problems that have plagued computers in past, problems that cannot be fixed because legacy operatings need to support legacy applications that cannot be made to work securely. This is a key advantage of these new platforms, but it is also one that could easily evaporate if Apple is not careful as it designs and implements new APIs.

    In this post I am going to walk through a brief history of computer security, an explanation of why iPhone OS can be more secure than Mac OS X or Windows, describe some API flaws in iPhone OS 2.x/3.x that reduce user security, and explain potential exploit vector in iPhone OS, and ways they can be fixed. Finally, I will take a quick look at some of the features announced at the iPhone OS 4.0 event, and their potential security and privacy implications, though I will not discuss their actual APIs or any specific analysis I have done.

    It should be noted, that I do not, and have never, worked professionally in software security. Having said that, everyone who writes software needs to be concerned with these sorts of issues and have some expertise with them, especially people who deal with any potentially sensitive data. It should probably also be noted that most of the issues I am about to raise are not security issues in the traditional sense (they are technically privacy issues), but to the end user they are the same thing. It doesn't matter whether a nefarious application gets personal information by exploiting a bug, or because the system was designed to let applications get that data.

    In the beginning

    When the first computers were built, security was a non-issue. Early computers took teams of people to operate, could only input and output through switches and lights (and slightly later terminals), could only run a single program at a time, and that program was generally written by the people operating the machine.

    Over time the machines were enhanced, but for decades they were very expensive. As any buisness person will tell you, if you have an expensive fixed asset sitting idle you are wasting money, so people needed to find ways to keep those machines utilized. The first common way to do this was through batch processing. With batch processing you would write a program, and send it to the computer. When the computer finished one program the operators would take the next program from the queue and run it immediately. This had the benefit of keeping the computer busy, but it had some pretty substantial downsides as well. The most obvious one (in hindsight) was that it was impossible to deal with the computer interactively. If your program had a bug, you couldn't just fix your program while it was on the computer and rerun it, you would wait until you got the batch results, fix the bug, and put it back in the queue, and wait another day until it was scheduled to run again.

    The benefits of interactive computing were pretty obvious once it was technologically feasible, but that still didn't solve the cost issues. A new way of sharing computers had to be invented: Time-sharing. In a time-sharing system, the computer runs multiple programs at once, and keeps switching between them. If a program isn't using its timeslice (because it is waiting for the user to type something, for instance), it can yield its time to other programs. One of the first such systems was MIT CTSS. CTSS was arguably the first OS to resemble what we now call a modern OS, and among other significant achievements, it hosted the first known electronic mail implementation, the first interactive shell, was the first computer to run background only daemon processes, and the first system to use a virtual machine to support legacy applications. In fact, you can track iPhone OS's lineage back to it: iPhone OS -> Mac OS X -> BSD Unix -> AT&T Unix -> Multics -> CTSS.

    One of the other things that became obvious once time-sharing came into being was that it changed fundamental assumptions about computers, and the programs they ran, worked. In order to even demo the first time sharing code the IBM 704 at MIT required hardware changed to support interrupts. Before CTSS was implemented several other modifications were made to allow for memory relocation (early virtual memory), and memory protection. This changed what had previous been invariant assumptions about the environment in which a program ran, and resulted in CTSS having a slightly more dubious distinction. CTSS wass the first operating system to have a known software security issue.

    Enter the sandbox

    Until that point, all software had been written assuming it was the only software running, and programmers had never had to consider the issues involved when multiple programs ran. It was clear that some sort of mechanism to isolate users would be advantaguous, but since the segmentation and protection mechanisms were retrofit into an existing piece of hardware they were not necessarily as flexible or well thought out as one would hope.

    The relocation and protection features IBM added to the 704 allowed CTSS to implement the first ever sandbox. A sandbox is a small virtual world within a computer. The important thing to understand about the a sandbox is that once something is in a sandbox, it cannot get out of the sandbox (absent a bug or design flaw in the OS). Unfortunately, since those features were retrofitted into an existing (deplyed!) piece of hardware they were not as flexible or well thought out as they might have been, which meant there were ways in which to breach the sandbox. Despite those design flaws, the security issue that occured was not due to breaching a sandbox. It was instead due to a race condition that existed because of the assumption that only a single program would run at the same time, and an administrative decision allowing two users to play in the same sandbox.

    What happened was that there was a single system account that multiple people needed to login to in order to do system maintenance. When two admins both logged in, they were both inside the system sandbox. In addition to that, the editor software assumed only one copy of itself could be running at a time, so it stored its temporary data fixed location. One day two people were logged in doing two unrelated system tasks, one was editting the message of the day, the other was modifying user account information by editing the system password file. Since both of their editors were using the same temp file, the system system password file ended up being written out as the message of the day, and everyone who logged in could see everyone else's passwords. This incident also resulted in the idea of using hashed passwords.

    Evolution

    After CTSS, a number of time-sharing systems came into existence, but given iPhone OS's lineage, it is best to look at how Unix handled this situation. Unix was an OS inspired by Multics. By the time it came into existence hardware had evolved enough to support isolating user processes in a fairly robust manner. The world was still pretty different back then. Computers will still expensive, they still tended to have dedicated administrators, and the amount of software available for them was still pretty small compared to today.

    While the notion of a hostile user certainly existed, most of the software you were running was likely to have been provided as part of the AT&T Unix distribution (and later the BSD Unix distribution), or written in house. Networking still wasn't prevelent. As a consequence, when Unix was designed the goal was to sandbox off the users from each other, but no effort was made to isolate any of a user's data from a program they themselves ran. This is a completely different than the environment of the average user today, where most systems only have one user accessing them at a time, and are running lots of third party code of unknown provenance.

    Mac OS X is a modern version of Unix, and as such it inherits Unix's basic security design. The result is we have a system which has great support for isolating things into seperate sandboxes, but for the most part everything runs in a single sandbox (the user's account). The unfortunate thing is that most programs need to run in the same sandbox, because the way the Cocoa API's are designed they won't function properly if they are not. In many cases the necessary API changes to secure things are difficult and incompatible with existing applications. Since Apple has no way to analyze all existing software, force developers to update their apps, or provide compatibility without opening up the exact same holes they would want to close, they can't fix them with long difficult transitions.

    The situation on the iPhone is far better, but there are still some very serious issues. iPhone OS shares much of OS X's code, but it is a new platform that does not need to maintain compatibility with all the software on Mac OS X that depends on now invalid security assumptions. iPhone OS can fix them without the pain of breaking old software, but if Apple is not careful it can just as easily make the same mistakes again.

    The CoreLocation and Address Book APIs

    So lets look at two APIs available on iPhone, the CoreLocation API, and the Address Book API, as an example of where iPhone has screwed up its security, and an example of where iPhone has gotten it right.

    CoreLocation

    There are obvious privacy concerns with applications being able to determine a user's location. Apple handles this from a user perspective by asking the user if an application should be allowed to access their current location. What happens under the hood was that if CoreLocation access is approved by the user, CoreLocation signals over to a background service (running its own sandbox) asking for the location information. That service would check the system preference, confirm the app asking for information was allowed to have it, and send it back.

    That is not to say that CoreLocation was perfect in iPhone OS 2.0. While the basic design was right, it turned out that Apple actually cached location data inside the application's sandbox in a way that allowed it be accessed using private APIs. As a result, it was possible to get a (sometimes stale) location without the user knowing. Apple stopped caching the data, and now all CoreLocation access is gated messaging the background service. Fixing this particular bug did not break any apps that were using the CoreLocation API, since the API has always been able to return an error saying there was no access, so all apps using CoreLocation have always had to deal with the fact that sometimes the user does not give them access. Obviously the apps using the private APIs broke, but that is a separate issue.

    In iPhone OS 4.0 this is being greatly improved, so much so that the improvements were demoed in the keynote. In 4.0 there will be a status bar indicating that an app has recently used CoreLocation, it will be possible to look at the list apps to see what has used your location in the las day, and turn on and off their access. All of this works without changes to the existing APIs, so all existing apps the CoreLocation will be effected, resulting in much better security of the user's location information, and the ability to notice and identify if something is using the data without your consent. For more details people can watch the event stream.

    AddressBook

    Now lets look at the address book. The AddressBook API is basically the opposite of CoreLocation, in that Apple completely botched the security of the API, and the implementation, on multiple levels. The mistakes are so deeply ingrained that they cannot be fixed in any meaningful way without break a ton of apps. For that reason, the only viable options are to leave it broken, or fix it incrementally over several releases.

    The most obvious failure is that there are no controls on what can access your address book. Any app can read the address book, and there is no visible indication to the user, no log of the events, and no way to turn it off. In addition to that, all software currently using the address book has an implicit presumption it can read and write to the address book. This has actually caused fairly public incidents, such as the original Aurora Feint incident. In that case it was noticed by people monitoring network traffic of the app after it had been approved for the store, and the app wasn't actually trying to do anything malicious. An app actually trying to steal that data could encrypt its transmissions, and hide what it is doing in such a way that Apple would not notice it during the approval process. Fixing the API is not simple, because if the existing API were changed such that it could return an error most apps would be unable to cope with it, and either crash or behave unexpectedly. An option might be fore the OS to return an empty address book to such apps, but that could cause some serious problems for users as well.

    Even ignoring the API related issues, the infrastructure for restricting access to the address book doesn't exist. Rather than keeping address book database in its own sandbox with a daemon to arbitrate access, Apple exposes the address book database (/var/mobile/Library/AddressBook/AddressBook.sqlitedb) into all sandboxes and the API they provide directly accesses it. That sounds similiar to the above story about CTSS and the motd, though we have 40 years more experience and address book handles concurrent access to the database just fine. In an environment with no malicious apps that would be fine, but that is not the environment we live in. By exposing the address book database in this way not only has Apple introduced a way for applications to read all of the address book data without going through the AddressBook API, they have also exposed an attack vector for applications to insert poisoned address data that can be synced back to other device, or even attack other applications inside their own sandboxes.

    By carefully constructing a malformed address book database file it may be possible to exploit bugs in either sqlite3, the AddressBook API, or the code of a specific application stack in order to corrupt their stack. At that that point it is possible to run arbitrary code using a return-to-libc payload, despite countermeasures like an non-executable stack. All the app has to do is overwrite the users address book database with the new one, then the next the user runs a targetted app that accesses the database the exploit can occur. To be fair, sqlite3 is heavily vetted against malformed files, but that is no guarantee that this is not possible, and by directly accessing the database Apple has added an potential exploit vector that just doesn't need to be there.

    What I want to see in the future

    Well, the first thing I want is for Apple to expose the same sort of UI for managing AddressBook as exists for CoreLocation. I want notification when things are accessing it (especially when apps are writing to it). Furthur more, I want it to be possible to turn off an app's access to the address book, despite the potential for current apps to break, since I would rather risk the app crashing then it stealing information or inserting information. I want the API expanded to support read only and read/write access. As a compatibility measure Apple should default to permissive access to the address book for 4.0 using the existing APIs (that way no apps would break unless the user explicitly denied access). In addition, Apple should start testing all 4.0 apps submitted to make sure they operate correctly when address book access is denied. That way, when 5.0 comes out all apps submitted in the previous year will work correctly if they change the default to asking the user for access and the user tends to deny access.

    I also think there is room for some significant improvements to the UI features added for CoreLocation privacy in 4.0. Aside for the basic UI for notifying users about access, the system should log accesses somewhere that is accessible via iPhone Configuration Utility. Also, if Apple adds the ability to control address book read and write access there may be a need to rethink how to layout the UI in the Settings app, as having seperate table views for Push Notifications, Location, Address Book (read), and Address Book (read/write) is excessive and unintuitive. It would probably better to have a single app security table where each app should a synopsis of what it was trying to use, and could be expanded in order to inspect them more closely and edit them. Ultimately I trust Apple's abilities to create a UI that can handle presenting the necessary information to the user, that is the sort of thing they excel at.

    While I am at it, I also want applications to state up front what they want access to, rather than asking me as they access them. This could be handled by embedding it as metadata in the application's plist. That way the first time an application attempts to access anything that requires permission the OS could bring interface for approving everything at once (instead of multiple dialogs as accesses to different services happen).

    Why did I decide to write this blog post

    I decided to write this blog post because I think iPhone OS is at a crossroads, where it is either going to end up with up as a much more secure platform then we currently have, or it could end up throwing away all of that progress. This is especially important if the iPhone OS ends up being the basis for most of my computer usage through products like the iPad. Its fundamental sandboxing design allows it to be much more secure, but in order for that to remain the case Apple has to carefully vet every API that allows access to user data, and provide the user ways to control that access. At this point there are a fairly limited amount of data exposed to applications:

    • Location
      • Well secured
    • Address Book
      • Insecure
      • No worse than on Windows or Mac OS X, but it could be better
    • iTunes database metadata
      • Insecure
      • Not particularly sensitive
    • Camera access
      • Currently imited to Apple provided UI (see below)
      • Can be globally turned off via iPCU).

    The thing is, Steve announced some very exciting features for 4.0 that have potential privacy concerns. As you can see from my artistic rendition below, there are 3 features I want to call out. They are:

    iPhone 4.0 Privacy Risks

    • Calender
      • If knowing where I am is a privacy concern, then knowing where I am going to be certainly is)
    • Photo Library Access
      • Apps stealing your photos and uploading them to a server is a very real privacy issue, just wait until apps can read all the photos and movies you have taken without going through the OS's UI.
    • Raw access to the camera data
      • This a particularly nasty one, since iPhones do not include visible indicators that the camera is turned on. A similiar issue has been the subject of a recent lawsuit in the Lower Merion School District. In that sort of case the situation is even worse, because companies and schools can create their own applications under the iPhone Enterprise Program, that don't have conform the public API or be vetted by Apple, eliminating any safety derived from Apple's approval process.

    It is important that all of these APIs are designed in such a way that they can expose restricted access to apps, since if apps can simply pull all of this data with no restrictions then we have effectively given up a lot of the value of the sandboxes. After all, who cares that apps can't get out of their sandbox when all the user's personal data is right there for them to play with. Apple needs to use their sandbox not just as way to protect the integrity of the OS on the phone, but also to protect the privacy of the user's data.

    (For any Apple readers, please check out these 6 radars).

    Thursday
    Oct082009

    Flash on the iPhone

    Update: One of the developers of Trading Stuff posted a comment below, and has a great blog post about his experience getting it running on the iPhone.

    There has been a lot of discussion about running Flash apps on the iPhone over
    the last few days. It was precipitated by Adobe's announcement that Flash
    Professional CS5
    would have support for publishing apps as iPhone native
    executables. They went into a little more detail, saying that they were going to
    use an Ahead of Time (AOT) compiler backend based on LLVM, and that
    there are already several apps on the store using it. This generated a large
    number of responses from various people, some knee-jerk, some well
    reasoned out. Of course, the fact that there are samples we can dissect means
    that it is possible to make some informed analysis about them.

     

    Personal views

     

    Before we get into the technical details of this, let me go into my background a
    bit, lest I be accused of being biased or having an agenda. I have around 10
    years of Objective C development experience, and almost no experience writing
    anything substantial with Flash or ActionScript. I am also primarily a Macintosh
    user, where the Flash experience (even in the browser) is often less than ideal.
    I am told it is a better experience on Windows. I do use the Hulu Desktop app, which is written in Flash, and think it is pretty nice (though
    they should make cut and paste work in their text fields, grrr). Of course,
    something like Hulu is an immersive app that has no need to integrate with the
    native OS experience, but neither do most games.

    There are a lot of neat little web games and what not that are written in Flash,
    that I would like to run. If that was the tool the author felt it best to
    express their ideas and it worked for them then great. In particular, for
    software that I don't feel needs interface with OS and which can use a
    completely custom UI (in other words, games) I think there is no difference to
    the user. So long as the environment generates good code that can run at full
    frame rate without killing the battery I am in favor of getting Flash apps to
    run on the iPhone.

     

    Warm up

     

    Okay, so first off lets look at what we know. Adobe is using the LLVM arm
    backend to generate code. Right off the bat that gets me bit worried. Don't get
    me wrong, I love LLVM. Hell, I was one of the people who wrote
    the LLVM ppc backend. Having said that, I wouldn't use the LLVM arm backend at
    this time. The reason I wouldn't is because every time I have asked the
    people who are actively working on it they tell me it is not ready for prime
    time. That is the reason why LLVM-gcc and clang are not supported compiler
    targets for iPhone, despite them being supported (and encouraged) for development
    on Snow Leopard. Apple has a lot of compiler engineers and has basically stated
    that LLVM is their future compiler direction, so if those guys are telling us to
    hold back, then how is it good enough for AOT Flash?

    Now, Adobe could potentially have another arm backend they developed, or maybe
    they branched off a particular build and have fixed whatever bugs would impact
    them while ignoring stuff they didn't need. It is entirely possible that they
    could have reasonable code coming out of this thing, and I don't have access to
    the toolchain itself to inspect it, but it definitely gets me nervous. This may
    also be one of the reasons why they are not ready to widely release the tools
    yet.

     

    Lets get to it

     

    Okay, so I downloaded one of the games that was available, Trading
    Stuff
    , decompressed its IPA and had a look inside. At first
    glance it looked like a pretty normal iPhone app. Then I noticed there were no
    resources besides a basic MainWindow.nib. No images, no sounds, no
    localizations. The next thing I noticed was that the binary was ~13 megabytes,
    or approximately ~95% the size of the entire app. That is enormous for a binary.
    For reference, compare that to a normal iPhone game, like The Oregon
    Trail
    , which is ~106 megabyte game has ~1 megabyte executable, or
    about 1% the size of the app.

    What is going on is that the Flash build environment is not using any of the
    standard Mac OS X/iPhone OS bundling or localization mechanisms. Instead they
    are transforming all their assets into embeddable objects and shoving them
    directly into their application's TEXT section. At first glance that might not
    seem so bad, but it has a bunch of consequences. It defeats almost any sort of
    caching or prefetch logic the OS has for specific data types (like images), and
    instead places all of the pressure directly on the VM and paging subsystems.

    To be clear this is no way violation of the SDK agreement, and embedding objects
    into an app in this way is occasionally appropriate, but the degree to which it
    is happening with these cross compiled apps is different, and likely will have a
    number of significant (negative) performance implications.

    Now, moving on from the obvious, its time to actually start poking at the
    generated code. If we just take a cursory look at the linkage, we can see some
    bad stuff going on here. Why are they calling dlopen, dlclose, and dlsym? The
    only reason to use those is load in frameworks and resolve symbols after launch,
    something that is strictly verboten. In the best case that might be some dead
    code they use from debugging that should have been stripped that should have
    out. In the worst case, they depend on them to get access to symbols they are
    not supposed to use. I want to be clear about this: There is no legitimate
    reason why any app that follows the terms of the SDK agreement should use these
    functions
    , and I find it shocking that Apple lets apps that link against them
    on the store at all. I should also note this is not exhaustive, those are just the most obvious things, but in my cursory inspection I saw a dozen or so objective C selectors that I believe are private.

    Actually that brings up another point. Despite the app developer doing nothing
    wrong, one of their toolchain or middleware vendors is doing something that could
    be an issue. When I write apps for the store I might choose to play fast and
    loose with something if there is a compelling reason, but if I am providing a
    library for someone else I never do. Using private APIs is putting your
    customer's apps at risk. Not only do I find that unacceptable,
    but I think any vendor who does that is generally irresponsible and makes it me
    hesitate to use any of their other products because I feel it shows a certain
    casualness about how you treat their customers. If there is a legitimate reason
    you need to do something that risks your customer's products, then as a company
    you need to disclose it so your customers can make an informed decision.

    I should note this is not an intrinsic issue with Flash, I know for a fact
    certain major vendors ship iPhone libraries that call APIs that can get your app
    rejected from the store without informing developers. For instance, various analytics companies
    really shouldn't be poking at private APIs to try to find cached location
    framework data. It isn't just a privacy breach, it places your clients apps at risk.

     

    So what. How does it run?

     

    On my iPhone 3G it runs really choppy, on my 3GS it runs acceptably, but it
    still isn't smooth. Given the OpenGL performance people have seen on the 3GS
    that is still pretty bad. I have not done any invasive tests by instrumenting
    the binary, that is just what I can get via basic usage. The sad thing is that
    there is no reason it has to have performance like this. This is not an inherent
    issue with the ActionScript used in this app (though that may have issues), it
    is that what is coming out of the toolchain is a huge, monstrous binary that stresses
    the runtime and has performance characteristics completely different than
    anything the iPhone is currently setup for.

    Also, remember, the slower the frame rate the more work the phone is doing per frame, and the
    more battery the app is using. When you see an app that can do 120 FPS in its
    demo loop, that means that when it is running at 30 FPS it is using ~25% of the
    CPU/GPU assets. When you see one that can only get 20 FPS that means it cannot
    hit 30 FPS to clamp at despite maxing some or the costly (in terms of battery)
    system assets.

     

    Punchline

     

    Technically speaking, these do appear to basically be within letter of the SDK
    agreement, modulo the fact that Adobe appears to making private API calls. They
    should be able to do what they need to without making those calls, so ultimately
    that should be a non-issue.

    Now, the notion that what this thing emits is indistinguishable from
    something Xcode emits is laughable. They are very different, and not in a good
    way. While the apps may get acceptable frame rates on an iPhone 3GS, they don't
    on earlier hardware, and they almost certainly use substantially more
    battery power than native games.

    I want to be excited about this thing, both because it is a seriously cool
    piece of tech, and because there are Flash games I would like to run on my
    phone, but looking at what this thing is spitting out I think the apps it will
    generate will perpetuate the stereotypes about Flash (especially on cell
    phones), and give Objective C programmers a (somewhat misplaced) sense of
    vindication about their views on Flash.

    This is all still in beta, it could end up a lot better than it currently is. It
    could be something that can make some great games available on the iPhone.
    Unfortunately looking it right now I am very skeptical, and I think that is the
    right position to have given Flash's performance elsewhere. Yes, this is
    entirely new technology, but it comes from the same company with the same
    priorities. Given the product they have delivered to me on my desktop for the
    last 5 years they don't get benefit the doubt, they have to pull themselves out
    of the doghouse as far as I am concerned. Come on Adobe, prove me wrong!