Intro
Plenty of people are debating the significance of the iPad. Opinions range from
those who believe it represents the future of computing, to those who think it
is just a toy or a diversion. There are lots of aspects to these new types of
devices, and any discussion about their impact that would necessarily look at a
lot of different issues.
Personally, I find the iPad very interesting for many reasons, though most of
them are probably different than what excites the average consumer. One of the
aspects I find most exciting is that the iPad (and iPhone) represent a new
platform that has been designed from the ground up in such a way that they can
avoid a number of the security problems that have plagued computers in past,
problems that cannot be fixed because legacy operatings need to support legacy
applications that cannot be made to work securely. This is a key advantage of
these new platforms, but it is also one that could easily evaporate if Apple is
not careful as it designs and implements new APIs.
In this post I am going to walk through a brief history of computer security, an
explanation of why iPhone OS can be more secure than Mac OS X or Windows,
describe some API flaws in iPhone OS 2.x/3.x that reduce user security, and
explain potential exploit vector in iPhone OS, and ways they can be fixed.
Finally, I will take a quick look at some of the features announced at the
iPhone OS 4.0 event, and their potential security and privacy implications,
though I will not discuss their actual APIs or any specific analysis I have done.
It should be noted, that I do not, and have never, worked professionally in
software security. Having said that, everyone who writes software needs to be
concerned with these sorts of issues and have some expertise with them,
especially people who deal with any potentially sensitive data. It should
probably also be noted that most of the issues I am about to raise are not
security issues in the traditional sense (they are technically privacy issues),
but to the end user they are the same thing. It doesn't matter whether a
nefarious application gets personal information by exploiting a bug, or because
the system was designed to let applications get that data.
In the beginning
When the first computers were built, security was a non-issue. Early computers
took teams of people to operate, could only input and output through switches
and lights (and slightly later terminals), could only run a single program at a
time, and that program was generally written by the people operating the
machine.
Over time the machines were enhanced, but for decades they were very expensive.
As any buisness person will tell you, if you have an expensive fixed asset
sitting idle you are wasting money, so people needed to find ways to keep those
machines utilized. The first common way to do this was through batch processing.
With batch processing you would write a program, and send it to the computer.
When the computer finished one program the operators would take the next program
from the queue and run it immediately. This had the benefit of keeping the
computer busy, but it had some pretty substantial downsides as well. The most
obvious one (in hindsight) was that it was impossible to deal with the computer
interactively. If your program had a bug, you couldn't just fix your program
while it was on the computer and rerun it, you would wait until you got the
batch results, fix the bug, and put it back in the queue, and wait another day
until it was scheduled to run again.
The benefits of interactive computing were pretty obvious once it was
technologically feasible, but that still didn't solve the cost issues. A new way
of sharing computers had to be invented: Time-sharing. In a
time-sharing system, the computer runs multiple programs at once, and keeps
switching between them. If a program isn't using its timeslice (because it is
waiting for the user to type something, for instance), it can yield its time to
other programs. One of the first such systems was MIT CTSS. CTSS was
arguably the first OS to resemble what we now call a modern OS, and among other
significant achievements, it hosted the first known electronic mail implementation,
the first interactive shell, was the first computer to run background only
daemon processes, and the first system to use a virtual machine to support
legacy applications. In fact, you can track iPhone OS's lineage back to it:
iPhone OS -> Mac OS X -> BSD Unix -> AT&T
Unix -> Multics -> CTSS.
One of the other things that became obvious once time-sharing came into being
was that it changed fundamental assumptions about how computers, and the programs
they ran, worked. In order to even demo the first time sharing code the IBM 704
at MIT required hardware changed to support interrupts. Before CTSS was
implemented several other modifications were made to allow for memory relocation
(early virtual memory), and memory protection. This changed what had previous
been invariant assumptions about the environment in which a program ran, and
resulted in CTSS having a slightly more dubious distinction. CTSS wass the first
operating system to have a known software security issue.
Enter the sandbox
Until that point, all software had been written assuming it was the only
software running, and programmers had never had to consider the issues involved
when multiple programs ran. It was clear that some sort of mechanism to isolate
users would be advantaguous, but since the segmentation and protection
mechanisms were retrofit into an existing piece of hardware they were not
necessarily as flexible or well thought out as one would hope.
The relocation and protection features IBM added to the 704 allowed CTSS to
implement the first ever sandbox. A sandbox is a small virtual world
within a computer. The important thing to understand about the a sandbox is that
once something is in a sandbox, it cannot get out of the sandbox (absent a bug
or design flaw in the OS). Unfortunately, since those features were retrofitted
into an existing (deployed!) piece of hardware they were not as flexible or well
thought out as they might have been, which meant there were ways in which to
breach the sandbox. Despite those design flaws, the security issue that occured
was not due to breaching a sandbox. It was instead due to a race
condition that existed because of the assumption that only a
single program would run at the same time, and an administrative decision
allowing two users to play in the same sandbox.
What happened was that there was a single system account that multiple people
needed to login to in order to do system maintenance. When two admins both logged
in, they were both inside the system sandbox. In addition to that, the editor
software assumed only one copy of itself could be running at a time, so it
stored its temporary data fixed location. One day two people were logged in
doing two unrelated system tasks, one was editting the message of the
day, the other was modifying user account information by editing the
system password file. Since both of their editors were using the same temp
file, the system system password file ended up being written out as the message
of the day, and everyone who logged in could see everyone else's passwords. This
incident also resulted in the idea of using hashed passwords.
Evolution
After CTSS, a number of time-sharing systems came into existence, but given
iPhone OS's lineage, it is best to look at how Unix handled this situation. Unix
was an OS inspired by Multics. By the time it came into existence hardware had
evolved enough to support isolating user processes in a fairly robust manner.
The world was still pretty different back then. Computers will still expensive,
they still tended to have dedicated administrators, and the amount of software
available for them was still pretty small compared to today.
While the notion of a hostile user certainly existed, most of the software you
were running was likely to have been provided as part of the AT&T Unix
distribution (and later the BSD Unix distribution), or written in house.
Networking still wasn't prevelent. As a consequence, when Unix was designed the
goal was to sandbox off the users from each other, but no effort was made to isolate
any of a user's data from a program they themselves ran. This is a completely
different than the environment of the average user today, where most systems only
have one user accessing them at a time, and are running lots of third party code of
unknown provenance.
Mac OS X is a modern version of Unix, and as such it inherits Unix's basic
security design. The result is we have a system which has great support for
isolating things into seperate sandboxes, but for the most part everything runs
in a single sandbox (the user's account). The unfortunate thing is that most
programs need to run in the same sandbox, because the way the Cocoa
API's are designed they won't function properly if they are not. In many cases
the necessary API changes to secure things are difficult and incompatible with
existing applications. Since Apple has no way to analyze all existing software,
force developers to update their apps, or provide compatibility without opening
up the exact same holes they would want to close, they can't fix them with long
difficult transitions.
The situation on the iPhone is far better, but there are still some very serious
issues. iPhone OS shares much of OS X's code, but it is a new platform that does
not need to maintain compatibility with all the software on Mac OS X that
depends on now invalid security assumptions. iPhone OS can fix them without the
pain of breaking old software, but if Apple is not careful it can just as easily
make the same mistakes again.
The CoreLocation and Address Book APIs
So lets look at two APIs available on iPhone, the CoreLocation API, and the
Address Book API, as an example of where iPhone has screwed up its security, and
an example of where iPhone has gotten it right.
CoreLocation
There are obvious privacy concerns with applications being able to determine a
user's location. Apple handles this from a user perspective by asking the user
if an application should be allowed to access their current location. What
happens under the hood was that if CoreLocation access is approved by the user,
CoreLocation signals over to a background service (running its own sandbox)
asking for the location information. That service would check the system
preference, confirm the app asking for information was allowed to have it, and
send it back.
That is not to say that CoreLocation was perfect in iPhone OS 2.0. While the
basic design was right, it turned out that Apple actually cached location data
inside the application's sandbox in a way that allowed it be accessed using
private APIs. As a result, it was possible to get a (sometimes stale) location
without the user knowing. Apple stopped caching the data, and now all
CoreLocation access is gated messaging the background service. Fixing this
particular bug did not break any apps that were using the CoreLocation API,
since the API has always been able to return an error saying there was no
access, so all apps using CoreLocation have always had to deal with the fact
that sometimes the user does not give them access. Obviously the apps using the
private APIs broke, but that is a separate issue.
In iPhone OS 4.0 this is being greatly improved, so much so that the
improvements were demoed in the keynote. In 4.0 there will be a status bar
indicating that an app has recently used CoreLocation, it will be possible to
look at the list apps to see what has used your location in the las day, and
turn on and off their access. All of this works without changes to the existing
APIs, so all existing apps the CoreLocation will be effected, resulting in much
better security of the user's location information, and the ability to notice and
identify if something is using the data without your consent. For more details
people can watch the event stream.
AddressBook
Now lets look at the address book. The AddressBook API is basically the opposite
of CoreLocation, in that Apple completely botched the security of the API, and
the implementation, on multiple levels. The mistakes are so deeply ingrained
that they cannot be fixed in any meaningful way without break a ton of apps. For
that reason, the only viable options are to leave it broken, or fix it
incrementally over several releases.
The most obvious failure is that there are no controls on what can access your
address book. Any app can read the address book, and there is no visible
indication to the user, no log of the events, and no way to turn it off. In
addition to that, all software currently using the address book has an implicit
presumption it can read and write to the address book. This has actually caused
fairly public incidents, such as the original Aurora Feint incident. In that case it was noticed by people monitoring network traffic
of the app after it had been approved for the store, and the app wasn't actually
trying to do anything malicious. An app actually trying to steal that data could
encrypt its transmissions, and hide what it is doing in such a way that Apple
would not notice it during the approval process. Fixing the API is not simple,
because if the existing API were changed such that it could return an error most
apps would be unable to cope with it, and either crash or behave unexpectedly.
An option might be fore the OS to return an empty address book to such apps, but
that could cause some serious problems for users as well.
Even ignoring the API related issues, the infrastructure for restricting access
to the address book doesn't exist. Rather than keeping address book database in
its own sandbox with a daemon to arbitrate access, Apple exposes the address
book database (/var/mobile/Library/AddressBook/AddressBook.sqlitedb) into all
sandboxes and the API they provide directly accesses it. That sounds similiar to
the above story about CTSS and the motd, though we have 40 years more experience
and address book handles concurrent access to the database just fine. In an
environment with no malicious apps that would be fine, but that is not
the environment we live in. By exposing the address book database in this way
not only has Apple introduced a way for applications to read all of the address
book data without going through the AddressBook API, they have also exposed an
attack vector for applications to insert poisoned address data that can be
synced back to other device, or even attack other applications inside their own
sandboxes.
By carefully constructing a malformed address book database file it may be
possible to exploit bugs in either sqlite3, the AddressBook API, or the code
of a specific application stack in order to corrupt their stack. At that that
point it is possible to run arbitrary code using a return-to-libc
payload, despite countermeasures like an non-executable stack. All the app has to do
is overwrite the users address book database with the new one, then the next the
user runs a targetted app that accesses the database the exploit can occur. To
be fair, sqlite3 is heavily vetted against malformed files, but that is no
guarantee that this is not possible, and by directly accessing the database
Apple has added an potential exploit vector that just doesn't need to be there.
What I want to see in the future
Well, the first thing I want is for Apple to expose the same sort of UI for
managing AddressBook as exists for CoreLocation. I want notification when things
are accessing it (especially when apps are writing to it). Furthur more, I want
it to be possible to turn off an app's access to the address book, despite the
potential for current apps to break, since I would rather risk the app crashing
then it stealing information or inserting information. I want the API expanded
to support read only and read/write access. As a compatibility measure Apple
should default to permissive access to the address book for 4.0 using the
existing APIs (that way no apps would break unless the user explicitly denied
access). In addition, Apple should start testing all 4.0 apps submitted to make
sure they operate correctly when address book access is denied. That way, when
5.0 comes out all apps submitted in the previous year will work correctly if
they change the default to asking the user for access and the user tends to deny
access.
I also think there is room for some significant improvements to the UI features
added for CoreLocation privacy in 4.0. Aside for the basic UI for notifying
users about access, the system should log accesses somewhere that is accessible
via iPhone Configuration Utility. Also, if Apple
adds the ability to control address book read and write access there may be a
need to rethink how to layout the UI in the Settings app, as having seperate
table views for Push Notifications, Location, Address Book (read), and Address
Book (read/write) is excessive and unintuitive. It would probably better to have
a single app security table where each app should a synopsis of what it was
trying to use, and could be expanded in order to inspect them more closely and
edit them. Ultimately I trust Apple's abilities to create a UI that can handle
presenting the necessary information to the user, that is the sort of thing they
excel at.
While I am at it, I also want applications to state up front what they want
access to, rather than asking me as they access them. This could be handled by
embedding it as metadata in the application's plist. That way the first time an
application attempts to access anything that requires permission the OS could
bring interface for approving everything at once (instead of multiple dialogs as
accesses to different services happen).
Why did I decide to write this blog post
I decided to write this blog post because I think iPhone OS is at a crossroads,
where it is either going to end up with up as a much more secure platform then
we currently have, or it could end up throwing away all of that progress. This
is especially important if the iPhone OS ends up being the basis for most of my
computer usage through products like the iPad. Its fundamental sandboxing design
allows it to be much more secure, but in order for that to remain the case Apple
has to carefully vet every API that allows access to user data, and provide the
user ways to control that access. At this point there are a fairly limited amount
of data exposed to applications:
- Location
- Address Book
- Insecure
- No worse than on Windows or Mac OS X, but it could be better
- iTunes database metadata
- Insecure
- Not particularly sensitive
- Camera access
- Currently imited to Apple provided UI (see below)
- Can be globally turned off via iPCU).
The thing is, Steve announced some very exciting features for 4.0 that have
potential privacy concerns. As you can see from my artistic rendition below,
there are 3 features I want to call out. They are:

- Calender
- If knowing where I am is a privacy concern, then knowing where I am
going to be certainly is)
- Photo Library Access
- Apps stealing your photos and uploading them to a server is a very
real privacy issue, just wait until apps can read all the photos and
movies you have taken without going through the OS's UI.
- Raw access to the camera data
- This a particularly nasty one, since iPhones do not include visible
indicators that the camera is turned on. A similiar issue has been the
subject of a recent lawsuit in the Lower Merion School
District. In that sort of case the situation is even worse, because
companies and schools can create their own applications under the
iPhone Enterprise Program, that don't have conform
the public API or be vetted by Apple, eliminating any safety derived from
Apple's approval process.
It is important that all of these APIs are designed in such a way that they can
expose restricted access to apps, since if apps can simply pull all of this data
with no restrictions then we have effectively given up a lot of the value of the
sandboxes. After all, who cares that apps can't get out of their sandbox when
all the user's personal data is right there for them to play with. Apple needs
to use their sandbox not just as way to protect the integrity of the OS on the
phone, but also to protect the privacy of the user's data.
(For any Apple readers, please check out
these 6 radars).