To build Android applications that run smoothly and responsively in these resource-constrained environments, we need to arm ourselves with an understanding of the options available, and how, when, and why to use them—this is the essence of this book.
However, before we do that, we'll briefly consider why we need to concern ourselves at all. We'll see how serious Google is about the efficiency of the platform, explore the Android process model and its implications for programmers and end users, and examine some of the measures that the Android team have put in place to protect users from apps that behave badly.
To conclude, we'll discuss the general approach used throughout the rest of the book to keep applications responsive using asynchronous programming and concurrency, and its associated challenges and benefits.
In this chapter, we will cover the following topics:
Introducing the Dalvik Virtual Machine
Memory sharing and the Zygote
Understanding the Android thread model
The main thread
Unresponsive apps and the ANR dialog
Concurrency in Android
Android applications are typically programmed using the Java language, but the virtual machines in the Android stack are not instances of the Java Virtual Machine (JVM). Instead, the Java source is compiled to Java byte-code and translated into a Dalvik executable file (DEX) for execution on a Dalvik Virtual Machine (DVM).
It is no accident that Google chose Java as the primary language, allowing a vast pool of developer talent to quickly get to work on building apps, but why not simply run Android applications directly on a JVM?
Dalvik was created specifically for Android, and as such, was designed to operate in environments where memory, processor, and electrical power are limited, for example, mobile devices. Satisfying these design constraints resulted in a very different virtual machine than the typical JVM's that we know from desktop and server environments.
Dalvik goes to great lengths to improve the efficiency of the JVM, involving a range of optimizations to simplify and speed up interpretation and reduce the memory footprint of a running program. The most fundamental difference between the two VM architectures is that the JVM is a stack-based machine, whereas the DVM is register-based.
A stack-based virtual machine must transfer data from registers to the operand stack before manipulating them. In contrast, a register-based VM operates by directly using virtual registers. This increases the relative size of instructions because they must specify which registers to use, but reduces the number of instructions that must be executed to achieve the same result.
Dalvik's creators claim that the net result is in Dalvik's favour and that the DVM is on average around 30 percent more efficient than the JVM. Clearly, Google has gone to great lengths to squeeze every last drop of performance out of each mobile device to help developers build responsive applications!
A special process called the Zygote is launched when Android initially boots. The Zygote starts up a virtual machine, preloads the core libraries, and initializes various shared structures. It then waits for instructions by listening on a socket.
When a new Android application is launched, the Zygote receives a command to create a virtual machine to run the application on. It does this by forking its prewarmed VM process and creating a new child process that shares its memory with the parent, using a technique called Copy-On-Write. This has some fantastic benefits:
First, the virtual machine and core libraries are already loaded into the memory. Not having to read this significant chunk of data to initialize the virtual machine drastically reduces the startup overhead.
Second, the memory in which these core libraries and common structures reside is shared by the Zygote with all other applications, resulting in saving a lot of memory when the user is running multiple apps.
Each forked application process runs independently and is scheduled frequent, small amounts of CPU time by the operating system. This time-slicing approach means that even a single-processor device can appear to be actively working in more than one application at the same time, when in fact, each application is taking very short turns on the CPU.
Within a process, there may be many threads of execution. Each thread is a separate sequential flow of control within the overall program—it executes its instructions in order, one after the other. These threads are also allocated slices of CPU time by the operating system.
While the application process is started by the system and prevented from directly interfering with data in the memory address space of other processes, threads may be started by application code and can communicate and share data with other threads within the same process.
Within each DVM process, the system starts a number of threads to perform important duties such as garbage collection, but of particular importance to application developers is the single thread of execution known as the main or UI thread. By default, any code that we write in our applications will be executed by the main thread.
For example, when we write code in an
onCreate method in the
Activity class, it will be executed on the main thread. Likewise, when we attach listeners to user-interface components to handle taps and other user-input gestures, the listener callback executes on the main thread.
For apps that do little I/O or processing, this single thread model is fine. However, if we need to do CPU-intensive calculations, read or write files from permanent storage, or talk to a web service, any further events that arrive while we're doing this work will be blocked until we're finished.
An app that doesn't respond quickly to user interaction will feel unresponsive—anything more than a couple of hundred milliseconds delay is noticeable. This is such a pernicious problem that the Android platform protects users from applications that do too much on the main thread.
Android works hard to synchronize user interface redraws with the hardware refresh rate. This means that it aims to redraw at 60 frames per second—that's just 16.67 ms per frame. If we do work on the main thread that takes anywhere near 16 ms, we risk affecting the frame rate, resulting in jank—stuttering animations, jerky scrolling, and so on.
At API level 16, Android introduced a new entity, the Choreographer, to oversee timing issues. It will start issuing dropped-frame warnings in the log if you drop more than 30 consecutive frames.
Ideally, of course, we don't want to drop a single frame. Jank, unresponsiveness, and especially the ANR, offer a very poor user experience and translate into bad reviews and an unpopular application. A rule to live by when building Android applications is: do not block the main thread!
Android provides a helpful strict mode setting in Developer Options on each device, which will flash the screen when applications perform long-running operations on the main thread.
Further protection was added to the platform in Honeycomb (API level 11) with the introduction of a new
NetworkOnMainThreadException, a subclass of
RuntimeException that is thrown if the system detects network activity initiated on the main thread.
Ideally then, we want to offload long-running operations from the main thread so that they can be handled in the background, and the main thread can continue to process user interface updates smoothly and respond in a timely fashion to user interaction.
For this to be useful, we must be able to coordinate work and safely pass data between cooperating threads—especially between background threads and the main thread.
We also want to execute many background tasks at the same time and take advantage of additional CPU cores to churn through heavy processing tasks quickly.
Higher-level mechanisms introduced to Java 5 in the
java.util.concurrent package, such as Executors, atomic wrapper classes, locking constructs, and concurrent collections, are also available for use in Android applications.
We can start new threads of execution in our Android applications just as we would in any other Java application, and the operating system will schedule some CPU time for those threads.
While starting new threads is easy, concurrency is actually a very difficult thing to do well. Concurrent software faces many issues that fall into the two broad categories: correctness (producing consistent and correct results) and liveness (making progress towards completion).
A common example of a correctness problem occurs when two threads need to modify the value of the same variable based on its current value. Let's imagine that we have an integer variable
myInt with the current value of 2.
In order to increment
myInt, we first need to read its current value and then add 1 to it. In a single threaded world, the two increments would happen in a strict sequence—we read the initial value 2, add 1 to it, set the new value back to the variable, then repeat the sequence. After the two increments,
myInt holds the value 4.
In a multithreaded environment, we run into potential timing issues. It is possible that two threads trying to increment the variable would both read the same initial value (2), add 1 to it, and set the result (in both cases, 3) back to the variable.
Both threads have behaved correctly in their localized view of the world, but in terms of the overall program, we clearly have a correctness problem; 2 + 2 should not equal 3! This kind of timing issue is known as a race condition.
A common solution to correctness problems such as race conditions is mutual exclusion—preventing multiple threads from accessing certain resources at the same time. Typically, this is achieved by ensuring that threads acquire an exclusive lock before reading or updating shared data.
Liveness can be thought of as the ability of the application to do useful work and make progress towards goals. Liveness problems tend to be an unfortunate side effect of the solution to correctness problems. By locking access to data or system resources, it is possible to create bottlenecks where many threads are contending for access to a single lock, leading to potentially significant delays.
Worse, where multiple locks are used, it is possible to create a situation where no thread can make progress because each requires exclusive access to a lock that another thread currently owns—a situation known as a deadlock.
Android applications are typically composed of one or more subclasses of
Activity instance has a very well-defined lifecycle that the system manages through the execution of lifecycle method callbacks, all of which are executed on the main thread.
Activity instance that has been completed should be eligible for garbage collection, but background threads that refer to the
Activity or part of its view hierarchy can prevent garbage collection and create a memory leak.
Similarly, it is easy to waste CPU cycles (and battery life) by continuing to do background work when the result can never be displayed because
Activity has finished.
Finally, the Android platform is free at any time to kill processes that are not the user's current focus. This means that if we have long-running operations to complete, we need some way of letting the system know not to kill our process yet!
All of this complicates the do-not-block–the-main-thread rule because we need to worry about canceling background work in a timely fashion or decoupling it from the
Activity lifecycle where appropriate.
This is because the user-interface toolkit is not thread-safe, that is, accessing it from multiple threads may cause correctness problems. In fact, the user-interface toolkit protects itself from potential problems by actively denying access to user-interface components from threads other than the one that originally created those components.
The final challenge then lies in safely synchronizing background threads with the main thread so that the main thread can update the user interface with the results of the background work.
There are constructs that allow us to defer tasks to run later on the main thread, to communicate easily between cooperating threads, and to issue work to managed pools of worker threads and re-integrate the results.
There are solutions to the constraints of the
Activity lifecycle both for medium-term operations that closely involve the user-interface and for longer-term work that must be completed even if the user leaves the application.
While some of these constructs were only introduced with newer releases of the Android platform, all are available through the support libraries and, with a few exceptions, the examples in this book target devices that run API level 7 (Android 2.1) and above.
The rest of this book discusses these Android-specific constructs and their usage and applications.
In this chapter, we learned that Google takes the efficiency of the Android platform very seriously. We also looked at the extraordinary lengths they go to in order to ensure a smooth user experience, evidencing the importance of building responsive applications.
We discussed the Android thread model and the measures that the platform may take to protect the user from apps that misbehave or are not sufficiently responsive.
Finally, we gained an overview of the general approach to building responsive apps through concurrency, and learned some of the issues faced by developers of concurrent software in general and Android applications in particular.
In the next chapter, we'll start to build responsive applications by applying the infamous
AsyncTask instance to execute work in the background using pools of threads, and return progress updates and results to the main thread.