Mastering High Performance with Kotlin

By Igor Kucherenko
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies

About this book

The ease with which we write applications has been increasing, but with it comes the need to address their performance. A balancing act between easily implementing complex applications and keeping their performance optimal is a present-day requirement In this book, we explore how to achieve this crucial balance, while developing and deploying applications with Kotlin.

The book starts by analyzing various Kotlin specifcations to identify those that have a potentially adverse effect on performance. Then, we move on to monitor techniques that enable us to identify performance bottlenecks and optimize performance metrics. Next, we look at techniques that help to us achieve high performance: memory optimization, concurrency, multi threading, scaling, and caching. We also look at fault tolerance solutions and the importance of logging. We'll also cover best practices of Kotlin programming that will help you to improve the quality of your code base.

By the end of the book, you will have gained some insight into various techniques and solutions that will help to create high-performance applications in the Kotlin environment

Publication date:
June 2018
Publisher
Packt
Pages
316
ISBN
9781788996648

 

Chapter 1. Identifying Performance Bottlenecks

How well does it work? How fast is it? These questions mean essentially the same thing if we're talking about software. Although the question about saving technical resources isn't as relevant as it was in the early years of the computer industry, developers still should be careful about the efficiency of systems. Even though efficiency starts with hardware, modern computers have such large instruction sets that it's possible to use them in any manner.

Engineers spend a lot of time and effort to avoid a drain on the Central Processing Unit (CPU), to save battery life, or to make an animation of the user interface much smoother. So the question about performance is relevant nowadays, and software engineers should be careful with system resources.

Before we begin, let's review the topics we will be looking at:

  • Reasons for performance issues
  • Memory model
  • Slow rendering
 

Reasons for performance issues


Performance is a complicated term that can include response time, the speed of data transmission, availability, and utilization of computer resources. First of all, we should remember that we develop software for users, and so we should concentrate on factors that affect their experience.

Different issues can influence overall system performance differently. In one case, we can have a slow rendering speed; in another case, the response time can be slow. Poor performance decreases productivity, damages the loyalty of customers, and costs the software industry millions of dollars annually. So it would be better to identify bottlenecks before they begin to have a negative influence on the user experience.

Today's customers have applications with legacy code that require upgrading throughputs and response time. Java is one of the most popular languages in the world. A lot of server-side mobile applications and software for SIM cards have been written in Java. But Java isn't a modern programming language. This is the main reason for the appearance of Kotlin. It allows you to write simpler and more reliable code. The fact that Kotlin can compile to the same bytecode as Java is why applications written in these different languages can have the same performance. That's why the question about migrating from Java to Kotlin is relevant nowadays, and developers should be prepared for it. We're going to uncover the main reasons for performance issues that relate to all applications that are based on the Java Virtual Machine (JVM) and consequently to Kotlin.

Memory management

Memory is one of the essential resources of a computer, and it's essential to manage it properly. Failure to do so can lead to slow performance and bugs such as arithmetic overflow, memory leaks, segmentation faults, and buffer overflows.

The primary purpose of a memory management system is to provide the ability to dynamically allocate the requested size of memory to programs and to release it for reuse when no longer needed. These systems perform management on two levels:

  • Operating-system level
  • Application level

We'll concentrate on the application level because it's the responsibility of an application software developer. The operating-system level is managed with an operating system.

There are two types of application-level management systems:

  • Automatic memory management
  • Manual memory management

Manual memory management assumes that the programmer uses manual instructions to release unused garbage. It's relevant to languages (still in wide use today) such as C and C++. The JVM has automatic memory management that involves the garbage collection.

Garbage collection

Garbage collection is a strategy for automatically detecting memory allocated to objects that are no longer usable in a program and returning that allocated memory to the pool of free memory locations. All memory management techniques, including garbage collection, take a significant proportion of a program's total processing time and, as a result, can greatly influence performance. With modern, optimized garbage collection algorithms, memory can be released faster than with manual memory management. But depending on the application, the opposite can also be true, and many developers prefer to deallocate memory themselves. One of the biggest advantages that manual memory management has is the ability to reclaim resources before an object is destroyed. This process is referred to as finalization, and we'll touch on it further because it can also be a performance issue.

Memory management is an essential process that's applied to the computer memory. Since the JVM uses automatic memory management with the garbage collection strategy we should know what it is and how it works.

Working principles of the garbage collector

The garbage collection strategy assumes that the developer doesn't explicitly release memory. Instead, the garbage collector (GC) finds objects that aren't being used anymore and destroys them. As GC sees it, there are two types of objects—reachable and unreachable. This principle is based on a set of root objects that are always reachable. An object is a root if it satisfies one of the following criteria:

  • Local variables: They are stored in the stack of a thread. When a method, constructor, or initialization block is entered, local variables are created and become unreachable and available for the GC once they exit the scope of the method, constructor, or initialization block.
  • Active threads: These are objects that hold other ones from the GC's point of view. So all these objects are a reference tree that will not be destroyed until the thread is terminated.
  • Static variables: They are referenced by instances of the Class type where they're defined. The metadata of classes is kept in the Metaspace section of memory. This makes static variables de facto roots. When a classLoader loads and instantiates a new object of the Class type, static variables are created and can be destroyed during major garbage collection.
  • Java native interface references: They are references to objects that are held in native code. These objects aren't available to the GC because they can be used outside the JVM environment. These references require manual management in native code. That's why they often become the reason for memory leaks and performance issues.

The following diagram illustrates a simplified schematic of references trees:

An object is reachable if it's a leaf from a reference tree that's reachable from the root object. If an object is unreachable, then it's available for the GC. Since the GC starts collecting at unpredictable times, it's hard to tell when the memory space will be deallocated.

To perform garbage collection, the JVM needs to stop the world. This means that the JVM stops all threads except those that are needed for garbage collection. This procedure guarantees that new objects aren't created and that old objects don't suddenly become unreachable while the GC is working. Modern GC implementations do as much work as possible in the background thread. For instance, the mark and sweep algorithm marks reachable objects while the application continues to run, and parallel collectors split a space of the heap into small sections, and a separate thread works on each one.

The memory space is divided into several primary generations—young, old, and permanent. The permanent generation contains static members and the metadata about classes and methods. Newly created objects belong to the young generation, and once there's no reference to any of them, it becomes available for minor garbage collection. After this, the surviving objects are moved to the old generation, and become available only for major garbage collection.

Impacts of garbage collection

Depends on algorithm, the performance of garbage collection can depend on the number of objects or the size of the heap. GC needs time to detect reachable and unreachable objects. During this step, automatic memory management might lose out to manual memory management because a developer may have already known which objects should be destroyed. And after this stop the world—also known as the GC pause—is invoked, the GC suspends execution of all threads to ensure the integrity of reference trees.

Heap fragmentation

When the JVM starts, it allocates heap memory from the operating system and then manages that memory. Whenever an application creates a new object, the JVM automatically allocates a block of memory with a size that's big enough to fit the new object on the heap. After sweeping, in most cases, memory becomes fragmented. Memory fragmentation leads to two problems:

  • Allocation operations become more time consuming, because it's hard to find the next free block of sufficient size
  • The unused space between blocks can become so great that the JVM won't be able to create a new object

The following diagram illustrates a fragmented memory heap:

To avoid these problems after each GC cycle, the JVM executes a compaction step. Compacting moves all reachable objects to one end of the heap and, in this way, closes all holes. The heap after compacting looks as follows:

These diagrams show how blocks are located before and after compacting. The drawback is that an application must also be suspended during this process.

Finalization

Finalization is a process of releasing resources. It's executed with a finalizer method that's invoked after an object becomes unreachable, but before its memory is deallocated. Finalization is a non-deterministic process because it's not known when garbage collection will occur, and it might never happen.This is in contrast to a destructor, which is a method called for finalization in languages with manual memory management.

The following diagram illustrates the simplified life cycle of an object:

A destructor, in most languages, is the language-level term that means a method defined in a class by a programmer. A finalizer is an implementation-level term that means a method called by a system during object creation or destruction. Finalizers are needed to perform object-specific operations, cleaning or releasing resources that were used with an object. That's why they're most frequently instance methods.

Finalizers have several drawbacks:

  • It may never be called promptly, so a software engineer cannot rely on it to do something important, such as persisting a state or releasing scarce resources.
  • The invoking order of finalizers isn't specified.
  • Garbage collection, and consequently the finalizer, runs when memory resources are terminated but not when it's time to release other scarce resources. So, it's not a good idea to use it to release limited resources.
  • If too much work is performed in one finalizer, another one may start with a delay. And this may increase the total time of the garbage collection pause.
  • A finalizer may cause synchronization issues as well because it can use shared variables.
  • A finalizer in a superclass can also slow down garbage collection in a subclass because it can refer to the same fields.

To implement the finalizer in Java, a developer has to override the finalize() method of the Object class. The Object class has an empty implementation of the following method: 

protected void finalize() throws Throwable { }

This method has good documentation with two interesting moments. The first is that the Java programming language doesn't guarantee which thread will invoke the finalize() method for any given object. It's guaranteed, however, that the thread that invokes the finalize() method will not be holding any user-visible synchronization locks when finalize() is invoked. If an uncaught exception is thrown by the finalize() method, the exception is ignored, and finalization of that object terminates. And the second interesting catch is that any exceptions that are thrown by the finalize() method cause the finalization of this object to be halted, but they are otherwise ignored.

A sample of overriding can be found, for instance, in the source code of the FileInputStream class:

@Override protected void finalize() throws IOException {
     try {
         if (guard != null) {
           guard.warnIfOpen();
         }
         close();
      } finally {
         try {
             super.finalize();
         } catch (Throwable t) {
             // for consistency with the RI, we must override Object.finalize() to
             // remove the ’throws Throwable’ clause.
             throw new AssertionError(t);
         }
     }
}

This implementation ensures that all resources for this stream are released when it's about to be garbage collected.

But in Kotlin, the root of the class hierarchy is Any, which does not have a finalize() method:

public open class Any {

    public open operator fun equals(other: Any?): Boolean

    public open fun hashCode(): Int

    public open fun toString(): String
}

But according to the Kotlin documentation: https://kotlinlang.org/docs/reference/java-interop.html#finalize, to override finalize(), all you need to do is simply declare it without using theoverridekeyword (and it can't be private):

class C {
   protected fun finalize() {
      // finalization logic
   }
}

If you read to avoid finalizers and cleaners item of the effective Java book, you know that using finalizers to release resources is a common anti-pattern. To acquire resources in the constructor or initialization block and release it in the finalizer isn't a good approach. It's better to acquire the resources only when needed and release them once they're no longer needed. In other cases, using the finalize() method to release resources can cause resource and memory leaks.

Resource leaks

An operating system has several resources that are limited in number, for instance, files or internet sockets. A resource leak is a situation where a computer program doesn't release the resources it has acquired. The most common example is a case where files have been opened but haven't been closed:

fun readFirstLine() : String {
    val fileInputStream = FileInputStream("input.txt")
    val inputStreamReader = InputStreamReader(fileInputStream)
    val bufferedReader = BufferedReader(inputStreamReader)
    return bufferedReader.readLine()
}

In the preceding code snippet, the input.txt file hasn't been closed after being acquired and used. InputStream is an abstract superclass of all classes representing an input stream of bytes. It implements the Closeable single-method interface with a close() method. The subclasses of InputStream override this method to provide the ability to release the input stream, and in our case the file, correctly. So a correct version of the readFirstLine() method would look like this:

fun readFirstLine() : String? {
    var fileInputStream: FileInputStream? = null
    var inputStreamReader: InputStreamReader? = null
    var bufferedReader: BufferedReader? = null
    return try {
         fileInputStream = FileInputStream("input.txt")
         inputStreamReader = InputStreamReader(fileInputStream)
         bufferedReader = BufferedReader(inputStreamReader)
         bufferedReader.readLine()
    } catch (e: Exception) {
         null
    } finally {
         fileInputStream?.close()
         inputStreamReader?.close()
         bufferedReader?.close()
    }
}

Note

It's important to close a stream inside a finally section because if you do it at the end of the try section and an exception is thrown, then you'll have a file handle leak.

In this example, we can see how the dispose pattern is used with the try-finally special language construction. It's a design pattern for resource management that assumes use of the method usually called close(), dispose(), or release() to free the resources once they aren't needed. But since Kotlin 1.2, thanks to extension functions, we can write something like this:

fun readFirstLine(): String? = File("input.txt")
 .inputStream()
 .bufferedReader()
 .use { it.readLine() }

The use or useLines function executes the givenblock function on this resource and then closes it down correctly whether or not an exception is thrown.

Note

The use and useLines functions return the result of the block, which is very convenient, especially in our case.

The source code of the use function also uses the try-finally construction to ensure resources will be closed:

public inline fun <T : Closeable?, R> T.use(block: (T) -> R): R {
    var exception: Throwable? = null
    try {
       return block(this)
    } catch (e: Throwable) {
       exception = e
       throw e
    } finally {
        when {
           apiVersionIsAtLeast(1, 1, 0) -> this.closeFinally(exception)
           this == null -> {}
           exception == null -> close()
           else ->
             try {
               close()
             } catch (closeException: Throwable) {
                 // cause.addSuppressed(closeException) // ignored here
             }
        }
    }
}

So scarce resources that have been acquired must be released. Otherwise, an application will suffer from a resource leak, for example, a file handle leak like the one we've just described. Another common reason for slow performance is a memory leak.

Memory leaks

A memory leak may happen when an object can't be collected and can't be accessed by running code. The situation when memory that is no longer needed isn't released is referred to as a memory leak. In an environment with a GC, such as the JVM, a memory leak may happen when a reference to an object that's no longer needed is still stored in another object. This happens due to logical errors in program code, when an object holds a reference to another one when the last isn't used and isn't accessible in the program code anymore. The following diagram represents this case:

The GC cares about unreachable, also known as unreferenced, objects, but handling unused referenced objects depends on application logic. Leaked objects allocate memory, which means that less space is available for new objects. So if there's a memory leak, the GC will work frequently and the risk of the OutOfMemoryError exception increases.

Let's look at an example written in Kotlin of the popular RxJava2 library:

fun main(vars: Array<String>) {
var memoryLeak: MemoryLeak? = MemoryLeak()
    memoryLeak?.start()
    memoryLeak = null
    memoryLeak = MemoryLeak()
memoryLeak.start()
    Thread.currentThread().join()
}

class MemoryLeak {

init {
objectNumber ++
    }

private val currentObjectNumber = objectNumber

fun start() {
        Observable.interval(1, TimeUnit.SECONDS)
                .subscribe { println(currentObjectNumber) }
}

companion object {
@JvmField
        var objectNumber = 0
}
}

In this example, the join() method of the main thread is used to prevent the ending of application execution until other threads run. The objectNumber field of the MemoryLeak class counts created instances. Whenever a new instance of the MemoryLeak class is created, the value of objectNumber increments and is copied to the currentObjectNumber property.

The MemoryLeak class also has the start() method. This method contains an instance of Observable that emits an incremented number every second. Observable is the multi-valued base-reactive class thatoffers factory methods, intermediate operators, and the ability to consume synchronousand/or asynchronous reactive data-flows. Observable has many factory functions that create new instances to perform different actions. In our case, we'll use the interval function that takes two arguments—the sampling rate and the instance of the TimeUnit enum (https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/TimeUnit.html), which is the time unit in whichthe sampling rate is defined. The subscribe method takes an instance of a class that has the Consumer type. The most common approach is to create a lambda to handle emitted values. 

The main function is the starting point of our application. In this function, we create a new instance of the MemoryLeak class, then invoke the start() method. After this, we assign null to the memoryLeak reference and repeat the previous step.

This is the most common issue when using RxJava. The first instance of the MemoryLeak class cannot be collected because the passed Consumer obtains references to it. Hence one of the active threads, which is a root object, obtains references to the first instance of MemoryLeak. Since we don't have a reference to this object, it's unused, but it can't be collected. The output of the application looks as follows:

1
2
1
2
2
1
1
2
1
2
1

As you can see, both instances of Observable run and use the currentObjectNumber property, and both instances of the MemoryLeak class sequentially allocate memory. That's why we should release resources when an object is no longer needed. To deal with this issue, we have to rewrite the code as follows:

 fun main(vars: Array<String>) {
var memoryLeak: NoMemoryLeak? = NoMemoryLeak()
     memoryLeak?.start()
+    memoryLeak?.disposable?.dispose()
     memoryLeak = NoMemoryLeak()
memoryLeak.start()
     Thread.currentThread().join()
 }

 class NoMemoryLeak {

init {
objectNumber ++
     }

private val currentObjectNumber = objectNumber

+    var disposable: Disposable? = null

     fun start() {
+        disposable = Observable.interval(1, TimeUnit.SECONDS)
                 .subscribe { println(currentObjectNumber) }
}

companion object {
@JvmField
         var objectNumber = 0
}
 }

And now the output looks like this:

2
2
2
2
2
2

The subscribe() method returns an instance of the Disposable type, which has the dispose() method. Using this approach, we can prevent the memory leak.

Using instances of mutable classes without overriding the equals() and hashCode() methods as keys for Map can also lead to a memory leak. Let's look at the following example:

class MutableKey(var name: String? = null)

fun main(vars: Array<String>) {
val map = HashMap<MutableKey, Int>()
    map.put(MutableKey("someName"), 2)
print(map[MutableKey("someName")])
}

The output will be the following:

null

The get method of HashMap uses the hashCode() and equals() methods of a key to find and return a value. The current implementation of the MutableKey class doesn't override these methods. That's why if you lose the reference to the original key instance, you won't be able to retrieve or remove the value. It's definitely a memory leak because map is a local variable and sequentially it's the root object.

We can remedy the situation by making the MutableKey class data. If a class is marked as data, the compiler automatically derives the equals() and hashCode() methods from all properties declared in the primary constructor. So the MutableKey class will look as follows:

data class MutableKey(var name: String? = null)

And now the output will be:

2

Now, this class works as expected. But we can face another issue with the MutableKey class. Let's rewrite main as follows:

fun main(vars: Array<String>) {
val key = MutableKey("someName")

val map = HashMap<MutableKey, Int>()
    map.put(key, 2)

    key.name = "anotherName"

print(map[key])
}

Now, the output will be:

null

Because the hash, after re-assigning the name property, isn't the same as it was before:

fun main(vars: Array<String>) {
val key = MutableKey("someName")

println(key.hashCode())

val map = HashMap<MutableKey, Int>()
    map.put(key, 2)

    key.name = "anotherName"

println(key.hashCode())

print(map[key])
}

The output will now be:

1504659871
-298337234
null

This means that our code isn't simple and reliable. And we can still have the memory leak. The concept of an immutable object is extremely helpful in this case. Using this concept, we can protect objects from corruption, which is exactly the issue we need to prevent.

A strategy of creating classes for immutable objects in Java is complex and includes the following key moments:

  • Do not provide setters
  • All fields have to be marked with the final and private modifiers
  • Mark a class with the final modifier
  • References that are held by fields of an immutable class should refer to immutable objects
  • Objects that are composed by an immutable class have to also be immutable

An immutable class that is created according to this strategy may looks as follows:

 public final class ImmutableKey {


    private final String name;

     public ImmutableKey(String name) {
           this.name = name;

}
     public String getName() {
           return name;

}
 }

This is all very easy in Kotlin:

data class ImmutableKey(val name: String? = null)

All we need is it define all properties with val in primary constructor. We'll get a compiler error if we try to assign a new value to the name property. Immutability is an extremely powerful concept that allows us to implement some mechanisms, such as the String pool.

String pool

The String pool is a set of String objects stored in the Permanent Generation section of the heap. Under the hood, an instance of the String class is an array of chars. Each char allocates two bytes. The String class also has a cached hash that allocates four bytes, and each object has housekeeping information that allocates about eight bytes. And if we're talking about Java Development Kit 7 (http://grepcode.com/file/repository.grepcode.com/java/root/jdk/openjdk/7-b147/java/lang/String.java?av=f) or lower, the String class also has offset and length fields. Since String is the most used type, the instances of the String class allocate a significant part of the heap. 

To reduce the load on memory, the JVM has the String pool as the implementation of the Flyweight Design Pattern because memory space can be crucial for low-memory devices such as mobile devices.

Whenever double quotes are used to create a new instance of the String class, the JVM first looks for an existing instance with the same value in the String pool. If an existing instance is found, a reference to it is returned. Otherwise, a new instance is created in the String pool and then the reference to it is returned. When we use a constructor, we force the creation of a new instance of the String class in the heap:

This technique is called copy-on-write (COW). The point is that when a copy of the object is requested, the reference to the existing object is returned instead of creating a new one. In code, it may look like this:

fun main(vars: Array<String>) {
val cat1 = "Cat"
val cat2 = "Cat"
val cat3 = String("Cat".toCharArray())
println(cat1 === cat2)
println(cat1 === cat3)
}

The output:

true
false

Note

Kotlin has its own kotlin.String class. It's not the same as the java.lang.String class. And kotlin.String doesn't have a constructor that takes another instance of the String class. 

With the COW, when trying to modify an object through a particular reference, a real copy is created, the change is applied to it, and then the reference to the newly created object is returned. The following diagram illustrates this:

In code, it may look like this:

fun main(vars: Array<String>) {
val cat1 = "Cat"
val cat2 = cat1.plus("Dog")
println(cat1)
println(cat2)
println(cat1 === cat2)
}

And here's the output: 

Cat
CatDog
false

This technique is good for creating simple and reliable code and can be very useful in a concurrent application, as you can be sure that your object won't be corrupted with another thread.

Let's look at the following example:

class User(val id: Int = 0, val firstName: String = "", val lastName: String = "")

fun main(vars: Array<String>) {
val user = User()
val building = "304a"

val query = "SELECT id, firstName, lastName FROM Building " + building + " WHERE firstName = " + user.firstName
}

Each concatenation creates a new instance of String. So many unnecessary objects are created in this code. Instead of concatenation, we should use StringBuilder or String Templates (https://kotlinlang.org/docs/reference/basic-types.html#string-templates), which uses StringBuilder under the hood but is much simpler to use:

val query = "SELECT id, firstName, lastName FROM Building $building WHERE firstName = ${user.firstName}"

But how can we put a String object into the String pool if we receive it from the outside? Here is how:

val firstLine: String
get() = File("input.txt")
            .inputStream()
            .bufferedReader()
            .use { it.readLine() }

fun main(vars: Array<String>) {
println(firstLine === firstLine)
}

This is the output:

false

To put the value of the firstLine variable in the String pool, we have to use the intern() method. When this method is invoked, if the pool already contains a string equal to the value of the object, then the reference to the String from the pool is returned. Otherwise, this object is added to the pool and a reference to this object is returned. The intern() method is an implementation of interning. It's a method for storing only one copy of each distinct value:

fun main(vars: Array<String>) {
println(firstLine.intern() === firstLine.intern())
}

Here's the output: 

true

You shouldn't abuse this method because the String pool is stored in the Permanent Generation section of the heap. And it can be collected only during major garbage collection.

 

Memory model


The memory model describes how the JVM interacts with a computer's memory. By computer memory, we mean not only Random Access Memory (RAM) but also registers and cache memory of the CPU. So we consider the memory model as a simplified abstraction of the hardware memory architecture.

We can consider the whole JVM as a model of a computer that provides the ability to run a program on a wide range of processors and operating systems.

An understanding of the Java Memory Model is important because it specifies how different threads interact in memory. Concurrent programming involves plenty of different pitfalls in synchronization between threads that have shared variables and compliance with the consistency of a sequence of operations.

The problem of concurrency and parallelism

While concurrency is executing independent subtasks out of order without affecting the final result, parallelism is the executing subtasks that are carried out simultaneously. Parallelism involves concurrency, but concurrency is not necessarily executed in a parallel manner.

The compiler feels free to reorder instructions to perform optimization. This means that there are cases in which accesses to variables, during the execution of a program, may differ from the order specified in the code. Data is moved between registers, caches, and RAM all the time. There are no requirements for the compiler to perform synchronization between threads perfectly because this would cost too much from the performance point of view. This leads to cases when different threads may read different values from the same shared variable. A simplified example of the case described here may look like this:

fun main(vars: Array<String>) {
    var sharedVariableA = 0
var sharedVariableB = 0
val threadPool = Executors.newFixedThreadPool(10)
    val threadA = Runnable {
sharedVariableA = 3
sharedVariableB = 4
}
val threadB = Runnable {
val localA = sharedVariableA
        val localB = sharedVariableB
    }
threadPool.submit(threadA)
    threadPool.submit(threadB)
}

In a body of the threadB thread, the value of the localA variable is 3, and the value of the localB variable is 4. But if the compiler reorders the operations, the final values of the local variables may differ. To get a better understanding of this issue, we need some knowledge of the internal system of the Java Memory Model.

Java Memory Model (JMM)

The JMM divides the memory space between thread stacks and the heap. Each application has at least one thread, which is referred to as the main thread. When a new thread starts, a new stack is created. If an application has several threads, the simplified memory model may look like this:

The thread stack is a stack of blocks. Whenever a thread calls a method, a new block is created and added to the stack. This is also referred to as the call stack. This block contains all the local variables that were created in its scope. The local variables cannot be shared between threads, even if threads are executing the same method. A block fully stores all local variables of primitive types and references to objects. One thread can only pass copies of local primitive variables or references to another thread:

Kotlin doesn't have primitive types, in contrast to Java, but Kotlin does compile into the same bytecode as Java. And if you don't manipulate a variable in the same way as an object, then the generated bytecode will contain the variable of a primitive type:

fun main(vars: Array<String>) {

  val localVariable = 0

}

The simplified generated bytecode will look like this:

public final static main([Ljava/lang/String;)V

LOCALVARIABLE localVariable I L2 L3 1

But if you specify the type of the localVariable as Nullable, as follows:

val localVariable: Int? = null

Then this variable will be represented as an object in the bytecode:

LOCALVARIABLE localVariable Ljava/lang/Integer; L2 L3 1

All objects are contained in the heap even if they're local variables. In the case of local primitive variables, they'll be destroyed automatically when the execution point of a program leaves the scope of the method. The object can be destroyed only with the GC. So the use of local primitive variables is preferable. Since the Kotlin compiler applies optimizations to variables that can be primitive, in most cases the bytecode will contain variables of primitive types.

This diagram illustrates how two threads can share the same object:

Synchronization

As we already know, the JMM is a simplified model of the hardware memory architecture. If a variable is used frequently, it can be copied to the CPU cache. If several threads have a shared variable, then several CPU caches have their own duplicate of this variable. This is needed to increase access speed to variables. The hardware memory architecture has a hierarchy that is illustrated in the following diagram:

When several caches have duplicates of a variable that's stored in the main memory, the problem with visibility of shared objects may occur. This problem is referred to as a data race. This is a case when two or more threads change the values that were copied to caches. But one thread doesn't know about changes that were applied to the copied value by another thread. And when the thread updates the original variable in the main memory, the value that was assigned to the shared object by another thread can be erased. 

The following example clarifies the described case. Two threads run on two CPUs at the same time. And they have a shared object with the count variable that's been copied to caches of both CPUs. Both threads increment the copied values at the same time. But these changes aren't visible to each other because the updates haven't been flushed back to the main memory yet. The following diagram illustrates this:

To solve the problem with synchronization, you can use the volatile keyword, synchronized methods, or blocks, and so on. But all of these approaches bring overhead and make your code complex. It's better just to avoid shared mutable objects and use only immutable objects in a multithreading environment. This strategy helps keep your code simple and reliable.

 

Slow rendering


Slow rendering is another performance issue that powerfully influences the user experience. Users expect interactive and smooth user interfaces, and that's where a development team needs to increasingly focus their efforts. It's not enough to make loading and the first rendering fast; the application also needs to perform well. Response time should be instantaneous, animations should be smooth, and scrolling should be stick-to-finger-fast. To create an application with an excellent experience, a developer needs to understand how to write code that runs as efficiently as possible. 

Device refresh rate

Users interact and see the results of how an application reacts and updates the image on the display of the device, which is why it's an essential part that provides an excellent user experience. Manufacturers continue to improve display hardware, so it's important to understand some display characteristics.

The refresh rate of a display is how many times per second the image on the screen can be refreshed. It's measured in hertz (Hz), so if the refresh rate of a display is 60 Hz, then the displayed picture cannot be updated more than 60 times per second.

The refresh rate that leads to a good experience depends on the purpose. Movies in movie theaters run at just 24 Hz, while the old TV standards were 50 Hz and 60 Hz. A typical monitor for a personal computer has a 60 Hz refresh rate, but the latest gaming displays can reach 240 Hz.

Since the device refresh rate is a hardware characteristic and a software developer can't influence it, the frame rate is of more interest. The following (from https://developer.apple.com/library/content/documentation/DeviceInformation/Reference/iOSDeviceCompatibility/Displays/Displays.html) shows the refresh rates and recommended frame rates for popular devices:

Device

Refresh rate

Recommended frame rates

iPhone X

60 Hz

60, 30, 20

iPhone 8 Plus

60 Hz

60, 30, 20

iPhone 8

60 Hz

60, 30, 20

iPhone 7 Plus

60 Hz

60, 30, 20

iPhone 7

60 Hz

60, 30, 20

iPhone 6s Plus

60 Hz

60, 30, 20

iPhone 6s

60 Hz

60, 30, 20

iPhone SE

60 Hz

60, 30, 20

iPhone 6 Plus

60 Hz

60, 30, 20

iPhone 6

60 Hz

60, 30, 20

iPad Pro 12.9-inch (2nd generation)

120 Hz maximum

120, 60, 40, 30, 24, 20

iPad Pro 10.5-inch

120 Hz maximum

120, 60, 40, 30, 24, 20

iPad Pro (12.9-inch)

60 Hz

60, 30, 20

iPad Pro (9.7-inch)

60 Hz

60, 30, 20

iPad Mini 4

60 Hz

60, 30, 20

Frame rate

The human brain receives and processes visual information continuously. This can be used to create the illusion of motion when images follow each other fast enough. When an animation or transition runs or the user scrolls the page, the application needs to put up a new picture for each screen refresh. How many images software shows per second is the frame rate, and it's measured in frames per second (FPS):

  • A rate of 10 to 12 frames per second is referred to as clearly motion, and, in this case, a user retains awareness of individual pages.
  • A 24 frames per second rate with motion-blurring technology is enough to see fluid movement and is enough for the film industry.
  • A 30 frame per second rate is sufficient for movies, but without special effects.
  • 60 or more frames per second is what most people see as high-quality, smooth motion:

The act of generating a frame from an application and displaying it is referred to as user interface rendering. According to the table with recommended frame rates for popular devices, to be sure that a user interacts with an application smoothly, the application should render one frame each 16.6 ms to display 60 frames per second. The developer has to take into account that the system also requires some time to draw a frame, so it's not a good idea to plan to own all of that 16 ms, and it would be better to count on 10 ms. When an application fails to meet this budget, the frame rate drops and the content stutters on the screen.

It's essential to understand how to get smooth motion with a high frame rate. The human eye is extremely sensitive to motion inconsistencies. An application can display on average 60 frames per second, but it's enough to have only one frame that takes more than 16 ms to render, for the user to see something that we call hitching, lag, or jank. If the device's refresh rate is higher than the frame rate, the monitor displays repeated renderings of identical frames. This diagram illustrates a simplified view of jank:

 

Summary


In this chapter, we presented the most common reasons for performance issues. We'll talk about them in more depth in the next chapters. We compared several examples in Java and Kotlin to know how to use features of these languages to prevent performance bottlenecks. We also introduced the most common things that influence the user experience of the whole application.

In the next chapter, we'll introduce different tools and practices for identifying performance issues. 

About the Author

  • Igor Kucherenko

    Igor Kucherenko is an Android developer at Techery, a software development company that uses Kotlin as the main language for Android development. Currently, he lives in Ukraine, where he is a speaker in the Kotlin Dnipro community, which promotes Kotlin, and shares his knowledge with audiences at meetups. You can find his articles concerning Kotlin and Android development on Medium and in a blog for Yalantis, where he worked previously.

    Browse publications by this author
Book Title
Unlock this full book FREE 10 day trial
Start Free Trial