Home Game Development Unity 2017 Game Optimization - Second Edition

Unity 2017 Game Optimization - Second Edition

By Chris Dickinson
books-svg-icon Book
eBook $39.99 $27.98
Print $48.99
Subscription $15.99 $10 p/m for three months
$10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
BUY NOW $10 p/m for first 3 months. $15.99 p/m after that. Cancel Anytime!
eBook $39.99 $27.98
Print $48.99
Subscription $15.99 $10 p/m for three months
What do you get with a Packt Subscription?
This book & 7000+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook + Subscription?
Download this book in EPUB and PDF formats, plus a monthly download credit
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with a Packt Subscription?
This book & 6500+ ebooks & video courses on 1000+ technologies
60+ curated reading lists for various learning paths
50+ new titles added every month on new and emerging tech
Early Access to eBooks as they are being written
Personalised content suggestions
Customised display settings for better reading experience
50+ new titles added every month on new and emerging tech
Playlists, Notes and Bookmarks to easily manage your learning
Mobile App with offline access
What do you get with eBook?
Download this book in EPUB and PDF formats
Access this title in our online reader
DRM FREE - Read whenever, wherever and however you want
Online reader with customised display settings for better reading experience
What do you get with video?
Download this video in MP4 format
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with video?
Stream this video
Access this title in our online reader
DRM FREE - Watch whenever, wherever and however you want
Online reader with customised display settings for better learning experience
What do you get with Audiobook?
Download a zip folder consisting of audio files (in MP3 Format) along with supplementary PDF
What do you get with Exam Trainer?
Flashcards, Mock exams, Exam Tips, Practice Questions
Access these resources with our interactive certification platform
Mobile compatible-Practice whenever, wherever, however you want
  1. Free Chapter
    Pursuing Performance Problems
About this book
Unity is an awesome game development engine. Through its massive feature-set and ease-of-use, Unity helps put some of the best processing and rendering technology in the hands of hobbyists and professionals alike. This book shows you how to make your games fly with the recent version of Unity 2017, and demonstrates that high performance does not need to be limited to games with the biggest teams and budgets. Since nothing turns gamers away from a game faster than a poor user-experience, the book starts by explaining how to use the Unity Profiler to detect problems. You will learn how to use stopwatches, timers and logging methods to diagnose the problem. You will then explore techniques to improve performance through better programming practices. Moving on, you will then learn about Unity’s built-in batching processes; when they can be used to improve performance, and their limitations. Next, you will import your art assets using minimal space, CPU and memory at runtime, and discover some underused features and approaches for managing asset data. You will also improve graphics, particle system and shader performance with a series of tips and tricks to make the most of GPU parallel processing. You will then delve into the fundamental layers of the Unity3D engine to discuss some issues that may be difficult to understand without a strong knowledge of its inner-workings. The book also introduces you to the critical performance problems for VR projects and how to tackle them. By the end of the book, you will have learned to improve the development workflow by properly organizing assets and ways to instantiate assets as quickly and waste-free as possible via object pooling.
Publication date:
November 2017
Publisher
Packt
Pages
376
ISBN
9781788392365

 

Chapter 1. Pursuing Performance Problems

Performance evaluation for most software products is a very scientific process. First, we determine the maximum/minimum supported performance metrics, such as the allowed memory usage, acceptable CPU consumption, the number of concurrent users, and so on. Next, we perform load testing against the application in scenarios with a version of the application built for the target platform, and test it while gathering instrumentation data. Once this data is collected, we analyze and search it for performance bottlenecks. If problems are discovered, we complete a root-cause analysis, make changes in the configuration or application code to fix the issue and repeat it.

Although game development is a very artistic process, it is still exceptionally technical, so there is a good reason to treat it in similarly objective ways. Our game should have a target audience in mind, which can tell us what hardware limitations our game might be operating under and, perhaps, tell us exactly what performance targets we need to meet (particularly in the case of console and mobile games). We can perform runtime testing on our application, gather performance data from multiple subsystems (CPU, GPU, memory, the Physics Engine, the Rendering Pipeline, and so on), and compare them against what we consider to be acceptable. We can use this data to identify bottlenecks in our application, perform additional instrumentation, and determine the root cause of the issue. Finally, depending on the type of problem, we should be capable of applying a number of fixes to improve our application's performance to bring it more in line with the intended behavior.

However, before we spend even a single moment making performance fixes, we will need to prove that a performance problem exists to begin with. It is unwise to spend time rewriting and refactoring code until there is good reason to do so, since pre-optimization is rarely worth the hassle. Once we have proof of a performance issue, the next task is figuring out exactly where the bottleneck is located. It is important to ensure that we understand why the performance issue is happening, otherwise we could waste even more time applying fixes that are little more than educated guesses. Doing so often means that we only fix a symptom of the issue, not its root cause, and so we risk the chance that it manifests itself in other ways in the future, or in ways we haven't yet detected.

In this chapter, we will explore the following:

  • How to gather profiling data using the Unity Profiler
  • How to analyze Profiler data for performance bottlenecks
  • Techniques to isolate a performance problem and determine its root cause

With a thorough understanding of a given problem, you will then be ready for information presented in the remaining chapters, where you will learn what solutions are available for the issue we've detected.

 

The Unity Profiler


The Unity Profiler is built into the Unity Editor itself and provides an expedient way of narrowing down our search for performance bottlenecks by generating usage and statistics reports on a multitude of Unity3D subsystems during runtime. The different subsystems it can gather data for are listed as follows:

  • CPU consumption (per-major subsystem)
  • Basic and detailed rendering and GPU information
  • Runtime memory allocations and overall consumption
  • Audio source/data usage
  • Physics Engine (2D and 3D) usage
  • Network messaging and operation usage
  • Video playback usage
  • Basic and detailed user interface performance (new in Unity 2017)
  • Global Illumination statistics (new in Unity 2017)

There are generally two approaches to make use of a profiling tool: instrumentation and benchmarking (although, admittedly, the two terms are often used interchangeably).

Instrumentation typically means taking a close look into the inner workings of the application by observing the behavior of targeted function calls, where/how much memory is being allocated, and, generally getting an accurate picture of what is happening with the hope of finding the root cause of a problem. However, this is normally not an efficient way of starting to find performance problems because profiling of any application comes with a performance cost of its own.

When a Unity application is compiled in Development Mode (determined by the Development Build flag in the Build Settings menu), additional compiler flags are enabled causing the application to generate special events at runtime, which get logged and stored by the Profiler. Naturally, this will cause additional CPU and memory overhead at runtime due to all of the extra workload the application takes on. Even worse, if the application is being profiled through the Unity Editor, then even more CPU and memory will be spent, ensuring that the Editor updates its interface, renders additional windows (such as the Scene window), and handles background tasks. This profiling cost is not always negligible. In excessively large projects, it can sometimes cause wildly inconsistent behavior when the Profiler is enabled. In some cases, the inconsistency is significant enough to cause completely unexpected behavior due to changes in event timings and potential race conditions in asynchronous behavior. This is a necessary price we pay for a deep analysis of our code's behavior at runtime, and we should always be aware of its presence.

Before we get ahead of ourselves and start analyzing every line of code in our application, it would be wiser to perform a surface-level measurement of the application. We should gather some rudimentary data and perform test scenarios during a runtime session of our game while it runs on the target hardware; the test case could simply be a few seconds of Gameplay, playback of a cut scene, a partial play through of a level, and so on. The idea of this activity is to get a general feel for what the user might experience and keep watching for moments when performance becomes noticeably worse. Such problems may be severe enough to warrant further analysis.

This activity is commonly known as benchmarking, and the important metrics we're interested in are often the number of frames per-second (FPS) being rendered, overall memory consumption, how CPU activity behaves (looking for large spikes in activity), and sometimes CPU/GPU temperature. These are all relatively simple metrics to collect and can be used as a best first approach to performance analysis for one important reason; it will save us an enormous amount of time in the long run, since it ensures that we only spend our time investigating problems that users would notice.

We should dig deeper into instrumentation only after a benchmarking test indicates that further analysis is required. It is also very important to benchmark by simulating actual platform behavior as much as possible if we want a realistic data sample. As such, we should never accept benchmarking data that was generated through Editor Mode as representative of real gameplay, since Editor Mode comes with some additional overhead costs that might mislead us, or hide potential race conditions in a real application. Instead, we should hook the profiling tool into the application while it is running in a standalone format on the target hardware. 

Note

Many Unity developers are surprised to find that the Editor sometimes calculates the results of operations much faster than a standalone application does. This is particularly common when dealing with serialized data like audio files, Prefabs and Scriptable Objects. This is because the Editor will cache previously imported data and is able to access it much faster than a real application would.

Let's cover how to access the Unity Profiler and connect it to the target device so that we can start to make accurate benchmarking tests.

Note

Users who are already familiar with connecting the Unity Profiler to their applications can skip to the section titled The Profiler window.

Launching the Profiler

We will begin with a brief tutorial on how to connect our game to the Unity Profiler within a variety of contexts:

  • Local instances of the application, either through the Editor or a standalone instance
  • Local instances of a WebGL application running in a browser
  • Remote instances of the application on an iOS device (for example, iPhone or iPad)
  • Remote instances of the application on an Android device (for example, an Android tablet or phone)
  • Profiling the Editor itself

We will briefly cover the requirements for setting up the Profiler in each of these contexts.

Editor or standalone instances

The only way to access the Profiler is to launch it through the Unity Editor and connect it to a running instance of our application. This is the case whether we're executing our game in Play Mode within the Editor, running a standalone application on the local or remote device, or we wish to profile the Editor itself.

To open the Profiler, navigate to Window | Profiler within the Editor:

Note

If the Editor is already running in Play Mode, then we should see reporting data actively gathering in the Profiler window.

To profile standalone projects, ensure that the Development Build and Autoconnect Profiler flags are enabled when the application is built.

Choosing whether to profile an Editor-based instance (through the Editor's Play Mode) or a standalone instance (built and running separately from the Editor) can be achieved through the Connected Player option in the Profiler window:

Note that switching back to the Unity Editor while profiling a separate standalone project will halt all data collection since the application will not be updated while it is in the background.

Note

Note that the Development Build option is named Use Development Mode and the Connected Player option is named Active Profiler in Unity 5.

Connecting to a WebGL instance

The Profiler can also be connected to an instance of the Unity WebGL Player. This can be achieved by ensuring that the Development Build and Autoconnect Profiler flags are enabled when the WebGL application is built and run from the Editor. The application will then be launched through the Operating System's default browser. This enables us to profile our web-based application in a more real-world scenario through the target browser and test multiple browser types for inconsistencies in behavior (although this requires us to keep changing the default browser).

Unfortunately, the Profiler connection can only be established when the application is first launched from the Editor. It currently (at least in early builds of Unity 2017) cannot be connected to a standalone WebGL instance already running in a browser. This limits the accuracy of benchmarking WebGL applications since there will be some Editor-based overhead, but it’s the only option we have available for the moment.

Remote connection to an iOS device

The Profiler can also be connected to an active instance of the application running remotely on an iOS device, such as an iPad or iPhone. This can be achieved through a shared Wi-Fi connection. 

Note

Note that remote connection to an iOS device is only possible when Unity (and hence the Profiler) is running on an Apple Mac device.

Follow the given steps to connect the Profiler to an iOS device:

  1. Ensure that the Development Build and Autoconnect Profiler flags are enabled when the application is built.
  2. Connect both the iOS device and Mac device to a local Wi-Fi network, or to an ad hoc Wi-Fi network.
  3. Attach the iOS device to the Mac via the USB or Lightning cable.
  4. Begin building the application with the Build & Run option as usual.
  5. Open the Profiler window in the Unity Editor and select the device under Connected Player.

You should now see the iOS device's profiling data gathering in the Profiler window.

Note

The Profiler uses ports from 54998 to 55511 to broadcast profiling data. Ensure that these ports are available for outbound traffic if there is a firewall on the network.

For troubleshooting problems with building iOS applications and connecting the Profiler to them, consult the following documentation page: https://docs.unity3d.com/Manual/TroubleShootingIPhone.html.

Remote connection to an Android device

There are two different methods for connecting an Android device to the Unity Profiler: either through a Wi-Fi connection or using the Android Debug Bridge (ADB) tool. Either of these approaches will work from an Apple Mac, or a Windows PC.

Perform the following steps to connect an Android device over a Wi-Fi connection:

  1. Ensure that the Development Build and Autoconnect Profiler flags are enabled when the application is built.
  2. Connect both the Android and desktop devices to a local Wi-Fi network.
  3. Attach the Android device to the desktop device via the USB cable.
  4. Begin building the application with the Build & Run option as usual.
  5. Open the Profiler window in the Unity Editor and select the device under Connected Player.

The application should then be built and pushed to the Android device through the USB connection, and the Profiler should connect through the Wi-Fi connection. You should then see the Android device's profiling data gathering in the Profiler window.

The second option is to use ADB. This is a suite of debugging tools that comes bundled with the Android Software Development Kit (SDK). For ADB profiling, follow these steps:

  1. Ensure that the Android SDK is installed by following Unity's guide for Android SDK/NDK setup: https://docs.unity3d.com/Manual/android-sdksetup.html.
  2. Connect the Android device to your desktop machine via the USB cable.
  3. Ensure that the Development Build and Autoconnect Profiler flags are enabled when the application is built.
  4. Begin building the application with the Build & Run option as usual.
  5. Open the Profiler window in the Unity Editor and select the device under Connected Player.

You should now see the Android device's profiling data gathering in the Profiler window.

For troubleshooting problems with building Android applications and connecting the Profiler to them, consult the following documentation page: https://docs.unity3d.com/Manual/TroubleShootingAndroid.html.

Editor profiling

We can profile the Editor itself. This is normally used when trying to profile the performance of custom Editor Scripts. This can be achieved by enabling the Profile Editor option in the Profiler window and configuring the Connected Player option to Editor, as shown in the following screenshot:

Note that both options must be configured this way if we want to profile the Editor. Setting Connected Player to Editor without enabling the Profile Editor button is the default case, where the Profiler is collecting data for our application while it is running in Play Mode.

The Profiler window

We will now cover the essential features of the Profiler as they can be found within the interface.

The Profiler window is split into four main sections:

  • Profiler Controls
  • Timeline View
  • Breakdown View Controls
  • Breakdown View

These sections are shown in the following screenshot:

We'll cover each of these sections in detail.

Profiler controls

The top bar in the previous screenshot contains multiple drop-down and toggle buttons we can use to affect what is being profiled and how deeply in the subsystem that data is gathered from. They are covered in the next subsections.

Add Profiler

By default, the Profiler will collect data for several different subsystems that cover the majority of the Unity's Engine subsystems in the Timeline View. These subsystems are organized into various Areas containing relevant data. The Add Profiler option can be used to add additional Areas or restore them if they were removed. Refer to the Timeline View section for a complete list of subsystems we can profile.

Record

Enabling the Record option makes the Profiler record profiling data. This will happen continuously while this option is enabled. Note that runtime data can only be recorded if the application is actively running. For an app running in the Editor, this means that Play Mode must be enabled and it should not be paused; alternatively, for a standalone app, it must be the active window. If Profile Editor is enabled, then the data that appears will be gathered for the Editor itself.

Deep Profile

Ordinary profiling will only record the time and memory allocations made by the common Unity callback methods, such as Awake(), Start(), Update(), and FixedUpdate(). Enabling the Deep Profile option re-compiles our scripts with much deeper level of instrumentation, allowing it to measure each and every invoked method. This causes a significantly greater instrumentation cost during runtime than normal, and uses substantially more memory since data is being collected for the entire callstack at runtime. As a consequence, Deep Profiling may not even be possible in large projects, as Unity may run out of memory before testing even begins or the application may run so slowly as to make the test pointless.

Note

Note that toggling Deep Profile requires the entire project to be completely re-compiled before profiling can begin again, so it is best to avoid toggling the option back and forth between tests.

Since this option blindly measures the entire callstack, it would be unwise to keep it enabled during most of our profiling tests. This option is best reserved for when default profiling is not providing enough detail to figure out the root cause, or if we’re testing performance of a small test Scene, which we're using to isolate certain activities.

If Deep Profiling is required for larger projects and scenes, but the Deep Profile option is too much of a hindrance during runtime, then there are alternative approaches that can be used to perform more detailed profiling in the upcoming section titled Targeted profiling of code segments.

Profile Editor

The Profile Editor option enables Editor profiling, that is, gathering profiling data for the Unity Editor itself. This is useful in order to profile any custom Editor scripts we have developed.

Note

Remember that Connected Player must also be set to the Editor option for Editor profiling to occur.

Connected Player

The Connected Player drop-down offers choices to select the target instance of Unity we want to profile. This can be the current Editor application, a local standalone instance of our application, or an instance of our application running on a remote device.

Clear

The Clear button clears all profiling data from the Timeline View.

Load

The Load button will open up a dialog window to load in any previously-saved Profiler data (from using the Save option).

Save

The Save button saves any Profiler data currently presented in the Timeline View to a file. Only 300 frames of data can be saved in this fashion at a time, and a new file must be manually created for any more data. This is typically sufficient for most situations, since when a performance spike occurs we then have about five to ten seconds to pause the application and save the data for future analysis (such as attaching it to a bug report) before it gets pushed off the left side of the Timeline View. Any saved Profiler data can be loaded into the Profiler for future examination using the Load option.

Frame Selection

The Frame Counter shows how many frames have been profiled and which frame is currently selected in the Timeline View. There are two buttons to move the currently selected frame forward or backward by one frame and a third button (the Current button) that resets the selected frame to the most recent frame and keeps that position. This will cause the Breakdown View to always show the profiling data for the current frame during runtime profiling and will display the word Current.

Timeline View

The Timeline View reveals profiling data that has been collected during runtime, organized into a series of Areas. Each Area focuses on profiling data for a different subsystem of the Unity Engine and each is split into two sections: a graphical representation of profiling data on the right, and a series of checkboxes to enable/disable different activities/data types on the left. These colored boxes can be toggled, which changes the visibility of the corresponding data types within the graphical section of the Timeline View.

When an Area is selected in the Timeline View, more detailed information for that subsystem will be revealed in the Breakdown View (beneath the Timeline View) for the currently selected frame. The kinds of information displayed in the Breakdown View varies depending on which Area is currently selected in the Timeline View.

Areas can be removed from the Timeline View by clicking on the X at the top-right corner of an Area. Recall that Areas can be restored to the Timeline View through the Add Profiler option in the Controls bar.

At any time, we can click at a location in the graphical part of the Timeline View to reveal information about a given frame. A large vertical white bar will appear (usually with some additional information on either side coinciding with the line graphs), showing us which frame is selected.

Depending on which Area is currently selected (determined by which Area is currently highlighted in blue), different information will be available in the Breakdown View, and different options will be available in the Breakdown View Controls. Changing the Area that is selected is as simple as clicking on the relevant box on the left-hand side of the Timeline View or on the graphical side, although clicking inside the graphical Area might also change which frame has been selected, so be careful clicking in the graphical Area if you wish to see Breakdown View information for the same frame.

Breakdown View Controls

Different drop-downs and toggle button options will appear within the Breakdown View Controls, depending on which Area is currently selected in the Timeline View. Different Areas offer different controls, and these options dictate what information is available, and how that information is presented in the Breakdown View.

Breakdown View

The information revealed in the Breakdown View will vary enormously based on which Area is currently selected and which Breakdown View Controls options are selected. For instance, some Areas offer different modes in a drop-down within the Breakdown View Controls, which can provide a simpler or detailed view of the information or even a graphical layout of the same information so that it can be parsed more easily.

Let's cover each Area and the different kinds of information and options available in the Breakdown View.

The CPU Usage Area

This Area shows data for all CPU usage and statistics. This Area is perhaps the most complex and useful since it covers a large number of Unity subsystems, such as MonoBehaviour Components, cameras, some rendering and physics processes, user interface (including the Editor's interface, if we're running through the Editor), audio processing, the Profiler itself, and more.

There are three different modes of displaying CPU usage data in the Breakdown View:

  • Hierarchy mode
  • Raw Hierarchy mode
  • Timeline mode

Hierarchy mode reveals most callstack invocations, while grouping similar data elements and global Unity function calls together for convenience. For instance, rendering delimiters, such as BeginGUI() and EndGUI() calls, are combined together in this mode. Hierarchy mode is helpful as an initial first step to determine which function calls cost the most CPU time to execute.

Raw Hierarchy mode is similar to Hierarchy mode, except it will separate global Unity function calls into separate entries rather than being combined into one bulk entry. This will tend to make the Breakdown View more difficult to read, but may be helpful if we're trying to count how many times a particular global method is invoked or determining whether one of these calls is costing more CPU/memory than expected. For example, each BeginGUI() and EndGUI() calls will be separated into different entries, making it more clear how many times each is being called compared to the Hierarchy mode.

Perhaps, the most useful mode for the CPU Usage Area is the Timeline mode option (not to be confused with the main Timeline View). This mode organizes CPU usage during the current frame by how the call stack expanded and contracted during processing.

Timeline mode organizes the Breakdown View vertically into different sections that represent different threads at runtime, such as Main Thread, Render Thread, and various background job threads called Unity Job System, used for loading activity such as scenes and other assets. The horizontal axis represents time, so wider blocks are consuming more CPU time than narrower blocks. The horizontal size also represents relative time, making it easy to compare how much time one function call took compared to another. The vertical axis represents the callstack, so deeper chains represent more calls in the callstack at that time.

Under Timeline mode, blocks at the top of the Breakdown View are functions (or technically, callbacks) called by the Unity Engine at runtime (such as Start(), Awake(), or Update() ), whereas blocks underneath them are functions that those functions had called into, which can include functions on other Components or regular C# objects.

The Timeline mode offers a very clean and organized way to determine which particular method in the callstack consumes the most time and how that processing time measures up against other methods being called during the same frame. This allows us to gauge the method that is the biggest cause of performance problems with minimal effort.

For example, let's assume that we are looking at a performance problem in the following screenshot. We can tell, with a quick glance, that there are three methods that are causing a problem, and they each consume similar amounts of processing time, due to their similar widths:

In the previous screenshot, we have exceeded our 16.667 millisecond budget with calls to three different MonoBehaviour Components. The good news is that we have three possible methods through which we can find performance improvements, which means lots of opportunities to find code that can be improved. The bad news is that increasing the performance of one method will only improve about one-third of the total processing for that frame. Hence, all three methods may need to be examined and optimized in order get back under budget.

Note

It's a good idea to collapse the Unity Job System list when using Timeline mode, as it tends to obstruct the visibility of items shown in the Main Thread block, which is probably what we’re most interested in.

In general, the CPU Usage Area will be most useful for detecting issues that can be solved by solutions that will be explored in Chapter 2, Scripting Strategies.

The GPU Usage Area

The GPU Usage Area is similar to the CPU Usage Area, except that it shows method calls and processing time as it occurs on the GPU. Relevant Unity method calls in this Area will relate to cameras, drawing, opaque and transparent geometry, lighting and shadows, and so on.

The GPU Usage Area offers hierarchical information similar to the CPU Usage Area and estimates time spent calling into various rendering functions such as  Camera.Render() (provided rendering actually occurs during the frame currently selected in the Timeline View).

The GPU Usage Area will be a useful tool to refer to when you go through Chapter 6, Dynamic Graphics.

The Rendering Area

The Rendering Area provides some generic rendering statistics that tend to focus on activities related to preparing the GPU for rendering, which is a set of activities that occur on the CPU (as opposed to the act of rendering, which is activity handled within the GPU and is detailed in the GPU Usage Area). The Breakdown View offers useful information, such as the number of SetPass calls (otherwise known as Draw Calls), the total number of batches used to render the Scene, the number of batches saved from Dynamic Batching and Static Batching and how they are being generated, as well as memory consumed for textures.

The Rendering Area also offers a button to open the Frame Debugger, which will be explored more in Chapter 3, The Benefits of Batching. The rest of this Area's information will prove useful when you go through Chapter 3, The Benefits of Batching, and Chapter 6, Dynamic Graphics.

The Memory Area

The Memory Area allows us to inspect memory usage of the application in the Breakdown View in the following two modes: 

  • Simple mode
  • Detailed mode

Simple mode provides only a high-level overview of memory consumption of subsystems. This include Unity's low-level Engine, the Mono framework (total heap size that is being watched by the Garbage Collector), graphical assets, audio assets and buffers, and even memory used to store data collected by the Profiler.

Detailed mode shows memory consumption of individual GameObjects and MonoBehaviours for both their Native and Managed representations. It also has a column explaining the reason why an object may be consuming memory and when it might be deallocated.

Note

The Garbage Collector is a common feature provided by the various languages Unity supports, which automatically releases any memory we have allocated to store data, but if it is handled poorly it has the potential to stall our application for brief moments. This topic, and many more related topics such as Native and Managed memory spaces, will be explored in Chapter 8, Masterful Memory Management.

Note that information only appears in Detailed mode through manual sampling by clicking on the Take Sample: <TargetName> button. This is the only way to gather information when using Detailed mode, since performing this kind of analysis automatically for each update would be prohibitively expensive.

The Breakdown View also provides a button labelled Gather Object References, which can gather deeper memory information about some objects.

The Memory Area will be a useful tool to use when we dive into the complexities of memory management, Native versus Managed memory, and the Garbage Collector in Chapter 8, Masterful Memory Management.

The Audio Area

The Audio Area grants an overview of audio statistics and can be used both to measure CPU usage from the audio system and total memory consumed by Audio Sources (both for those that are playing or paused) and Audio Clips.

The Breakdown View provides lots of useful insight into how the Audio System is operating and how various audio channels and groups are being used.

The Audio Area may come in handy as we explore art assets in Chapter 4, Kickstart Your Art.

Note

Audio is often overlooked when it comes to performance optimization, but audio can become a surprisingly large source of bottlenecks if it is not managed properly due to the potential amount of hard disk access and CPU processing required. Don’t neglect it!

The Physics 3D and Physics 2D Areas

There are two different Physics Areas, one for 3D physics (Nvidia's PhysX) and another for the 2D physics system (Box2D). This Area provides various physics statistics, such as Rigidbody, Collider, and Contact counts.

The Breakdown View for each Physics Area provides some rudimentary insight into the subsystem’s inner workings, but we can get further insight by exploring the Physics Debugger, which we will introduce in Chapter 5, Faster Physics.

The Network Messages and Network Operations Areas

These two Areas provide information about Unity's Networking System, which was introduced during the Unity 5 release cycle. The information present will depend on whether the application is using the High-Level API (HLAPI) or Transport Layer API (TLAPI) provided by Unity. The HLAPI is a more easy-to-use system for managing Player and GameObject network synchronization automatically, whereas the TLAPI is a thin layer that operates just above the socket level, allowing Unity developers to conjure up their own networking system.

Optimizing network traffic is a subject that fills an entire book all by itself, where the right solution is typically very dependent on the particular needs of the application. This will not be a Unity-specific problem, and as such, the topic of network traffic optimization will not be explored in this book.

The Video Area

If our application happens to make use of Unity's VideoPlayer API, then we might find this Area useful for profiling video playback behavior.

Optimization of media playback is also a complex, non-Unity-specific topic and will not be explored in this book.

The UI and UI Details Areas

These Areas are new in Unity 2017 and provide insight into applications making use of Unity's built-in User Interface System. If we’re using a custom-built or 3rd-party User Interface System (such as NGUI), then these Areas will probably provide little benefit.

Poorly optimized user interface can often affect one or both of the CPU and GPU, so we will investigate some code optimization strategies for UI in Chapter 2, Scripting Strategies, and graphics-related approaches in Chapter 6, Dynamic Graphics.

The Global Illumination Area

The Global Illumination Area is another new Area in Unity 2017, and gives us a fantastic amount of detail into Unity's Global Illumination (GI) system. If our application makes use of GI, then we should refer to this Area to verify that it is performing properly.

This Area may become useful as we explore lighting and shadowing in Chapter 6, Dynamic Graphics.

 

Best approaches to performance analysis


Good coding practices and project asset management often make finding the root cause of a performance issue relatively simple, at which point the only real problem is figuring out how to improve the code. For instance, if the method only processes a single gigantic for loop, then it will be a pretty safe assumption that the problem is either with how many iterations the loop is performing, whether or not the loop is causing cache misses by reading memory in a non-sequential fashion, how much work is done in each iteration, or how much work it takes to prepare for the next iteration.

Of course, whether we're working individually or in a group setting, a lot of our code is not always written in the cleanest way possible, and we should expect to have to profile some poor coding work from time to time. Sometimes we are forced to implement a hacky solution for the sake of speed, and we don't always have the time to go back and refactor everything to keep up with our best coding practices. In fact, many code changes made in the name of performance optimization tend to appear very strange or arcane, often making our codebase more difficult to read. The common goal of software development is to make code that is clean, feature-rich, and fast. Achieving one of these is relatively easy, but the reality is that achieving two will cost significantly more time and effort, while achieving all three is a near-impossibility.

At its most basic level, performance optimization is just another form of problem-solving, and overlooking the obvious while problem-solving can be an expensive mistake. Our goal is to use benchmarking to observe our application looking for instances of problematic behavior, then use instrumentation to hunt through the code for clues about where the problem originates. Unfortunately, it's often very easy to get distracted by invalid data or jump to conclusions because we're being too impatient or missed a subtle detail. Many of us have run into occasions during software debugging where we could have found the root cause of the problem much faster if we had simply challenged and verified our earlier assumptions. Hunting down performance issues is no different.

A checklist of tasks would be helpful to keep us focused on the issue, and not waste time chasing so-called ghosts. Of course, every project is different, with its own unique challenges to overcome, but the following checklist is general enough that it should be able to apply to any Unity project:

  • Verifying that the target script is present in the Scene
  • Verifying that the script appears in the Scene the correct number of times
  • Verifying the correct order of events
  • Minimizing ongoing code changes
  • Minimizing internal distractions
  • Minimizing external distractions

Verifying script presence

Sometimes, there are things we expect to see, but don't. These are usually easy to spot because the human brain is very good at pattern recognition and spotting differences we didn't expect. Meanwhile, there are times where we assume that something has been happening, but it didn't. These are generally more difficult to notice, because we're often scanning for the first kind of problem, and we’re assuming that the things we don’t see are working as intended. In the context of Unity, one problem that manifests itself this way is verifying that the scripts we expect to be operating are actually present in the Scene.

Script presence can be quickly verified by typing the following into the Hierarchy window textbox:

t:<monobehaviour name>

For example, typing t:mytestmonobehaviour (note that it is not case-sensitive) into the Hierarchy textbox will show a shortlist of all GameObjects that currently have at least one MyTestMonoBehaviour script attached as a Component.

Note

Note that this shortlist feature also includes any GameObjects with Components that derive from the given script name.

We should also double-check that the GameObjects they are attached to are still enabled, since we may have disabled them during earlier testing since someone or something may have accidentally deactivated the object.

Verifying script count

If we’re looking at our Profiler data and note that a certain MonoBehaviour method is being executed more times than expected, or is taking longer than expected, we might want to double-check that it only occurs as many times in the Scene as we expect it to. It’s entirely possible that someone created the object more times than expected in the Scene file, or that we accidentally instantiated the object more than the expected number of times from code. If so, the problem could be due to conflicting or duplicated method invocations generating a performance bottleneck. We can verify the count using the same shortlist method used in the Best approaches to performance analysis section.

If we expected a specific number of Components to appear in the Scene, but the shortlist revealed more (or less!) than this, then it might be wise to write some initialization code that prevents this from ever happening again. We could also write some custom Editor helpers to display warnings to any level designers who might be making this mistake.

Preventing casual mistakes such as this is essential for good productivity, since experience tells us that if we don't explicitly disallow something, then someone, somewhere, at some point, for whatever reason, will do it anyway. This is likely to cost us a frustrating afternoon hunting down a problem that eventually turned out to be caused by human-error.

Verifying the order of events

Unity applications mostly operate as a series of callbacks from Native code to Managed code. This concept will be explained in more detail in Chapter 8, Masterful Memory Management, but for the sake of a brief summary, Unity's main thread doesn't operate like a simple console application would. In such applications, code would be executed with some obvious starting point (usually a main() function), and we would then have direct control of the game engine, where we will initialize major subsystems, then the game runs in a big while-loop (often called the Game Loop) that checks for user input, updates the game, renders the current Scene, and repeats. This loop only exits once the player chooses to quit the game.

Instead, Unity handles the Game Loop for us, and we expect callbacks such as Awake(), Start(), Update(), and FixedUpdate() to be called at specific moments. The big difference is that we don't have fine-grained control over the order in which events of the same type are called. When a new Scene is loaded (whether it's the first Scene of the game, or a later Scene), every MonoBehaviour Component's Awake() callback gets called, but there's no way of telling which order this will happen in.

So, if we one set of objects that configure some data in their Awake() callback, and then another set of objects do something with that configured data in their own Awake() callback, some reorganization or recreation of Scene objects or a random change in the codebase or compilation process (it's unclear what exactly causes it) may cause the order of these Awake() calls to change, and then the dependent objects will probably try to do things with data that wasn't initialized how we expected. The same goes for all other callbacks provided by MonoBehaviour Components, such as Start() and Update().

There’s no way or telling the order in which the same type of callback gets called among a group of MonoBehaviour Components, so we should be very careful not to assume that object callbacks are happening in a specific order. In fact, its essential practice to never write code in a way that assumes these callbacks will need to be called in a certain order because it could break at any time.

A better place to handle late-stage initialization is in a MonoBehaviour Component's Start() callback, which is always called after every object’s Awake() is called and just before its first Update(). Late-stage updates can also be done in the LateUpdate() callback.

If we’re having trouble determining the actual order of events, then this is best handled by either step-through debugging with an IDE (MonoDevelop, Visual Studio, and so on) or by printing simple logging statements with Debug.Log().

Note

Be warned that Unity's logger is notoriously expensive. Logging is unlikely to change the order of the callbacks, but it can cause some unwanted spikes in performance if used too aggressively. Be smart and do targeted logging only on the most relevant parts of the codebase.

Coroutines are typically used to script some sequence of events, and when they’re triggered will depend on what yield types are being used. The most difficult and unpredictable type to debug is perhaps the WaitForSeconds yield type. The Unity Engine is non-deterministic, meaning that you'll get a slightly different behavior from one session and the next, even on the same hardware. For example, you might get 60 updates called during the first second of application runtime during one session, 59 in the next, and 62 in the one after that. In another session, you might get 61 updates in the first second, then 60, followed by 59.

A variable number of Update() callbacks will be called between when the Coroutine starts and when it ends, and so if the Coroutine depends on something’s Update() being called a specific number of times, we will run into problems. It’s best to keep Coroutine behavior dead-simple and dependency-free of other behavior once it begins. Breaking this rule may be tempting, but it's essentially guaranteed that some future change is going to interact with the Coroutine in an unexpected way leading to a long, painful debugging session of a game-breaking bug that’s very hard to reproduce.

Minimizing ongoing code changes

Making code changes to the application in order to hunt down performance issues is best done carefully, as the changes are easy to forget as time wears on. Adding debug logging statements to our code can be tempting, but remember that it costs us time to introduce these calls, recompile our code, and remove these calls once our analysis is complete. In addition, if we forget to remove them, then they can cost unnecessary runtime overhead in the final build since Unity's debug Console window logging can be prohibitively expensive in both CPU and memory.

A good way to combat this problem is to add a flag or comment everywhere we made a change with our name so that it's easy to find and remove it later. Hopefully we're also wise enough to use a source-control tool for our codebase making it easy to differentiate the contents of any modified files and revert them to their original state. This is an excellent way to ensure that unnecessary changes don't make it into the final version. Of course, this is by no means a guaranteed solution if we also applied a fix at the same time and didn’t double-check all of our modified files before committing the change.

Making use of breakpoints during runtime debugging is the preferred approach, as we can trace the full callstack, variable data, and conditional code paths (for example, if-else blocks), without risking any code changes or wasting time on recompilation. Of course, this is not always an option if, for example, we're trying to figure out what causes something strange to happen in one out of a thousand frames. In this case, it's better to determine a threshold value to look for and add an if statement with a breakpoint inside, which will be triggered when the value has exceeded the threshold.

Minimizing internal distractions

The Unity Editor has its own little quirks and nuances, which can sometimes make it confusing to debug some kinds of problems.

Firstly, if a single frame takes a long time to process, such that our game noticeably freezes, then the Profiler may not be capable of picking up the results and recording them in the Profiler window. This can be especially annoying if we wish to catch data during application/Scene initialization. The upcoming section, Custom CPU profiling, will offer some alternatives to explore to solve this problem.

One common mistake (that I have admittedly fallen victim to multiple times during the writing of this book) is that if we are trying to initiate a test with a keystroke and have the Profiler open, we should not forget to click back into the Editor's Game window before triggering the keystroke. If the Profiler is the most recently clicked window, then the Editor will send keystroke events to that, instead of the runtime application, and hence no GameObject will catch the event for that keystroke. This can also apply to the GameView for rendering tasks and even Coroutines using the WaitForEndOfFrame yield type. If the Game window is not visible and active in the Editor, then nothing is being rendered to that view, and therefore, no events that rely on the Game window rendering will be triggered. Be warned!

Vertical Sync (otherwise known as VSync) is used to match the application's frame rate to the frame rate of the device it is being displayed to, for example, a monitor may run at 60 Hertz (60 cycles per-second), and if a rendering loop in our game is running faster than this then it will sit and wait until that time has elapsed before outputting the rendered frame. This feature reduces screen-tearing which occurs when a new image is pushed to the monitor before the previous image was finished, and for a brief moment part of the new image overlaps the old image.

Executing the Profiler with VSync enabled will probably generate a lot of noisy spikes in the CPU Usage Area under the WaitForTargetFPS heading, as the application intentionally slows itself down to match the frame rate of the display. These spikes often appear very large in Editor Mode since the Editor is typically rendering to a very small window, which doesn’t take a lot of CPU or GPU work to render.

This will generate unnecessary clutter, making it harder to spot the real issue(s). We should ensure that we disable the VSync checkbox under the CPU Usage Area when we're on the lookout for CPU spikes during performance tests. We can disable the VSync feature entirely by navigating to Edit | Project Settings | Quality and then to the sub-page for the currently selected platform.

We should also ensure that a drop in performance isn't a direct result of a massive number of exceptions and error messages appearing in the Editor Console window. Unity's Debug.Log() and similar methods, such as Debug.LogError() and  Debug.LogWarning() are notoriously expensive in terms of CPU usage and heap memory consumption, which can then cause garbage collection to occur and even more lost CPU cycles (refer to Chapter 8, Masterful Memory Management, for more information on these topics).

This overhead is usually unnoticeable to a human being looking at the project in Editor Mode, where most errors come from the compiler or misconfigured objects. However, they can be problematic when used during any kind of runtime process, especially during profiling, where we wish to observe how the game runs in the absence of external disruptions. For example, if we are missing an object reference that we were supposed to assign through the Editor and it is being used in an Update() callback, then a single MonoBehaviour could throw new exceptions every single update. This adds lots of unnecessary noise to our profiling data.

Note that we can hide different log level types with the buttons shown in the next screenshot. The extra logging still costs CPU and memory to execute, even though they are not being rendered, but it does allow us to filter out the junk we don't want. Although, it is often a good practice to keep all of these options enabled to verify that we're not missing anything important.

Minimizing external distractions

This one is simple but absolutely necessary. We should double-check that there are no background processes eating away CPU cycles or consuming vast swathes of memory. Being low on available memory will generally interfere with our testing, as it can cause more cache misses, hard-drive access for virtual memory page-file swapping, and generally slow responsiveness of the application. If our application is suddenly behaving significantly worse than what we expected, double-check the system's task manager (or equivalent) for any CPU/memory/hard disk activity, which might be causing problems.

Targeted profiling of code segments

If our performance problem isn't resolved by the checklist mentioned previously, then we probably have a real issue on our hands that demands further analysis. The Profiler window is effective at showing us a broad overview of performance; it can help us find specific frames to investigate and can quickly inform us which MonoBehaviour and/or method may be causing issues. We would then need to figure out whether the problem is reproducible, under what circumstances a performance bottleneck arises, and where exactly within the problematic code block the issue is originating from.

To accomplish these, we will need to perform some profiling of targeted sections of our code, and there are a handful of useful techniques we can employ for this task. For Unity projects, they essentially fit into two categories:

  • Controlling the Profiler from script code
  • Custom timing and logging methods

Note

Note that the next section focuses on how to investigate Scripting bottlenecks through C# code. Detecting the source of bottlenecks in other engine subsystems will be discussed in their related chapters.

Profiler script control

The Profiler can be controlled in script code through the Profiler class. There are several useful methods in this class that we can explore within the Unity documentation, but the most important methods are the delimiter methods that activate and deactivate profiling at runtime. These can be accessed through the UnityEngine.Profiling.Profiler class through its BeginSample() and EndSample() methods.

Note

Note that the delimiter methods, BeginSample() and EndSample(), are only compiled in development builds, and as such, they will not be compiled or executed in release builds where Development Mode is unchecked. This is commonly known as non-operation or no-op code.

The BeginSample() method has an overload that allows a custom name for the sample to appear in the CPU Usage Area's Hierarchy mode. For example, the following code will profile invocations of this method and make the data appear in the Breakdown View under a custom heading, as follows:

void DoSomethingCompletelyStupid() { 
  Profiler.BeginSample("My Profiler Sample");  
  List<int> listOfInts = new List<int>();  
  for(int i = 0; i < 1000000; ++i) {    
    listOfInts.Add(i);  
  }
  Profiler.EndSample();
}

Note

You can download the example code files from your account at http://www.packtpub.com for all the Packt Publishing books you have purchased. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files emailed directly to you.

We should expect that invoking this poorly designed method (which generates a List containing a million integers, and then does absolutely nothing with it) will cause a huge spike in CPU usage, chew up several megabytes of memory, and appear in the ProfilerBreakdown View under the heading My Profiler Sample, as the following screenshot shows:

Custom CPU Profiling

The Profiler is just one tool at our disposal. Sometimes, we may want to perform customized profiling and logging of our code. Maybe we're not confident that the Unity Profiler is giving us the right answer, maybe we consider its overhead cost too great, or maybe we just like having complete control of every single aspect of our application. Whatever our motivations, knowing some techniques to perform an independent analysis of our code is a useful skill to have. It's unlikely we'll only be working with Unity for the entirety of our game development careers, after all.

Profiling tools are generally very complex, so it's unlikely we would be able to generate a comparable solution on our own within a reasonable time frame. When it comes to testing CPU usage, all we should really need is an accurate timing system, a fast, low-cost way of logging that information, and some piece of code to test against. It just so happens that the .NET library (or, technically, the Mono framework) comes with a Stopwatch class under the System.Diagnostics namespace. We can stop and start a Stopwatch object at any time, and we can easily acquire a measure of how much time has passed since the Stopwatch was started.

Unfortunately, this class is not perfectly accurate; it is accurate only to milliseconds, or tenths of a millisecond, at best. Counting high-precision real time with a CPU clock can be a surprisingly difficult task when we start to get into it; so, in order to avoid a detailed discussion of the topic, we should try to find a way for the Stopwatch class to satisfy our needs.

If precision is important, then one effective way to increase it is by running the same test multiple times. Assuming that the test code block is both easily repeatable and not exceptionally long, we should be able to run thousands, or even millions of tests within a reasonable time frame and then divide the total elapsed time by the number of tests we just performed to get a more accurate time for a single test.

Before we get obsessed with the topic of high precision, we should first ask ourselves if we even need it. Most games expect to run at 30 FPS or 60 FPS, which means that they only have around 33 milliseconds or 16 milliseconds, respectively, to compute everything for the entire frame. So, hypothetically, if we need to bring only the performance of a particular code block under ten milliseconds, then repeating the test thousands of times to get microsecond precision is too many orders of magnitude away from the target to be worthwhile.

The following is a class definition for a custom timer that uses a Stopwatch to count time for a given number of tests:

using System;
using System.Diagnostics;

public class CustomTimer : IDisposable {
  private string _timerName;
  private int _numTests;
  private Stopwatch _watch;

  // give the timer a name, and a count of the 
  // number of tests we're running
  public CustomTimer(string timerName, int numTests) {
    _timerName = timerName;
    _numTests = numTests;
    if (_numTests <= 0) {
      _numTests = 1;
    }
    _watch = Stopwatch.StartNew();
  }

    // automatically called when the 'using()' block ends
    public void Dispose() {
    _watch.Stop();
    float ms = _watch.ElapsedMilliseconds;
    UnityEngine.Debug.Log(string.Format("{0} finished: {1:0.00} " + 
        "milliseconds total, {2:0.000000} milliseconds per-test " + 
        "for {3} tests", _timerName, ms, ms / _numTests, _numTests));
    }
}

Note

Adding an underscore before member variable names is a common and useful way of distinguishing a class' member variables (also known as fields) from a method's arguments and local variables.

The following is an example of the CustomTimer class usage:

const int numTests = 1000;
using (new CustomTimer("My Test", numTests)) {
  for(int i = 0; i < numTests; ++i) {
    TestFunction();
  }
} // the timer's Dispose() method is automatically called here

There are three things to note when using this approach. Firstly, we are only making an average of multiple method invocations. If processing time varies enormously between invocations, then that will not be well represented in the final average.

Secondly, if memory access is common, then repeatedly requesting the same blocks of memory will result in an artificially higher cache hit rate (where the CPU can find data in memory very quickly because it's accessed the same region recently), which will bring the average time down when compared to a typical invocation.

Thirdly, the effects of Just-In-Time (JIT) compilation will be effectively hidden for similarly artificial reasons, as it only affects the first invocation of the method. JIT compilation is a .NET feature that will be covered in more detail in Chapter 8, Masterful Memory Management.

The using block is typically used to safely ensure that unmanaged resources are properly destroyed when they go out of scope. When the using block ends, it will automatically invoke the object's Dispose() method to handle any cleanup operations. In order to achieve this, the object must implement the IDisposable interface, which forces it to define the Dispose() method.

However, the same language feature can be used to create a distinct code block, which creates a short-term object, which then automatically processes something useful when the code block ends, which is how it is being used in the preceding code block.

Note

Note that the using block should not be confused with the using statement, which is used at the start of a script file to pull in additional namespaces. It's extremely ironic that the keyword for managing namespaces in C# has a naming conflict with another keyword.

As a result, the using block and the CustomTimer class give us a clean way of wrapping our test code in a way that makes it obvious when and where it is being used.

Another concern to worry about is application warm-up time. Unity has a significant startup cost when a Scene begins, given the amount of data that needs to be loaded from disk, the initialization of complex subsystems, such as the Physics and Rendering Systems, and the number of calls to various Awake() and Start() callbacks that need to be resolved before anything else can happen. This early overhead might only last a second, but that can have a significant effect on the results of our testing if the code is also executed during this early initialization period. This makes it crucial that if we want an accurate test, then any runtime testing should begin only after the application has reached a steady state.

Ideally, we would be able to execute the target code block in its own Scene after its initialization has completed. This is not always possible, so as a backup plan, we could wrap the target code block in an Input.GetKeyDown() check in order to assume control over when it is invoked. For example, the following code will execute our test method only when the spacebar is pressed:

if (Input.GetKeyDown(KeyCode.Space)) {
  const int numTests = 1000;
  using (new CustomTimer("Controlled Test", numTests)) {
    for(int i = 0; i < numTests; ++i) {
      TestFunction();
    }
  }
}

As mentioned previously, Unity's Console window logging mechanism is prohibitively expensive. As a result, we should try not to use these logging methods in the middle of a profiling test (or during gameplay, for that matter). If we find ourselves absolutely needing detailed profiling data that prints out lots of individual messages (such as performing a timing test on a loop to figure out which iteration is costing more time than the rest), then it would be wiser to cache the logging data and print it all out at the end, as the CustomTimer class does. This will reduce runtime overhead, at the cost of some memory consumption. The alternative is that many milliseconds are lost to printing each Debug.Log() message in the middle of the test, which pollutes the results.

The CustomTimer class also makes use of string.Format(). This will be covered in more detail in Chapter 8, Masterful Memory Management, but the short explanation is that this method is used because generating custom string object using the + operator (for example, code such as Debug.Log("Test: " + output);) can result in a surprisingly large amount of memory allocations, which attracts the attention of the Garbage Collector. Doing otherwise would conflict with our goal of achieving accurate timing and analysis and should be avoided.

 

Final thoughts on Profiling and Analysis


One way of thinking about performance optimization is the act of stripping away unnecessary tasks that spend valuable resources. We can do the same and maximize our own productivity through minimizing any wasted effort. Effective use of the tools we have at our disposal is of paramount importance. It would serve us well to optimize our own workflow by keeping aware of some best practices and techniques.

Most, if not all, advice for using any kind of data-gathering tool properly can be summarized into three different strategies:

  • Understanding the tool
  • Reducing noise
  • Focusing on the issue

Understanding the Profiler

The Profiler is an arguably well-designed and intuitive tool, so understanding the majority of its feature set can be gained by simply spending an hour or two exploring its options with a test project and reading its documentation. The more we know about a tool in terms of its benefits, pitfalls, features, and limitations, the more sense we can make of the information it is giving us, so it is worth spending the time to use it in a playground setting. We don't want to be two weeks away from release, with a hundred performance defects to fix, with no idea how to do performance analysis efficiently.

For example, always remain aware of the relative nature of the Timeline View's graphical display. The Timeline View does not provide values on its vertical axis and automatically readjusts this axis based on the content of the last 300 frames; it can make small spikes appear to be a bigger problem than they really are because of the relative change. So, just because a spike or resting state in the timeline seems large and threatening does not necessarily mean there is a performance issue.

Several Areas in the Timeline View provide helpful benchmark bars, which appear as horizontal lines with a timing and FPS value associated with them. These should be used to determine the magnitude of the problem. Don't let the Profiler trick us into thinking that big spikes are always bad. As always, it's only important if the user will notice it.

As an example, if a large CPU usage spike does not exceed the 60 FPS or 30 FPS benchmark bars (depending on the application's target frame rate), then it would be wise to ignore it and search elsewhere for CPU performance issues, as no matter how much we improve the offending piece of code, it will probably never be noticed by the end user, and therefore isn't a critical issue that affects user experience.

Reducing noise

The classical definition of noise (at least in the realm of computer science) is meaningless data, and a batch of profiling data that was blindly captured with no specific target in mind is always full of data that won't interest us. More sources of data takes more time to mentally process and filter, which can be very distracting. One of the best methods to avoid this is to simply reduce the amount of data we need to process by stripping away any data deemed non-vital to the current situation.

Reducing clutter in the Profiler's graphical interface will make it easier to determine which subsystems are causing a spike in resource usage. Remember to use the colored checkboxes in each Timeline View Area to narrow the search.

Note

Be warned that these settings are auto-saved in the Editor, so ensure that you re-enable them for the next profiling session, as this might cause us to miss something important next time.

Also, GameObjects can be deactivated to prevent them from generating profiling data, which will also help reduce clutter in our profiling data. This will naturally cause a slight performance boost for each object we deactivate. However, if we're gradually deactivating objects and performance suddenly becomes significantly more acceptable when a specific object is deactivated, then clearly that object is related to the root cause of the problem.

Focusing on the issue

This category may seem redundant, given that we've already covered reducing noise. All we should have left is the issue at hand, right? Not exactly. Focus is the skill of not letting ourselves become distracted by inconsequential tasks and wild goose chases.

Recall that profiling with the Unity Profiler comes with a minor performance cost. This cost is even more severe when using the Deep Profiling option. We might even introduce more minor performance costs into our application with additional logging. It's easy to forget when and where we introduced profiling code if the hunt continues for several hours.

We are effectively changing the result by measuring it. Any changes we implement during data sampling can sometimes lead us to chase after non-existent bugs in the application, when we could have saved ourselves a lot of time by attempting to replicate the scenario without additional profiling instrumentation. If the bottleneck is reproducible and noticeable without profiling, then it's a candidate to begin an investigation. However, if new bottlenecks keep appearing in the middle of an existing investigation, then keep in mind that they could be bottlenecks we introduced with our test code and not an existing problem that's been newly exposed.

Finally, when we have finished profiling, completed our fixes, and are now ready to move on to the next investigation, we should make sure to profile the application one last time to verify that the changes have had the intended effect.

 

Summary


You learned a great deal throughout this chapter on how to detect and analyze performance issues within your application. You learned about many of the Profiler's features and secrets, explored a variety of tactics to investigate performance issues with a more hands-on approach, and have been introduced to a variety of different tips and strategies to follow. You can use these to improve your productivity immensely, so long as you appreciate the wisdom behind them and remember to exploit them when the situation makes it possible.

This chapter has introduced us to the tips, tactics, and strategies we will need to find a performance problem that needs improvement. In the remaining chapters, we will explore methods on how to fix issues and improve performance whenever possible. So, give yourself a pat on the back for getting through the boring part first, and let's move on to learning some approaches to optimize our C# scripts.

 

 

 

About the Author
  • Chris Dickinson

    Chris Dickinson grew up in a quiet little corner of England with a strong passion for mathematics, science and, in particular, video games. He loved playing them, dissecting their gameplay, and trying to figure out how they worked. Watching his dad hack the hex code of a PC game to get around the early days of copy protection completely blew his mind! His passion for science won the battle at the time; however, after completing a master's degree in physics with electronics, he flew out to California to work in the field of scientific research in the heart of Silicon Valley. Shortly afterward, he had to admit to himself that research work was an unsuitable career path for his temperament. After firing resumes in all directions, he landed a job that finally set him on the correct course in the field of software engineering (this is not uncommon for physics grads, I hear). His time working as an automated tools developer for IPBX phone systems fit his temperament much better. Now he was figuring out complex chains of devices, helping its developers fix and improve them, and building tools of his own. Chris learned a lot about how to work with big, complex, real-time, event-based, user-input driven state machines (sounds familiar?). Being mostly self-taught at this point, Chris's passion for video games was flaring up again, pushing him to really figure out how video games were built. Once he felt confident enough, he returned to school for a bachelor's degree in game and simulation programming. By the time he was done, he was already hacking together his own (albeit rudimentary) game engines in C++ and regularly making use of those skills during his day job. However, if you want to build games, you should just build games, and not game engines. So, Chris picked his favorite publically available game engine at the time--an excellent little tool called Unity 3D--and started hammering out some games. After a brief stint of indie game development, Chris regretfully decided that the demands of that particular career path weren't for him, but the amount of knowledge he had accumulated in just a few short years was impressive by most standards, and he loved to make use of it in ways that enabled other developers with their creations. Since then, Chris has authored a tutorial book on game physics (Learning Game Physics with Bullet Physics and OpenGL, Packt Publishing) and two editions of a Unity performance optimization book (which you are currently reading). He has married the love of his life, Jamie, and works with some of the coolest modern technology as a software development engineer in Test (SDET) at Jaunt Inc. in San Mateo, CA, a Virtual Reality/Augmented Reality startup that focuses on delivering VR and AR experiences, such as 360 videos (and more!). Outside of work, Chris continues to fight an addiction to board games (particularly Battlestar: Galactica and Blood Rage), an obsession with Blizzard's Overwatch and Starcraft II, cater to the ever-growing list of demands from a pair of grumpy yet adorable cats, and gazing forlornly at the latest versions of Unity with a bunch of game ideas floating around on paper. Someday soon, when the time is right (and when he stops slacking off), his plans may come to fruition

    Browse publications by this author
Latest Reviews (5 reviews total)
Very useful tips (also) for advanced Unity developers.
Ótimas formas de pagamento. Nunca tive problema com compra alguma
Bon livre, se lit facilement. Contenu intéressant.
Unity 2017 Game Optimization - Second Edition
Unlock this book and the full library FREE for 7 days
Start now