Metal API: Get closer to the bare metal with Metal API

Chuck Gaffney

February 2016

The Metal framework supports 3D graphics rendering and other data computing commands. Metal is used in game designing to reduce the CPU overhead.

In this article we'll cover:

  • CPU/GPU framework levels
  • Graphics pipeline overview

(For more resources related to this topic, see here.)

The Apple Metal API and the graphics pipeline

One of the rules, if not the golden rule of modern video game development, is to keep our games running constantly at 60 frames per second or greater. If developing for VR devices and applications, this is of even more importance as dropped frame rates could lead to a sickening and game ending experience for the player.

In the past, being lean was the name of the game; hardware limitations prevented much from not only being written to the screen but how much memory storage a game could hold. This limited the number of scenes, characters, effects, and levels. In the past, game development was built more with an engineering mindset, so the developers made the things work with what little they had. Many of the games on 8-bit systems and earlier had levels and characters that were only different because of elaborate sprite slicing and recoloring.

Over time, advances in hardware, particularly that of GPUs allowed for richer graphical experiences. This leads to the advent of computation-heavy 3D models, real-time lighting, robust shaders, and other effects that we can use to make our games present an even greater player experience; this while trying to stuff it all in that precious .016666 second/60 Hz window.

To get everything out of the hardware and combat the clash between a designer's need to make the best looking experience and the engineering reality of hardware limitations in even today's CPU/GPUs, Apple developed the Metal API.

CPU/GPU framework levels

Metal is what's known as a low-level GPU API. When we build our games on the iOS platform, there are different levels between the machine code in our GPU/CPU hardware and what we use to design our games. This goes for any piece of computer hardware we work with, be it Apple or others. For example, on the CPU side of things, at the very base of it all is the machine code. The next level up is the assembly language of the chipset. Assembly language differs based on the CPU chipset and allows the programmer to be as detailed as determining the individual registers to swap data in and out of in the processor. Just a few lines of a for-loop in C/C++ would take up a decent number of lines to code in assembly. The benefit of working in the lower levels of code is that we could make our games run much faster. However, most of the mid-upper level languages/APIs are made to work well enough so that this isn't a necessity anymore.

Game developers have coded in assembly even after the very early days of game development. In the late 1990's, the game developer Chris Sawyer created his game, Rollercoster Tycoon™, almost entirely in the x86 assembly language! Assembly can be a great challenge for any enthusiastic developer who loves to tinker with the inner workings of computer hardware.

Moving up the chain we have where C/C++ code would be and just above that is where we'd find Swift and Objective-C code. Languages such as Ruby and JavaScript, which some developers can use in Xcode, are yet another level up.

That was about the CPU, now on to the GPU. The Graphics Processing Unit (GPU) is the coprocessor that works with the CPU to make the calculations for the visuals we see on the screen. The following diagram shows the GPU, the APIs that work with the GPU, and possible iOS games that can be made based on which framework/API is chosen.

Like the CPU, the lowest level is the processor's machine code. To work as close to the GPU's machine code as possible, many developers would use Silicon Graphics' OpenGL API. For mobile devices, such as the iPhone and iPad, it would be the OpenGL subset, OpenGL ES. Apple provides a helper framework/library to OpenGL ES named GLKit. GLKit helps simplify some of the shader logic and lessen the manual work that goes into working with the GPU at this level. For many game developers, this was practically the only option to make 3D games on the iOS device family originally; though some use of iOS's Core Graphics, Core Animation and UIKit frameworks were perfectly fine for simpler games.

Not too long into the lifespan of the iOS device family, third-party frameworks came into play, which were aimed at game development. Using OpenGL ES as its base, thus sitting directly one level above it, is the Cocos2D framework. This was actually the framework used in the original release of Rovio's Angry Birds™ series of games back in 2009. Eventually, Apple realized how important gaming was for the success of the platform and made their own game-centric frameworks, that is, the SpriteKit and SceneKit frameworks. They too, like Cocos2D/3D, sat directly above OpenGL ES. When we made SKSprite nodes or SCNNodes in our Xcode projects, up until the introduction of Metal, OpenGL operations were being used to draw these objects in the update/render cycle behind the scenes. As of iOS 9, SpriteKit and SceneKit use Metal's rendering pipeline to process graphics to the screen. If the device is older, they revert to OpenGL ES as the underlying graphics API.

Graphics pipeline overview

Let's take a look at the graphics pipeline to get an idea, at least on an upper level, of what the GPU is doing during a single rendered frame. We can imagine the graphical data of our games being divided in two main categories:

  • Vertex data: This is the position information of where on the screen this data can be rendered. Vector/vertex data can be expressed as points, lines, or triangles. Remember the old saying about video game graphics, "everything is a triangle." All of those polygons in a game are just a collection of triangles via their point/vector positions. The GPU's Vertex Processing Unit (VPU) handles this data.
  • Rendering/pixel data: Controlled by the GPU's Rasterizer, this is the data that tells the GPU how the objects, positioned by the vertex data, will be colored/shaded on the screen. For example, this is where color channels, such as RGB and alpha, are handled. In short, it's the pixel data and what we actually see on the screen.

Here's a diagram showing the graphics pipeline overview:

The graphics pipeline is the sequence of steps it takes to have our data rendered to the screen. The previous diagram is a simplified example of this process. Here are the main sections that can make up the pipeline:

  • Buffer objects: These are known as Vertex Buffer Objects in OpenGL and are of the class MTLBuffer in the Metal API. These are the objects we create in our code that are sent from the CPU to the GPU for primitive processing. These objects contain data, such as the positions, normal vectors, alphas, colors, and more.
  • Primitive processing: These are the steps in the GPU that take our Buffer Objects, break down the various vertex and rendering data in those objects, and then draw this information to the frame buffer, which is the screen output we see on the device.

Before we go over the steps of primitive processing done in Metal, we should first understand the history and basics of shaders.


This article gives us precise knowledge about CPU/GPU framework levels and Graphics pipeline. We also learned that to overcome hardware limitations in even today's CPU/GPUs world, Apple developed the Metal API.

To learn more about iOS for game development, the following books published by Packt Publishing ( are recommended:

Resources for Article:


Further resources on this subject:

You've been reading an excerpt of:

iOS 9 Game Development Essentials

Explore Title
comments powered by Disqus