In augmented reality, the reality around us is layered on with virtual content. Whether that is an immersive 3D experience or simple text and indicators, virtual reality is both an old concept and a rising new technology. In this chapter we will go over the concept of augmented reality in its many forms, to broaden our view of the scope of the concept and how it can be utilized. We will also go over the tools that will help us materialize this concept in the explosively growing mobile platforms, particularly iOS devices. The following image shows an augmented reality game:
Augmented reality (AR) in its broadest and simplest definition is the technology that enables the addition of virtual content to the real world. This is usually associated with the addition of 3D content to a live feed from camera, though the term in itself has a much broader meaning and usage.
Perhaps the simplest form of augmented reality that people have been using for decades is the one available in photo cameras more than a decade ago. Many use it, but very few realized the nature of the concept applied. It's the part of the camera called "the viewfinder", which is the little window you look through to view the world through the camera. This little window is in fact an augmented reality in a very simple form. What it fundamentally does is look at the world around it through the lens and then add a layer printed on glass to highlight the center of the lens and the borders of the image to be captured. What it did here is what augmented reality in all its forms aspires to do, which is to layer relative information over the real world.
In mobile platforms, augmented reality works on the same principles even if the method is slightly different. The camera captures a live feed of the world around it, and then the computer vision systems try to get a bearing in the visible 3D space and display the augmented reality in a way that is seamless with the world. The process of calculating the relative position of the user to the reality around, to be able to correctly augment the content for the user, is called tracking.
Augmented reality can take many forms. It always depends in one form or another on a technique to calculate the relative 3D space to the reality around us. It can achieve that using many technologies. For example, the Gyroscope on an iPhone can be used to track the placement of the phone in the 3D space that can be used to track the world relative to the device if movement is applied. That is usually seen in a number of augmented reality games for the device. That is certainly a form of AR but in this book, we will be mainly concerned with one form of tracking that uses computer vision.
Computer vision tracking can be divided into two sections, marker tracking and markerless tracking. In marker tracking, there is a physical entity that the computer vision is trained to track; it positions the camera's perspective relative to it. The physical object used is usually called a trackable. The trackable is usually handled internally to be the origin and the center of the world, which the computer vision can orient itself to. Sometimes, the camera live feed is the one considered to be the center of the world, and the trackable or trackables are objects that orbit it in space so to speak.
Markerless tracking techniques are essentially similar to marker tracking in that they try to find an origin point to augment the reality relative to. They differ in the way they find the origin point that, unlike marker tracking, they achieve without using a predefined physical object that the computer vision is only trained to follow. In Markerless tracking, the computer vision is mainly programmed to follow certain colors and shapes with a degree of freedom. For example, the computer can be trained to follow green objects of a certain shade and cover them completely with a blue one. In this case, it simply tracks the color; if it finds a green area in the camera feed, it augments it with a blue virtual object. Computer vision can even be trained to recognize faces, such as all the famous camera apps that add animated effects around the user's head or face. Markerless tracking is definitely more versatile, but it offers less reliability than marker tracking. Also Markerless tracking is naturally more complicated to develop, contributing to the popularity of the Marker tracking augmented reality.
In this book, we will use the Vuforia SDK, which is an SDK that uses Marker tracking techniques. We will use this with the Unity 3D engine to deliver augmented reality experiences on iOS devices. In utilizing both technologies, we will be familiarizing ourselves with the workflow of creating augmented reality.
As we have established, the concept of augmented reality is an old one. It is even woven into the pop-culture in sci-fi movies as old as we can remember. Augmented reality as a technology did not reach the mainstream till quite recently. In the past, augmented reality was considered a niche, because of the expensive setup it needed to function. Augmented reality is demanding when it comes to hardware. It needs a camera to view the world with, computational power to calculate and render the augmented content, and a way for the user to interact with the virtual content. All of this was difficult to attain for mainstream users.
Today, almost everyone is walking around with very capable computers in their pockets able to render graphics content to a large degree of realism. Those smartphones are evolving at an unprecedented pace that makes them more and more powerful by the month. And best of all, they come with an accurate camera, fulfilling all the three needs for augmented reality.
It is not very inaccurate to assume that everyone is walking around with an augmented reality-capable machine at his or her will. That alone eradicated the barrier to accessibility that was present for so long. Now augmented reality content can reach millions of users for an unprecedented immersive experience.
A lot of companies understood the importance of the trend in the industry and its potential. Perhaps in the lead is Qualcomm, the biggest mobile chip manufacturer in the world. Qualcomm realized the huge potential of AR present in mobile phones, and developed the free SDK Vuforia. Vuforia, known as QCAR in the past, was created to enable developers to tap that potential in the mobile space. Vuforia started out on Android platforms, and later expanded to include iOS devices as well. Qualcomm always includes subtle optimizations to AR on their chips to further improve the experience. This shows how much they believe in the future of the technology. Qualcomm even invested in making a more mobile-friendly OpenCV SDK called Easy CV. Easy CV is a tool for image processing and computer vision that can further enhance the experience of AR along other uses that involve computer vision.
Google also is heavily invested in the concept of augmented reality with their Google Glass project. Google Glass is perhaps the most ambitious augmented reality project under development right now. It promises wearable computers for the mainstream in the form of a head mounted display equipped with a camera. The design is to be unobtrusive, but at the same time efficient at displaying augmented reality data based on the input of the real world. Interaction will be in the form of voice commands and it will be able to access the Internet. The project is still in its infancy but the fact that Google is investing so many resources, clearly indicates the importance of the rising AR technology.
With the accessibility of augmented reality hardware, the support of major corporations, and the huge market available, augmented reality has everything it needs to thrive and stay for a long time. This is why it is important to familiarize us with the concept and its potential.
Immersion is the factor in which the user is engrossed in the world you presented to them. The more believable the world, the more immersed the user will be, and the more successful the message the experience is trying to convey. The successful developer will try to achieve the highest level of immersion possible.
The human mind will always try to make sense of what it's seeing; that is true for all human interactions. This fact is particularly interesting for virtual interaction because what the human mind is trying to make sense of is not physically there. The more elaborate the lie, the easier the mind will believe it. So the art of immersion is the art of telling the perfect lie to the mind. And as all good liars will say, if they were honest for that moment, the way to tell the perfect lie is to mix it with the truth. By that definition, augmented reality is the perfect way of telling a lie.
By mixing the virtual content with the real world, the user feels connected to the content presented in a way most other virtual medias fall short of. Watching a user interact with augmented reality content for the first time is always wonderful. Often, we can see that the user forgets for a moment that they are watching the virtual content through the screen of their device and try to grab it with their hands as if to check it's not really there. It happens almost consistently and certainly subconsciously. This is indicative of how much the user is immersed in the action.
What adds to the immersion as well is the way the user can interact with the augmented reality content. The user can view the content from almost all angles. They can walk around it, come close to it, and walk away from it. The fact that it stays consistent with the world around them, maintains the connection between the user and the content. If the experience is mixed with the right audio and/or video content, it can be something that brings a smile on the user's face.
Interactivity can even come in the form of a game structure that allows the user to directly affect the content being displayed. Interactivity of this kind can be very entertaining for the user and a fresh way of playing a game.
Vuforia is a great offering from Qualcomm that gave the augmented reality industry a great boost. It has one of the fastest tracking algorithms in the market that is less prone to trackable occlusion and even low light conditions. This makes the apps created using the SDK user-friendly and easy to use. Best of all, the Vuforia SDK is offered for free, making it widely used with an active community on the forums tackling most issues that might arise.
The SDK is also particularly friendly to developers new to the concept. It is easy to learn with a smooth workflow that just makes sense. Using this SDK will allow developers to deploy simple AR apps in very little time, and still allows them to develop robust and complex AR experiences.
Vuforia offers easy to use components that perform the augmented reality role when interacting together. For example, the SDK offers the ARCamera component. The ARCamera component will automatically take the video camera feed from the device and display it for the use. It will also detect trackables that the developer specified for the camera. The ARCamera will respond to the orientation of the user in relation to the trackable mostly without much intervention from the developer. This simplifies the process of creating an augmented reality experience greatly.
Vuforia also offers a number of tracking solutions that cover a number of situations. The list of components offered in the SDK is as follows:
Image Target: This is the most common form of trackables offered by Vuforia. Using this component, the app can detect any suitable image it has been trained to detect and show the AR content layered on top of it. By simply adding the content to this component and setting what image it needs to track, the AR content will appear relative to the trackable image in the real world. The following image shows Image Target with a 3D object rendered:
Frame Marker: This is a square marker with code embedded around its internal edges. There are 100 coded Frame Markers that Vuforia offers for you that the app can detect using their coded number and display AR content on top of them. Frame Markers can be smaller than Image Targets, and we can add any sort of image inside their borders without having to worry about how well they can be tracked. It's suitable for game pieces or playing cards. With a minimal performance hit, many of them can be tracked simultaneously at the same time. The following image shows a Frame Marker image:
Multi-Targets: Multi-Targets allow developers to track a simple physical box from any angle. The box must have suitably detailed images on it, and must be of a simple shape. Using Multi-Targets can even allow occlusion of AR content from the physical object. It means that if an object is to rotate around the box being tracked, it can be developed so that the 3D is occluded when it passes behind the object being tracked.
Virtual Button: Virtual Button is an interesting technology that can add to the whole AR experience. What this component does is allow the user to touch a physical part of the trackable image, and the app will respond to it. There can be more than one Virtual Button on the Image Target and all can be assigned different events. The following screenshot shows a Virtual Button affecting the color of a rendered object:
With the array of options Vuforia provides, a complete and rich AR experience can be achieved on the powerful smartphones most have with them right now.
In this book, we will be focusing on the most versatile and widely popular tracking technology that is Image Target. Using Image Targets, a natural experience can be delivered to the user because of how relevant the trackable image can be. For example, the trackable image can be an advertisement with information on it but also, if looked at through the AR app, it displays a video playback layered on the image as if the image came to life.
The tracking data of Image Targets are stored in entities called datasets. In datasets, the data of the image such as the edges and contrasting areas are stored, and the ARCamera keeps on processing the live feed video looking for areas that match any of the images inside the dataset. When that happens, the trackable is considered found in the real world and AR content is layered on. The app can have more than one dataset active simultaneously. Each dataset can have up to 100 images. That is a lot of data the app can process in real-time, which shows how powerful Vuforia can be.
The Image Target creation process is also a simple one using Vuforia Target Manager, which can create datasets from images, and even assign a score of how well that particular image can be tracked. The trackability of the image depends on many factors, mainly high contrast and well-defined edges. The following image shows the Vuforia Target Manager website:
Vuforia also offers a number of solutions for Image Target behavior. One of the services it offers is cloud-based recognition. The Cloud Recognition service provided by Qualcomm enables apps to have over one million Image Targets at the same time. It allows an easier management of a large number of targets as well. This service is well suited for large deployment of targets that are subject to change, such as for retail stores to create an AR shopping experience. The service is free but limited to 1000 total images for non-business use and paid but unlimited for business.
Also Vuforia allows the user to create a user-defined Image Target at runtime from a camera shot. This is a great versatile tool that doesn't tie the user to a specific target image that might not always be available for the user every time the app is needed. The following image shows User-defined Target sample app from Vuforia:
There are many ways we can use the great tools provided in the SDK. We will try to cover the basics that will allow the creation of a well-defined AR experience that will resonate with the user.
Unity is a cross-platform game engine that is developed by Unity Technologies. The game engine has a built in IDE and the ability to deploy to numerous platforms. More than one million developers, making it the most popular game engine in the industry to date, are using Unity. It is designed for ease of use and high productivity. And because of its relatively easy learning curve and the fact that there is a free version of it being offered, encouraged some schools to teach Unity as an introduction to game development.
Unity's greatest strength is its ability to deploy on a large number of platforms with ease and few changes to the project's structure. Unity's name comes from that particular strength. Unity can deploy on Windows, OS X, iOS, Android, Web plugin, Flash, Xbox 360, PlayStation 3, and Wii U. That kind of reach opens up a lot of opportunities when developing using Unity engine.
The preceding screenshot can be intimidating for the uninitiated, but through this book we will go over a lot of the basics of Unity engine. In the book we will go over how to create a new project and the deployment process needed to deploy on IOS devices. We will also cover some game development techniques by making our simple AR game and establish how the user can interact with the AR content.
Though Vuforia offers an OpenGL SDK that we can use to create AR apps natively without having to use Unity, Unity offers a lot of tools that would simply take too long to create using OpenGL. Unity is a game engine that offers a lot of tools that can make 3D content look incredibly good and realistic. Some of the best-looking iPhone and iPad games on the iTunes market are created using Unity engine. Some of these games are Dead Trigger and Shadowgun, both incredibly good-looking games on the platform.
Also Unity simplifies game logic greatly with the robust structure it offers. It offers a window into how the 3D graphics will look exactly and even how interactions will look. Unity using Vuforia can utilize a webcam to detect trackables and even show you how exactly the AR content will be on the trackable without having to deploy on the device first. That saves a lot of time that could have been wasted simply deploying on devices to find out that the 3D content doesn't look or behave correctly on the trackable.
Lately Unity opened up their license options to allow anyone to deploy to iOS and Android for free. We do not need to buy their license to be able to deploy simple Vuforia apps. Although Unity pro does offer many strong features, they are not necessary in the course of our book.
In this chapter we were introduced to the meaning and possibilities of augmented reality. It is a very exciting field that has been briefly introduced in this chapter. We have been introduced to the many forms of augmented reality and how it manifested themselves in the hand of the users in the form of smartphones. We know also understand how powerful AR is at delivering immersive experience for users.
We were introduced to Vuforia, the free AR SDK by Qualcomm. We understand how powerful it can be in improving the flow of creating AR apps for users. Having it handle the technicalities of AR and allowing us to focus on making a better experience. We know the many tracking techniques that Vuforia offers and how different of an experience each can deliver. This should allow us to better utilize them in the future.
Unity was introduced to us, we have a vague idea of how powerful that engine is or how it can enable us as developers to forge AR experience as creatively as we want them. In the book, we will further explore the surface of Unity's power. While we won't be able to go through everything that is Unity in this book, we will see how simple knowledge in a few components can create impressive AR apps.
In the next chapter, we will go through the process of setting up our environment to start creating AR apps. We will set up both Unity and Vuforia to better understand how they both work together. We will also deploy Vuforia sample apps on device to test how a final app looks like.