How-To Tutorials

article-image-really-basic-guide-to-batch-file-programming

31 May 2018

3 min read

A really basic guide to batch file programming

31 May 2018

Batch file programming is a way of making a computer do things simply by creating, yes, you guessed it, a batch file. It's a way of doing things you might ordinarily do in the command prompt, but automates some tasks, which means you don't have to write so much code. If it sounds straightforward, that's because it is, generally. Which is why it's worth learning... Batch file programming is a good place to start learning how computers work Of course, if you already know your way around batch files, I'm sure you'll agree it's a good way for someone relatively experienced in software to get to know their machine a little better. If you know someone that you think would get a lot from learning batch file programming share this short guide with them! Why would I write a batch script? There are a number of reasons you might write batch scripts. It's particularly useful for resolving network issues, installing a number of programs on different machines, even organizing files and folders on your computer. Imagine you have a recurring issue - with a batch file you can solve it quickly and easily wherever you are without having to write copious lines of code in the command line. Or maybe your desktop simply looks like a mess; with a little knowledge of batch file programming you can clean things up without too much effort. How to write a batch file Clearly, batch file programming can make your life a lot easier. Let's take a look at the key steps to begin writing batch scripts. Step 1: Open your text editor Batch file programming is really about writing commands - so you'll need your text editor open to begin. Notepad, wordpad, it doesn't matter! Step 2: Begin writing code As we've already seen, batch file programming is really about writing commands for your computer. The code is essentially the same as what you would write in the command prompt. Here are a few batch file commands you might want to know to get started: ipconfig - this presents network information like your IP and MAC address. start “” [website] - this opens a specified website in your browser. rem - this is used if you want to make a comment or remark in your code (ie. for documentation purposes) pause - this, as you'd expect, pauses the script so it can be read before it continues. echo - this command will display text in the command prompt. %%a - this command refers to every file in a given folder if - this is a conditional command The list of batch file commands is pretty long. There are plenty of other resources with an exhaustive list of commands you can use, but a good place to begin is this page on Wikipedia. Step 3: Save your batch file Once you've written your commands in the text editor, you'll then need to save your document as a batch file. Title it, and suffix it with the .bat extension. You'll also need to make sure save as type is set as 'All files'. That's basically it when it comes to batch file programming. Of course, there are some complex things you can do, but once you know the basics, getting into the code is where you can start to experiment. Read next Jupyter and Python scripting Python Scripting Essentials

0
0
40744

article-image-azure-function-asp-net-core-mvc-application

Aaron Lazar

03 May 2018

10 min read

How to call an Azure function from an ASP.NET Core MVC application

Aaron Lazar

03 May 2018

10 min read

In this tutorial, we'll learn how to call an Azure Function from an ASP.NET Core MVC application. [box type="shadow" align="" class="" width=""]This article is an extract from the book C# 7 and .NET Core Blueprints, authored by Dirk Strauss and Jas Rademeyer. This book is a step-by-step guide that will teach you essential .NET Core and C# concepts with the help of real-world projects.[/box] We will get started with creating an ASP.NET Core MVC application that will call our Azure Function to validate an email address entered into a login screen of the application: This application does no authentication at all. All it is doing is validating the email address entered. ASP.NET Core MVC authentication is a totally different topic and not the focus of this post. In Visual Studio 2017, create a new project and select ASP.NET Core Web Application from the project templates. Click on the OK button to create the project. This is shown in the following screenshot: On the next screen, ensure that .NET Core and ASP.NET Core 2.0 is selected from the drop-down options on the form. Select Web Application (Model-View-Controller) as the type of application to create. Don't bother with any kind of authentication or enabling Docker support. Just click on the OK button to create your project: After your project is created, you will see the familiar project structure in the Solution Explorer of Visual Studio: Creating the login form For this next part, we can create a plain and simple vanilla login form. For a little bit of fun, let's spice things up a bit. Have a look on the internet for some free login form templates: I decided to use a site called colorlib that provided 50 free HTML5 and CSS3 login forms in one of their recent blog posts. The URL to the article is: https://colorlib.com/wp/html5-and-css3-login-forms/. I decided to use Login Form 1 by Colorlib from their site. Download the template to your computer and extract the ZIP file. Inside the extracted ZIP file, you will see that we have several folders. Copy all the folders in this extracted ZIP file (leave the index.html file as we will use this in a minute): Next, go to the solution for your Visual Studio application. In the wwwroot folder, move or delete the contents and paste the folders from the extracted ZIP file into the wwwroot folder of your ASP.NET Core MVC application. Your wwwroot folder should now look as follows: 4. Back in Visual Studio, you will see the folders when you expand the wwwroot node in the CoreMailValidation project. 5. I also want to focus your attention to the Index.cshtml and _Layout.cshtml files. We will be modifying these files next: Open the Index.cshtml file and remove all the markup (except the section in the curly brackets) from this file. Paste the HTML markup from the index.html file from the ZIP file we extracted earlier. Do not copy the all the markup from the index.html file. Only copy the markup inside the <body></body> tags. Your Index.cshtml file should now look as follows: @{ ViewData["Title"] = "Login Page"; } <div class="limiter"> <div class="container-login100"> <div class="wrap-login100"> <div class="login100-pic js-tilt" data-tilt> <img src="images/img-01.png" alt="IMG"> </div> <form class="login100-form validate-form"> <span class="login100-form-title"> Member Login </span> <div class="wrap-input100 validate-input" data-validate="Valid email is required: ex@abc.xyz"> <input class="input100" type="text" name="email" placeholder="Email"> <span class="focus-input100"></span> <span class="symbol-input100"> <i class="fa fa-envelope" aria-hidden="true"></i> </span> </div> <div class="wrap-input100 validate-input" data-validate="Password is required"> <input class="input100" type="password" name="pass" placeholder="Password"> <span class="focus-input100"></span> <span class="symbol-input100"> <i class="fa fa-lock" aria-hidden="true"></i> </span> </div> <div class="container-login100-form-btn"> <button class="login100-form-btn"> Login </button> </div> <div class="text-center p-t-12"> <span class="txt1"> Forgot </span> <a class="txt2" href="#"> Username / Password? </a> </div> <div class="text-center p-t-136"> <a class="txt2" href="#"> Create your Account <i class="fa fa-long-arrow-right m-l-5" aria-hidden="true"></i> </a> </div> </form> </div> </div> </div> The code for this chapter is available on GitHub here: Next, open the Layout.cshtml file and add all the links to the folders and files we copied into the wwwroot folder earlier. Use the index.html file for reference. You will notice that the _Layout.cshtml file contains the following piece of code—@RenderBody(). This is a placeholder that specifies where the Index.cshtml file content should be injected. If you are coming from ASP.NET Web Forms, think of the _Layout.cshtml page as a master page. Your Layout.cshtml markup should look as follows: <!DOCTYPE html> <html> <head> <meta charset="utf-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>@ViewData["Title"] - CoreMailValidation</title> <link rel="icon" type="image/png" href="~/images/icons/favicon.ico" /> <link rel="stylesheet" type="text/css" href="~/vendor/bootstrap/css/bootstrap.min.css"> <link rel="stylesheet" type="text/css" href="~/fonts/font-awesome-4.7.0/css/font-awesome.min.css"> <link rel="stylesheet" type="text/css" href="~/vendor/animate/animate.css"> <link rel="stylesheet" type="text/css" href="~/vendor/css-hamburgers/hamburgers.min.css"> <link rel="stylesheet" type="text/css" href="~/vendor/select2/select2.min.css"> <link rel="stylesheet" type="text/css" href="~/css/util.css"> <link rel="stylesheet" type="text/css" href="~/css/main.css"> </head> <body> <div class="container body-content"> @RenderBody() <hr /> <footer> <p>© 2018 - CoreMailValidation</p> </footer> </div> <script src="~/vendor/jquery/jquery-3.2.1.min.js"></script> <script src="~/vendor/bootstrap/js/popper.js"></script> <script src="~/vendor/bootstrap/js/bootstrap.min.js"></script> <script src="~/vendor/select2/select2.min.js"></script> <script src="~/vendor/tilt/tilt.jquery.min.js"></script> <script> $('.js-tilt').tilt({ scale: 1.1 }) </script> <script src="~/js/main.js"></script> @RenderSection("Scripts", required: false) </body> </html> If everything worked out right, you will see the following page when you run your ASP.NET Core MVC application. The login form is obviously totally non-functional: However, the login form is totally responsive. If you had to reduce the size of your browser window, you will see the form scale as your browser size reduces. This is what you want. If you want to explore the responsive design offered by Bootstrap, head on over to https://getbootstrap.com/ and go through the examples in the documentation: The next thing we want to do is hook this login form up to our controller and call the Azure Function we created to validate the email address we entered. Let's look at doing that next. Hooking it all up To simplify things, we will be creating a model to pass to our controller: Create a new class in the Models folder of your application called LoginModel and click on the Add button: 2. Your project should now look as follows. You will see the model added to the Models folder: The next thing we want to do is add some code to our model to represent the fields on our login form. Add two properties called Email and Password: namespace CoreMailValidation.Models { public class LoginModel { public string Email { get; set; } public string Password { get; set; } } } Back in the Index.cshtml view, add the model declaration to the top of the page. This makes the model available for use in our view. Take care to specify the correct namespace where the model exists: @model CoreMailValidation.Models.LoginModel @{ ViewData["Title"] = "Login Page"; } The next portion of code needs to be written in the HomeController.cs file. Currently, it should only have an action called Index(): public IActionResult Index() { return View(); } Add a new async function called ValidateEmail that will use the base URL and parameter string of the Azure Function URL we copied earlier and call it using an HTTP request. I will not go into much detail here, as I believe the code to be pretty straightforward. All we are doing is calling the Azure Function using the URL we copied earlier and reading the return data: private async Task<string> ValidateEmail(string emailToValidate) { string azureBaseUrl = "https://core-mail- validation.azurewebsites.net/api/HttpTriggerCSharp1"; string urlQueryStringParams = $"? code=/IS4OJ3T46quiRzUJTxaGFenTeIVXyyOdtBFGasW9dUZ0snmoQfWoQ ==&email={emailToValidate}"; using (HttpClient client = new HttpClient()) { using (HttpResponseMessage res = await client.GetAsync( $"{azureBaseUrl}{urlQueryStringParams}")) { using (HttpContent content = res.Content) { string data = await content.ReadAsStringAsync(); if (data != null) { return data; } else return ""; } } } } Create another public async action called ValidateLogin. Inside the action, check to see if the ModelState is valid before continuing. For a nice explanation of what ModelState is, have a look at the following article—https://www.exceptionnotfound.net/asp-net-mvc-demystified-modelstate/. We then do an await on the ValidateEmail function, and if the return data contains the word false, we know that the email validation failed. A failure message is then passed to the TempData property on the controller. The TempData property is a place to store data until it is read. It is exposed on the controller by ASP.NET Core MVC. The TempData property uses a cookie-based provider by default in ASP.NET Core 2.0 to store the data. To examine data inside the TempData property without deleting it, you can use the Keep and Peek methods. To read more on TempData, see the Microsoft documentation here: https://docs.microsoft.com/en-us/aspnet/core/fundamentals/app-state?tabs=aspnetcore2x. If the email validation passed, then we know that the email address is valid and we can do something else. Here, we are simply just saying that the user is logged in. In reality, we will perform some sort of authentication here and then route to the correct controller. So now you know how to call an Azure Function from an ASP.NET Core application. If you found this tutorial helpful and you'd like to learn more, go ahead and pick up the book C# 7 and .NET Core Blueprints. What is ASP.NET Core? Why ASP.NET makes building apps for mobile and web easy How to dockerize an ASP.NET Core application

0
1
40715

article-image-google-employees-join-hands-with-amnesty-international-urging-google-to-drop-project-dragonfly

Sugandha Lahoti

28 Nov 2018

3 min read

Google employees join hands with Amnesty International urging Google to drop Project Dragonfly

Sugandha Lahoti

28 Nov 2018

3 min read

Yesterday, Google employees have signed a petition protesting Google’s infamous Project Dragonfly. “We are Google employees and we join Amnesty International in calling on Google to cancel project Dragonfly”, they wrote on a post on Medium. This petition also marks the first time over 300 Google employees (at the time of writing this post) have used their actual names in a public document. Project Dragonfly is the secretive search engine that Google is allegedly developing which will comply with the Chinese rules of censorship. It has been on the receiving end of constant backlash from various human rights organizations and investigative reporters, since it was revealed earlier this year. On Monday, it also faced critique from human rights organization Amnesty International. Amnesty launched a petition opposing the project, and coordinated protests outside Google offices around the world including San Francisco, Berlin, Toronto and London. https://twitter.com/amnesty/status/1067488964167327744 Yesterday, Google employees joined Amnesty and wrote an open letter to the firm. “We are protesting against Google’s effort to create a censored search engine for the Chinese market that enables state surveillance. Our opposition to Dragonfly is not about China: we object to technologies that aid the powerful in oppressing the vulnerable, wherever they may be. Dragonfly in China would establish a dangerous precedent at a volatile political moment, one that would make it harder for Google to deny other countries similar concessions. Dragonfly would also enable censorship and government-directed disinformation, and destabilize the ground truth on which popular deliberation and dissent rely.” Employees have expressed their disdain over Google’s decision by calling it a money-minting business. They have also highlighted Google’s previous disappointments including Project Maven, Dragonfly, and Google’s support for abusers, and believe that “Google is no longer willing to place its values above its profits. This is why we’re taking a stand.” Google spokesperson has redirected to their previous response on the topic: "We've been investing for many years to help Chinese users, from developing Android, through mobile apps such as Google Translate and Files Go, and our developer tools. But our work on search has been exploratory, and we are not close to launching a search product in China." Twitterati have openly sided with Google employees in this matter. https://twitter.com/Davidramli/status/1067582476262957057 https://twitter.com/shabirgilkar/status/1067642235724972032 https://twitter.com/nrambeck/status/1067517570276868097 https://twitter.com/kuminaidoo/status/1067468708291985408 OK Google, why are you ok with mut(at)ing your ethos for Project DragonFly? Amnesty International takes on Google over Chinese censored search engine, Project Dragonfly. Google’s prototype Chinese search engine ‘Dragonfly’ reportedly links searches to phone numbers

0
0
40695

Bhagyashree R

23 Jul 2019

4 min read

Django 3.0 is going async!

Bhagyashree R

23 Jul 2019

4 min read

Last year, Andrew Godwin, a Django contributor, formulated a roadmap to bring async functionality into Django. After a lot of discussion and amendments, the Django Technical Board approved his DEP 0009: Async-capable Django yesterday. Godwin wrote in a Google group, “After a long and involved vote, I can announce that the Technical Board has voted in favour of DEP 0009 (Async Django), and so the DEP has been moved to the "accepted" state.” The reason why Godwin thinks that this is the right time to bring async-native support in Django is that starting from version 2.1, it supports Python 3.5 and up. These Python versions have async def and similar native support for coroutines. Also, the web is now slowly shifting to use cases that prefer high concurrency workloads and large parallelizable queries. The motivation behind Async in Django The Django Enhancement Proposal (DEP) 0009 aims to address one of the core flaws in Python: inefficient threading. Python is not considered to be a perfect asynchronous language. Its ‘asyncio’ library for writing concurrent code suffers from some core design flaws. There are alternative async frameworks for Python but are incompatible. Django Channels brought some async support to Django but they primarily focus on WebSocket handling. Explaining the motivation, the DEP says, “At the same time, it's important we have a plan that delivers our users immediate benefits, rather than attempting to write a whole new Django-size framework that is natively asynchronous from the start.” Additionally, most developers are unacquainted with developing Python applications that have async support. There is also a lack of proper documentation, tutorials, and tooling to help them. Godwin believes that Django can become a “good catalyst” to help in creating guidance documentation. Goals this DEP outlines to achieve The DEP proposes to bring support for asynchronous Python into Django while maintaining synchronous Python support as well in a backward-compatible way. Here are its end goals, that Godwin listed in his roadmap: Making the blocking parts in Django such as sessions, auth, the ORM, and handlers asynchronous natively with a synchronous wrapper exposed on top where needed to ensure backward compatibility. Keeping familiar models/views/templates/middleware layout intact with very few changes. Ensuring that these updates do not compromise speed and cause significant performance regressions at any stage of this plan. Enabling developers to write fully-async websites if they want to, but not enforcing this as the default way of writing websites. Welcoming new talent into the Djang team to help out on large-scale features. Timeline to achieve these goals Godwin in his "A Django Async Roadmap" shared the following timeline: Django Version Updates 2.1 Current in-progress release. No async work 2.2 Initial work to add async ORM and view capability, but everything defaults to sync by default, and async support is mostly threadpool-based. 3.0 Rewrite the internal request handling stack to be entirely asynchronous, add async middleware, forms, caching, sessions, auth. Start the deprecation process for any APIs that are becoming async-only. 3.1 Continue improving async support, potential async templating changes 3.2 Finish deprecation process and have a mostly-async Django. Godwin posted a summary of the discussion he had with the Django Technical Board in the Google Group. Some of the queries they raised were how the team plans to distinguish async versions of functions/method from sync ones, how this implementation will ensure that there is no performance hit if the user opts out of async mode, and more. In addition to these technical queries, the board also raised a non-technical concern, “The Django project has lost many contributors over the years, is essentially in a maintenance mode, and we likely do not have the people to staff a project like this.” Godwin sees a massive opportunity to lurking in this fundamental challenge - namely to revive the Django project. He adds, “I agree with the observation that things have substantially slowed down, but I personally believe that a project like async is exactly what Django needs to get going again. There's now a large amount of fertile ground to change and update things that aren't just fixing five-year-old bugs.” Read the DEP 0009: Async-capable Django to know more in detail. Which Python framework is best for building RESTful APIs? Django or Flask? Django 2.2 is now out with classes for custom database constraints

0
0
40671

article-image-techniques-and-practices-game-ai

Packt

14 Jan 2016

10 min read

Techniques and Practices of Game AI

Packt

14 Jan 2016

10 min read

In this article by Peter L Newton, author of the book Learning Unreal AI Programming, we will understand the fundamental techniques and practices of game AI. This will be the building block to developing an amazing and interesting game AI. (For more resources related to this topic, see here.) Navigation While all the following components aren't necessary to achieve AI navigation, they all contribute critical feedback that can affect navigation. Navigating within a world is limited only by the pathways within the game. Navigation for AI is built up of the following things: Path following (path nodes): Another solution similar to NavMesh, path nodes can designate the space in which the AI traverses. Navigation mesh: Using tools such as Navigation Mesh, also known as NavMesh, you can designate areas in which the AI can traverse. NavMesh generates a plot of grids that is used to calculate the path and cost during navigation. It's important to know that this is only one of several pathfinding techniques available; we use it because it works well in this demonstration. Behavior trees: Using behavior trees to influence your AI's next destination can create a more interesting player experience. It not only calculates its requested destination, but also decides whether it should enter the screen with a cartwheel double backflip, no hands or try the triple somersault to jazz hands. Steering behaviors: Steering behaviors affect the way the AI moves while navigating to avoid obstacles. This also means using steering to create formations with your fleets that you have set to attack the king's wall. Steering can be used in many ways to influence the movement of the character. Sensory systems: Sensory systems can provide critical details, such as players nearby, sound levels, cover nearby, and many other variables of the environment that can alter movement. It's critical that your AI understands the changing environment so that it doesn't break the illusion of being a real opponent. Achieving realistic movement with steering When you think of what steering does for a car, you would be right to imagine that the same idea is applied to game AI navigation. Steering influences the movement of AI elements as they traverse to their next destination. The influences can be supplied as necessary, but we will go over the most commonly used ones. Avoidance is used essentially to avoid colliding with oncoming AI. Flocking is another key factor in steering; you commonly see an example of it while watching a school of fish. This phenomenon, known as flocking, is useful in simulating interesting group movement; simulate a complete panic or a school of fish. The goal of steering behaviors is to achieve realistic movement behavior within the player's world. Creating character with randomness and probability AI with character is what randomness and probability adds to the bots decision making. If a bot attacked you the same way, always entered the scene the same way, and annoyed you with its laugh after every successful hit, it wouldn't make for a unique experience—the AI always does the same thing. By using randomness and probability, you can instead make the AI laugh based on probability or introduce randomness to the AI's skill of choice. Another great by-product of applying randomness and probability is that it allows you to introduce levels of difficulty. You can lower the chance of missing the skill cast or even allow the bots to aim more precisely. If you have bots who wander around looking for enemies, their next destination can be randomly chosen. Creating complex decision making with behavior trees Finite State Machines (FSM) allow your bot to perform transitions between states. This allows it to go from wandering to hunting and then to killing. Behavior trees are similar but allow more flexibility. Behavior trees allow hierarchical FSM, which introduces another layer of decisions. So, the bot decides between branches of behaviors that define the state it is in. There is a tool provided by UE4 called Behavior Tree. Its editor tool allows you to modify AI behavior quickly and with ease. The following sections show the components found within UE4's Behavior Tree. Root This node is the starting node that sends the signal to the next node in the tree. This would connect to a composite that begins your first tree. What you may notice is that you are required to use a composite first to define a tree and then create the task for that tree. This is because a hierarchical FSM creates branches of states. These states will be populated with other states or tasks. This allows easy transitions between multiple states. Decorators This node creates another task, which you can add on top of the node as a "decoration". This could be, for example, a Force Success decorator when using a sequence composite or using a loop to have a node's actions repeated a number of times. I used a decorator in the AI we will make that tells it to update to the next available route. Consider the following screenshot: In the preceding screenshot, you see the Attack & Destroy decorator at the top of the composite, which defines the state. This state includes two tasks, Attack Enemy and Move To Enemy, the latter of which also has a decorator telling it to execute only when the bot state is searching. Composites These are the starting points of the states. They define how the state will behave with returns and execution flow. There is a Selector in our example that will execute each of its children from left to right and doesn't fail but returns success when one of its children returns success. Therefore, this is good for a state that doesn't check for successfully executed nodes. The Sequence executes its children in a similar fashion to the Selector, but returns a fail message when one of its children returns fail. This means that it's required that the nodes return a success message to complete the sequence. Last but not least is Simple Parallel. This allows you to execute a task and a tree at essentially the same time. This is great for creating a state that will require another task to always be called. So, to set it up, you first need to connect it to a task that it will execute. The second task or state that is connected continues to be called with the first task until the first task returns a success message. Services Services run as long as the composite that it is added to stays activated. They tick on the intervals that you set within the properties. They have another float property that allows you to create deviations in the tick intervals. Services are used to modify the state of the AI in most cases, because it's always called. For example, in the bot that we will create, we add a service to the first branch of the tree so that it's called without interruption, thus being able to maintain the state that the bot should be in at any given movement. This service, called Detect Enemy, actually runs a deviating cycle that updates Blackboard variables, such as State and EnemyActor: Tasks Tasks do the dirty work and report with a success or failed message if necessary. They have two nodes, which you'll use most often when working with a task: Event Receive Execute, which receives the signal to execute the connected scripts, and Finish Execute, which sends the signal back, returning a true or false message on success. This is important when making a task meant for the Sequence composite. Blackboards Blackboards are used to store variables within the behavior tree of the AI. In our example, we store an enumeration variable, State, to store the state, TargetPoint to hold the currently targeted enemy, and Route, which stores the current route position the AI has been requested to travel to, just to name a few. Blackboards work just by setting a public variable of a node to one of the available Blackboard variables in the drop-down menu. The naming convention shown in the following screenshot makes this process streamlined: Sensory system Creating a sensory system is heavily based on the environment where the AI will be fighting the player. It will need to be able to find cover, evade the enemy, get ammo, and other features that you feel will create an immersive AI for your game. Games with AI that challenges the player create a unique individual experience. A good sensory system contributes critical information, which makes for reactive AI. In this project, we use the sensory system to detect pawns that the AI can see. We also use functions to check for the line of sight of the enemy. We check whether there is another pawn in our path. We can check for cover and other resources within the area. Machine learning Machine learning is a branch of its own. This technique allows AI to learn from situations and simulations. The inputs are from the environment, including the context in which the bot allows it to make decisive actions. In machine learning, the inputs are put within a classifier, which can predict a set of outputs with a certain level of certainty. Classifiers can be combined into ensembles to increase the accuracy of the probabilistic prediction. We don't dig heavily into this subject, but I will provide some material for those interested. Tracing Tracing allows another actor within the world to detect objects by ray tracing. A single line trace is sent out, and if it collides with an actor, the actor is returned, including the information about the impact. Tracing is used for many reasons. One way it is used in FPS games is to detect hits. Are you familiar with the hit box? When your player shoots in a game, a trace is shot out that collides with the opponent's hit box, determining the damage to your opponent and, if you're skillful enough, resulting in their death. There are other shapes available for traces, such as spheres, capsules, and boxes, which allow tracing for different situations. Recently, I used the box trace for my car in order to detect objects near it. Influence mapping Influence mapping isn't a finite approach; it's the idea that specific locations on the map would contribute information that directly influences the player or AI. An example when using influence mapping with AI is presence falloff. Say we have enemy AI in a group. Their presence map would create a radial circle around the group with an intensity based on the size of the group. This way, other AI elements know that on entering this area, they're entering a zone occupied by enemy AI. Practical information isn't the only thing people use this for, so just understand that it's meant to provide another level of input to help your bot make additional decisions. Summary In this article, we saw the fundamental techniques and practices of game AI. We saw how to implement navigation, achieve realistic movement of AI elements, and create characters with randomness in order to achieve a sense of realism. We also looked at behavior trees and all their constituent elements. Further, we touched upon some aspects related to AI, such as machine learning and tracing. Resources for Article: Further resources on this subject: Overview of Unreal Engine 4[article] The Unreal Engine[article] Creating weapons for your game using UnrealScript[article]

0
0
40596

article-image-implementing-an-api-design-first-approach-for-building-apis

Packt Editorial Staff

15 Jun 2018

9 min read

Implement an API Design-first approach for building APIs [Tutorial]

Packt Editorial Staff

15 Jun 2018

9 min read

0
0
40577

article-image-3-programming-languages-some-people-think-are-dead-but-definitely-arent

Richard Gall

24 Oct 2019

11 min read

3 programming languages some people think are dead but definitely aren’t

Richard Gall

24 Oct 2019

11 min read

0
0
40576

Packt

22 Feb 2016

64 min read

Dynamic Graphics

Packt

22 Feb 2016

64 min read

There is no question that the rendering system of modern graphics devices is complicated. Even rendering a single triangle to the screen engages many of these components, since GPUs are designed for large amounts of parallelism, as opposed to CPUs, which are designed to handle virtually any computational scenario. Modern graphics rendering is a high-speed dance of processing and memory management that spans software, hardware, multiple memory spaces, multiple languages, multiple processors, multiple processor types, and a large number of special-case features that can be thrown into the mix. To make matters worse, every graphics situation we will come across is different in its own way. Running the same application against a different device, even by the same manufacturer, often results in an apples-versus-oranges comparison due to the different capabilities and functionality they provide. It can be difficult to determine where a bottleneck resides within such a complex chain of devices and systems, and it can take a lifetime of industry work in 3D graphics to have a strong intuition about the source of performance issues in modern graphics systems. Thankfully, Profiling comes to the rescue once again. If we can gather data about each component, use multiple performance metrics for comparison, and tweak our Scenes to see how different graphics features affect their behavior, then we should have sufficient evidence to find the root cause of the issue and make appropriate changes. So in this article, you will learn how to gather the right data, dig just deep enough into the graphics system to find the true source of the problem, and explore various solutions to work around a given problem. There are many more topics to cover when it comes to improving rendering performance, so in this article we will begin with some general techniques on how to determine whether our rendering is limited by the CPU or by the GPU, and what we can do about either case. We will discuss optimization techniques such as Occlusion Culling and Level of Detail (LOD) and provide some useful advice on Shader optimization, as well as large-scale rendering features such as lighting and shadows. Finally, since mobile devices are a common target for Unity projects, we will also cover some techniques that may help improve performance on limited hardware. (For more resources related to this topic, see here.) Profiling rendering issues Poor rendering performance can manifest itself in a number of ways, depending on whether the device is CPU-bound, or GPU-bound; in the latter case, the root cause could originate from a number of places within the graphics pipeline. This can make the investigatory stage rather involved, but once the source of the bottleneck is discovered and the problem is resolved, we can expect significant improvements as small fixes tend to reap big rewards when it comes to the rendering subsystem. The CPU sends rendering instructions through the graphics API, that funnel through the hardware driver to the GPU device, which results in commands entering the GPU's Command Buffer. These commands are processed by the massively parallel GPU system one by one until the buffer is empty. But there are a lot more nuances involved in this process. The following shows a (greatly simplified) diagram of a typical GPU pipeline (which can vary based on technology and various optimizations), and the broad rendering steps that take place during each stage: The top row represents the work that takes place on the CPU, the act of calling into the graphics API, through the hardware driver, and pushing commands into the GPU. Ergo, a CPU-bound application will be primarily limited by the complexity, or sheer number, of graphics API calls. Meanwhile, a GPU-bound application will be limited by the GPU's ability to process those calls, and empty the Command Buffer in a reasonable timeframe to allow for the intended frame rate. This is represented in the next two rows, showing the steps taking place in the GPU. But, because of the device's complexity, they are often simplified into two different sections: the front end and the back end. The front end refers to the part of the rendering process where the GPU has received mesh data, a draw call has been issued, and all of the information that was fed into the GPU is used to transform vertices and run through Vertex Shaders. Finally, the rasterizer generates a batch of fragments to be processed in the back end. The back end refers to the remainder of the GPU's processing stages, where fragments have been generated, and now they must be tested, manipulated, and drawn via Fragment Shaders onto the frame buffer in the form of pixels. Note that "Fragment Shader" is the more technically accurate term for Pixel Shaders. Fragments are generated by the rasterization stage, and only technically become pixels once they've been processed by the Shader and drawn to the Frame Buffer. There are a number of different approaches we can use to determine where the root cause of a graphics rendering issue lies: Profiling the GPU with the Profiler Examining individual frames with the Frame Debugger Brute Force Culling GPU profiling Because graphics rendering involves both the CPU and GPU, we must examine the problem using both the CPU Usage and GPU Usage areas of the Profiler as this can tell us which component is working hardest. For example, the following screenshot shows the Profiler data for a CPU-bound application. The test involved creating thousands of simple objects, with no batching techniques taking place. This resulted in an extremely large Draw Call count (around 15,000) for the CPU to process, but giving the GPU relatively little work to do due to the simplicity of the objects being rendered: This example shows that the CPU's "rendering" task is consuming a large amount of cycles (around 30 ms per frame), while the GPU is only processing for less than 16 ms, indicating that the bottleneck resides in the CPU. Meanwhile, Profiling a GPU-bound application via the Profiler is a little trickier. This time, the test involves creating a small number of high polycount objects (for a low Draw Call per vertex ratio), with dozens of real-time point lights and an excessively complex Shader with a texture, normal texture, heightmap, emission map, occlusion map, and so on, (for a high workload per pixel ratio). The following screenshot shows Profiler data for the example Scene when it is run in a standalone application: As we can see, the rendering task of the CPU Usage area matches closely with the total rendering costs of the GPU Usage area. We can also see that the CPU and GPU time costs at the bottom of the image are relatively similar (41.48 ms versus 38.95 ms). This is very unintuitive as we would expect the GPU to be working much harder than the CPU. Be aware that the CPU/GPU millisecond cost values are not calculated or revealed unless the appropriate Usage Area has been added to the Profiler window. However, let's see what happens when we test the same exact Scene through the Editor: This is a better representation of what we would expect to see in a GPU-bound application. We can see how the CPU and GPU time costs at the bottom are closer to what we would expect to see (2.74 ms vs 64.82 ms). However, this data is highly polluted. The spikes in the CPU and GPU Usage areas are the result of the Profiler Window UI updating during testing, and the overhead cost of running through the Editor is also artificially increasing the total GPU time cost. It is unclear what causes the data to be treated this way, and this could certainly change in the future if enhancements are made to the Profiler in future versions of Unity, but it is useful to know this drawback. Trying to determine whether our application is truly GPU-bound is perhaps the only good excuse to perform a Profiler test through the Editor. The Frame Debugger A new feature in Unity 5 is the Frame Debugger, a debugging tool that can reveal how the Scene is rendered and pieced together, one Draw Call at a time. We can click through the list of Draw Calls and observe how the Scene is rendered up to that point in time. It also provides a lot of useful details for the selected Draw Call, such as the current render target (for example, the shadow map, the camera depth texture, the main camera, or other custom render targets), what the Draw Call did (drawing a mesh, drawing a static batch, drawing depth shadows, and so on), and what settings were used (texture data, vertex colors, baked lightmaps, directional lighting, and so on). The following screenshot shows a Scene that is only being partially rendered due to the currently selected Draw Call within the Frame Debugger. Note the shadows that are visible from baked lightmaps that were rendered during an earlier pass before the object itself is rendered: If we are bound by Draw Calls, then this tool can be effective in helping us figure out what the Draw Calls are being spent on, and determine whether there are any unnecessary Draw Calls that are not having an effect on the scene. This can help us come up with ways to reduce them, such as removing unnecessary objects or batching them somehow. We can also use this tool to observe how many additional Draw Calls are consumed by rendering features, such as shadows, transparent objects, and many more. This could help us, when we're creating multiple quality levels for our game, to decide what features to enable/disable under the low, medium, and high quality settings. Brute force testing If we're poring over our Profiling data, and we're still not sure we can determine the source of the problem, we can always try the brute force method: cull a specific activity from the Scene and see if it results in greatly increased performance. If a small change results in a big speed improvement, then we have a strong clue about where the bottleneck lies. There's no harm in this approach if we eliminate enough unknown variables to be sure the data is leading us in the right direction. We will cover different ways to brute force test a particular issue in each of the upcoming sections. CPU-bound If our application is CPU-bound, then we will observe a generally poor FPS value within the CPU Usage area of the Profiler window due to the rendering task. However, if VSync is enabled the data will often get muddied up with large spikes representing pauses as the CPU waits for the screen refresh rate to come around before pushing the current frame buffer. So, we should make sure to disable the VSync block in the CPU Usage area before deciding the CPU is the problem. Brute-forcing a test for CPU-bounding can be achieved by reducing Draw Calls. This is a little unintuitive since, presumably, we've already been reducing our Draw Calls to a minimum through techniques such as Static and Dynamic Batching, Atlasing, and so forth. This would mean we have very limited scope for reducing them further. What we can do, however, is disable the Draw-Call-saving features such as batching and observe if the situation gets significantly worse than it already is. If so, then we have evidence that we're either already, or very close to being, CPU-bound. At this point, we should see whether we can re-enable these features and disable rendering for a few choice objects (preferably those with low complexity to reduce Draw Calls without over-simplifying the rendering of our scene). If this results in a significant performance improvement then, unless we can find further opportunities for batching and mesh combining, we may be faced with the unfortunate option of removing objects from our scene as the only means of becoming performant again. There are some additional opportunities for Draw Call reduction, including Occlusion Culling, tweaking our Lighting and Shadowing, and modifying our Shaders. These will be explained in the following sections. However, Unity's rendering system can be multithreaded, depending on the targeted platform, which version of Unity we're running, and various settings, and this can affect how the graphics subsystem is being bottlenecked by the CPU, and slightly changes the definition of what being CPU-bound means. Multithreaded rendering Multithreaded rendering was first introduced in Unity v3.5 in February 2012, and enabled by default on multicore systems that could handle the workload; at the time, this was only PC, Mac, and Xbox 360. Gradually, more devices were added to this list, and since Unity v5.0, all major platforms now enable multithreaded rendering by default (and possibly some builds of Unity 4). Mobile devices were also starting to feature more powerful CPUs that could support this feature. Android multithreaded rendering (introduced in Unity v4.3) can be enabled through a checkbox under Platform Settings | Other Settings | Multithreaded Rendering. Multithreaded rendering on iOS can be enabled by configuring the application to make use of the Apple Metal API (introduced in Unity v4.6.3), under Player Settings | Other Settings | Graphics API. When multithreaded rendering is enabled, tasks that must go through the rendering API (OpenGL, DirectX, or Metal), are handed over from the main thread to a "worker thread". The worker thread's purpose is to undertake the heavy workload that it takes to push rendering commands through the graphics API and driver, to get the rendering instructions into the GPU's Command Buffer. This can save an enormous number of CPU cycles for the main thread, where the overwhelming majority of other CPU tasks take place. This means that we free up extra cycles for the majority of the engine to process physics, script code, and so on. Incidentally, the mechanism by which the main thread notifies the worker thread of tasks operates in a very similar way to the Command Buffer that exists on the GPU, except that the commands are much more high-level, with instructions like "render this object, with this Material, using this Shader", or "draw N instances of this piece of procedural geometry", and so on. This feature has been exposed in Unity 5 to allow developers to take direct control of the rendering subsystem from C# code. This customization is not as powerful as having direct API access, but it is a step in the right direction for Unity developers to implement unique graphical effects. Confusingly, the Unity API name for this feature is called "CommandBuffer", so be sure not to confuse it with the GPU's Command Buffer. Check the Unity documentation on CommandBuffer to make use of this feature: http://docs.unity3d.com/ScriptReference/Rendering.CommandBuffer.html. Getting back to the task at hand, when we discuss the topic of being CPU-bound in graphics rendering, we need to keep in mind whether or not the multithreaded renderer is being used, since the actual root cause of the problem will be slightly different depending on whether this feature is enabled or not. In single-threaded rendering, where all graphics API calls are handled by the main thread, and in an ideal world where both components are running at maximum capacity, our application would become bottlenecked on the CPU when 50 percent or more of the time per frame is spent handling graphics API calls. However, resolving these bottlenecks can be accomplished by freeing up work from the main thread. For example, we might find that greatly reducing the amount of work taking place in our AI subsystem will improve our rendering significantly because we've freed up more CPU cycles to handle the graphics API calls. But, when multithreaded rendering is taking place, this task is pushed onto the worker thread, which means the same thread isn't being asked to manage both engine work and graphics API calls at the same time. These processes are mostly independent, and even though additional work must still take place in the main thread to send instructions to the worker thread in the first place (via the internal CommandBuffer system), it is mostly negligible. This means that reducing the workload in the main thread will have little-to-no effect on rendering performance. Note that being GPU-bound is the same regardless of whether multithreaded rendering is taking place. GPU Skinning While we're on the subject of CPU-bounding, one task that can help reduce CPU workload, at the expense of additional GPU workload, is GPU Skinning. Skinning is the process where mesh vertices are transformed based on the current location of their animated bones. The animation system, working on the CPU, only transforms the bones, but another step in the rendering process must take care of the vertex transformations to place the vertices around those bones, performing a weighted average over the bones connected to those vertices. This vertex processing task can either take place on the CPU or within the front end of the GPU, depending on whether the GPU Skinning option is enabled. This feature can be toggled under Edit | Project Settings | Player Settings | Other Settings | GPU Skinning. Front end bottlenecks It is not uncommon to use a mesh that contains a lot of unnecessary UV and Normal vector data, so our meshes should be double-checked for this kind of superfluous fluff. We should also let Unity optimize the structure for us, which minimizes cache misses as vertex data is read within the front end. We will also learn some useful Shader optimization techniques shortly, when we begin to discuss back end optimizations, since many optimization techniques apply to both Fragment and Vertex Shaders. The only attack vector left to cover is finding ways to reduce actual vertex counts. The obvious solutions are simplification and culling; either have the art team replace problematic meshes with lower polycount versions, and/or remove some objects from the scene to reduce the overall polygon count. If these approaches have already been explored, then the last approach we can take is to find some kind of middle ground between the two. Level Of Detail Since it can be difficult to tell the difference between a high quality distance object and a low quality one, there is very little reason to render the high quality version. So, why not dynamically replace distant objects with something more simplified? Level Of Detail (LOD), is a broad term referring to the dynamic replacement of features based on their distance or form factor relative to the camera. The most common implementation is mesh-based LOD: dynamically replacing a mesh with lower and lower detailed versions as the camera gets farther and farther away. Another example might be replacing animated characters with versions featuring fewer bones, or less sampling for distant objects, in order to reduce animation workload. The built-in LOD feature is available in the Unity 4 Pro Edition and all editions of Unity 5. However, it is entirely possible to implement it via Script code in Unity 4 Free Edition if desired. Making use of LOD can be achieved by placing multiple objects in the Scene and making them children of a GameObject with an attached LODGroup component. The LODGroup's purpose is to generate a bounding box from these objects, and decide which object should be rendered based on the size of the bounding box within the camera's field of view. If the object's bounding box consumes a large area of the current view, then it will enable the mesh(es) assigned to lower LOD groups, and if the bounding box is very small, it will replace the mesh(es) with those from higher LOD groups. If the mesh is too far away, it can be configured to hide all child objects. So, with the proper setup, we can have Unity replace meshes with simpler alternatives, or cull them entirely, which eases the burden on the rendering process. Check the Unity documentation for more detailed information on the LOD feature: http://docs.unity3d.com/Manual/LevelOfDetail.html. This feature can cost us a large amount of development time to fully implement; artists must generate lower polygon count versions of the same object, and level designers must generate LOD groups, configure them, and test them to ensure they don't cause jarring transitions as the camera moves closer or farther away. It also costs us in memory and runtime CPU; the alternative meshes need to be kept in memory, and the LODGroup component must routinely test whether the camera has moved to a new position that warrants a change in LOD level. In this era of graphics card capabilities, vertex processing is often the least of our concerns. Combined with the additional sacrifices needed for LOD to function, developers should avoid preoptimizing by automatically assuming LOD will help them. Excessive use of the feature will lead to burdening other parts of our application's performance, and chew up precious development time, all for the sake of paranoia. If it hasn't been proven to be a problem, then it's probably not a problem! Scenes that feature large, expansive views of the world, and lots of camera movement, should consider implementing this technique very early, as the added distance and massive number of visible objects will exacerbate the vertex count enormously. Scenes that are always indoors, or feature a camera with a viewpoint looking down at the world (real-time strategy and MOBA games, for example) should probably steer clear of implementing LOD from the beginning. Games somewhere between the two should avoid it until necessary. It all depends on how many vertices are expected to be visible at any given time and how much variability in camera distance there will be. Note that some game development middleware companies offer third-party tools for automated LOD mesh generation. These might be worth investigating to compare their ease of use versus quality loss versus cost effectiveness. Disable GPU Skinning As previously mentioned, we could enable GPU Skinning to reduce the burden on a CPU-bound application, but enabling this feature will push the same workload into the front end of the GPU. Since Skinning is one of those "embarrassingly parallel" processes that fits well with the GPU's parallel architecture, it is often a good idea to perform the task on the GPU. But this task can chew up precious time in the front end preparing the vertices for fragment generation, so disabling it is another option we can explore if we're bottlenecked in this area. Again, this feature can be toggled under Edit | Project Settings | Player Settings | Other Settings | GPU Skinning. GPU Skinning is available in Unity 4 Pro Edition, and all editions of Unity 5. Reduce tessellation There is one last task that takes place in the front end process and that we need to consider: tessellation. Tessellation through Geometry Shaders can be a lot of fun, as it is a relatively underused technique that can really make our graphical effects stand out from the crowd of games that only use the most common effects. But, it can contribute enormously to the amount of processing work taking place in the front end. There are no simple tricks we can exploit to improve tessellation, besides improving our tessellation algorithms, or easing the burden caused by other front end tasks to give our tessellation tasks more room to breathe. Either way, if we have a bottleneck in the front end and are making use of tessellation techniques, we should double-check that they are not consuming the lion's share of the front end's budget. Back end bottlenecks The back end is the more interesting part of the GPU pipeline, as many more graphical effects take place during this stage. Consequently, it is the stage that is significantly more likely to suffer from bottlenecks. There are two brute force tests we can attempt: Reduce resolution Reduce texture quality These changes will ease the workload during two important stages at the back end of the pipeline: fill rate and memory bandwidth, respectively. Fill rate tends to be the most common source of bottlenecks in the modern era of graphics rendering, so we will cover it first. Fill rate By reducing screen resolution, we have asked the rasterization system to generate significantly fewer fragments and transpose them over a smaller canvas of pixels. This will reduce the fill rate consumption of the application, giving a key part of the rendering pipeline some additional breathing room. Ergo, if performance suddenly improves with a screen resolution reduction, then fill rate should be our primary concern. Fill rate is a very broad term referring to the speed at which the GPU can draw fragments. But, this only includes fragments that have survived all of the various conditional tests we might have enabled within the given Shader. A fragment is merely a "potential pixel," and if it fails any of the enabled tests, then it is immediately discarded. This can be an enormous performance-saver as the pipeline can skip the costly drawing step and begin work on the next fragment instead. One such example is Z-testing, which checks whether the fragment from a closer object has already been drawn to the same pixel already. If so, then the current fragment is discarded. If not, then the fragment is pushed through the Fragment Shader and drawn over the target pixel, which consumes exactly one draw from our fill rate. Now imagine multiplying this process by thousands of overlapping objects, each generating hundreds or thousands of possible fragments, for high screen resolutions causing millions, or billions, of fragments to be generated each and every frame. It should be fairly obvious that skipping as many of these draws as we can will result in big rendering cost savings. Graphics card manufacturers typically advertise a particular fill rate as a feature of the card, usually in the form of gigapixels per second, but this is a bit of a misnomer, as it would be more accurate to call it gigafragments per second; however this argument is mostly academic. Either way, larger values tell us that the device can potentially push more fragments through the pipeline, so with a budget of 30 GPix/s and a target frame rate of 60 Hz, we can afford to process 30,000,000,000/60 = 500 million fragments per frame before being bottlenecked on fill rate. With a resolution of 2560x1440, and a best-case scenario where each pixel is only drawn over once, then we could theoretically draw the entire scene about 125 times without any noticeable problems. Sadly, this is not a perfect world, and unless we take significant steps to avoid it, we will always end up with some amount of redraw over the same pixels due to the order in which objects are rendered. This is known as overdraw, and it can be very costly if we're not careful. The reason that resolution is a good attack vector to check for fill rate bounding is that it is a multiplier. A reduction from a resolution of 2560x1440 to 800x600 is an improvement factor of about eight, which could reduce fill rate costs enough to make the application perform well again. Overdraw Determining how much overdraw we have can be represented visually by rendering all objects with additive alpha blending and a very transparent flat color. Areas of high overdraw will show up more brightly as the same pixel is drawn over with additive blending multiple times. This is precisely how the Scene view's Overdraw shading mode reveals how much overdraw our scene is suffering. The following screenshot shows a scene with several thousand boxes drawn normally, and drawn using the Scene view's Overdraw shading mode: At the end of the day, fill rate is provided as a means of gauging the best-case behavior. In other words, it's primarily a marketing term and mostly theoretical. But, the technical side of the industry has adopted the term as a way of describing the back end of the pipeline: the stage where fragment data is funneled through our Shaders and drawn to the screen. If every fragment required an absolute minimum level of processing (such as a Shader that returned a constant color), then we might get close to that theoretical maximum. The GPU is a complex beast, however, and things are never so simple. The nature of the device means it works best when given many small tasks to perform. But, if the tasks get too large, then fill rate is lost due to the back end not being able to push through enough fragments in time and the rest of the pipeline is left waiting for tasks to do. There are several more features that can potentially consume our theoretical fill rate maximum, including but not limited to alpha testing, alpha blending, texture sampling, the amount of fragment data being pulled through our Shaders, and even the color format of the target render texture (the final Frame Buffer in most cases). The bad news is that this gives us a lot of subsections to cover, and a lot of ways to break the process, but the good news is it gives us a lot of avenues to explore to improve our fill rate usage. Occlusion Culling One of the best ways to reduce overdraw is to make use of Unity's Occlusion Culling system. The system works by partitioning Scene space into a series of cells and flying through the world with a virtual camera making note of which cells are invisible from other cells (are occluded) based on the size and position of the objects present. Note that this is different to the technique of Frustum Culling, which culls objects not visible from the current camera view. This feature is always active in all versions, and objects culled by this process are automatically ignored by the Occlusion Culling system. Occlusion Culling is available in the Unity 4 Pro Edition and all editions of Unity 5. Occlusion Culling data can only be generated for objects properly labeled Occluder Static and Occludee Static under the StaticFlags dropdown. Occluder Static is the general setting for static objects where we want it to hide other objects, and be hidden by large objects in its way. Occludee Static is a special case for transparent objects that allows objects behind them to be rendered, but we want them to be hidden if something large blocks their visibility. Naturally, because one of the static flags must be enabled for Occlusion Culling, this feature will not work for dynamic objects. The following screenshot shows how effective Occlusion Culling can be at reducing the number of visible objects in our Scene: This feature will cost us in both application footprint and incur some runtime costs. It will cost RAM to keep the Occlusion Culling data structure in memory, and there will be a CPU processing cost to determine which objects are being occluded in each frame. The Occlusion Culling data structure must be properly configured to create cells of the appropriate size for our Scene, and the smaller the cells, the longer it takes to generate the data structure. But, if it is configured correctly for the Scene, Occlusion Culling can provide both fill rate savings through reduced overdraw, and Draw Call savings by culling non-visible objects. Shader optimization Shaders can be a significant fill rate consumer, depending on their complexity, how much texture sampling takes place, how many mathematical functions are used, and so on. Shaders do not directly consume fill rate, but do so indirectly because the GPU must calculate or fetch data from memory during Shader processing. The GPU's parallel nature means any bottleneck in a thread will limit how many fragments can be pushed into the thread at a later date, but parallelizing the task (sharing small pieces of the job between several agents) provides a net gain over serial processing (one agent handling each task one after another). The classic example is a vehicle assembly line. A complete vehicle requires multiple stages of manufacture to complete. The critical path to completion might involve five steps: stamping, welding, painting, assembly, and inspection, and each step is completed by a single team. For any given vehicle, no stage can begin before the previous one is finished, but whatever team handled the stamping for the last vehicle can begin stamping for the next vehicle as soon as it has finished. This organization allows each team to become masters of their particular domain, rather than trying to spread their knowledge too thin, which would likely result in less consistent quality in the batch of vehicles. We can double the overall output by doubling the number of teams, but if any team gets blocked, then precious time is lost for any given vehicle, as well as all future vehicles that would pass through the same team. If these delays are rare, then they can be negligible in the grand scheme, but if not, and one stage takes several minutes longer than normal each and every time it must complete the task, then it can become a bottleneck that threatens the release of the entire batch. The GPU parallel processors work in a similar way: each processor thread is an assembly line, each processing stage is a team, and each fragment is a vehicle. If the thread spends a long time processing a single stage, then time is lost on each fragment. This delay will multiply such that all future fragments coming through the same thread will be delayed. This is a bit of an oversimplification, but it often helps to paint a picture of how poorly optimized Shader code can chew up our fill rate, and how small improvements in Shader optimization provide big benefits in back end performance. Shader programming and optimization have become a very niche area of game development. Their abstract and highly-specialized nature requires a very different kind of thinking to generate Shader code compared to gameplay and engine code. They often feature mathematical tricks and back-door mechanisms for pulling data into the Shader, such as precomputing values in texture files. Because of this, and the importance of optimization, Shaders tend to be very difficult to read and reverse-engineer. Consequently, many developers rely on prewritten Shaders, or visual Shader creation tools from the Asset Store such as Shader Forge or Shader Sandwich. This simplifies the act of initial Shader code generation, but might not result in the most efficient form of Shaders. If we're relying on pre-written Shaders or tools, we might find it worthwhile to perform some optimization passes over them using some tried-and-true techniques. So, let's focus on some easily reachable ways of optimizing our Shaders. Consider using Shaders intended for mobile platforms The built-in mobile Shaders in Unity do not have any specific restrictions that force them to only be used on mobile devices. They are simply optimized for minimum resource usage (and tend to feature some of the other optimizations listed in this section). Desktop applications are perfectly capable of using these Shaders, but they tend to feature a loss of graphical quality. It only becomes a question of whether the loss of graphical quality is acceptable. So, consider doing some testing with the mobile equivalents of common Shaders to see whether they are a good fit for your game. Use small data types GPUs can calculate with smaller data types more quickly than larger types (particularly on mobile platforms!), so the first tweak we can attempt is replacing our float data types (32-bit, floating point) with smaller versions such as half (16-bit, floating point), or even fixed (12-bit, fixed point). The size of the data types listed above will vary depending on what floating point formats the target platform prefers. The sizes listed are the most common. The importance for optimization is in the relative size between formats. Color values are good candidates for precision reduction, as we can often get away with less precise color values without any noticeable loss in coloration. However, the effects of reducing precision can be very unpredictable for graphical calculations. So, changes such as these can require some testing to verify whether the reduced precision is costing too much graphical fidelity. Note that the effects of these tweaks can vary enormously between one GPU architecture and another (for example, AMD versus Nvidia versus Intel), and even GPU brands from the same manufacturer. In some cases, we can make some decent performance gains for a trivial amount of effort. In other cases, we might see no benefit at all. Avoid changing precision while swizzling Swizzling is the Shader programming technique of creating a new vector (an array of values) from an existing vector by listing the components in the order in which we wish to copy them into the new structure. Here are some examples of swizzling: float4 input = float4(1.0, 2.0, 3.0, 4.0); // initial test value float2 val1 = input.yz; // swizzle two components float3 val2 = input.zyx; // swizzle three components in a different order float4 val3 = input.yyy; // swizzle the same component multiple times float sclr = input.w; float3 val4 = sclr.xxx // swizzle a scalar multiple times We can use both the xyzw and rgba representations to refer to the same components, sequentially. It does not matter whether it is a color or vector; they just make the Shader code easier to read. We can also list components in any order we like to fill in the desired data, repeating them if necessary. Converting from one precision type to another in a Shader can be a costly operation, but converting the precision type while simultaneously swizzling can be particularly painful. If we have mathematical operations that rely on being swizzled into different precision types, it would be wiser if we simply absorbed the high-precision cost from the very beginning, or reduced precision across the board to avoid the need for changes in precision. Use GPU-optimized helper functions The Shader compiler often performs a good job of reducing mathematical calculations down to an optimized version for the GPU, but compiled custom code is unlikely to be as effective as both the Cg library's built-in helper functions and the additional helpers provided by the Unity Cg included files. If we are using Shaders that include custom function code, perhaps we can find an equivalent helper function within the Cg or Unity libraries that can do a better job than our custom code can. These extra include files can be added to our Shader within the CGPROGRAM block like so: CGPROGRAM // other includes #include "UnityCG.cginc" // Shader code here ENDCG Example Cg library functions to use are abs() for absolute values, lerp() for linear interpolation, mul() for multiplying matrices, and step() for step functionality. Useful UnityCG.cginc functions include WorldSpaceViewDir() for calculating the direction towards the camera, and Luminance() for converting a color to grayscale. Check the following URL for a full list of Cg standard library functions: http://http.developer.nvidia.com/CgTutorial/cg_tutorial_appendix_e.html. Check the Unity documentation for a complete and up-to-date list of possible include files and their accompanying helper functions: http://docs.unity3d.com/Manual/SL-BuiltinIncludes.html. Disable unnecessary features Perhaps we can make savings by simply disabling Shader features that aren't vital. Does the Shader really need multiple passes, transparency, Z-writing, alpha-testing, and/or alpha blending? Will tweaking these settings or removing these features give us a good approximation of our desired effect without losing too much graphical fidelity? Making such changes is a good way of making fill rate cost savings. Remove unnecessary input data Sometimes the process of writing a Shader involves a lot of back and forth experimentation in editing code and viewing it in the Scene. The typical result of this is that input data that was needed when the Shader was going through early development is now surplus fluff once the desired effect has been obtained, and it's easy to forget what changes were made when/if the process drags on for a long time. But, these redundant data values can cost the GPU valuable time as they must be fetched from memory even if they are not explicitly used by the Shader. So, we should double check our Shaders to ensure all of their input geometry, vertex, and fragment data is actually being used. Only expose necessary variables Exposing unnecessary variables from our Shader to the accompanying Material(s) can be costly as the GPU can't assume these values are constant. This means the Shader code cannot be compiled into a more optimized form. This data must be pushed from the CPU with every pass since they can be modified at any time through the Material's methods such as SetColor(), SetFloat(), and so on. If we find that, towards the end of the project, we always use the same value for these variables, then they can be replaced with a constant in the Shader to remove such excess runtime workload. The only cost is obfuscating what could be critical graphical effect parameters, so this should be done very late in the process. Reduce mathematical complexity Complicated mathematics can severely bottleneck the rendering process, so we should do whatever we can to limit the damage. Complex mathematical functions could be replaced with a texture that is fed into the Shader and provides a pre-generated table for runtime lookup. We may not see any improvement with functions such as sin and cos, since they've been heavily optimized to make use of GPU architecture, but complex methods such as pow, exp, log, and other custom mathematical processes can only be optimized so much, and would be good candidates for simplification. This is assuming we only need one or two input values, which are represented through the X and Y coordinates of the texture, and mathematical accuracy isn't of paramount importance. This will cost us additional graphics memory to store the texture at runtime (more on this later), but if the Shader is already receiving a texture (which they are in most cases) and the alpha channel is not being used, then we could sneak the data in through the texture's alpha channel, costing us literally no performance, and the rest of the Shader code and graphics system would be none-the-wiser. This will involve the customization of art assets to include such data in any unused color channel(s), requiring coordination between programmers and artists, but is a very good way of saving Shader processing costs with no runtime sacrifices. In fact, Material properties and textures are both excellent entry points for pushing work from the Shader (the GPU) onto the CPU. If a complex calculation does not need to vary on a per pixel basis, then we could expose the value as a property in the Material, and modify it as needed (accepting the overhead cost of doing so from the previous section Only expose necessary variables). Alternatively, if the result varies per pixel, and does not need to change often, then we could generate a texture file from script code, containing the results of the calculations in the RGBA values, and pulling the texture into the Shader. Lots of opportunities arise when we ignore the conventional application of such systems, and remember to think of them as just raw data being transferred around. Reduce texture lookups While we're on the subject of texture lookups, they are not trivial tasks for the GPU to process and they have their own overhead costs. They are the most common cause of memory access problems within the GPU, especially if a Shader is performing samples across multiple textures, or even multiple samples across a single texture, as they will likely inflict cache misses in memory. Such situations should be simplified as much as possible to avoid severe GPU memory bottlenecking. Even worse, sampling a texture in a random order would likely result in some very costly cache misses for the GPU to suffer through, so if this is being done, then the texture should be reordered so that it can be sampled in a more sequential order. Avoid conditional statements In modern day CPU architecture, conditional statements undergo a lot of clever predictive techniques to make use of instruction-level parallelism. This is a feature where the CPU attempts to predict which direction a conditional statement will go in before it has actually been resolved, and speculatively begins processing the most likely result of the conditional using any free components that aren't being used to resolve the conditional (fetching some data from memory, copying some floats into unused registers, and so on). If it turns out that the decision is wrong, then the current result is discarded and the proper path is taken instead. So long as the cost of speculative processing and discarding false results is less than the time spent waiting to decide the correct path, and it is right more often than it is wrong, then this is a net gain for the CPU's speed. However, this feature is not possible on GPU architecture because of its parallel nature. The GPU's cores are typically managed by some higher-level construct that instructs all cores under its command to perform the same machine-code-level instruction simultaneously. So, if the Fragment Shader requires a float to be multiplied by 2, then the process will begin by having all cores copy data into the appropriate registers in one coordinated step. Only when all cores have finished copying to the registers will the cores be instructed to begin the second step: multiplying all registers by 2. Thus, when this system stumbles into a conditional statement, it cannot resolve the two statements independently. It must determine how many of its child cores will go down each path of the conditional, grab the list of required machine code instructions for one path, resolve them for all cores taking that path, and repeat for each path until all possible paths have been processed. So, for an if-else statement (two possibilities), it will tell one group of cores to process the "true" path, then ask the remaining cores to process the "false" path. Unless every core takes the same path, it must process both paths every time. So, we should avoid branching and conditional statements in our Shader code. Of course, this depends on how essential the conditional is to achieving the graphical effect we desire. But, if the conditional is not dependent on per pixel behavior, then we would often be better off absorbing the cost of unnecessary mathematics than inflicting a branching cost on the GPU. For example, we might be checking whether a value is non-zero before using it in a calculation, or comparing against some global flag in the Material before taking one action or another. Both of these cases would be good candidates for optimization by removing the conditional check. Reduce data dependencies The compiler will try its best to optimize our Shader code into the more GPU-friendly low-level language so that it is not waiting on data to be fetched when it could be processing some other task. For example, the following poorly-optimized code, could be written in our Shader: float sum = input.color1.r; sum = sum + input.color2.g; sum = sum + input.color3.b; sum = sum + input.color4.a; float result = calculateSomething(sum); If we were able to force the Shader compiler to compile this code into machine code instructions as it is written, then this code has a data dependency such that each calculation cannot begin until the last finishes due to the dependency on the sum variable. But, such situations are often detected by the Shader compiler and optimized into a version that uses instruction-level parallelism (the code shown next is the high-level code equivalent of the resulting machine code): float sum1, sum2, sum3, sum4; sum1 = input.color1.r; sum2 = input.color2.g; sum3 = input.color3.b sum4 = input.color4.a; float sum = sum1 + sum2 + sum3 + sum4; float result = CalculateSomething(sum); In this case, the compiler would recognize that it can fetch the four values from memory in parallel and complete the summation once all four have been fetched independently via thread-level parallelism. This can save a lot of time, relative to performing the four fetches one after another. However, long chains of data dependency can absolutely murder Shader performance. If we create a strong data dependency in our Shader's source code, then it has been given no freedom to make such optimizations. For example, the following data dependency would be painful on performance, as one step cannot be completed without waiting on another to fetch data and performing the appropriate calculation. float4 val1 = tex2D(_tex1, input.texcoord.xy); float4 val2 = tex2D(_tex2, val1.yz); float4 val3 = tex2D(_tex3, val2.zw); Strong data dependencies such as these should be avoided whenever possible. Surface Shaders If we're using Unity's Surface Shaders, which are a way for Unity developers to get to grips with Shader programming in a more simplified fashion, then the Unity Engine takes care of converting our Surface Shader code for us, abstracting away some of the optimization opportunities we have just covered. However, it does provide some miscellaneous values that can be used as replacements, which reduce accuracy but simplify the mathematics in the resulting code. Surface Shaders are designed to handle the general case fairly efficiently, but optimization is best achieved with a personal touch. The approxview attribute will approximate the view direction, saving costly operations. halfasview will reduce the precision of the view vector, but beware of its effect on mathematical operations involving multiple precision types. noforwardadd will limit the Shader to only considering a single directional light, reducing Draw Calls since the Shader will render in only a single pass, but reducing lighting complexity. Finally, noambient will disable ambient lighting in the Shader, removing some extra mathematical operations that we may not need. Use Shader-based LOD We can force Unity to render distant objects using simpler Shaders, which can be an effective way of saving fill rate, particularly if we're deploying our game onto multiple platforms or supporting a wide range of hardware capability. The LOD keyword can be used in the Shader to set the onscreen size factor that the Shader supports. If the current LOD level does not match this value, it will drop to the next fallback Shader and so on until it finds the Shader that supports the given size factor. We can also change a given Shader object's LOD value at runtime using the maximumLOD property. This feature is similar to the mesh-based LOD covered earlier, and uses the same LOD values for determining object form factor, so it should be configured as such. Memory bandwidth Another major component of back end processing and a potential source of bottlenecks is memory bandwidth. Memory bandwidth is consumed whenever a texture must be pulled from a section of the GPU's main video memory (also known as VRAM). The GPU contains multiple cores that each have access to the same area of VRAM, but they also each contain a much smaller, local Texture Cache that stores the current texture(s) the GPU has been most recently working with. This is similar in design to the multitude of CPU cache levels that allow memory transfer up and down the chain, as a workaround for the fact that faster memory will, invariably, be more expensive to produce, and hence smaller in capacity compared to slower memory. Whenever a Fragment Shader requests a sample from a texture that is already within the core's local Texture Cache, then it is lightning fast and barely perceivable. But, if a texture sample request is made, that does not yet exist within the Texture Cache, then it must be pulled in from VRAM before it can be sampled. This fetch request risks cache misses within VRAM as it tries to find the relevant texture. The transfer itself consumes a certain amount of memory bandwidth, specifically an amount equal to the total size of the texture file stored within VRAM (which may not be the exact size of the original file, nor the size in RAM, due to GPU-level compression). It's for this reason that, if we're bottlenecked on memory bandwidth, then performing a brute force test by reducing texture quality would suddenly result in a performance improvement. We've shrunk the size of our textures, easing the burden on the GPU's memory bandwidth, allowing it to fetch the necessary textures much quicker. Globally reducing texture quality can be achieved by going to Edit | Project Settings | Quality | Texture Quality and setting the value to Half Res, Quarter Res, or Eighth Res. In the event that memory bandwidth is bottlenecked, then the GPU will keep fetching the necessary texture files, but the entire process will be throttled as the Texture Cache waits for the data to appear before processing the fragment. The GPU won't be able to push data back to the Frame Buffer in time to be rendered onto the screen, blocking the whole process and culminating in a poor frame rate. Ultimately, proper usage of memory bandwidth is a budgeting concern. For example, with a memory bandwidth of 96 GB/sec per core and a target frame rate of 60 frames per second, then the GPU can afford to pull 96/60 = 1.6 GB worth of texture data every frame before being bottlenecked on memory bandwidth. Memory bandwidth is often listed on a per core basis, but some GPU manufacturers may try to mislead you by multiplying memory bandwidth by the number of cores in order to list a bigger, but less practical number. Because of this, research may be necessary to confirm the memory bandwidth limit we have for the target GPU hardware is given on a per core basis. Note that this value is not the maximum limit on the texture data that our game can contain in the project, nor in CPU RAM, not even in VRAM. It is a metric that limits how much texture swapping can occur during one frame. The same texture could be pulled back and forth multiple times in a single frame depending on how many Shaders need to use them, the order that the objects are rendered, and how often texture sampling must occur, so rendering just a few objects could consume whole gigabytes of memory bandwidth if they all require the same high quality, massive textures, require multiple secondary texture maps (normal maps, emission maps, and so on), and are not batched together, because there simply isn't enough Texture Cache space available to keep a single texture file long enough to exploit it during the next rendering pass. There are several approaches we can take to solve bottlenecks in memory bandwidth. Use less texture data This approach is simple, straightforward, and always a good idea to consider. Reducing texture quality, either through resolution or bit rate, is not ideal for graphical quality, but we can sometimes get away with using 16-bit textures without any noticeable degradation. Mip Maps are another excellent way of reducing the amount of texture data being pushed back and forth between VRAM and the Texture Cache. Note that the Scene View has a Mipmaps Shading Mode, which will highlight textures in our scene blue or red depending on whether the current texture scale is appropriate for the current Scene View's camera position and orientation. This will help identify what textures are good candidates for further optimization. Mip Maps should almost always be used in 3D Scenes, unless the camera moves very little. Test different GPU Texture Compression formats The Texture Compression techniques helpe reduce our application's footprint (executable file size), and runtime CPU memory usage, that is, the storage area where all texture resource data is kept until it is needed by the GPU. However, once the data reaches the GPU, it uses a different form of compression to keep texture data small. The common formats are DXT, PVRTC, ETC, and ASTC. To make matters more confusing, each platform and GPU hardware supports different compression formats, and if the device does not support the given compression format, then it will be handled at the software level. In other words, the CPU will need to stop and recompress the texture to the desired format the GPU wants, as opposed to the GPU taking care of it with a specialized hardware chip. The compression options are only available if a texture resource has its Texture Type field set to Advanced. Using any of the other texture type settings will simplify the choices, and Unity will make a best guess when deciding which format to use for the target platform, which may not be ideal for a given piece of hardware and thus will consume more memory bandwidth than necessary. The best approach to determining the correct format is to simply test a bunch of different devices and Texture Compression techniques and find one that fits. For example, common wisdom says that ETC is the best choice for Android since more devices support it, but some developers have found their game works better with the DXT and PVRTC formats on certain devices. Beware that, if we're at the point where individually tweaking Texture Compression techniques is necessary, then hopefully we have exhausted all other options for reducing memory bandwidth. By going down this road, we could be committing to supporting many different devices each in their own specific way. Many of us would prefer to keep things simple with a general solution instead of personal customization and time-consuming handiwork to work around problems like this. Minimize texture sampling Can we modify our Shaders to remove some texture sampling overhead? Did we add some extra texture lookup files to give ourselves some fill rate savings on mathematical functions? If so, we might want to consider lowering the resolution of such textures or reverting the changes and solving our fill rate problems in other ways. Essentially, the less texture sampling we do, the less often we need to use memory bandwidth and the closer we get to resolving the bottleneck. Organize assets to reduce texture swaps This approach basically comes back to Batching and Atlasing again. Are there opportunities to batch some of our biggest texture files together? If so, then we could save the GPU from having to pull in the same texture files over and over again during the same frame. As a last resort, we could look for ways to remove some textures from the entire project and reuse similar files. For instance, if we have fill rate budget to spare, then we may be able to use some Fragment Shaders to make a handful of textures files appear in our game with different color variations. VRAM limits One last consideration related to textures is how much VRAM we have available. Most texture transfer from CPU to GPU occurs during initialization, but can also occur when a non-existent texture is first required by the current view. This process is asynchronous and will result in a blank texture being used until the full texture is ready for rendering. As such, we should avoid too much texture variation across our Scenes. Texture preloading Even though it doesn't strictly relate to graphics performance, it is worth mentioning that the blank texture that is used during asynchronous texture loading can be jarring when it comes to game quality. We would like a way to control and force the texture to be loaded from disk to the main memory and then to VRAM before it is actually needed. A common workaround is to create a hidden GameObject that features the texture and place it somewhere in the Scene on the route that the player will take towards the area where it is actually needed. As soon as the textured object becomes a candidate for the rendering system (even if it's technically hidden), it will begin the process of copying the data towards VRAM. This is a little clunky, but is easy to implement and works sufficiently well in most cases. We can also control such behavior via Script code by changing a hidden Material's texture: GetComponent<Renderer>().material.texture = textureToPreload; Texture thrashing In the rare event that too much texture data is loaded into VRAM, and the required texture is not present, the GPU will need to request it from the main memory and overwrite the existing texture data to make room. This is likely to worsen over time as the memory becomes fragmented, and it introduces a risk that the texture just flushed from VRAM needs to be pulled again within the same frame. This will result in a serious case of memory "thrashing", and should be avoided at all costs. This is less of a concern on modern consoles such as the PS4, Xbox One, and WiiU, since they share a common memory space for both CPU and GPU. This design is a hardware-level optimization given the fact that the device is always running a single application, and almost always rendering 3D graphics. But, all other platforms must share time and space with multiple applications and be capable of running without a GPU. They therefore feature separate CPU and GPU memory, and we must ensure that the total texture usage at any given moment remains below the available VRAM of the target hardware. Note that this "thrashing" is not precisely the same as hard disk thrashing, where memory is copied back and forth between main memory and virtual memory (the swap file), but it is analogous. In either case, data is being unnecessarily copied back and forth between two regions of memory because too much data is being requested in too short a time period for the smaller of the two memory regions to hold it all. Thrashing such as this can be a common cause of dreadful graphics performance when games are ported from modern consoles to the desktop and should be treated with care. Avoiding this behavior may require customizing texture quality and file sizes on a per-platform and per-device basis. Be warned that some players are likely to notice these inconsistencies if we're dealing with hardware from the same console or desktop GPU generation. As many of us will know, even small differences in hardware can lead to a lot of apples-versus-oranges comparisons, but hardcore gamers will expect a similar level of quality across the board. Lighting and Shadowing Lighting and Shadowing can affect all parts of the graphics pipeline, and so they will be treated separately. This is perhaps one of the most important parts of game art and design to get right. Good Lighting and Shadowing can turn a mundane scene into something spectacular as there is something magical about professional coloring that makes it visually appealing. Even the low-poly art style (think Monument Valley) relies heavily on a good lighting and shadowing profile in order to allow the player to distinguish one object from another. But, this isn't an art book, so we will focus on the performance characteristics of various Lighting and Shadowing features. Unity offers two styles of dynamic light rendering, as well as baked lighting effects through lightmaps. It also provides multiple ways of generating shadows with varying levels of complexity and runtime processing cost. Between the two, there are a lot of options to explore, and a lot of things that can trip us up if we're not careful. The Unity documentation covers all of these features in an excellent amount of detail (start with this page and work through them: http://docs.unity3d.com/Manual/Lighting.html), so we'll examine these features from a performance standpoint. Let's tackle the two main light rendering modes first. This setting can be found under Edit | Project Settings | Player | Other Settings | Rendering, and can be configured on a per-platform basis. Forward Rendering Forward Rendering is the classical form of rendering lights in our scene. Each object is likely to be rendered in multiple passes through the same Shader. How many passes are required will be based on the number, distance, and brightness of light sources. Unity will try to prioritize which directional light is affecting the object the most and render the object in a "base pass" as a starting point. It will then take up to four of the most powerful point lights nearby and re-render the same object multiple times through the same Fragment Shader. The next four point lights will then be processed on a per-vertex basis. All remaining lights are treated as a giant blob by means of a technique called spherical harmonics. Some of this behavior can be simplified by setting a light's Render Mode to values such as Not Important, and changing the value of Edit | Project Settings | Quality | Pixel Light Count. This value limits how many lights will be treated on a per pixel basis, but is overridden by any lights with a Render Mode set to Important. It is therefore up to us to use this combination of settings responsibly. As you can imagine, the design of Forward Rendering can utterly explode our Draw Call count very quickly in scenes with a lot of point lights present, due to the number of render states being configured and Shader passes being reprocessed. CPU-bound applications should avoid this rendering mode if possible. More information on Forward Rendering can be found in the Unity documentation: http://docs.unity3d.com/Manual/RenderTech-ForwardRendering.html. Deferred Shading Deferred Shading or Deferred Rendering as it is sometimes known, is only available on GPUs running at least Shader Model 3.0. In other words, any desktop graphics card made after around 2004. The technique has been around for a while, but it has not resulted in a complete replacement of the Forward Rendering method due to the caveats involved and limited support on mobile devices. Anti-aliasing, transparency, and animated characters receiving shadows are all features that cannot be managed through Deferred Shading alone and we must use the Forward Rendering technique as a fallback. Deferred Shading is so named because actual shading does not occur until much later in the process; that is, it is deferred until later. From a performance perspective, the results are quite impressive as it can generate very good per pixel lighting with surprisingly little Draw Call effort. The advantage is that a huge amount of lighting can be accomplished using only a single pass through the lighting Shader. The main disadvantages include the additional costs if we wish to pile on advanced lighting features such as Shadowing and any steps that must pass through Forward Rendering in order to complete, such as transparency. The Unity documentation contains an excellent source of information on the Deferred Shading technique, its advantages, and its pitfalls: http://docs.unity3d.com/Manual/RenderTech-DeferredShading.html Vertex Lit Shading (legacy) Technically, there are more than two lighting methods. Unity allows us to use a couple of legacy lighting systems, only one of which may see actual use in the field: Vertex Lit Shading. This is a massive simplification of lighting, as lighting is only considered per vertex, and not per pixel. In other words, entire faces are colored based on the incoming light color, and not individual pixels. It is not expected that many, or really any, 3D games will make use of this legacy technique, as a lack of shadows and proper lighting make visualizations of depth very difficult. It is mostly relegated to 2D games that don't intend to make use of shadows, normal maps, and various other lighting features, but it is there if we need it. Real-time Shadows Soft Shadows are expensive, Hard Shadows are cheap, and No Shadows are free. Shadow Resolution, Shadow Projection, Shadow Distance, and Shadow Cascades are all settings we can find under Edit | Project Settings | Quality | Shadows that we can use to modify the behavior and complexity of our shadowing passes. That summarizes almost everything we need to know about Unity's real-time shadowing techniques from a high-level performance standpoint. We will cover shadows more in the following section on optimizing our lighting effects. Lighting optimization With a cursory glance at all of the relevant lighting techniques, let's run through some techniques we can use to improve lighting costs. Use the appropriate Shading Mode It is worth testing both of the main rendering modes to see which one best suits our game. Deferred Shading is often used as a backup in the event that Forward Rendering is becoming a burden on performance, but it really depends on where else we're finding bottlenecks as it is sometimes difficult to tell the difference between them. Use Culling Masks A Light Component's Culling Mask property is a layer-based mask that can be used to limit which objects will be affected by the given Light. This is an effective way of reducing lighting overhead, assuming that the layer interactions also make sense with how we are using layers for physics optimization. Objects can only be a part of a single layer, and reducing physics overhead probably trumps lighting overhead in most cases; thus, if there is a conflict, then this may not be the ideal approach. Note that there is limited support for Culling Masks when using Deferred Shading. Because of the way it treats lighting in a very global fashion, only four layers can be disabled from the mask, limiting our ability to optimize its behavior through this method. Use Baked Lightmaps Baking Lighting and Shadowing into a Scene is significantly less processor-intensive than generating them at runtime. The downside is the added application footprint, memory consumption, and potential for memory bandwidth abuse. Ultimately, unless a game's lighting effects are being handled exclusively through Legacy Vertex Lighting or a single Directional Light, then it should probably include Lightmapping to make some huge budget savings on lighting calculations. Relying entirely on real-time lighting and shadows is a recipe for disaster unless the game is trying to win an award for the smallest application file size of all time. Optimize Shadows Shadowing passes mostly consume our Draw Calls and fill rate, but the amount of vertex position data we feed into the process and our selection for the Shadow Projection setting will affect the front end's ability to generate the required shadow casters and shadow receivers. We should already be attempting to reduce vertex counts to solve front end bottlenecking in the first place, and making this change will be an added multiplier towards that effort. Draw Calls are consumed during shadowing by rendering visible objects into a separate buffer (known as the shadow map) as either a shadow caster, a shadow receiver, or both. Each object that is rendered into this map will consume another Draw Call, which makes shadows a huge performance cost multiplier, so it is often a setting that games will expose to users via quality settings, allowing users with weaker hardware to reduce the effect or even disable it entirely. Shadow Distance is a global multiplier for runtime shadow rendering. The fewer shadows we need to draw, the happier the entire rendering process will be. There is little point in rendering shadows at a great distance from the camera, so this setting should be configured specific to our game and how much shadowing we expect to witness during gameplay. It is also a common setting that is exposed to the user to reduce the burden of rendering shadows. Higher values of Shadow Resolution and Shadow Cascades will increase our memory bandwidth and fill rate consumption. Both of these settings can help curb the effects of artefacts in shadow rendering, but at the cost of a much larger shadow map size that must be moved around and of the canvas size to draw to. The Unity documentation contains an excellent summary on the topic of the aliasing effect of shadow maps and how the Shadow Cascades feature helps to solve the problem: http://docs.unity3d.com/Manual/DirLightShadows.html. It's worth noting that Soft Shadows do not consume any more memory or CPU overhead relative to Hard Shadows, as the only difference is a more complex Shader. This means that applications with enough fill rate to spare can enjoy the improved graphical fidelity of Soft Shadows. Optimizing graphics for mobile Unity's ability to deploy to mobile devices has contributed greatly to its popularity among hobbyist, small, and mid-size development teams. As such, it would be prudent to cover some approaches that are more beneficial for mobile platforms than for desktop and other devices. Note that any, and all, of the following approaches may become obsolete soon, if they aren't already. The mobile device market is moving blazingly fast, and the following techniques as they apply to mobile devices merely reflect conventional wisdom from the last half decade. We should occasionally test the assumptions behind these approaches from time-to-time to see whether the limitations of mobile devices still fit the mobile marketplace. Minimize Draw Calls Mobile applications are more often bottlenecked on Draw Calls than on fill rate. Not that fill rate concerns should be ignored (nothing should, ever!), but this makes it almost necessary for any mobile application of reasonable quality to implement Mesh Combining, Batching, and Atlasing techniques from the very beginning. Deferred Rendering is also the preferred technique as it fits well with other mobile-specific concerns, such as avoiding transparency and having too many animated characters. Minimize the Material count This concern goes hand in hand with the concepts of Batching and Atlasing. The fewer Materials we use, the fewer Draw Calls will be necessary. This strategy will also help with concerns relating to VRAM and memory bandwidth, which tend to be very limited on mobile devices. Minimize texture size and Material count Most mobile devices feature a very small Texture Cache relative to desktop GPUs. For instance, the iPhone 3G can only support a total texture size of 1024x1024 due to running OpenGLES1.1 with simple vertex rendering techniques. Meanwhile the iPhone 3GS, iPhone 4, and iPad generation run OpenGLES 2.0, which only supports textures up to 2048x2048. Later generations can support textures up to 4096x4096. Double check the device hardware we are targeting to be sure it supports the texture file sizes we wish to use (there are too many Android devices to list here). However, later-generation devices are never the most common devices in the mobile marketplace. If we wish our game to reach a wide audience (increasing its chances of success), then we must be willing to support weaker hardware. Note that textures that are too large for the GPU will be downscaled by the CPU during initialization, wasting valuable loading time, and leaving us with unintended graphical fidelity. This makes texture reuse of paramount importance for mobile devices due to the limited VRAM and Texture Cache sizes available. Make textures square and power-of-2 The GPU will find it difficult, or simply be unable to compress the texture if it is not in a square format, so make sure you stick to the common development convention and keep things square and sized to a power of 2. Use the lowest possible precision formats in Shaders Mobile GPUs are particularly sensitive to precision formats in its Shaders, so the smallest formats should be used. On a related note, format conversion should be avoided for the same reason. Avoid Alpha Testing Mobile GPUs haven't quite reached the same levels of chip optimization as desktop GPUs, and Alpha Testing remains a particularly costly task on mobile devices. In most cases it should simply be avoided in favor of Alpha Blending. Summary If you've made it this far without skipping ahead, then congratulations are in order. That was a lot of information to absorb for just one component of the Unity Engine, but then it is clearly the most complicated of them all, requiring a matching depth of explanation. Hopefully, you've learned a lot of approaches to help you improve your rendering performance and enough about the rendering pipeline to know how to use them responsibly! To learn more about Unity 5, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Unity 5 Game Optimization (https://www.packtpub.com/game-development/unity-5-game-optimization) Unity 5.x By Example (https://www.packtpub.com/game-development/unity-5x-example) Unity 5.x Cookbook (https://www.packtpub.com/game-development/unity-5x-cookbook) Unity 5 for Android Essentials (https://www.packtpub.com/game-development/unity-5-android-essentials) Resources for Article: Further resources on this subject: The Vertex Functions [article] UI elements and their implementation [article] Routing for Yii Demystified [article]

0
0
40570

article-image-azure-stream-analytics-7-reasons-to-choose

Sugandha Lahoti

19 Apr 2018

11 min read

How to get started with Azure Stream Analytics and 7 reasons to choose it

Sugandha Lahoti

19 Apr 2018

11 min read

0
0
40529

article-image-an-introduction-to-typescript-types-for-asp-net-core-tutorial

Prasad Ramesh

23 Feb 2019

9 min read

An introduction to TypeScript types for ASP.NET core [Tutorial]

Prasad Ramesh

23 Feb 2019

9 min read

JavaScript, being a flexible scripting language along with its dynamic type system, can become harder to maintain, the more a project scales up and as the team's staff changes. There are many tools and languages that can assist with this situation, one of which is TypeScript. This article is an excerpt from a book written by Tamir Dresher, Amir Zuker, and Shay Friedman titled Hands-On Full-Stack Web Development with ASP.NET Core. In this book, you will learn how to build web applications using Angular, React, and Vue. Back in the day when JavaScript ES5 was the active version, writing JavaScript code that adhered to encapsulation and modularity was not a trivial task. Things such as prototypes, namespaces, and self-executing functions could certainly improve your code; unfortunately, many JavaScript developers did not invest the necessary effort to improve the manageability of their code using those constructs. Consequently, new technologies surfaced to help with this matter; a popular example relevant at that time was CoffeeScript. Then, JavaScript ES6 was released. ES6, with its great added features, such as modules, classes, and more specific variable scoping, basically overturned CoffeeScript and similar alternatives at that time. Evidently, JavaScript is still missing a key feature, and that is type information. JavaScript's dynamic type system is a strong feature at times; unfortunately, the manageability and reusability of existing code suffers due to this fact. For that key purpose, every large-scale project should examine the use of tools that compensate the situation; those leading the field now are TypeScript and Flow. TypeScript, an open source scripting language, is developed and maintained by Microsoft. Its killer feature is bringing static typing to JavaScript-based systems and allowing you to annotate the code with type information. Additionally, it supports augmenting existing JavaScript code with external type information, similar to the header files pattern, as it's commonly known in C++. TypeScript is a superset of JavaScript, that meaning it has seamless integration with JavaScript, and makes JavaScript developers feel right at home. Being a JavaScript developer makes you a TypeScript developer as well since JavaScript code is actually valid TypeScript. Importantly, this makes the gradual upgrade of existing JavaScript code fairly easy. Another example of how great TypeScript is is the fact that Angular, a framework built by Google, adopts it extensively, and is actually implemented in TypeScript internally. The TypeScript compiler TypeScript is a superset of JavaScript, yet browsers and other platforms don't recognize it, nor are they able to execute it. For that purpose, TypeScript is processed by the TypeScript compiler, which transpiles your TypeScript code to JavaScript. Then, you can run the transpiled code basically everywhere JavaScript is supported. The tsc is available as an npm package that you can install globally. Open your favorite terminal and install it using the following command: npm install -g typescript Afterward, you can use the command tsc to execute the tsc: TypeScript files have a .ts file extension; this is where you write your TypeScript code. When needed, you use the TypeScript compiler to transpile your TypeScript code, which creates the JavaScript implementation of the code; you do that by executing the following command: tsc <typescript_filename> Types in Typescript One of the key features of TypeScript is bringing static typing to JavaScript. In TypeScript, you annotate your code with type information; every declaration can be defined with its associated typing. In the declaration, you do that by adding a colon following the type. For example, the following statement defines a variable of type number: const productsCount: number; Annotating your code with type information enables validation. The TypeScript compiler should throw errors if you try to do anything that is not supported by the specified type, thus productivity and discoverability benefit substantially through type safety. Additionally, with proper tooling support, you get IntelliSense support and automatic code completion. TypeScript supports a handful of types to reflect most constructs in JavaScript. We will cover these next, starting with basic types. Basic types TypeScript supports a handful of built-in types; the following are examples of the most common ones: const fullName: string = "John Doe"; const age: number = 6; const isDone: boolean = false; const d: Date = new Date(); const canBeAnything: any = 6; As you can see, TypeScript supports basic primitive types such as string, number, boolean, and date. Another thing to notice is the type any. The any type is a valid TypeScript type that indicates that the declaration can be of any given type and all operations on it should be allowed and considered safe. Arrays TypeScript arrays can be written in one of two ways. You can specify the item type followed by square brackets or you can use the generic Array type as follows: const list: number[] = [1, 2, 3]; const list: Array<number> = [1, 2, 3]; Enums Enums allow you to specify named values that correspond to numeric or string values. If you don't manually set the associated corresponding value, the named values in the enum are numbered, starting at 0 by default. Enums are extremely useful, and are commonly used to represent a set of predefined options or lookup values. In TypeScript, you define enums by using the enum keyword as follows: enum Color {Red, Green, Blue} const c: Color = Color.Red; Objects In JavaScript, objects contain key/value pairs that form the shape of the object. TypeScript supports object types as well, very similar to how you write plain JavaScript objects. Consider the following plain JavaScript object: const obj = { x: 5, y: 6 } In the preceding code, obj is a plain object defined with two keys, x and y, both with number values. To define an object type in TypeScript, you use a similar format — curly braces to represent an object, inside the keys, and their type information: { x:number, y:number } Together, this is how you associate a type to an object: const obj: { x:number, y:number } = { x: 5, y: 6 } In the preceding code, the obj object is initialized the same as before, only now along with its type information, which is marked in bold. Functions Functions have type information as well. Functions are comprised of parameters and a return value: function buildName(firstName: string, lastName: string): string { return firstName + " " + lastName; } Each parameter is annotated with its type, and then the return value's type is added at the end of the function signature. Unlike JavaScript, TypeScript enforces calls to functions to adhere to the defined signature. For example, if you try to call the function with anything other than two string parameters, the TypeScript compiler will not allow it. You can define more flexible function signatures if you like. TypeScript supports defining optional parameters by using the question mark symbol, ?: function buildName(firstName: string, lastName: string, title?: string): string { return title + " " + firstName + " " + lastName; } const name = buildName('John', 'Doe'); // valid call The preceding function can now be called with either 2 or 3 string parameters. Everything in JavaScript is still supported, of course. You can still use ES6 default parameter values and rest parameters as well. Type inference Until now, the illustrated code included the type information in the same statement where the actual value assignment took place. TypeScript supports type inference, meaning that it tries to resolve the relevant type information if you don't specify any. Let's look at some examples: const count = 2; In the preceding simple example, TypeScript is smart enough to realize that the type of count is a number without you having to explicitly specify that: function sum(x1: number, x2: number) { return x1 + x2; } const mySum = sum(1, 2); The preceding example demonstrates the power of type inference in TypeScript. If you pay close attention, the code doesn't specify the return value type of the sum function. TypeScript evaluates the code and determines that the return type is number in this case. As a result, the type of the variable mySum is also a number. Type inference is an extremely useful feature since it saves us the incredible time that would be otherwise spent in adding endless type information. Type casting Having the code statically typed leads to type safety. TypeScript does not let you perform actions that are not considered safe and supported. In the following example, the loadProducts function is defined with a return value of type any, thus the type of products is inferred as any: function loadProducts(): any { return [{name: 'Book1'}, {name: 'Book2'}]; } const products = loadProducts(); // products is of type 'any' In some cases, you may actually know the expected type. You can instruct TypeScript of the type by using a cast. Typecasting in TypeScript is supported by using the as an operator or the use of angle brackets, as follows: const products = loadProducts() as {name: string}[]; const products = <{name: string}[]>loadProducts(); Using the preceding cast, you have full support and recognition by TypeScript that products is now an array of objects with a name key. Type aliasing TypeScript supports defining new types via type aliases. You can use type aliases to reuse type information in multiple declarations, for example: type person = {name: string}; let employee: person; let contact: person; Given the preceding example, the code defines a type alias named person as an object with a name key of type string. Afterwards, the rest of the code can reference this type where needed. Type aliases are somewhat similar to interfaces (covered next), but can name primitives, unions, and tuples. Unlike interfaces, type aliases cannot be extended or implemented from, and so interfaces are generally preferred. Unions and intersections allow you to construct a type from multiple existing types, while tuples express arrays with a fixed number of known elements. You can read more about these at https://www.typescriptlang.org/docs/handbook/basic-types.html and https://www.typescriptlang.org/docs/handbook/advanced-types.html. TypeScript is one of the most popular scripting languages in respect to JavaScript-based code bases. At its very core, it brings static typing and assists tremendously with maintainability and productivity. In this article, we covered TypeScript types. TypeScript advances at a steady pace and shows great promise for using it in large projects. It has more features that weren't addressed in this tutorial, such as generics, mapped types, abstract classes, project references, and much more that will help you for ASP.NET core development. To know more about interfaces and compilers of TypeScript, check out the book Hands-On Full-Stack Web Development with ASP.NET Core. Typescript 3.3 is finally released! Future of ESLint support in TypeScript Introducing ReX.js v1.0.0 a companion library for RegEx written in TypeScript

0
0
40467

How-To Tutorials

Packt

03 Mar 2017

18 min read

The NumPy array object

Packt

03 Mar 2017

18 min read

In this article by Armando Fandango author of the book Python Data Analysis - Second Edition, discuss how the NumPy provides a multidimensional array object called ndarray. NumPy arrays are typed arrays of fixed size. Python lists are heterogeneous and thus elements of a list may contain any object type, while NumPy arrays are homogenous and can contain object of only one type. An ndarray consists of two parts, which are as follows: The actual data that is stored in a contiguous block of memory The metadata describing the actual data Since the actual data is stored in a contiguous block of memory hence loading of the large data set as ndarray is affected by availability of large enough contiguous block of memory. Most of the array methods and functions in NumPy leave the actual data unaffected and only modify the metadata. Actually, we made a one-dimensional array that held a set of numbers. The ndarray can have more than a single dimension. (For more resources related to this topic, see here.) Advantages of NumPy arrays The NumPy array is, in general, homogeneous (there is a particular record array type that is heterogeneous)—the items in the array have to be of the same type. The advantage is that if we know that the items in an array are of the same type, it is easy to ascertain the storage size needed for the array. NumPy arrays can execute vectorized operations, processing a complete array, in contrast to Python lists, where you usually have to loop through the list and execute the operation on each element. NumPy arrays are indexed from 0, just like lists in Python. NumPy utilizes an optimized C API to make the array operations particularly quick. We will make an array with the arange() subroutine again. You will see snippets from Jupyter Notebook sessions where NumPy is already imported with instruction import numpy as np. Here's how to get the data type of an array: In: a = np.arange(5) In: a.dtype Out: dtype('int64') The data type of the array a is int64 (at least on my computer), but you may get int32 as the output if you are using 32-bit Python. In both the cases, we are dealing with integers (64 bit or 32 bit). Besides the data type of an array, it is crucial to know its shape. A vector is commonly used in mathematics but most of the time we need higher-dimensional objects. Let's find out the shape of the vector we produced a few minutes ago: In: a Out: array([0, 1, 2, 3, 4]) In: a.shape Out: (5,) As you can see, the vector has five components with values ranging from 0 to 4. The shape property of the array is a tuple; in this instance, a tuple of 1 element, which holds the length in each dimension. Creating a multidimensional array Now that we know how to create a vector, we are set to create a multidimensional NumPy array. After we produce the matrix, we will again need to show its, as demonstrated in the following code snippets: Create a multidimensional array as follows: In: m = np.array([np.arange(2), np.arange(2)]) In: m Out: array([[0, 1], [0, 1]]) We can show the array shape as follows: In: m.shape Out: (2, 2) We made a 2 x 2 array with the arange() subroutine. The array() function creates an array from an object that you pass to it. The object has to be an array, for example, a Python list. In the previous example, we passed a list of arrays. The object is the only required parameter of the array() function. NumPy functions tend to have a heap of optional arguments with predefined default options. Selecting NumPy array elements From time to time, we will wish to select a specific constituent of an array. We will take a look at how to do this, but to kick off, let's make a 2 x 2 matrix again: In: a = np.array([[1,2],[3,4]]) In: a Out: array([[1, 2], [3, 4]]) The matrix was made this time by giving the array() function a list of lists. We will now choose each item of the matrix one at a time, as shown in the following code snippet. Recall that the index numbers begin from 0: In: a[0,0] Out: 1 In: a[0,1] Out: 2 In: a[1,0] Out: 3 In: a[1,1] Out: 4 As you can see, choosing elements of an array is fairly simple. For the array a, we just employ the notation a[m,n], where m and n are the indices of the item in the array. Have a look at the following figure for your reference: NumPy numerical types Python has an integer type, a float type, and complex type; nonetheless, this is not sufficient for scientific calculations. In practice, we still demand more data types with varying precisions and, consequently, different storage sizes of the type. For this reason, NumPy has many more data types. The bulk of the NumPy mathematical types ends with a number. This number designates the count of bits related to the type. The following table (adapted from the NumPy user guide) presents an overview of NumPy numerical types: Type Description bool Boolean (True or False) stored as a bit inti Platform integer (normally either int32 or int64) int8 Byte (-128 to 127) int16 Integer (-32768 to 32767) int32 Integer (-2 ** 31 to 2 ** 31 -1) int64 Integer (-2 ** 63 to 2 ** 63 -1) uint8 Unsigned integer (0 to 255) uint16 Unsigned integer (0 to 65535) uint32 Unsigned integer (0 to 2 ** 32 - 1) uint64 Unsigned integer (0 to 2 ** 64 - 1) float16 Half precision float: sign bit, 5 bits exponent, and 10 bits mantissa float32 Single precision float: sign bit, 8 bits exponent, and 23 bits mantissa float64 or float Double precision float: sign bit, 11 bits exponent, and 52 bits mantissa complex64 Complex number, represented by two 32-bit floats (real and imaginary components) complex128 or complex Complex number, represented by two 64-bit floats (real and imaginary components) For each data type, there exists a matching conversion function: In: np.float64(42) Out: 42.0 In: np.int8(42.0) Out: 42 In: np.bool(42) Out: True In: np.bool(0) Out: False In: np.bool(42.0) Out: True In: np.float(True) Out: 1.0 In: np.float(False) Out: 0.0 Many functions have a data type argument, which is frequently optional: In: np.arange(7, dtype= np.uint16) Out: array([0, 1, 2, 3, 4, 5, 6], dtype=uint16) It is important to be aware that you are not allowed to change a complex number into an integer. Attempting to do that sparks off a TypeError: In: np.int(42.0 + 1.j) Traceback (most recent call last): <ipython-input-24-5c1cd108488d> in <module>() ----> 1 np.int(42.0 + 1.j) TypeError: can't convert complex to int The same goes for conversion of a complex number into a floating-point number. By the way, the j component is the imaginary coefficient of a complex number. Even so, you can convert a floating-point number to a complex number, for example, complex(1.0). The real and imaginary pieces of a complex number can be pulled out with the real() and imag() functions, respectively. Data type objects Data type objects are instances of the numpy.dtype class. Once again, arrays have a data type. To be exact, each element in a NumPy array has the same data type. The data type object can tell you the size of the data in bytes. The size in bytes is given by the itemsize property of the dtype class : In: a.dtype.itemsize Out: 8 Character codes Character codes are included for backward compatibility with Numeric. Numeric is the predecessor of NumPy. Its use is not recommended, but the code is supplied here because it pops up in various locations. You should use the dtype object instead. The following table lists several different data types and character codes related to them: Type Character code integer i Unsigned integer u Single precision float f Double precision float d bool b complex D string S unicode U Void V Take a look at the following code to produce an array of single precision floats: In: arange(7, dtype='f') Out: array([ 0., 1., 2., 3., 4., 5., 6.], dtype=float32) Likewise, the following code creates an array of complex numbers: In: arange(7, dtype='D') In: arange(7, dtype='D') Out: array([ 0.+0.j, 1.+0.j, 2.+0.j, 3.+0.j, 4.+0.j, 5.+0.j, 6.+0.j]) The dtype constructors We have a variety of means to create data types. Take the case of floating-point data (have a look at dtypeconstructors.py in this book's code bundle): We can use the general Python float, as shown in the following lines of code: In: np.dtype(float) Out: dtype('float64') We can specify a single precision float with a character code: In: np.dtype('f') Out: dtype('float32') We can use a double precision float with a character code: In: np.dtype('d') Out: dtype('float64') We can pass the dtype constructor a two-character code. The first character stands for the type; the second character is a number specifying the number of bytes in the type (the numbers 2, 4, and 8 correspond to floats of 16, 32, and 64 bits, respectively): In: np.dtype('f8') Out: dtype('float64') A (truncated) list of all the full data type codes can be found by applying sctypeDict.keys(): In: np.sctypeDict.keys() In: np.sctypeDict.keys() Out: dict_keys(['?', 0, 'byte', 'b', 1, 'ubyte', 'B', 2, 'short', 'h', 3, 'ushort', 'H', 4, 'i', 5, 'uint', 'I', 6, 'intp', 'p', 7, 'uintp', 'P', 8, 'long', 'l', 'L', 'longlong', 'q', 9, 'ulonglong', 'Q', 10, 'half', 'e', 23, 'f', 11, 'double', 'd', 12, 'longdouble', 'g', 13, 'cfloat', 'F', 14, 'cdouble', 'D', 15, 'clongdouble', 'G', 16, 'O', 17, 'S', 18, 'unicode', 'U', 19, 'void', 'V', 20, 'M', 21, 'm', 22, 'bool8', 'Bool', 'b1', 'float16', 'Float16', 'f2', 'float32', 'Float32', 'f4', 'float64', ' Float64', 'f8', 'float128', 'Float128', 'f16', 'complex64', 'Complex32', 'c8', 'complex128', 'Complex64', 'c16', 'complex256', 'Complex128', 'c32', 'object0', 'Object0', 'bytes0', 'Bytes0', 'str0', 'Str0', 'void0', 'Void0', 'datetime64', 'Datetime64', 'M8', 'timedelta64', 'Timedelta64', 'm8', 'int64', 'uint64', 'Int64', 'UInt64', 'i8', 'u8', 'int32', 'uint32', 'Int32', 'UInt32', 'i4', 'u4', 'int16', 'uint16', 'Int16', 'UInt16', 'i2', 'u2', 'int8', 'uint8', 'Int8', 'UInt8', 'i1', 'u1', 'complex_', 'int0', 'uint0', 'single', 'csingle', 'singlecomplex', 'float_', 'intc', 'uintc', 'int_', 'longfloat', 'clongfloat', 'longcomplex', 'bool_', 'unicode_', 'object_', 'bytes_', 'str_', 'string_', 'int', 'float', 'complex', 'bool', 'object', 'str', 'bytes', 'a']) The dtype attributes The dtype class has a number of useful properties. For instance, we can get information about the character code of a data type through the properties of dtype: In: t = np.dtype('Float64') In: t.char Out: 'd' The type attribute corresponds to the type of object of the array elements: In: t.type Out: numpy.float64 The str attribute of dtype gives a string representation of a data type. It begins with a character representing endianness, if appropriate, then a character code, succeeded by a number corresponding to the number of bytes that each array item needs. Endianness, here, entails the way bytes are ordered inside a 32- or 64-bit word. In the big-endian order, the most significant byte is stored first, indicated by >. In the little-endian order, the least significant byte is stored first, indicated by <, as exemplified in the following lines of code: In: t.str Out: '<f8' One-dimensional slicing and indexing Slicing of one-dimensional NumPy arrays works just like the slicing of standard Python lists. Let's define an array containing the numbers 0, 1, 2, and so on up to and including 8. We can select a part of the array from indexes 3 to 7, which extracts the elements of the arrays 3 through 6: In: a = np.arange(9) In: a[3:7] Out: array([3, 4, 5, 6]) We can choose elements from indexes the 0 to 7 with an increment of 2: In: a[:7:2] Out: array([0, 2, 4, 6]) Just as in Python, we can use negative indices and reverse the array: In: a[::-1] Out: array([8, 7, 6, 5, 4, 3, 2, 1, 0]) Manipulating array shapes We have already learned about the reshape() function. Another repeating chore is the flattening of arrays. Flattening in this setting entails transforming a multidimensional array into a one-dimensional array. Let us create an array b that we shall use for practicing the further examples: In: b = np.arange(24).reshape(2,3,4) In: print(b) Out: [[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]) We can manipulate array shapes using the following functions: Ravel: We can accomplish this with the ravel() function as follows: In: b Out: array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]) In: b.ravel() Out: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]) Flatten: The appropriately named function, flatten(), does the same as ravel(). However, flatten() always allocates new memory, whereas ravel gives back a view of the array. This means that we can directly manipulate the array as follows: In: b.flatten() Out: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]) Setting the shape with a tuple: Besides the reshape() function, we can also define the shape straightaway with a tuple, which is exhibited as follows: In: b.shape = (6,4) In: b Out: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]) As you can understand, the preceding code alters the array immediately. Now, we have a 6 x 4 array. Transpose: In linear algebra, it is common to transpose matrices. Transposing is a way to transform data. For a two-dimensional table, transposing means that rows become columns and columns become rows. We can do this too by using the following code: In: b.transpose() Out: array([[ 0, 4, 8, 12, 16, 20], [ 1, 5, 9, 13, 17, 21], [ 2, 6, 10, 14, 18, 22], [ 3, 7, 11, 15, 19, 23]]) Resize: The resize() method works just like the reshape() method, In: b.resize((2,12)) In: b Out: array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]]) Stacking arrays Arrays can be stacked horizontally, depth wise, or vertically. We can use, for this goal, the vstack(), dstack(), hstack(), column_stack(), row_stack(), and concatenate() functions. To start with, let's set up some arrays: In: a = np.arange(9).reshape(3,3) In: a Out: array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) In: b = 2 * a In: b Out: array([[ 0, 2, 4], [ 6, 8, 10], [12, 14, 16]]) As mentioned previously, we can stack arrays using the following techniques: Horizontal stacking: Beginning with horizontal stacking, we will shape a tuple of ndarrays and hand it to the hstack() function to stack the arrays. This is shown as follows: In: np.hstack((a, b)) Out: array([[ 0, 1, 2, 0, 2, 4], [ 3, 4, 5, 6, 8, 10], [ 6, 7, 8, 12, 14, 16]]) We can attain the same thing with the concatenate() function, which is shown as follows: In: np.concatenate((a, b), axis=1) Out: array([[ 0, 1, 2, 0, 2, 4], [ 3, 4, 5, 6, 8, 10], [ 6, 7, 8, 12, 14, 16]]) The following diagram depicts horizontal stacking: Vertical stacking: With vertical stacking, a tuple is formed again. This time it is given to the vstack() function to stack the arrays. This can be seen as follows: In: np.vstack((a, b)) Out: array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 0, 2, 4], [ 6, 8, 10], [12, 14, 16]]) The concatenate() function gives the same outcome with the axis parameter fixed to 0. This is the default value for the axis parameter, as portrayed in the following code: In: np.concatenate((a, b), axis=0) Out: array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 0, 2, 4], [ 6, 8, 10], [12, 14, 16]]) Refer to the following figure for vertical stacking: Depth stacking: To boot, there is the depth-wise stacking employing dstack() and a tuple, of course. This entails stacking a list of arrays along the third axis (depth). For example, we could stack 2D arrays of image data on top of each other as follows: In: np.dstack((a, b)) Out: array([[[ 0, 0], [ 1, 2], [ 2, 4]], [[ 3, 6], [ 4, 8], [ 5, 10]], [[ 6, 12], [ 7, 14], [ 8, 16]]]) Column stacking: The column_stack() function stacks 1D arrays column-wise. This is shown as follows: In: oned = np.arange(2) In: oned Out: array([0, 1]) In: twice_oned = 2 * oned In: twice_oned Out: array([0, 2]) In: np.column_stack((oned, twice_oned)) Out: array([[0, 0], [1, 2]]) 2D arrays are stacked the way the hstack() function stacks them, as demonstrated in the following lines of code: In: np.column_stack((a, b)) Out: array([[ 0, 1, 2, 0, 2, 4], [ 3, 4, 5, 6, 8, 10], [ 6, 7, 8, 12, 14, 16]]) In: np.column_stack((a, b)) == np.hstack((a, b)) Out: array([[ True, True, True, True, True, True], [ True, True, True, True, True, True], [ True, True, True, True, True, True]], dtype=bool) Yes, you guessed it right! We compared two arrays with the == operator. Row stacking: NumPy, naturally, also has a function that does row-wise stacking. It is named row_stack() and for 1D arrays, it just stacks the arrays in rows into a 2D array: In: np.row_stack((oned, twice_oned)) Out: array([[0, 1], [0, 2]]) The row_stack() function results for 2D arrays are equal to the vstack() function results: In: np.row_stack((a, b)) Out: array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 0, 2, 4], [ 6, 8, 10], [12, 14, 16]]) In: np.row_stack((a,b)) == np.vstack((a, b)) Out: array([[ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True]], dtype=bool) Splitting NumPy arrays Arrays can be split vertically, horizontally, or depth wise. The functions involved are hsplit(), vsplit(), dsplit(), and split(). We can split arrays either into arrays of the same shape or indicate the location after which the split should happen. Let's look at each of the functions in detail: Horizontal splitting: The following code splits a 3 x 3 array on its horizontal axis into three parts of the same size and shape (see splitting.py in this book's code bundle): In: a Out: array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) In: np.hsplit(a, 3) Out: [array([[0], [3], [6]]), array([[1], [4], [7]]), array([[2], [5], [8]])] Liken it with a call of the split() function, with an additional argument, axis=1: In: np.split(a, 3, axis=1) Out: [array([[0], [3], [6]]), array([[1], [4], [7]]), array([[2], [5], [8]])] Vertical splitting: vsplit() splits along the vertical axis: In: np.vsplit(a, 3) Out: [array([[0, 1, 2]]), array([[3, 4, 5]]), array([[6, 7, 8]])] The split() function, with axis=0, also splits along the vertical axis: In: np.split(a, 3, axis=0) Out: [array([[0, 1, 2]]), array([[3, 4, 5]]), array([[6, 7, 8]])] Depth-wise splitting: The dsplit() function, unsurprisingly, splits depth-wise. We will require an array of rank 3 to begin with: In: c = np.arange(27).reshape(3, 3, 3) In: c Out: array([[[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8]], [[ 9, 10, 11], [12, 13, 14], [15, 16, 17]], [[18, 19, 20], [21, 22, 23], [24, 25, 26]]]) In: np.dsplit(c, 3) Out: [array([[[ 0], [ 3], [ 6]], [[ 9], [12], [15]], [[18], [21], [24]]]), array([[[ 1], [ 4], [ 7]], [[10], [13], [16]], [[19], [22], [25]]]), array([[[ 2], [ 5], [ 8]], [[11], [14], [17]], [[20], [23], [26]]])] NumPy array attributes Let's learn more about the NumPy array attributes with the help of an example. Let us create an array b that we shall use for practicing the further examples: In: b = np.arange(24).reshape(2, 12) In: b Out: array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]]) Besides the shape and dtype attributes, ndarray has a number of other properties, as shown in the following list: ndim gives the number of dimensions, as shown in the following code snippet: In: b.ndim Out: 2 size holds the count of elements. This is shown as follows: In: b.size Out: 24 itemsize returns the count of bytes for each element in the array, as shown in the following code snippet: In: b.itemsize Out: 8 If you require the full count of bytes the array needs, you can have a look at nbytes. This is just a product of the itemsize and size properties: In: b.nbytes Out: 192 In: b.size * b.itemsize Out: 192 The T property has the same result as the transpose() function, which is shown as follows: In: b.resize(6,4) In: b Out: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]) In: b.T Out: array([[ 0, 4, 8, 12, 16, 20], [ 1, 5, 9, 13, 17, 21], [ 2, 6, 10, 14, 18, 22], [ 3, 7, 11, 15, 19, 23]]) If the array has a rank of less than 2, we will just get a view of the array: In: b.ndim Out: 1 In: b.T Out: array([0, 1, 2, 3, 4]) Complex numbers in NumPy are represented by j. For instance, we can produce an array with complex numbers as follows: In: b = np.array([1.j + 1, 2.j + 3]) In: b Out: array([ 1.+1.j, 3.+2.j]) The real property returns to us the real part of the array, or the array itself if it only holds real numbers: In: b.real Out: array([ 1., 3.]) The imag property holds the imaginary part of the array: In: b.imag Out: array([ 1., 2.]) If the array holds complex numbers, then the data type will automatically be complex as well: In: b.dtype Out: dtype('complex128') In: b.dtype.str Out: '<c16' The flat property gives back a numpy.flatiter object. This is the only means to get a flatiter object; we do not have access to a flatiter constructor. The flat iterator enables us to loop through an array as if it were a flat array, as shown in the following code snippet: In: b = np.arange(4).reshape(2,2) In: b Out: array([[0, 1], [2, 3]]) In: f = b.flat In: f Out: <numpy.flatiter object at 0x103013e00> In: for item in f: print(item) Out: 0 1 2 3 It is possible to straightaway obtain an element with the flatiter object: In: b.flat[2] Out: 2 Also, you can obtain multiple elements as follows: In: b.flat[[1,3]] Out: array([1, 3]) The flat property can be set. Setting the value of the flat property leads to overwriting the values of the entire array: In: b.flat = 7 In: b Out: array([[7, 7], [7, 7]]) We can also obtain selected elements as follows: In: b.flat[[1,3]] = 1 In: b Out: array([[7, 1], [7, 1]]) The next diagram illustrates various properties of ndarray: Converting arrays We can convert a NumPy array to a Python list with the tolist() function . The following is a brief explanation: Convert to a list: In: b Out: array([ 1.+1.j, 3.+2.j]) In: b.tolist() Out: [(1+1j), (3+2j)] The astype() function transforms the array to an array of the specified data type: In: b Out: array([ 1.+1.j, 3.+2.j]) In: b.astype(int) /usr/local/lib/python3.5/site-packages/ipykernel/__main__.py:1: ComplexWarning: Casting complex values to real discards the imaginary part … Out: array([1, 3]) In: b.astype('complex') Out: array([ 1.+1.j, 3.+2.j]) We are dropping off the imaginary part when casting from the complex type to int. The astype() function takes the name of a data type as a string too. The preceding code won't display a warning this time because we used the right data type. Summary In this article, we found out a heap about the NumPy basics: data types and arrays. Arrays have various properties that describe them. You learned that one of these properties is the data type, which, in NumPy, is represented by a full-fledged object. NumPy arrays can be sliced and indexed in an effective way, compared to standard Python lists. NumPy arrays have the extra ability to work with multiple dimensions. The shape of an array can be modified in multiple ways, such as stacking, resizing, reshaping, and splitting. Resources for Article: Further resources on this subject: Big Data Analytics [article] Python Data Science Up and Running [article] R and its Diverse Possibilities [article]

0
0
40385

article-image-installing-arch-linux-using-official-iso

Packt

19 Feb 2013

7 min read

Installing Arch Linux using the official ISO

Packt

19 Feb 2013

7 min read

(For more resources related to this topic, see here.) Getting ready You can get the official ISO image file from https://www.archlinux.org/download/. On this page you will find a download link to the latest release. Depending on your preference, download the torrent file or the ISO image file immediately. The following list describes the main tasks that we will perform in this recipe: Preparing, booting, and setting keyboard layout: We are going to get the ISO file from the download page of the Arch Linux website and store it on the preferred media of our choice. At the time of writing this article, there is a dual ISO image file that contains both i686 and x86-64 architectures on one disk. Start your PC with your preferred installation media (CD or USB stick). On most PC systems, you can access the boot menu by pressing one of the function keys, usually between F8 and F12 depending on the motherboard manufacturer. On older machines where you do not yet have a boot menu, you might need to change the boot order in the BIOS where the CD-ROM (or DVD/Blu-ray) has to be chosen as the first device to try booting from. We'll also explain how to use a different keyboard layout than the default one in this recipe. Creating, formatting, and mounting partitions: You can partition the disks the way you want using cfdisk (for MBR disk partitioning) or cgdisk (for GUID disk partitioning). After creating the partitions, we can choose to format our created partitions with specific filesystems. When all partitions are formatted, we need to mount the partitions. First we will mount the root partition to /mnt. The other partitions will be mounted later on after you have created the specific folders. We'll designate our device with /dev/sdX; in your case this can be /dev/sda, and so on. Connecting to the Internet: To be able to continue installing the ISO you need to connect to the Internet, because there are no packages available for installation on the ISO. For a wireless network you will need to use netcfg. When connected to a wired network, just use dhcpcd or dhclient. Installing the base system and boot loader: These days the base system gets installed by running a simple script pacstrap. Pacstrap takes multiple parameters, the target location, and the packages or groups you want to install. For people who want to develop on their machines, the best base install is adding base-devel to the default installation. For normal end users, just base will be sufficient to start. Configuring the system: In this recipe, we'll describe the flow of what to do during the configuration. How to do it... The following steps will guide you in preparing, booting, and setting keyboard layout: Once you have downloaded the ISO image file, you should also verify its integrity by downloading the sha1sums.txt file from the download page. These days you can also check if the ISO is completely valid by verifying the signature of the ISO. Verify the integrity by issuing the sha1sum -c sha1sums.txt command and you'll see whether your download was successful or not. Also check if the signature of the ISO is correct by running gpg -v archlinux-...iso.sig: sha1sum -c sha1sums.txt gpg -v archlinux-2012-08-04-dual.iso.sig The following screenshot shows the execution of this step: As you can see in the previous screenshot, the ISO's checksum is ok and the signature is valid. Now that we are sure our ISO is ok, we can burn this to a CD with our favorite burning program. Insert the CD into the drive, or insert the USB stick into the USB port of your PC. Enter the boot menu, or let your computer automatically boot from the inserted installation media. If the previous steps are performed correctly, you will see the following screenshot: Select the architecture you want and press Enter, and we'll be on our way. Search the keyboard layout desired for your region. The available keyboard layouts can be found at /usr/share/kbd/keymaps/. Set the desired keyboard layout with loadkeys keyboardlayout. Now let's perform the following steps to create, format, and mount partitions: Start cfdisk or cgdisk, having the first parameter as the device you want to partition: cfdisk /dev/sdX cgdisk /dev/sdX Create your partition scheme. Store the partition scheme. Use the mkfs command to create a filesystem on a specific partition: mkfs -t vfat /dev/sdX mkfs.ext4 -L root /dev/sdX Mount your root partition to /mnt: mount /dev/sdX3 /mnt Make directories under mount for your other partitions: mkdir -p /mnt/boot Mount the other partitions: mount /dev/sdX1 /mnt/boot The following steps are needed to connect to the Internet: When we need a wireless network, create a netcfg profile and run netcfg mywireless. Use dhclient or dhcpcd to get an IP address. The following steps should be performed for installing the base system and boot loader: Run pacstrap with the desired parameters: pacstrap /mnt base base-devel Install the desired boot loader: the best choice at this moment is Syslinux. The final installation of the boot loader will be done in a chroot during the initial configuration. We'll now list the steps to do during the configuration: Generate fstab with genfstab: genfstab -p /mnt >> /mnt/etc/fstab Change the root into the system location: arch-chroot /mnt Set your hostname in /etc/hostname. Create /etc/localtime symlink. Set your locale in /etc/locale.conf. Uncomment the configured locale in /etc/locale.gen. Run locale-gen. Configure /etc/mkinitcpio.conf. Generate your initial ramdisk: mkinitcpio -p linux Finish installation of your boot loader. Set the root password with passwd. Leave the chroot environment (exit). How it works... We downloaded the ISO image file via torrent, or via HTTP from the mirror sites listed on the download page. The sha1sum command lets us verify the integrity of the downloaded ISO. On top of the checksum, we can also check the integrity by verifying the signature available for the ISO. So now, we can rest assured that the downloaded file is the real one. The ISO contains a fully working operating system. It also contains all the necessary tools to perform system recovery and installation. The keyboard configuration set with loadkeys will make sure that the key you press on your keyboard will be translated to the correct letter on your screen. Using a different keyboard layout from the one on your physical keyboard might be confusing. We then created a partition scheme on the selected disk with the appropriate tool (cfdisk or cgdisk). Make Filesystem (mkfs) is a unified frontend to create a filesystem. Using it we created our filesystem layout manually under/mnt by creating our default partition layout in our root, and mounting the specific partitions accordingly. You can make a connection with your wireless network (if needed), and then use dhcpcd or dhclient to obtain an IP address that enables you to access the Internet. Pacstrap will run pacman with a modified root location to install the desired packages into the newly created system. For example, installing Syslinux: pacstrap /mnt syslinux The specific configuration files will ensure we don't have to do all those steps over and over again on every boot. Summary This article explained the procedure to get Arch Linux installed on your system using the official installation media. Resources for Article : Further resources on this subject: Compression Formats in Linux Shell Script [Article] Making a Complete yet Small Linux Distribution [Article] Linux Shell Script: Tips and Tricks [Article]

0
0
40380

article-image-ai-distilled-32-navigating-industry-updates-and-innovations

Merlyn Shelley

12 Jan 2024

13 min read

AI_Distilled #32: Navigating Industry Updates and Innovations

Merlyn Shelley

12 Jan 2024

13 min read

Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“There is not going to be one model to rule them all. You need to be trying out different models, you need a real choice of model providers.” -Adam Selipsky, CEO, AWS. There’s no one-size-fits-all approach in AI development. When you embrace diversity in AI, that’s when it truly shines. There’s also a different side to the coin — the infinitely scalable adaptability of AI to revolutionize field after field, such as when it can help discover promising new sustainable battery materials to potentially reduce reliance on Lithium. Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across different industries and sectors: AI Launches & Industry Updates: Explore the GPT MarketplaceNVIDIA Unveils Innovations in Gaming, AI, and Robotics at CES 2024 Perplexity AI Secures $73.6M Funding Led by NVIDIA and Jeff Bezos OpenAI Set to Launch GPT Store for AI Models and Apps Google Faces Multibillion-Dollar Patent Trial Over AI Technology in U.S. Google's DeepMind Unveils Advances in Robotic Training with Video and Language Models AI in Healthcare: Isomorphic Labs Secures $3 Billion AI-Driven Drug Discovery Deals with Eli Lilly and Novartis Nabla Secures $24 Million in Series B Funding for AI-Powered Medical Assistant AI in Business: Deloitte Introduces PairD AI Chatbot for 75,000 Staff in Big Four's Latest Automation Move Walmart Revolutionizes Shopping with Generative AI Innovations AI in Science & Technology: Microsoft and PNNL Harness AI to Discover Promising Battery Material German Automakers Pioneer AI Integration in Cars, Elevating Driving Experience AI in Finance: Rising Concerns as Generative AI Use Grows in Finance, Amplifying Misinformation Risks AI in Supply Chain Management: Warehousing Industry Leverages Machine Learning to Tackle Disruptions We’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge: Explore the Future of AI: A Guide to the Top 9 AI APIs of 2024 Optimizing LLM Inference with Splitwise: Achieving Efficiency in GPU Usage A Comprehensive Guide to Merging LLMs AI Drift in Retrieval Augmented Generation Finally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects: Creating Your Own AI Image Generator App with Generative AI Optimizing Code Output with CodeWhisperer Mastering Knowledge Graph Construction with KeyBERT, HDBSCAN, and Zephyr-7B-Beta How to Craft an Open Source Multi-Modal RAG System Looking for some inspiration? Here are some GitHub repositories to get your projects going!gxnu-zhonglab/odtrack DLYuanGod/TinyGPT-V intel/intel-extension-for-transformers CambioML/pykoi 📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition." We appreciate your input and hope you enjoy the book! Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content! Cheers, Merlyn Shelley Editor-in-Chief, Packt SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & AnalysisAI Launches & Industry Updates: ⭐ Explore the GPT Marketplace: Just two months in, 3 million custom ChatGPTs are already out there! The GPT Store is now open to ChatGPT Plus, Team, and Enterprise users, offering a variety of handy GPTs. Get in on the action at chat.openai.com/gpts! ⭐ NVIDIA Unveils Innovations in Gaming, AI, and Robotics at CES 2024: NVIDIA unveiled impressive CES 2024 innovations: GeForce RTX 40 SUPER GPUs, AI laptops, generative AI tools. They highlighted RTX GPUs' influence on generative AI, introduced TensorRT acceleration for Stable Diffusion XL and SDXL Turbo, and NVIDIA Avatar Cloud Engine (ACE) Microservices for digital avatars. Getty Images and Nvidia introduced Generative AI by iStock, a text-to-image platform for customized stock photos. ⭐ Perplexity AI Secures $73.6M Funding Led by NVIDIA and Jeff Bezos: San Francisco's Perplexity AI secures $73.6 million in funding led by IVP, with Nvidia and Jeff Bezos participating, valuing the company at $520 million. Despite serving 500 million queries in 2023, profitability remains elusive, as it competes with Google in the search market. The funds will be used for hiring and product development. ⭐ OpenAI Set to Launch GPT Store for AI Models and Apps: OpenAI is set to launch the GPT Store, where developers can present custom GPT model applications, following updated policies. The launch, previously delayed, offers diverse, code-free applications. Revenue-sharing details await clarification. ⭐ Google Faces Multibillion-Dollar Patent Trial Over AI Technology in U.S.: Google is facing a federal jury trial in Boston as Singular Computing alleges patent infringement in its AI processors. Singular seeks up to $7 billion in damages, while Google argues independent development. The trial may last two to three weeks. ⭐ Google's DeepMind Unveils Advances in Robotic Training with Video and Language Models: DeepMind Robotics unveils AutoRT, a system enhancing robot understanding of human intentions using Visual Language Models. It orchestrates 20 robots, suggesting tasks via LLMs and introduces RT-Trajectory with 63% success in 41 tasks using video input. AI in Healthcare: ⭐ Isomorphic Labs Secures $3 Billion AI-Driven Drug Discovery Deals with Eli Lilly and Novartis: London-based Isomorphic, a DeepMind spin-out, forms strategic alliances with Eli Lilly and Novartis, valued at $3 billion. Utilizing AlphaFold 2 AI technology, Isomorphic focuses on accurate protein predictions for innovative drug discovery. ⭐ Nabla Secures $24 Million in Series B Funding for AI-Powered Medical Assistant: Paris startup Nabla secures $24 million in a Series B funding round led by Cathay Innovation and ZEBOX Ventures. Nabla develops an AI copilot for doctors, streamlining administrative tasks while collaborating with physicians. AI in Business: ⭐ Deloitte Introduces PairD AI Chatbot for 75,000 Staff in Big Four's Latest Automation Move: Deloitte is using a chatbot called PairD to help 75,000 employees in Europe and the Middle East with everyday tasks. While it's convenient, there are concerns about its accuracy, so employees still check its work. Deloitte is also sharing PairD with 800 workers at the charity Scope as part of its AI strategy. ⭐ Walmart Revolutionizes Shopping with Generative AI Innovations: Walmart introduces generative AI-powered features on iOS, Android, and its website to improve the digital shopping experience. These features provide personalized responses and recommendations, shifting from scrolling to goal-oriented searching for a smoother shopping journey. AI in Science & Technology: ⭐ Microsoft and PNNL Harness AI to Discover Promising Battery Material: Microsoft and PNNL used AI and cloud computing to speed up battery innovation, identifying a safer, efficient solid-state electrolyte with less lithium. Azure Quantum Elements platform screened 32 million candidates in 80 hours, highlighting a material with potential for a 70% reduction in sodium use, advancing sustainable energy solutions. ⭐ German Automakers Pioneer AI Integration in Cars, Elevating Driving Experience: Leading German automakers like Volkswagen and Mercedes-Benz are revolutionizing the automotive industry with advanced AI integration. Volkswagen unveiled ChatGPT technology, enhancing the driving experience with AI-powered chatbots and IDA voice assistants, while Mercedes-Benz introduced a sophisticated virtual assistant for context-based suggestions, marking a significant leap in interactive AI utilization at CES 2024. AI in Finance: ⭐ Rising Concerns as Generative AI Use Grows in Finance, Amplifying Misinformation Risks: The finance sector's growing use of generative AI is transforming services but raises concerns of misinformation. A study by PYMNTS Intelligence and AI-ID shows 80% of consumers worry about generative AI's misinformation risk. Regulatory guidelines, model explainability tools, and industry cooperation are essential for responsible AI adoption in finance. AI in Supply Chain Management: ⭐ Warehousing Industry Leverages Machine Learning to Tackle Disruptions: Zebra Technologies Corporation's research highlights the warehousing industry's adoption of AI, particularly machine learning (ML), amid challenges like inflation and labor shortages. The report predicts ML, predictive analytics, and mobile dimensioning will dominate by 2028, aiding historical analysis, demand prediction, and automation. Decision-makers aim to boost resilience with 94% planning ML integration within five years. 🔮 Expert Insights from Packt Community The Handbook of NLP with Gensim - By Chris Kuo Gensim and its NLP modeling techniques Gensim is actively maintained and supported by a community of developers and is widely used in academic research and industry applications. It covers many important NLP techniques that make up the workforce of today’s NLP. Last year, I was at a company’s year-end party. The ballroom was filled with people standing in groups with their drinks. I walked around and listened for conversation topics where I could chime in. I heard one group talking about the FIFA World Cup 2022 and another group talking about stock markets. I joined the stock markets conversation. In that short moment, my mind had performed “word extractions,” “text summarization,” and “topic classifications.” These tasks are the core tasks of NLP and what Gensim is designed to do. We perform serious text analyses in professional fields including legal, medical, and business. We organize similar documents into topics. Such work also demands “word extractions,” “text summarization,” and “topic classifications.” In the following sections, I will give you a brief introduction to the key models that Gensim offers so you will have a good overview. These models include the following: BoW and TF-IDF Latent semantic analysis/indexing (LSA/LSI) Word2Vec Doc2Vec Text summarization LDA Ensemble LDA BoW and TF-IDF Texts can be represented as a bag of words, which is the count frequency of a word. BoW uses the word count to reflect the significance of a word. However, this is not very intuitive. Frequent words may not carry special meanings depending on the type of document. LSA/LSI Latent semantic analysis (LSA) was developed in the 1990s. It's an NLP solution that far surpasses naïve keyword matching and has become an important search engine algorithm. Prior to that, in 1988, an LSA-based information retrieval system was patented (US Patent #4839853, now expired) and named “latent semantic indexing,” so the technique is also called latent semantic indexing (LSI). Gensim and many other reports name LSA as LSI so as not to confuse LSA with LDA. This is an excerpt from the book The Handbook of NLP with Gensim - By Chris Kuo and published in OCT ‘23. To see what's inside the book, read the entire chapter here or try a 7-day free trial to access the full Packt digital library. To discover more, click the button below. Read through the Chapter 1 unlocked here... 🌟 Secret Knowledge: AI/LLM Resources⭐ Explore the Future of AI: A Guide to the Top 9 AI APIs of 2024: In this guide, you'll learn how to navigate the dynamic realm of AI APIs, uncovering the capabilities of the top 9 for 2024. Discover Google Cloud Vision AI, an unparalleled eye for accurate image analysis, IBM Watson Assistant, a conversational genius transforming virtual assistance, Amazon Lex, empowering apps with voice commands effortlessly, Azure Cognitive Services, the Swiss Army knife of AI, offering diverse tools, DeepAI, simplifying deep learning for innovation, and decode texts with MonkeyLearn, a text analysis guru, among others. Read the post to explore how these APIs can shape your tech ventures and redefine the future of AI. ⭐ Optimizing LLM Inference with Splitwise: Achieving Efficiency in GPU Usage: Discover how Splitwise, a technique from Azure Research - Systems, boosts LLM inference efficiency. It separates prompt computation and token-generation phases, optimizing hardware use. This method enhances GPU cluster design, achieving higher throughput, lower costs, and reduced power for efficient LLM deployment. ⭐ A Comprehensive Guide to Merging LLMs: This comprehensive guide explores merging LLMs using the mergekit library without requiring a GPU. It covers four merging techniques: SLERP, TIES, DARE, and passthrough, with configuration examples. The result is Marcoro14–7B-slerp, a high-performing model featured on the Open LLM Leaderboard. ⭐ AI Drift in Retrieval Augmented Generation (RAG): This guide delves into AI drift within RAG pipelines, drawing from a real case where a customer faced declining AI responses. It covers the causes (content drift, LLM drift, pipeline algorithm changes) and strategies (content management, API upgrades, internal metrics) to control AI drift. 🔛 Masterclass: AI/LLM Tutorials⭐ Creating Your Own AI Image Generator App with Generative AI: Discover how to build a powerful Generative AI Text-to-Image application in this detailed guide. The author shares their journey of seamlessly integrating AI-generated images into a React app, using third-party APIs like SegMind. With a step-by-step walkthrough, you'll explore the code behind the app on GitHub and learn how to choose the right API, integrate it into React, and unleash AI capabilities in web development. Read on to bring dynamic, AI-generated content to your React projects and stay at the forefront of web development innovation. ⭐ Optimizing Code Output with CodeWhisperer: Unlock the full potential of Amazon CodeWhisperer with this in-depth guide on prompt engineering. Learn how CodeWhisperer accelerates software development by offering code recommendations based on natural language comments. The post provides step-by-step insights on effective prompt engineering in Python, emphasizing best practices such as crafting specific and concise prompts, incorporating additional context, utilizing multiple comments strategically, and understanding CodeWhisperer's capacity for cross-file context. ⭐ Mastering Knowledge Graph Construction with KeyBERT, HDBSCAN, and Zephyr-7B-Beta: Discover how to leverage LLMs with traditional NLP and ML methods to create knowledge graphs from unstructured text. The author showcases the synergy of KeyBERT, HDBSCAN, and Zephyr-7B-Beta for improved keyword extraction, clustering, and refinement. The guide covers dataset prep, keyword extraction, and LLM integration. ⭐ How to Craft an Open Source Multi-Modal RAG System: Discover building a Retrieval-Augmented Generation (RAG) system with an Open Source Large Language Multi-Modal (LLMM). Learn the integration of ChromeDB and Hugging Face, covering Clip, data storage, and MLLMs for user chat sessions in a detailed, dependency-free guide. 🚀 HackHub: Trending AI Tools⭐ gxnu-zhonglab/odtrack: Efficient video-level tracking pipeline utilizing online token propagation to densely capture contextual relationships and spatio-temporal trajectories across frames. ⭐ DLYuanGod/TinyGPT-V: Features an efficient Multimodal Large Language Model using small backbones for efficiently incorporating multimodal capabilities into language models. ⭐ intel/intel-extension-for-transformers: Toolkit to accelerate GenAI/LLM performance on Intel platforms, including Gaudi2, CPU, and GPU, seamlessly compressing Transformer-based models, accessing optimized model packages, and using NeuralChat. ⭐ CambioML/pykoi: An open-source Python library for LLMs, enhancing them with RLHF, collecting user feedback, fine-tuning with reinforcement learning, comparing models, and creating RAG chatbots efficiently.

0
0
40339

article-image-use-tensorflow-and-nlp-to-detect-duplicate-quora-questions-tutorial

Sunith Shetty

21 Jun 2018

32 min read

Use TensorFlow and NLP to detect duplicate Quora questions [Tutorial]

Sunith Shetty

21 Jun 2018

32 min read

0
1
40330

article-image-npm-javascript-predictions-for-2019-react-graphql-and-typescript-are-three-technologies-to-learn

Bhagyashree R

10 Dec 2018

3 min read

npm JavaScript predictions for 2019: React, GraphQL, and TypeScript are three technologies to learn

Bhagyashree R

10 Dec 2018

3 min read

Based on Laurie Voss’ talk on Node+JS Interactive 2018, on Friday, npm has shared some insights and predictions about JavaScript for 2019. These predictions are aimed to help developers make better technical choices in 2019. Here are the four predictions npm has made: “You will abandon one of your current tools.” In JavaScript, frameworks and tools don’t last and generally enjoy a phase of peak popularity of 3-5 years. This follows a slow decline as developers have to maintain the legacy applications but move to newer frameworks for new work. Mr. Voss said in his talk, “Nothing lasts forever!..Any framework that we see today will have its hay days and then it will have an after-life where it will slowly slowly degrade.” For developers, this essentially means that it is better to keep on learning new frameworks instead of holding on to their current tools too tightly. “Despite a slowdown in growth, React will be the dominant framework in 2019.” Though React’s growth has slowed down in 2018, as compared to 2017, it still continues to dominate the web scene. 60% of npm survey respondents said they are using React. In 2019, npm says that more people will use React for building web applications. As people using it will grow we will have more tutorials, advice, and bug fixes. “You’ll need to learn GraphQL.” The GraphQL client library is showing tremendous popularity and as per npm it is going to be a “technical force to reckon with in 2019.” It was first publicly released in 2015 and it is still too early to put it into production, but going by its growing popularity, developers are recommended to learn its concepts in 2019. npm also predict that developers will see themselves using GraphQL in new projects later in the year and in 2020. “Somebody on your team will bring in TypeScript.” npm’s survey uncovered that 46% of the respondents were using Microsoft’s TypeScript, a typed superset of JavaScript that compiles to plain JavaScript. One of the reason for this major adoption by enthusiasts could be the extra safety TypeScript provides by type-checking. Adopting TypeScript in 2019 could prove really useful, especially if you’re a member of a larger team. Read the detailed report and predictions on npm’s website. 4 key findings from The State of JavaScript 2018 developer survey TypeScript 3.2 released with configuration inheritance and more 7 reasons to choose GraphQL APIs over REST for building your APIs

0
0
40300

A really basic guide to batch file programming

How to call an Azure function from an ASP.NET Core MVC application

Google employees join hands with Amnesty International urging Google to drop Project Dragonfly

Django 3.0 is going async!

Techniques and Practices of Game AI

Implement an API Design-first approach for building APIs [Tutorial]

3 programming languages some people think are dead but definitely aren’t

Dynamic Graphics

How to get started with Azure Stream Analytics and 7 reasons to choose it

An introduction to TypeScript types for ASP.NET core [Tutorial]

Trending Topics

The NumPy array object

Installing Arch Linux using the official ISO

AI_Distilled #32: Navigating Industry Updates and Innovations

Use TensorFlow and NLP to detect duplicate Quora questions [Tutorial]

npm JavaScript predictions for 2019: React, GraphQL, and TypeScript are three technologies to learn

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access