Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-really-basic-guide-to-batch-file-programming
Richard Gall
31 May 2018
3 min read
Save for later

A really basic guide to batch file programming

Richard Gall
31 May 2018
3 min read
Batch file programming is a way of making a computer do things simply by creating, yes, you guessed it, a batch file. It's a way of doing things you might ordinarily do in the command prompt, but automates some tasks, which means you don't have to write so much code. If it sounds straightforward, that's because it is, generally. Which is why it's worth learning... Batch file programming is a good place to start learning how computers work Of course, if you already know your way around batch files, I'm sure you'll agree it's a good way for someone relatively experienced in software to get to know their machine a little better. If you know someone that you think would get a lot from learning batch file programming share this short guide with them! Why would I write a batch script? There are a number of reasons you might write batch scripts. It's particularly useful for resolving network issues, installing a number of programs on different machines, even organizing files and folders on your computer. Imagine you have a recurring issue - with a batch file you can solve it quickly and easily wherever you are without having to write copious lines of code in the command line. Or maybe your desktop simply looks like a mess; with a little knowledge of batch file programming you can clean things up without too much effort. How to write a batch file Clearly, batch file programming can make your life a lot easier. Let's take a look at the key steps to begin writing batch scripts. Step 1: Open your text editor Batch file programming is really about writing commands - so you'll need your text editor open to begin. Notepad, wordpad, it doesn't matter! Step 2: Begin writing code As we've already seen, batch file programming is really about writing commands for your computer. The code is essentially the same as what you would write in the command prompt. Here are a few batch file commands you might want to know to get started: ipconfig - this presents network information like your IP and MAC address. start “” [website] - this opens a specified website in your browser. rem - this is used if you want to make a comment or remark in your code (ie. for documentation purposes) pause - this, as you'd expect, pauses the script so it can be read before it continues. echo - this command will display text in the command prompt. %%a - this command refers to every file in a given folder if - this is a conditional command The list of batch file commands is pretty long. There are plenty of other resources with an exhaustive list of commands you can use, but a good place to begin is this page on Wikipedia. Step 3: Save your batch file Once you've written your commands in the text editor, you'll then need to save your document as a batch file. Title it, and suffix it with the .bat extension. You'll also need to make sure save as type is set as 'All files'. That's basically it when it comes to batch file programming. Of course, there are some complex things you can do, but once you know the basics, getting into the code is where you can start to experiment.  Read next Jupyter and Python scripting Python Scripting Essentials
Read more
  • 0
  • 0
  • 40744

article-image-azure-function-asp-net-core-mvc-application
Aaron Lazar
03 May 2018
10 min read
Save for later

How to call an Azure function from an ASP.NET Core MVC application

Aaron Lazar
03 May 2018
10 min read
In this tutorial, we'll learn how to call an Azure Function from an ASP.NET Core MVC application. [box type="shadow" align="" class="" width=""]This article is an extract from the book C# 7 and .NET Core Blueprints, authored by Dirk Strauss and Jas Rademeyer. This book is a step-by-step guide that will teach you essential .NET Core and C# concepts with the help of real-world projects.[/box] We will get started with creating an ASP.NET Core MVC application that will call our Azure Function to validate an email address entered into a login screen of the application: This application does no authentication at all. All it is doing is validating the email address entered. ASP.NET Core MVC authentication is a totally different topic and not the focus of this post. In Visual Studio 2017, create a new project and select ASP.NET Core Web Application from the project templates. Click on the OK button to create the project. This is shown in the following screenshot: On the next screen, ensure that .NET Core and ASP.NET Core 2.0 is selected from the drop-down options on the form. Select Web Application (Model-View-Controller) as the type of application to create. Don't bother with any kind of authentication or enabling Docker support. Just click on the OK button to create your project: After your project is created, you will see the familiar project structure in the Solution Explorer of Visual Studio: Creating the login form For this next part, we can create a plain and simple vanilla login form. For a little bit of fun, let's spice things up a bit. Have a look on the internet for some free login form templates: I decided to use a site called colorlib that provided 50 free HTML5 and CSS3 login forms in one of their recent blog posts. The URL to the article is: https://colorlib.com/wp/html5-and-css3-login-forms/. I decided to use Login Form 1 by Colorlib from their site. Download the template to your computer and extract the ZIP file. Inside the extracted ZIP file, you will see that we have several folders. Copy all the folders in this extracted ZIP file (leave the index.html file as we will use this in a minute): Next, go to the solution for your Visual Studio application. In the wwwroot folder, move or delete the contents and paste the folders from the extracted ZIP file into the wwwroot folder of your ASP.NET Core MVC application. Your wwwroot folder should now look as follows: 4. Back in Visual Studio, you will see the folders when you expand the wwwroot node in the CoreMailValidation project. 5. I also want to focus your attention to the Index.cshtml and _Layout.cshtml files. We will be modifying these files next: Open the Index.cshtml file and remove all the markup (except the section in the curly brackets) from this file. Paste the HTML markup from the index.html file from the ZIP file we extracted earlier. Do not copy the all the markup from the index.html file. Only copy the markup inside the <body></body> tags. Your Index.cshtml file should now look as follows: @{ ViewData["Title"] = "Login Page"; } <div class="limiter"> <div class="container-login100"> <div class="wrap-login100"> <div class="login100-pic js-tilt" data-tilt> <img src="images/img-01.png" alt="IMG"> </div> <form class="login100-form validate-form"> <span class="login100-form-title"> Member Login </span> <div class="wrap-input100 validate-input" data-validate="Valid email is required: ex@abc.xyz"> <input class="input100" type="text" name="email" placeholder="Email"> <span class="focus-input100"></span> <span class="symbol-input100"> <i class="fa fa-envelope" aria-hidden="true"></i> </span> </div> <div class="wrap-input100 validate-input" data-validate="Password is required"> <input class="input100" type="password" name="pass" placeholder="Password"> <span class="focus-input100"></span> <span class="symbol-input100"> <i class="fa fa-lock" aria-hidden="true"></i> </span> </div> <div class="container-login100-form-btn"> <button class="login100-form-btn"> Login </button> </div> <div class="text-center p-t-12"> <span class="txt1"> Forgot </span> <a class="txt2" href="#"> Username / Password? </a> </div> <div class="text-center p-t-136"> <a class="txt2" href="#"> Create your Account <i class="fa fa-long-arrow-right m-l-5" aria-hidden="true"></i> </a> </div> </form> </div> </div> </div> The code for this chapter is available on GitHub here: Next, open the Layout.cshtml file and add all the links to the folders and files we copied into the wwwroot folder earlier. Use the index.html file for reference. You will notice that the _Layout.cshtml file contains the following piece of code—@RenderBody(). This is a placeholder that specifies where the Index.cshtml file content should be injected. If you are coming from ASP.NET Web Forms, think of the _Layout.cshtml page as a master page. Your Layout.cshtml markup should look as follows: <!DOCTYPE html> <html> <head> <meta charset="utf-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>@ViewData["Title"] - CoreMailValidation</title> <link rel="icon" type="image/png" href="~/images/icons/favicon.ico" /> <link rel="stylesheet" type="text/css" href="~/vendor/bootstrap/css/bootstrap.min.css"> <link rel="stylesheet" type="text/css" href="~/fonts/font-awesome-4.7.0/css/font-awesome.min.css"> <link rel="stylesheet" type="text/css" href="~/vendor/animate/animate.css"> <link rel="stylesheet" type="text/css" href="~/vendor/css-hamburgers/hamburgers.min.css"> <link rel="stylesheet" type="text/css" href="~/vendor/select2/select2.min.css"> <link rel="stylesheet" type="text/css" href="~/css/util.css"> <link rel="stylesheet" type="text/css" href="~/css/main.css"> </head> <body> <div class="container body-content"> @RenderBody() <hr /> <footer> <p>© 2018 - CoreMailValidation</p> </footer> </div> <script src="~/vendor/jquery/jquery-3.2.1.min.js"></script> <script src="~/vendor/bootstrap/js/popper.js"></script> <script src="~/vendor/bootstrap/js/bootstrap.min.js"></script> <script src="~/vendor/select2/select2.min.js"></script> <script src="~/vendor/tilt/tilt.jquery.min.js"></script> <script> $('.js-tilt').tilt({ scale: 1.1 }) </script> <script src="~/js/main.js"></script> @RenderSection("Scripts", required: false) </body> </html> If everything worked out right, you will see the following page when you run your ASP.NET Core MVC application. The login form is obviously totally non-functional: However, the login form is totally responsive. If you had to reduce the size of your browser window, you will see the form scale as your browser size reduces. This is what you want. If you want to explore the responsive design offered by Bootstrap, head on over to https://getbootstrap.com/ and go through the examples in the documentation:   The next thing we want to do is hook this login form up to our controller and call the Azure Function we created to validate the email address we entered. Let's look at doing that next. Hooking it all up To simplify things, we will be creating a model to pass to our controller: Create a new class in the Models folder of your application called LoginModel and click on the Add button:  2. Your project should now look as follows. You will see the model added to the Models folder: The next thing we want to do is add some code to our model to represent the fields on our login form. Add two properties called Email and Password: namespace CoreMailValidation.Models { public class LoginModel { public string Email { get; set; } public string Password { get; set; } } } Back in the Index.cshtml view, add the model declaration to the top of the page. This makes the model available for use in our view. Take care to specify the correct namespace where the model exists: @model CoreMailValidation.Models.LoginModel @{ ViewData["Title"] = "Login Page"; } The next portion of code needs to be written in the HomeController.cs file. Currently, it should only have an action called Index(): public IActionResult Index() { return View(); } Add a new async function called ValidateEmail that will use the base URL and parameter string of the Azure Function URL we copied earlier and call it using an HTTP request. I will not go into much detail here, as I believe the code to be pretty straightforward. All we are doing is calling the Azure Function using the URL we copied earlier and reading the return data: private async Task<string> ValidateEmail(string emailToValidate) { string azureBaseUrl = "https://core-mail- validation.azurewebsites.net/api/HttpTriggerCSharp1"; string urlQueryStringParams = $"? code=/IS4OJ3T46quiRzUJTxaGFenTeIVXyyOdtBFGasW9dUZ0snmoQfWoQ ==&email={emailToValidate}"; using (HttpClient client = new HttpClient()) { using (HttpResponseMessage res = await client.GetAsync( $"{azureBaseUrl}{urlQueryStringParams}")) { using (HttpContent content = res.Content) { string data = await content.ReadAsStringAsync(); if (data != null) { return data; } else return ""; } } } } Create another public async action called ValidateLogin. Inside the action, check to see if the ModelState is valid before continuing. For a nice explanation of what ModelState is, have a look at the following article—https://www.exceptionnotfound.net/asp-net-mvc-demystified-modelstate/. We then do an await on the ValidateEmail function, and if the return data contains the word false, we know that the email validation failed. A failure message is then passed to the TempData property on the controller. The TempData property is a place to store data until it is read. It is exposed on the controller by ASP.NET Core MVC. The TempData property uses a cookie-based provider by default in ASP.NET Core 2.0 to store the data. To examine data inside the TempData property without deleting it, you can use the Keep and Peek methods. To read more on TempData, see the Microsoft documentation here: https://docs.microsoft.com/en-us/aspnet/core/fundamentals/app-state?tabs=aspnetcore2x. If the email validation passed, then we know that the email address is valid and we can do something else. Here, we are simply just saying that the user is logged in. In reality, we will perform some sort of authentication here and then route to the correct controller. So now you know how to call an Azure Function from an ASP.NET Core application. If you found this tutorial helpful and you'd like to learn more, go ahead and pick up the book C# 7 and .NET Core Blueprints. What is ASP.NET Core? Why ASP.NET makes building apps for mobile and web easy How to dockerize an ASP.NET Core application    
Read more
  • 0
  • 1
  • 40715

article-image-google-employees-join-hands-with-amnesty-international-urging-google-to-drop-project-dragonfly
Sugandha Lahoti
28 Nov 2018
3 min read
Save for later

Google employees join hands with Amnesty International urging Google to drop Project Dragonfly

Sugandha Lahoti
28 Nov 2018
3 min read
Yesterday, Google employees have signed a petition protesting Google’s infamous Project Dragonfly. “We are Google employees and we join Amnesty International in calling on Google to cancel project Dragonfly”, they wrote on a post on Medium. This petition also marks the first time over 300 Google employees (at the time of writing this post) have used their actual names in a public document. Project Dragonfly is the secretive search engine that Google is allegedly developing which will comply with the Chinese rules of censorship. It has been on the receiving end of constant backlash from various human rights organizations and investigative reporters, since it was revealed earlier this year. On Monday, it also faced critique from human rights organization Amnesty International. Amnesty launched a petition opposing the project, and coordinated protests outside Google offices around the world including San Francisco, Berlin, Toronto and London. https://twitter.com/amnesty/status/1067488964167327744 Yesterday, Google employees joined Amnesty and wrote an open letter to the firm. “We are protesting against Google’s effort to create a censored search engine for the Chinese market that enables state surveillance. Our opposition to Dragonfly is not about China: we object to technologies that aid the powerful in oppressing the vulnerable, wherever they may be. Dragonfly in China would establish a dangerous precedent at a volatile political moment, one that would make it harder for Google to deny other countries similar concessions. Dragonfly would also enable censorship and government-directed disinformation, and destabilize the ground truth on which popular deliberation and dissent rely.” Employees have expressed their disdain over Google’s decision by calling it a money-minting business. They have also highlighted Google’s previous disappointments including Project Maven, Dragonfly, and Google’s support for abusers, and believe that “Google is no longer willing to place its values above its profits. This is why we’re taking a stand.” Google spokesperson has redirected to their previous response on the topic: "We've been investing for many years to help Chinese users, from developing Android, through mobile apps such as Google Translate and Files Go, and our developer tools. But our work on search has been exploratory, and we are not close to launching a search product in China." Twitterati have openly sided with Google employees in this matter. https://twitter.com/Davidramli/status/1067582476262957057 https://twitter.com/shabirgilkar/status/1067642235724972032 https://twitter.com/nrambeck/status/1067517570276868097 https://twitter.com/kuminaidoo/status/1067468708291985408 OK Google, why are you ok with mut(at)ing your ethos for Project DragonFly? Amnesty International takes on Google over Chinese censored search engine, Project Dragonfly. Google’s prototype Chinese search engine ‘Dragonfly’ reportedly links searches to phone numbers
Read more
  • 0
  • 0
  • 40695

article-image-django-3-0-is-going-async
Bhagyashree R
23 Jul 2019
4 min read
Save for later

Django 3.0 is going async!

Bhagyashree R
23 Jul 2019
4 min read
Last year, Andrew Godwin, a Django contributor, formulated a roadmap to bring async functionality into Django. After a lot of discussion and amendments, the Django Technical Board approved his DEP 0009: Async-capable Django yesterday. Godwin wrote in a Google group, “After a long and involved vote, I can announce that the Technical Board has voted in favour of DEP 0009 (Async Django), and so the DEP has been moved to the "accepted" state.” The reason why Godwin thinks that this is the right time to bring async-native support in Django is that starting from version 2.1, it supports Python 3.5 and up. These Python versions have async def and similar native support for coroutines. Also, the web is now slowly shifting to use cases that prefer high concurrency workloads and large parallelizable queries. The motivation behind Async in Django The Django Enhancement Proposal (DEP) 0009 aims to address one of the core flaws in Python: inefficient threading. Python is not considered to be a perfect asynchronous language. Its ‘asyncio’ library for writing concurrent code suffers from some core design flaws. There are alternative async frameworks for Python but are incompatible. Django Channels brought some async support to Django but they primarily focus on WebSocket handling. Explaining the motivation, the DEP says, “At the same time, it's important we have a plan that delivers our users immediate benefits, rather than attempting to write a whole new Django-size framework that is natively asynchronous from the start.” Additionally, most developers are unacquainted with developing Python applications that have async support. There is also a lack of proper documentation, tutorials, and tooling to help them. Godwin believes that Django can become a “good catalyst” to help in creating guidance documentation. Goals this DEP outlines to achieve The DEP proposes to bring support for asynchronous Python into Django while maintaining synchronous Python support as well in a backward-compatible way. Here are its end goals, that Godwin listed in his roadmap: Making the blocking parts in Django such as sessions, auth, the ORM, and handlers asynchronous natively with a synchronous wrapper exposed on top where needed to ensure backward compatibility. Keeping familiar models/views/templates/middleware layout intact with very few changes. Ensuring that these updates do not compromise speed and cause significant performance regressions at any stage of this plan. Enabling developers to write fully-async websites if they want to, but not enforcing this as the default way of writing websites. Welcoming new talent into the Djang team to help out on large-scale features. Timeline to achieve these goals Godwin in his "A Django Async Roadmap" shared the following timeline: Django Version Updates 2.1 Current in-progress release. No async work 2.2 Initial work to add async ORM and view capability, but everything defaults to sync by default, and async support is mostly threadpool-based. 3.0 Rewrite the internal request handling stack to be entirely asynchronous, add async middleware, forms, caching, sessions, auth. Start the deprecation process for any APIs that are becoming async-only. 3.1 Continue improving async support, potential async templating changes 3.2 Finish deprecation process and have a mostly-async Django. Godwin posted a summary of the discussion he had with the Django Technical Board in the Google Group. Some of the queries they raised were how the team plans to distinguish async versions of functions/method from sync ones, how this implementation will ensure that there is no performance hit if the user opts out of async mode, and more. In addition to these technical queries, the board also raised a non-technical concern, “The Django project has lost many contributors over the years, is essentially in a maintenance mode, and we likely do not have the people to staff a project like this.” Godwin sees a massive opportunity to lurking in this fundamental challenge - namely to revive the Django project. He adds, “I agree with the observation that things have substantially slowed down, but I personally believe that a project like async is exactly what Django needs to get going again. There's now a large amount of fertile ground to change and update things that aren't just fixing five-year-old bugs.” Read the DEP 0009: Async-capable Django to know more in detail. Which Python framework is best for building RESTful APIs? Django or Flask? Django 2.2 is now out with classes for custom database constraints  
Read more
  • 0
  • 0
  • 40671

article-image-techniques-and-practices-game-ai
Packt
14 Jan 2016
10 min read
Save for later

Techniques and Practices of Game AI

Packt
14 Jan 2016
10 min read
In this article by Peter L Newton, author of the book Learning Unreal AI Programming, we will understand the fundamental techniques and practices of game AI. This will be the building block to developing an amazing and interesting game AI. (For more resources related to this topic, see here.) Navigation While all the following components aren't necessary to achieve AI navigation, they all contribute critical feedback that can affect navigation. Navigating within a world is limited only by the pathways within the game. Navigation for AI is built up of the following things: Path following (path nodes): Another solution similar to NavMesh, path nodes can designate the space in which the AI traverses. Navigation mesh: Using tools such as Navigation Mesh, also known as NavMesh, you can designate areas in which the AI can traverse. NavMesh generates a plot of grids that is used to calculate the path and cost during navigation. It's important to know that this is only one of several pathfinding techniques available; we use it because it works well in this demonstration. Behavior trees: Using behavior trees to influence your AI's next destination can create a more interesting player experience. It not only calculates its requested destination, but also decides whether it should enter the screen with a cartwheel double backflip, no hands or try the triple somersault to jazz hands. Steering behaviors: Steering behaviors affect the way the AI moves while navigating to avoid obstacles. This also means using steering to create formations with your fleets that you have set to attack the king's wall. Steering can be used in many ways to influence the movement of the character. Sensory systems: Sensory systems can provide critical details, such as players nearby, sound levels, cover nearby, and many other variables of the environment that can alter movement. It's critical that your AI understands the changing environment so that it doesn't break the illusion of being a real opponent. Achieving realistic movement with steering When you think of what steering does for a car, you would be right to imagine that the same idea is applied to game AI navigation. Steering influences the movement of AI elements as they traverse to their next destination. The influences can be supplied as necessary, but we will go over the most commonly used ones. Avoidance is used essentially to avoid colliding with oncoming AI. Flocking is another key factor in steering; you commonly see an example of it while watching a school of fish. This phenomenon, known as flocking, is useful in simulating interesting group movement; simulate a complete panic or a school of fish. The goal of steering behaviors is to achieve realistic movement behavior within the player's world. Creating character with randomness and probability AI with character is what randomness and probability adds to the bots decision making. If a bot attacked you the same way, always entered the scene the same way, and annoyed you with its laugh after every successful hit, it wouldn't make for a unique experience—the AI always does the same thing. By using randomness and probability, you can instead make the AI laugh based on probability or introduce randomness to the AI's skill of choice. Another great by-product of applying randomness and probability is that it allows you to introduce levels of difficulty. You can lower the chance of missing the skill cast or even allow the bots to aim more precisely. If you have bots who wander around looking for enemies, their next destination can be randomly chosen. Creating complex decision making with behavior trees Finite State Machines (FSM) allow your bot to perform transitions between states. This allows it to go from wandering to hunting and then to killing. Behavior trees are similar but allow more flexibility. Behavior trees allow hierarchical FSM, which introduces another layer of decisions. So, the bot decides between branches of behaviors that define the state it is in. There is a tool provided by UE4 called Behavior Tree. Its editor tool allows you to modify AI behavior quickly and with ease. The following sections show the components found within UE4's Behavior Tree. Root This node is the starting node that sends the signal to the next node in the tree. This would connect to a composite that begins your first tree. What you may notice is that you are required to use a composite first to define a tree and then create the task for that tree. This is because a hierarchical FSM creates branches of states. These states will be populated with other states or tasks. This allows easy transitions between multiple states. Decorators This node creates another task, which you can add on top of the node as a "decoration". This could be, for example, a Force Success decorator when using a sequence composite or using a loop to have a node's actions repeated a number of times. I used a decorator in the AI we will make that tells it to update to the next available route. Consider the following screenshot: In the preceding screenshot, you see the Attack & Destroy decorator at the top of the composite, which defines the state. This state includes two tasks, Attack Enemy and Move To Enemy, the latter of which also has a decorator telling it to execute only when the bot state is searching. Composites These are the starting points of the states. They define how the state will behave with returns and execution flow. There is a Selector in our example that will execute each of its children from left to right and doesn't fail but returns success when one of its children returns success. Therefore, this is good for a state that doesn't check for successfully executed nodes. The Sequence executes its children in a similar fashion to the Selector, but returns a fail message when one of its children returns fail. This means that it's required that the nodes return a success message to complete the sequence. Last but not least is Simple Parallel. This allows you to execute a task and a tree at essentially the same time. This is great for creating a state that will require another task to always be called. So, to set it up, you first need to connect it to a task that it will execute. The second task or state that is connected continues to be called with the first task until the first task returns a success message. Services Services run as long as the composite that it is added to stays activated. They tick on the intervals that you set within the properties. They have another float property that allows you to create deviations in the tick intervals. Services are used to modify the state of the AI in most cases, because it's always called. For example, in the bot that we will create, we add a service to the first branch of the tree so that it's called without interruption, thus being able to maintain the state that the bot should be in at any given movement. This service, called Detect Enemy, actually runs a deviating cycle that updates Blackboard variables, such as State and EnemyActor: Tasks Tasks do the dirty work and report with a success or failed message if necessary. They have two nodes, which you'll use most often when working with a task: Event Receive Execute, which receives the signal to execute the connected scripts, and Finish Execute, which sends the signal back, returning a true or false message on success. This is important when making a task meant for the Sequence composite. Blackboards Blackboards are used to store variables within the behavior tree of the AI. In our example, we store an enumeration variable, State, to store the state, TargetPoint to hold the currently targeted enemy, and Route, which stores the current route position the AI has been requested to travel to, just to name a few. Blackboards work just by setting a public variable of a node to one of the available Blackboard variables in the drop-down menu. The naming convention shown in the following screenshot makes this process streamlined: Sensory system Creating a sensory system is heavily based on the environment where the AI will be fighting the player. It will need to be able to find cover, evade the enemy, get ammo, and other features that you feel will create an immersive AI for your game. Games with AI that challenges the player create a unique individual experience. A good sensory system contributes critical information, which makes for reactive AI. In this project, we use the sensory system to detect pawns that the AI can see. We also use functions to check for the line of sight of the enemy. We check whether there is another pawn in our path. We can check for cover and other resources within the area. Machine learning Machine learning is a branch of its own. This technique allows AI to learn from situations and simulations. The inputs are from the environment, including the context in which the bot allows it to make decisive actions. In machine learning, the inputs are put within a classifier, which can predict a set of outputs with a certain level of certainty. Classifiers can be combined into ensembles to increase the accuracy of the probabilistic prediction. We don't dig heavily into this subject, but I will provide some material for those interested. Tracing Tracing allows another actor within the world to detect objects by ray tracing. A single line trace is sent out, and if it collides with an actor, the actor is returned, including the information about the impact. Tracing is used for many reasons. One way it is used in FPS games is to detect hits. Are you familiar with the hit box? When your player shoots in a game, a trace is shot out that collides with the opponent's hit box, determining the damage to your opponent and, if you're skillful enough, resulting in their death. There are other shapes available for traces, such as spheres, capsules, and boxes, which allow tracing for different situations. Recently, I used the box trace for my car in order to detect objects near it. Influence mapping Influence mapping isn't a finite approach; it's the idea that specific locations on the map would contribute information that directly influences the player or AI. An example when using influence mapping with AI is presence falloff. Say we have enemy AI in a group. Their presence map would create a radial circle around the group with an intensity based on the size of the group. This way, other AI elements know that on entering this area, they're entering a zone occupied by enemy AI. Practical information isn't the only thing people use this for, so just understand that it's meant to provide another level of input to help your bot make additional decisions. Summary In this article, we saw the fundamental techniques and practices of game AI. We saw how to implement navigation, achieve realistic movement of AI elements, and create characters with randomness in order to achieve a sense of realism. We also looked at behavior trees and all their constituent elements. Further, we touched upon some aspects related to AI, such as machine learning and tracing. Resources for Article: Further resources on this subject: Overview of Unreal Engine 4[article] The Unreal Engine[article] Creating weapons for your game using UnrealScript[article]
Read more
  • 0
  • 0
  • 40596

article-image-implementing-an-api-design-first-approach-for-building-apis
Packt Editorial Staff
15 Jun 2018
9 min read
Save for later

Implement an API Design-first approach for building APIs [Tutorial]

Packt Editorial Staff
15 Jun 2018
9 min read
The Monster Records & Associates (MRA) –a fictional music records company, having realised that its biggest asset is in fact its data, embarked on a digital transformation with the aim to offer its product and offerings completely online and via APIs. This article is an excerpt taken from the book Implementing Oracle API Platform Cloud Service, written  by Andy Bell, Sander Rensen, Luis Weir, Phil Wilkins. In this post we are is going to take  you through an interesting MRA case study who adopted an API design-first approach for building its APIs. We will go through the process and steps performed by MRA for this implementation. The Problem Scenario MRA had embarked on a digital transformation journey with the objective to become a digital organisation capable of offering tailored (à la carte) offerings to artists such as handling of an artist’s online presence to on-demand distribution of an artist's digital media to Music Streaming Services such as Spotify, Apple Music, Google Play Music, Amazon Prime Music, Pandora, Deezer to name a few. Having fully acknowledged that their most valuable asset is in fact their media data, MRA wanted to materialise in such assets and determined that the quickest and most effective way to achieve this was by exposing a public API capable of providing access, on-demand, to MRA's Media Catalogue assets such as artists, songs and albums. Figure 1: MRA's Media Catalogue API The idea being, once such assets became accessible via an API, streaming services could, on-demand and 24x7, explore MRA's repertoire, purchase rights-to-use and start streaming. In addition, the API could also open the door to a brand new global audience: millions of app developers constantly innovating. If only a fraction of such a huge audience leveraged MRA's Media Catalogue API, it would still represent a considerable success for MRA. However, as with everything, there is a challenge to realise such vision. MRA like many other organizations, had a level of experience with systems integrations and Service Oriented Architectures (SOA). One of the lessons learnt from SOA however was that the cycles for designing, building, prototyping, and testing SOAP-based Web Services could be quite lengthy and expensive. An API differentiates from a service in that the former represents the RESTful interface a consumer application interacts with, whereas the latter is the actual implementation (the code) behind an API. A HTTP endpoint exposed by a service is defined as an unmanaged API. When a service endpoint is accessed via an API Gateway where policies such as app-key validation, authentication/authorization and other policies are enforced, then it becomes a managed API. The book, Implementing Oracle API Platform Cloud Service, refers to managed APIs as simply APIs and unmanaged APIs as simply service endpoints. Especially when it came to capturing and accommodating the feedback from Client Application Developers (API consumers), MRA had very bad experiences as in the majority of occasions they came to realize very late in the software lifecycle that the Web Service developed did not meet the expectations of its consumers. Figure 2: feedback-loops in traditional web service design Refactoring web services in this approach wasn't just time consuming but also an expensive exercise as both the Service design (WSDL) and code had to be refactored and re-tested in order to accommodate the feedback received and before application developers could try a service again. Naturally service designers and developers avoided as much as possible making changes, thus challenging feedback received from application developers, which in turn created friction amongst both teams but in some occasions meant application developers finding alternative routes to solve their needs rather than using the web service. This was the worst possible scenario as it meant that the investments made in implementing a web service could've been wasted. API design-first process Learning from experiences and acknowledging the challenges that such waterfall like process imposed to a digital transformation initiative, MRA were quite keen to adopt a more agile, interactive but also quicker way to deliver modern RESTful based APIs. The idea was clear. By engaging application developers (API consumers) in the initial stages of the design process, feedback would be captured and reflected back in the interface design (API) early as well. Not only this would shorten feedback loops, but ensure that once the underlying services are implemented, it would expose an interface already endorsed and tested by its consumers, as opposed to risk building a service that won't satisfy the client expectations and needs late in the process. Figure 3: API design-first approach vs traditional service design The implication of this approach though, is that the tooling and notation to define the API, had to be both simple, yet rich in capability such as the task of designing and mocking API endpoints is quick and easy, given that if the process becomes cumbersome it would defeat its purpose. We elaborate on the different steps undertaken by MRA when designing its Media Catalogue API using Apiary and related tools in the book, Implementing Oracle API Platform Cloud Service. Here are the steps: Defining the API type Defining the API’s domain semantics Creating the API definition with its main resources Trying the API mock Defining MSON Data Structures Pushing the API Blueprint to GitHub Publishing the API mock in Oracle API Platform CS Setting up Dredd for continuous testing of API endpoints against the API blueprint Defining the API type: A fundamental step when designing any API is to first define what the type is. This is important as it will determine the guiding principles to consider when doing the design. We have three types of APIs: Single-Purpose APIs: These are APIs that serve a unique and specific purpose, typically derived from an unambiguous need associated with a user journey or use case. Multi-Purpose APIs: These APIs are more generic in nature and are meant to satisfy not just one but multiple use cases and scenarios. They are not bound (coupled) to a specific user journey or system of engagement (for example, a mobile app) therefore ideal for reuse enterprise-wide. MRA’s Media Catalogue API: MRA's Media Catalogue API was specifically targeted at two main audiences: Music Streaming Services and Application Developers in general. Therefore, the API had to be both Public and Multi-Purpose. Defining the API’s domain semantics: This step elaborates proper understanding of the API's bounded context, Media Catalogue. To do so, entities, key attributes, and relationships within the bounded context itself were identified and also defined using semantics appropriate for the purpose of the API. Creating the API definition with its main resources: This step shows how to create an API and define its main resources, parameters, and sample payloads.It involves steps followed by MRA when creating the Media Catalogue API definition and its associated API mock. Trying the API mock: This part describes how Apiary's automatically generated API mocks can be used to satisfy one of the most important steps in API design-first: try an API early in the lifecycle, before the API is actually implemented. This is a critical step as collecting feedback from API consumers early can potentially save numerous hours in code refactoring later in the project. Defining MSON Data Structures: The Markdown Syntax for Object Notation (MSON) is a plain-text syntax for the description and validation of Data Structures in API Blueprint. It provides a way to represent objects (for example, an artist) in a human-readable plain text form. This part involves steps to define the Artist, Album and Song objects using the MSON notation. Pushing the API Blueprint to GitHub: API Blueprints can be pushed into GitHub repositories, so they can be version controlled but most importantly it can follow a similar GitHub cycle as any other code asset. This step is also required in order to configure Dredd to validate API endpoints against API blueprint definitions. Publishing the API mock in Oracle API Platform CS: Although Apiary provides an API mock URL that can be can be accessed directly, it is recommended that instead, the API mock is published and accessed via the Oracle API Platform Cloud Service. Setting up Dredd for continuous testing of API endpoints against the API blueprint: The last step of the API design-first process is to configure Dredd to continuously validate that an API endpoint exposed through the API Gateway is always compliant with its corresponding API Blueprint definition. The idea is to ensure that Client Application code is not broken once an API Policy is changed to point to a Backend Service once it has been built and deployed. We discussed the API design-first approach for building its APIs. MRA's business scenario demanded the need for more efficient and leaner process for implementing APIs. We saw how an API design-first process could effectively help organizations such as MRA gain greater speed, agility, and efficiencies. Here’s a summary of the basic steps to realize such process. Choose your API type: We introduced the conceptual concepts such as Single Purpose and Multi-Purpose APIs to decide on what type of API to adopt. Define your APIs: The need for creating an API definition and an API mock in Apiary based on API Blueprints and the Markdown Syntax for Object Notation (MSON). Create & publish API: Creation and publication of an API using the Oracle API Platform Cloud Service. Continuously test: Finally, the configuration of Dredd to verify API endpoints compliance with the API definition. You just enjoyed an excerpt from the book Implementing Oracle API Platform Cloud Services. Grab the latest edition of this book to work with the newest Oracle APIs, and interface with an increasingly complex array of services your clients want. What are the best programming languages for building APIs? Glancing at the Fintech growth story – Powered by ML, AI & APIs What RESTful APIs can do for Cloud, IoT, social media and other emerging technologies  
Read more
  • 0
  • 0
  • 40577
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-3-programming-languages-some-people-think-are-dead-but-definitely-arent
Richard Gall
24 Oct 2019
11 min read
Save for later

3 programming languages some people think are dead but definitely aren’t

Richard Gall
24 Oct 2019
11 min read
Recently I looked closely at what it really means when a certain programming language, tool, or trend is declared to be ‘dead’. It seems, I argued, that talking about death in respect of different aspects of the tech industry is as much a signal about one’s identity and values as a developer as it is an accurate description of a particular ‘thing’s’ reality. To focus on how these debates and conversations play out in practice I decided to take a look at 3 programming languages, each of which has been described as dead or dying at some point. What I found might not surprise you, but it nevertheless highlights that the different opinions a certain person or community has about a language reflects their needs and challenges as software engineers. Is Java dead? One of the biggest areas of debate in terms of living, thriving or dying, is Java. There are a number of reasons for this. The biggest is the simple fact that it’s so widely used. With so many developers using the language for a huge range of reasons, it’s not surprising to find such a diversity of opinion across its developer community. Another reason is that Java is so well-established as a programming language. Although it’s a matter of debate whether it’s declining or dying, it certainly can’t be said to be emerging or growing at any significant pace. Java is part of the industry mainstream now. You’d think that might mean it’s holding up. But when you consider that this is an industry that doesn’t just embrace change and innovation, but one that depends on it for its value, you can begin to see that Java has occupied a slightly odd space for some time. Why do people think Java is dead? Java has been on the decline for a number of years. If you look at the TIOBE index from the mid to late part of this decade it has been losing percentage points. From May 2016 to May 2017, for example, the language declined 6% - this indicates that it’s losing mindshare to other languages. A further reason for its decline is the rise of Kotlin. Although Java has for a long time been the defining language of Android development, in recent years its reputation has taken a hit as Kotlin has become more widely adopted. As this Medium article from 2018 argues, it’s not necessarily a great idea to start a new Android project with Java. The threat to Java isn’t only coming from Kotlin - it’s coming from Scala too. Scala is another language based on the JVM (Java Virtual Machine). It supports both object oriented and functional programming, offering many performance advantages over Java, and is being used for a wide range of use cases - from machine learning to application development. Reasons why Java isn’t dead Although the TIOBE index has shown Java to be a language in decline, it nevertheless remains comfortably at the top of the table. It might have dropped significantly between 2016 and 2017, but more recently its decline has slowed: it has dropped only 0.92% between October 2018 and October 2019. From this perspective, it’s simply bizarre to suggest that Java is ‘dead’ or ‘dying’: it’s de facto the most widely used programming language on the planet. When you factor in everything else that that entails - the massive community means more support, an extensive ecosystem of frameworks, libraries and other tools (note Spring Boot’s growth as a response to the microservice revolution). So, while Java’s age might seem like a mark against it, it’s also a reason why there’s still a lot of life in it. At a more basic level, Java is ubiquitous; it’s used inside a massive range of applications. Insofar as it’s inside live apps it’s alive. That means Java developers will be in demand for a long time yet. The verdict: is Java dead or alive? Java is very much alive and well. But there are caveats: ultimately, it’s not a language that’s going to help you solve problems in creative or innovative ways. It will allow you to build things and get projects off the ground, but it’s arguably a solid foundation on which you will need to build more niche expertise and specialisation to be a really successful engineer. Is JavaScript dead? Although Java might be the most widely used programming language in the world, JavaScript is another ubiquitous language that incites a diverse range of opinions and debate. One of the reasons for this is that some people seriously hate JavaScript. The consensus on Java is a low level murmur of ‘it’s fine’, but with JavaScript things are far more erratic. This is largely because of JavaScript’s evolution. For a long time it was playing second fiddle to PHP in the web development arena because it was so unstable - it was treated with a kind of stigma as if it weren’t a ‘real language.’ Over time that changed, thanks largely to HTML5 and improved ES6 standards, but there are still many quirks that developers don’t like. In particular, JavaScript isn’t a nice thing to grapple with if you’re used to, say, Java or C. Unlike those languages its an interpreted not a compiled programming language. So, why do people think it’s dead? Why do people think JavaScript is dead? There are a number of very different reasons why people argue that JavaScript is dead. On the one hand, the rise of templates, and out of the box CMS and eCommerce solutions mean the use of JavaScript for ‘traditional’ web development will become less important. Essentially, the thinking goes, the barrier to entry is lower, which means there will be fewer people using JavaScript for web development. On the other hand people look at the emergence of Web Assembly as the death knell for JavaScript. Web Assembly (or Wasm) is “a binary instruction format for a stack-based virtual machine” (that’s from the project’s website), which means that code can be compiled into a binary format that can be read by a browser. This means you can bring high level languages such as Rust to the browser. To a certain extent, then, you’d think that Web Assembly would lead to the growth of languages that at the moment feel quite niche. Read next: Introducing Woz, a Progressive WebAssembly Application (PWA + Web Assembly) generator written entirely in Rust Reasons why JavaScript isn’t dead First, let’s counter the arguments above: in the first instance, out of the box solutions are never going to replace web developers. Someone needs to build those products, and even if organizations choose to use them, JavaScript is still a valuable language for customizing and reshaping purpose-built solutions. While the barrier to entry to getting a web project up and running might be getting lower, it’s certainly not going to kill JavaScript. Indeed, you could even argue that the pool is growing as you have people starting to pick up some of the basic elements of the web. On the Web Assembly issue: this is a slightly more serious threat to JavaScript, but it’s important to remember that Web Assembly was never designed to simply ape the existing JavaScript use case. As this useful article explains: “...They solve two different issues: JavaScript adds basic interactivity to the web and DOM while WebAssembly adds the ability to have a robust graphical engine on the web. WebAssembly doesn’t solve the same issues that JavaScript does because it has no knowledge of the DOM. Until it does, there’s no way it could replace JavaScript.” Web Assembly might even renew faith in JavaScript. By tackling some of the problems that many developers complain about, it means the language can be used for problems it is better suited to solve. But aside from all that, there are a wealth of other reasons that JavaScript is far from dead. React continues to grow in popularity, as does Node.js - the latter in particular is influential in how it has expanded what’s possible with the language, moving from the browser to the server. The verdict: Is JavaScript dead or alive? JavaScript is very much alive and well, however much people hate it. With such a wide ecosystem of tools surrounding it, the way that it’s used might change, but the language is here to stay and has a bright future. Is C dead? C is one of the oldest programming languages around (it’s approaching its 50th birthday). It’s a language that has helped build the foundations of the software world as we know it today, including just about every operating system. But although it’s a fundamental part of the technology landscape, there are murmurs that it’s just not up to the job any more… Why do people think that C is dead? If you want to get a sense of the division of opinion around C you could do a lot worse than this article on TechCrunch. “C is no longer suitable for this world which C has built,” explains engineer Jon Evans. “C has become a monster. It gives its users far too much artillery with which to shoot their feet off. Copious experience has taught us all, the hard way, that it is very difficult, verging on ‘basically impossible,’ to write extensive amounts of C code that is not riddled with security holes.” The security concerns are reflected elsewhere, with one writer arguing that “no one is creating new unsafe languages. It’s not plausible to say that this is because C and C++ are perfect; even the staunchest proponent knows that they have many flaws. The reason that people are not creating new unsafe languages is that there is no demand. The future is safe languages.” Added to these concerns is the rise of Rust - it could, some argue, be an alternative to C (and C++) for lower level systems programming that is more modern, safer and easier to use. Reasons why C isn’t dead Perhaps the most obvious reason why C isn’t dead is the fact that it’s so integral to so much software that we use today. We’re not just talking about your standard legacy systems; C is inside the operating systems that allow us to interface with software and machines. One of the arguments often made against C is that ‘the web is taking over’, as if software in general is moving up levels of abstraction that make languages at a machine level all but redundant. Aside from that argument being plain stupid (ie. what’s the web built on?), with IoT and embedded computing growing at a rapid rate, it’s only going to make C more important. To return to our good friend the TIOBE Index: C is in second place, the same position it held in October 2018. Like Java, then, it’s holding its own in spite of rumors. Unlike Java, moreover, C’s rating has actually increased over the course of a year. Not a massive amount admittedly - 0.82% - but a solid performance that suggests it’s a long way from dead. Read next: Why does the C programming language refuse to die? The verdict: Is C dead or alive? C is very much alive and well. It’s old, sure, but it’s buried inside too much of our existing software infrastructure for it to simply be cast aside. This isn’t to say it isn’t without flaws. From a security and accessibility perspective we’re likely to see languages like Rust gradually grow in popularity to tackle some of the challenges that C poses. But an equally important point to consider is just how fundamental C is for people that want to really understand programming in depth. Even if it doesn’t necessarily have a wide range of use cases, the fact that it can give developers and engineers an insight into how code works at various levels of the software stack means it will always remain a language that demands attention. Conclusion: Listen to multiple perspectives on programming languages before making a judgement The obvious conclusion to draw from all this is that people should just stop being so damn opinionated. But I don't actually think that's correct: people should keep being opinionated and argumentative. There's no place for snobbery or exclusion, but anyone that has a view on something's value then they should certainly express it. It helps other people understand the language in a way that's not possible through documentation or more typical learning content. What's important is that we read opinions with a critical eye: what's this persons agenda? What's their background? What are they trying to do? After all, there are things far more important than whether something is dead or alive: building great software we can be proud of being one of them.
Read more
  • 0
  • 0
  • 40576

article-image-dynamic-graphics
Packt
22 Feb 2016
64 min read
Save for later

Dynamic Graphics

Packt
22 Feb 2016
64 min read
There is no question that the rendering system of modern graphics devices is complicated. Even rendering a single triangle to the screen engages many of these components, since GPUs are designed for large amounts of parallelism, as opposed to CPUs, which are designed to handle virtually any computational scenario. Modern graphics rendering is a high-speed dance of processing and memory management that spans software, hardware, multiple memory spaces, multiple languages, multiple processors, multiple processor types, and a large number of special-case features that can be thrown into the mix. To make matters worse, every graphics situation we will come across is different in its own way. Running the same application against a different device, even by the same manufacturer, often results in an apples-versus-oranges comparison due to the different capabilities and functionality they provide. It can be difficult to determine where a bottleneck resides within such a complex chain of devices and systems, and it can take a lifetime of industry work in 3D graphics to have a strong intuition about the source of performance issues in modern graphics systems. Thankfully, Profiling comes to the rescue once again. If we can gather data about each component, use multiple performance metrics for comparison, and tweak our Scenes to see how different graphics features affect their behavior, then we should have sufficient evidence to find the root cause of the issue and make appropriate changes. So in this article, you will learn how to gather the right data, dig just deep enough into the graphics system to find the true source of the problem, and explore various solutions to work around a given problem. There are many more topics to cover when it comes to improving rendering performance, so in this article we will begin with some general techniques on how to determine whether our rendering is limited by the CPU or by the GPU, and what we can do about either case. We will discuss optimization techniques such as Occlusion Culling and Level of Detail (LOD) and provide some useful advice on Shader optimization, as well as large-scale rendering features such as lighting and shadows. Finally, since mobile devices are a common target for Unity projects, we will also cover some techniques that may help improve performance on limited hardware. (For more resources related to this topic, see here.) Profiling rendering issues Poor rendering performance can manifest itself in a number of ways, depending on whether the device is CPU-bound, or GPU-bound; in the latter case, the root cause could originate from a number of places within the graphics pipeline. This can make the investigatory stage rather involved, but once the source of the bottleneck is discovered and the problem is resolved, we can expect significant improvements as small fixes tend to reap big rewards when it comes to the rendering subsystem. The CPU sends rendering instructions through the graphics API, that funnel through the hardware driver to the GPU device, which results in commands entering the GPU's Command Buffer. These commands are processed by the massively parallel GPU system one by one until the buffer is empty. But there are a lot more nuances involved in this process. The following shows a (greatly simplified) diagram of a typical GPU pipeline (which can vary based on technology and various optimizations), and the broad rendering steps that take place during each stage: The top row represents the work that takes place on the CPU, the act of calling into the graphics API, through the hardware driver, and pushing commands into the GPU. Ergo, a CPU-bound application will be primarily limited by the complexity, or sheer number, of graphics API calls. Meanwhile, a GPU-bound application will be limited by the GPU's ability to process those calls, and empty the Command Buffer in a reasonable timeframe to allow for the intended frame rate. This is represented in the next two rows, showing the steps taking place in the GPU. But, because of the device's complexity, they are often simplified into two different sections: the front end and the back end. The front end refers to the part of the rendering process where the GPU has received mesh data, a draw call has been issued, and all of the information that was fed into the GPU is used to transform vertices and run through Vertex Shaders. Finally, the rasterizer generates a batch of fragments to be processed in the back end. The back end refers to the remainder of the GPU's processing stages, where fragments have been generated, and now they must be tested, manipulated, and drawn via Fragment Shaders onto the frame buffer in the form of pixels. Note that "Fragment Shader" is the more technically accurate term for Pixel Shaders. Fragments are generated by the rasterization stage, and only technically become pixels once they've been processed by the Shader and drawn to the Frame Buffer. There are a number of different approaches we can use to determine where the root cause of a graphics rendering issue lies: Profiling the GPU with the Profiler Examining individual frames with the Frame Debugger Brute Force Culling GPU profiling Because graphics rendering involves both the CPU and GPU, we must examine the problem using both the CPU Usage and GPU Usage areas of the Profiler as this can tell us which component is working hardest. For example, the following screenshot shows the Profiler data for a CPU-bound application. The test involved creating thousands of simple objects, with no batching techniques taking place. This resulted in an extremely large Draw Call count (around 15,000) for the CPU to process, but giving the GPU relatively little work to do due to the simplicity of the objects being rendered: This example shows that the CPU's "rendering" task is consuming a large amount of cycles (around 30 ms per frame), while the GPU is only processing for less than 16 ms, indicating that the bottleneck resides in the CPU. Meanwhile, Profiling a GPU-bound application via the Profiler is a little trickier. This time, the test involves creating a small number of high polycount objects (for a low Draw Call per vertex ratio), with dozens of real-time point lights and an excessively complex Shader with a texture, normal texture, heightmap, emission map, occlusion map, and so on, (for a high workload per pixel ratio). The following screenshot shows Profiler data for the example Scene when it is run in a standalone application: As we can see, the rendering task of the CPU Usage area matches closely with the total rendering costs of the GPU Usage area. We can also see that the CPU and GPU time costs at the bottom of the image are relatively similar (41.48 ms versus 38.95 ms). This is very unintuitive as we would expect the GPU to be working much harder than the CPU. Be aware that the CPU/GPU millisecond cost values are not calculated or revealed unless the appropriate Usage Area has been added to the Profiler window. However, let's see what happens when we test the same exact Scene through the Editor: This is a better representation of what we would expect to see in a GPU-bound application. We can see how the CPU and GPU time costs at the bottom are closer to what we would expect to see (2.74 ms vs 64.82 ms). However, this data is highly polluted. The spikes in the CPU and GPU Usage areas are the result of the Profiler Window UI updating during testing, and the overhead cost of running through the Editor is also artificially increasing the total GPU time cost. It is unclear what causes the data to be treated this way, and this could certainly change in the future if enhancements are made to the Profiler in future versions of Unity, but it is useful to know this drawback. Trying to determine whether our application is truly GPU-bound is perhaps the only good excuse to perform a Profiler test through the Editor. The Frame Debugger A new feature in Unity 5 is the Frame Debugger, a debugging tool that can reveal how the Scene is rendered and pieced together, one Draw Call at a time. We can click through the list of Draw Calls and observe how the Scene is rendered up to that point in time. It also provides a lot of useful details for the selected Draw Call, such as the current render target (for example, the shadow map, the camera depth texture, the main camera, or other custom render targets), what the Draw Call did (drawing a mesh, drawing a static batch, drawing depth shadows, and so on), and what settings were used (texture data, vertex colors, baked lightmaps, directional lighting, and so on). The following screenshot shows a Scene that is only being partially rendered due to the currently selected Draw Call within the Frame Debugger. Note the shadows that are visible from baked lightmaps that were rendered during an earlier pass before the object itself is rendered: If we are bound by Draw Calls, then this tool can be effective in helping us figure out what the Draw Calls are being spent on, and determine whether there are any unnecessary Draw Calls that are not having an effect on the scene. This can help us come up with ways to reduce them, such as removing unnecessary objects or batching them somehow. We can also use this tool to observe how many additional Draw Calls are consumed by rendering features, such as shadows, transparent objects, and many more. This could help us, when we're creating multiple quality levels for our game, to decide what features to enable/disable under the low, medium, and high quality settings. Brute force testing If we're poring over our Profiling data, and we're still not sure we can determine the source of the problem, we can always try the brute force method: cull a specific activity from the Scene and see if it results in greatly increased performance. If a small change results in a big speed improvement, then we have a strong clue about where the bottleneck lies. There's no harm in this approach if we eliminate enough unknown variables to be sure the data is leading us in the right direction. We will cover different ways to brute force test a particular issue in each of the upcoming sections. CPU-bound If our application is CPU-bound, then we will observe a generally poor FPS value within the CPU Usage area of the Profiler window due to the rendering task. However, if VSync is enabled the data will often get muddied up with large spikes representing pauses as the CPU waits for the screen refresh rate to come around before pushing the current frame buffer. So, we should make sure to disable the VSync block in the CPU Usage area before deciding the CPU is the problem. Brute-forcing a test for CPU-bounding can be achieved by reducing Draw Calls. This is a little unintuitive since, presumably, we've already been reducing our Draw Calls to a minimum through techniques such as Static and Dynamic Batching, Atlasing, and so forth. This would mean we have very limited scope for reducing them further. What we can do, however, is disable the Draw-Call-saving features such as batching and observe if the situation gets significantly worse than it already is. If so, then we have evidence that we're either already, or very close to being, CPU-bound. At this point, we should see whether we can re-enable these features and disable rendering for a few choice objects (preferably those with low complexity to reduce Draw Calls without over-simplifying the rendering of our scene). If this results in a significant performance improvement then, unless we can find further opportunities for batching and mesh combining, we may be faced with the unfortunate option of removing objects from our scene as the only means of becoming performant again. There are some additional opportunities for Draw Call reduction, including Occlusion Culling, tweaking our Lighting and Shadowing, and modifying our Shaders. These will be explained in the following sections. However, Unity's rendering system can be multithreaded, depending on the targeted platform, which version of Unity we're running, and various settings, and this can affect how the graphics subsystem is being bottlenecked by the CPU, and slightly changes the definition of what being CPU-bound means. Multithreaded rendering Multithreaded rendering was first introduced in Unity v3.5 in February 2012, and enabled by default on multicore systems that could handle the workload; at the time, this was only PC, Mac, and Xbox 360. Gradually, more devices were added to this list, and since Unity v5.0, all major platforms now enable multithreaded rendering by default (and possibly some builds of Unity 4). Mobile devices were also starting to feature more powerful CPUs that could support this feature. Android multithreaded rendering (introduced in Unity v4.3) can be enabled through a checkbox under Platform Settings | Other Settings | Multithreaded Rendering. Multithreaded rendering on iOS can be enabled by configuring the application to make use of the Apple Metal API (introduced in Unity v4.6.3), under Player Settings | Other Settings | Graphics API. When multithreaded rendering is enabled, tasks that must go through the rendering API (OpenGL, DirectX, or Metal), are handed over from the main thread to a "worker thread". The worker thread's purpose is to undertake the heavy workload that it takes to push rendering commands through the graphics API and driver, to get the rendering instructions into the GPU's Command Buffer. This can save an enormous number of CPU cycles for the main thread, where the overwhelming majority of other CPU tasks take place. This means that we free up extra cycles for the majority of the engine to process physics, script code, and so on. Incidentally, the mechanism by which the main thread notifies the worker thread of tasks operates in a very similar way to the Command Buffer that exists on the GPU, except that the commands are much more high-level, with instructions like "render this object, with this Material, using this Shader", or "draw N instances of this piece of procedural geometry", and so on. This feature has been exposed in Unity 5 to allow developers to take direct control of the rendering subsystem from C# code. This customization is not as powerful as having direct API access, but it is a step in the right direction for Unity developers to implement unique graphical effects. Confusingly, the Unity API name for this feature is called "CommandBuffer", so be sure not to confuse it with the GPU's Command Buffer. Check the Unity documentation on CommandBuffer to make use of this feature: http://docs.unity3d.com/ScriptReference/Rendering.CommandBuffer.html. Getting back to the task at hand, when we discuss the topic of being CPU-bound in graphics rendering, we need to keep in mind whether or not the multithreaded renderer is being used, since the actual root cause of the problem will be slightly different depending on whether this feature is enabled or not. In single-threaded rendering, where all graphics API calls are handled by the main thread, and in an ideal world where both components are running at maximum capacity, our application would become bottlenecked on the CPU when 50 percent or more of the time per frame is spent handling graphics API calls. However, resolving these bottlenecks can be accomplished by freeing up work from the main thread. For example, we might find that greatly reducing the amount of work taking place in our AI subsystem will improve our rendering significantly because we've freed up more CPU cycles to handle the graphics API calls. But, when multithreaded rendering is taking place, this task is pushed onto the worker thread, which means the same thread isn't being asked to manage both engine work and graphics API calls at the same time. These processes are mostly independent, and even though additional work must still take place in the main thread to send instructions to the worker thread in the first place (via the internal CommandBuffer system), it is mostly negligible. This means that reducing the workload in the main thread will have little-to-no effect on rendering performance. Note that being GPU-bound is the same regardless of whether multithreaded rendering is taking place. GPU Skinning While we're on the subject of CPU-bounding, one task that can help reduce CPU workload, at the expense of additional GPU workload, is GPU Skinning. Skinning is the process where mesh vertices are transformed based on the current location of their animated bones. The animation system, working on the CPU, only transforms the bones, but another step in the rendering process must take care of the vertex transformations to place the vertices around those bones, performing a weighted average over the bones connected to those vertices. This vertex processing task can either take place on the CPU or within the front end of the GPU, depending on whether the GPU Skinning option is enabled. This feature can be toggled under Edit | Project Settings | Player Settings | Other Settings | GPU Skinning. Front end bottlenecks It is not uncommon to use a mesh that contains a lot of unnecessary UV and Normal vector data, so our meshes should be double-checked for this kind of superfluous fluff. We should also let Unity optimize the structure for us, which minimizes cache misses as vertex data is read within the front end. We will also learn some useful Shader optimization techniques shortly, when we begin to discuss back end optimizations, since many optimization techniques apply to both Fragment and Vertex Shaders. The only attack vector left to cover is finding ways to reduce actual vertex counts. The obvious solutions are simplification and culling; either have the art team replace problematic meshes with lower polycount versions, and/or remove some objects from the scene to reduce the overall polygon count. If these approaches have already been explored, then the last approach we can take is to find some kind of middle ground between the two. Level Of Detail Since it can be difficult to tell the difference between a high quality distance object and a low quality one, there is very little reason to render the high quality version. So, why not dynamically replace distant objects with something more simplified? Level Of Detail (LOD), is a broad term referring to the dynamic replacement of features based on their distance or form factor relative to the camera. The most common implementation is mesh-based LOD: dynamically replacing a mesh with lower and lower detailed versions as the camera gets farther and farther away. Another example might be replacing animated characters with versions featuring fewer bones, or less sampling for distant objects, in order to reduce animation workload. The built-in LOD feature is available in the Unity 4 Pro Edition and all editions of Unity 5. However, it is entirely possible to implement it via Script code in Unity 4 Free Edition if desired. Making use of LOD can be achieved by placing multiple objects in the Scene and making them children of a GameObject with an attached LODGroup component. The LODGroup's purpose is to generate a bounding box from these objects, and decide which object should be rendered based on the size of the bounding box within the camera's field of view. If the object's bounding box consumes a large area of the current view, then it will enable the mesh(es) assigned to lower LOD groups, and if the bounding box is very small, it will replace the mesh(es) with those from higher LOD groups. If the mesh is too far away, it can be configured to hide all child objects. So, with the proper setup, we can have Unity replace meshes with simpler alternatives, or cull them entirely, which eases the burden on the rendering process. Check the Unity documentation for more detailed information on the LOD feature: http://docs.unity3d.com/Manual/LevelOfDetail.html. This feature can cost us a large amount of development time to fully implement; artists must generate lower polygon count versions of the same object, and level designers must generate LOD groups, configure them, and test them to ensure they don't cause jarring transitions as the camera moves closer or farther away. It also costs us in memory and runtime CPU; the alternative meshes need to be kept in memory, and the LODGroup component must routinely test whether the camera has moved to a new position that warrants a change in LOD level. In this era of graphics card capabilities, vertex processing is often the least of our concerns. Combined with the additional sacrifices needed for LOD to function, developers should avoid preoptimizing by automatically assuming LOD will help them. Excessive use of the feature will lead to burdening other parts of our application's performance, and chew up precious development time, all for the sake of paranoia. If it hasn't been proven to be a problem, then it's probably not a problem! Scenes that feature large, expansive views of the world, and lots of camera movement, should consider implementing this technique very early, as the added distance and massive number of visible objects will exacerbate the vertex count enormously. Scenes that are always indoors, or feature a camera with a viewpoint looking down at the world (real-time strategy and MOBA games, for example) should probably steer clear of implementing LOD from the beginning. Games somewhere between the two should avoid it until necessary. It all depends on how many vertices are expected to be visible at any given time and how much variability in camera distance there will be. Note that some game development middleware companies offer third-party tools for automated LOD mesh generation. These might be worth investigating to compare their ease of use versus quality loss versus cost effectiveness. Disable GPU Skinning As previously mentioned, we could enable GPU Skinning to reduce the burden on a CPU-bound application, but enabling this feature will push the same workload into the front end of the GPU. Since Skinning is one of those "embarrassingly parallel" processes that fits well with the GPU's parallel architecture, it is often a good idea to perform the task on the GPU. But this task can chew up precious time in the front end preparing the vertices for fragment generation, so disabling it is another option we can explore if we're bottlenecked in this area. Again, this feature can be toggled under Edit | Project Settings | Player Settings | Other Settings | GPU Skinning. GPU Skinning is available in Unity 4 Pro Edition, and all editions of Unity 5. Reduce tessellation There is one last task that takes place in the front end process and that we need to consider: tessellation. Tessellation through Geometry Shaders can be a lot of fun, as it is a relatively underused technique that can really make our graphical effects stand out from the crowd of games that only use the most common effects. But, it can contribute enormously to the amount of processing work taking place in the front end. There are no simple tricks we can exploit to improve tessellation, besides improving our tessellation algorithms, or easing the burden caused by other front end tasks to give our tessellation tasks more room to breathe. Either way, if we have a bottleneck in the front end and are making use of tessellation techniques, we should double-check that they are not consuming the lion's share of the front end's budget. Back end bottlenecks The back end is the more interesting part of the GPU pipeline, as many more graphical effects take place during this stage. Consequently, it is the stage that is significantly more likely to suffer from bottlenecks. There are two brute force tests we can attempt: Reduce resolution Reduce texture quality These changes will ease the workload during two important stages at the back end of the pipeline: fill rate and memory bandwidth, respectively. Fill rate tends to be the most common source of bottlenecks in the modern era of graphics rendering, so we will cover it first. Fill rate By reducing screen resolution, we have asked the rasterization system to generate significantly fewer fragments and transpose them over a smaller canvas of pixels. This will reduce the fill rate consumption of the application, giving a key part of the rendering pipeline some additional breathing room. Ergo, if performance suddenly improves with a screen resolution reduction, then fill rate should be our primary concern. Fill rate is a very broad term referring to the speed at which the GPU can draw fragments. But, this only includes fragments that have survived all of the various conditional tests we might have enabled within the given Shader. A fragment is merely a "potential pixel," and if it fails any of the enabled tests, then it is immediately discarded. This can be an enormous performance-saver as the pipeline can skip the costly drawing step and begin work on the next fragment instead. One such example is Z-testing, which checks whether the fragment from a closer object has already been drawn to the same pixel already. If so, then the current fragment is discarded. If not, then the fragment is pushed through the Fragment Shader and drawn over the target pixel, which consumes exactly one draw from our fill rate. Now imagine multiplying this process by thousands of overlapping objects, each generating hundreds or thousands of possible fragments, for high screen resolutions causing millions, or billions, of fragments to be generated each and every frame. It should be fairly obvious that skipping as many of these draws as we can will result in big rendering cost savings. Graphics card manufacturers typically advertise a particular fill rate as a feature of the card, usually in the form of gigapixels per second, but this is a bit of a misnomer, as it would be more accurate to call it gigafragments per second; however this argument is mostly academic. Either way, larger values tell us that the device can potentially push more fragments through the pipeline, so with a budget of 30 GPix/s and a target frame rate of 60 Hz, we can afford to process 30,000,000,000/60 = 500 million fragments per frame before being bottlenecked on fill rate. With a resolution of 2560x1440, and a best-case scenario where each pixel is only drawn over once, then we could theoretically draw the entire scene about 125 times without any noticeable problems. Sadly, this is not a perfect world, and unless we take significant steps to avoid it, we will always end up with some amount of redraw over the same pixels due to the order in which objects are rendered. This is known as overdraw, and it can be very costly if we're not careful. The reason that resolution is a good attack vector to check for fill rate bounding is that it is a multiplier. A reduction from a resolution of 2560x1440 to 800x600 is an improvement factor of about eight, which could reduce fill rate costs enough to make the application perform well again. Overdraw Determining how much overdraw we have can be represented visually by rendering all objects with additive alpha blending and a very transparent flat color. Areas of high overdraw will show up more brightly as the same pixel is drawn over with additive blending multiple times. This is precisely how the Scene view's Overdraw shading mode reveals how much overdraw our scene is suffering. The following screenshot shows a scene with several thousand boxes drawn normally, and drawn using the Scene view's Overdraw shading mode: At the end of the day, fill rate is provided as a means of gauging the best-case behavior. In other words, it's primarily a marketing term and mostly theoretical. But, the technical side of the industry has adopted the term as a way of describing the back end of the pipeline: the stage where fragment data is funneled through our Shaders and drawn to the screen. If every fragment required an absolute minimum level of processing (such as a Shader that returned a constant color), then we might get close to that theoretical maximum. The GPU is a complex beast, however, and things are never so simple. The nature of the device means it works best when given many small tasks to perform. But, if the tasks get too large, then fill rate is lost due to the back end not being able to push through enough fragments in time and the rest of the pipeline is left waiting for tasks to do. There are several more features that can potentially consume our theoretical fill rate maximum, including but not limited to alpha testing, alpha blending, texture sampling, the amount of fragment data being pulled through our Shaders, and even the color format of the target render texture (the final Frame Buffer in most cases). The bad news is that this gives us a lot of subsections to cover, and a lot of ways to break the process, but the good news is it gives us a lot of avenues to explore to improve our fill rate usage. Occlusion Culling One of the best ways to reduce overdraw is to make use of Unity's Occlusion Culling system. The system works by partitioning Scene space into a series of cells and flying through the world with a virtual camera making note of which cells are invisible from other cells (are occluded) based on the size and position of the objects present. Note that this is different to the technique of Frustum Culling, which culls objects not visible from the current camera view. This feature is always active in all versions, and objects culled by this process are automatically ignored by the Occlusion Culling system. Occlusion Culling is available in the Unity 4 Pro Edition and all editions of Unity 5. Occlusion Culling data can only be generated for objects properly labeled Occluder Static and Occludee Static under the StaticFlags dropdown. Occluder Static is the general setting for static objects where we want it to hide other objects, and be hidden by large objects in its way. Occludee Static is a special case for transparent objects that allows objects behind them to be rendered, but we want them to be hidden if something large blocks their visibility. Naturally, because one of the static flags must be enabled for Occlusion Culling, this feature will not work for dynamic objects. The following screenshot shows how effective Occlusion Culling can be at reducing the number of visible objects in our Scene: This feature will cost us in both application footprint and incur some runtime costs. It will cost RAM to keep the Occlusion Culling data structure in memory, and there will be a CPU processing cost to determine which objects are being occluded in each frame. The Occlusion Culling data structure must be properly configured to create cells of the appropriate size for our Scene, and the smaller the cells, the longer it takes to generate the data structure. But, if it is configured correctly for the Scene, Occlusion Culling can provide both fill rate savings through reduced overdraw, and Draw Call savings by culling non-visible objects. Shader optimization Shaders can be a significant fill rate consumer, depending on their complexity, how much texture sampling takes place, how many mathematical functions are used, and so on. Shaders do not directly consume fill rate, but do so indirectly because the GPU must calculate or fetch data from memory during Shader processing. The GPU's parallel nature means any bottleneck in a thread will limit how many fragments can be pushed into the thread at a later date, but parallelizing the task (sharing small pieces of the job between several agents) provides a net gain over serial processing (one agent handling each task one after another). The classic example is a vehicle assembly line. A complete vehicle requires multiple stages of manufacture to complete. The critical path to completion might involve five steps: stamping, welding, painting, assembly, and inspection, and each step is completed by a single team. For any given vehicle, no stage can begin before the previous one is finished, but whatever team handled the stamping for the last vehicle can begin stamping for the next vehicle as soon as it has finished. This organization allows each team to become masters of their particular domain, rather than trying to spread their knowledge too thin, which would likely result in less consistent quality in the batch of vehicles. We can double the overall output by doubling the number of teams, but if any team gets blocked, then precious time is lost for any given vehicle, as well as all future vehicles that would pass through the same team. If these delays are rare, then they can be negligible in the grand scheme, but if not, and one stage takes several minutes longer than normal each and every time it must complete the task, then it can become a bottleneck that threatens the release of the entire batch. The GPU parallel processors work in a similar way: each processor thread is an assembly line, each processing stage is a team, and each fragment is a vehicle. If the thread spends a long time processing a single stage, then time is lost on each fragment. This delay will multiply such that all future fragments coming through the same thread will be delayed. This is a bit of an oversimplification, but it often helps to paint a picture of how poorly optimized Shader code can chew up our fill rate, and how small improvements in Shader optimization provide big benefits in back end performance. Shader programming and optimization have become a very niche area of game development. Their abstract and highly-specialized nature requires a very different kind of thinking to generate Shader code compared to gameplay and engine code. They often feature mathematical tricks and back-door mechanisms for pulling data into the Shader, such as precomputing values in texture files. Because of this, and the importance of optimization, Shaders tend to be very difficult to read and reverse-engineer. Consequently, many developers rely on prewritten Shaders, or visual Shader creation tools from the Asset Store such as Shader Forge or Shader Sandwich. This simplifies the act of initial Shader code generation, but might not result in the most efficient form of Shaders. If we're relying on pre-written Shaders or tools, we might find it worthwhile to perform some optimization passes over them using some tried-and-true techniques. So, let's focus on some easily reachable ways of optimizing our Shaders. Consider using Shaders intended for mobile platforms The built-in mobile Shaders in Unity do not have any specific restrictions that force them to only be used on mobile devices. They are simply optimized for minimum resource usage (and tend to feature some of the other optimizations listed in this section). Desktop applications are perfectly capable of using these Shaders, but they tend to feature a loss of graphical quality. It only becomes a question of whether the loss of graphical quality is acceptable. So, consider doing some testing with the mobile equivalents of common Shaders to see whether they are a good fit for your game. Use small data types GPUs can calculate with smaller data types more quickly than larger types (particularly on mobile platforms!), so the first tweak we can attempt is replacing our float data types (32-bit, floating point) with smaller versions such as half (16-bit, floating point), or even fixed (12-bit, fixed point). The size of the data types listed above will vary depending on what floating point formats the target platform prefers. The sizes listed are the most common. The importance for optimization is in the relative size between formats. Color values are good candidates for precision reduction, as we can often get away with less precise color values without any noticeable loss in coloration. However, the effects of reducing precision can be very unpredictable for graphical calculations. So, changes such as these can require some testing to verify whether the reduced precision is costing too much graphical fidelity. Note that the effects of these tweaks can vary enormously between one GPU architecture and another (for example, AMD versus Nvidia versus Intel), and even GPU brands from the same manufacturer. In some cases, we can make some decent performance gains for a trivial amount of effort. In other cases, we might see no benefit at all. Avoid changing precision while swizzling Swizzling is the Shader programming technique of creating a new vector (an array of values) from an existing vector by listing the components in the order in which we wish to copy them into the new structure. Here are some examples of swizzling: float4 input = float4(1.0, 2.0, 3.0, 4.0); // initial test value float2 val1 = input.yz; // swizzle two components float3 val2 = input.zyx; // swizzle three components in a different order float4 val3 = input.yyy; // swizzle the same component multiple times float sclr = input.w; float3 val4 = sclr.xxx // swizzle a scalar multiple times We can use both the xyzw and rgba representations to refer to the same components, sequentially. It does not matter whether it is a color or vector; they just make the Shader code easier to read. We can also list components in any order we like to fill in the desired data, repeating them if necessary. Converting from one precision type to another in a Shader can be a costly operation, but converting the precision type while simultaneously swizzling can be particularly painful. If we have mathematical operations that rely on being swizzled into different precision types, it would be wiser if we simply absorbed the high-precision cost from the very beginning, or reduced precision across the board to avoid the need for changes in precision. Use GPU-optimized helper functions The Shader compiler often performs a good job of reducing mathematical calculations down to an optimized version for the GPU, but compiled custom code is unlikely to be as effective as both the Cg library's built-in helper functions and the additional helpers provided by the Unity Cg included files. If we are using Shaders that include custom function code, perhaps we can find an equivalent helper function within the Cg or Unity libraries that can do a better job than our custom code can. These extra include files can be added to our Shader within the CGPROGRAM block like so: CGPROGRAM // other includes #include "UnityCG.cginc" // Shader code here ENDCG Example Cg library functions to use are abs() for absolute values, lerp() for linear interpolation, mul() for multiplying matrices, and step() for step functionality. Useful UnityCG.cginc functions include WorldSpaceViewDir() for calculating the direction towards the camera, and Luminance() for converting a color to grayscale. Check the following URL for a full list of Cg standard library functions: http://http.developer.nvidia.com/CgTutorial/cg_tutorial_appendix_e.html. Check the Unity documentation for a complete and up-to-date list of possible include files and their accompanying helper functions: http://docs.unity3d.com/Manual/SL-BuiltinIncludes.html. Disable unnecessary features Perhaps we can make savings by simply disabling Shader features that aren't vital. Does the Shader really need multiple passes, transparency, Z-writing, alpha-testing, and/or alpha blending? Will tweaking these settings or removing these features give us a good approximation of our desired effect without losing too much graphical fidelity? Making such changes is a good way of making fill rate cost savings. Remove unnecessary input data Sometimes the process of writing a Shader involves a lot of back and forth experimentation in editing code and viewing it in the Scene. The typical result of this is that input data that was needed when the Shader was going through early development is now surplus fluff once the desired effect has been obtained, and it's easy to forget what changes were made when/if the process drags on for a long time. But, these redundant data values can cost the GPU valuable time as they must be fetched from memory even if they are not explicitly used by the Shader. So, we should double check our Shaders to ensure all of their input geometry, vertex, and fragment data is actually being used. Only expose necessary variables Exposing unnecessary variables from our Shader to the accompanying Material(s) can be costly as the GPU can't assume these values are constant. This means the Shader code cannot be compiled into a more optimized form. This data must be pushed from the CPU with every pass since they can be modified at any time through the Material's methods such as SetColor(), SetFloat(), and so on. If we find that, towards the end of the project, we always use the same value for these variables, then they can be replaced with a constant in the Shader to remove such excess runtime workload. The only cost is obfuscating what could be critical graphical effect parameters, so this should be done very late in the process. Reduce mathematical complexity Complicated mathematics can severely bottleneck the rendering process, so we should do whatever we can to limit the damage. Complex mathematical functions could be replaced with a texture that is fed into the Shader and provides a pre-generated table for runtime lookup. We may not see any improvement with functions such as sin and cos, since they've been heavily optimized to make use of GPU architecture, but complex methods such as pow, exp, log, and other custom mathematical processes can only be optimized so much, and would be good candidates for simplification. This is assuming we only need one or two input values, which are represented through the X and Y coordinates of the texture, and mathematical accuracy isn't of paramount importance. This will cost us additional graphics memory to store the texture at runtime (more on this later), but if the Shader is already receiving a texture (which they are in most cases) and the alpha channel is not being used, then we could sneak the data in through the texture's alpha channel, costing us literally no performance, and the rest of the Shader code and graphics system would be none-the-wiser. This will involve the customization of art assets to include such data in any unused color channel(s), requiring coordination between programmers and artists, but is a very good way of saving Shader processing costs with no runtime sacrifices. In fact, Material properties and textures are both excellent entry points for pushing work from the Shader (the GPU) onto the CPU. If a complex calculation does not need to vary on a per pixel basis, then we could expose the value as a property in the Material, and modify it as needed (accepting the overhead cost of doing so from the previous section Only expose necessary variables). Alternatively, if the result varies per pixel, and does not need to change often, then we could generate a texture file from script code, containing the results of the calculations in the RGBA values, and pulling the texture into the Shader. Lots of opportunities arise when we ignore the conventional application of such systems, and remember to think of them as just raw data being transferred around. Reduce texture lookups While we're on the subject of texture lookups, they are not trivial tasks for the GPU to process and they have their own overhead costs. They are the most common cause of memory access problems within the GPU, especially if a Shader is performing samples across multiple textures, or even multiple samples across a single texture, as they will likely inflict cache misses in memory. Such situations should be simplified as much as possible to avoid severe GPU memory bottlenecking. Even worse, sampling a texture in a random order would likely result in some very costly cache misses for the GPU to suffer through, so if this is being done, then the texture should be reordered so that it can be sampled in a more sequential order. Avoid conditional statements In modern day CPU architecture, conditional statements undergo a lot of clever predictive techniques to make use of instruction-level parallelism. This is a feature where the CPU attempts to predict which direction a conditional statement will go in before it has actually been resolved, and speculatively begins processing the most likely result of the conditional using any free components that aren't being used to resolve the conditional (fetching some data from memory, copying some floats into unused registers, and so on). If it turns out that the decision is wrong, then the current result is discarded and the proper path is taken instead. So long as the cost of speculative processing and discarding false results is less than the time spent waiting to decide the correct path, and it is right more often than it is wrong, then this is a net gain for the CPU's speed. However, this feature is not possible on GPU architecture because of its parallel nature. The GPU's cores are typically managed by some higher-level construct that instructs all cores under its command to perform the same machine-code-level instruction simultaneously. So, if the Fragment Shader requires a float to be multiplied by 2, then the process will begin by having all cores copy data into the appropriate registers in one coordinated step. Only when all cores have finished copying to the registers will the cores be instructed to begin the second step: multiplying all registers by 2. Thus, when this system stumbles into a conditional statement, it cannot resolve the two statements independently. It must determine how many of its child cores will go down each path of the conditional, grab the list of required machine code instructions for one path, resolve them for all cores taking that path, and repeat for each path until all possible paths have been processed. So, for an if-else statement (two possibilities), it will tell one group of cores to process the "true" path, then ask the remaining cores to process the "false" path. Unless every core takes the same path, it must process both paths every time. So, we should avoid branching and conditional statements in our Shader code. Of course, this depends on how essential the conditional is to achieving the graphical effect we desire. But, if the conditional is not dependent on per pixel behavior, then we would often be better off absorbing the cost of unnecessary mathematics than inflicting a branching cost on the GPU. For example, we might be checking whether a value is non-zero before using it in a calculation, or comparing against some global flag in the Material before taking one action or another. Both of these cases would be good candidates for optimization by removing the conditional check. Reduce data dependencies The compiler will try its best to optimize our Shader code into the more GPU-friendly low-level language so that it is not waiting on data to be fetched when it could be processing some other task. For example, the following poorly-optimized code, could be written in our Shader: float sum = input.color1.r; sum = sum + input.color2.g; sum = sum + input.color3.b; sum = sum + input.color4.a; float result = calculateSomething(sum); If we were able to force the Shader compiler to compile this code into machine code instructions as it is written, then this code has a data dependency such that each calculation cannot begin until the last finishes due to the dependency on the sum variable. But, such situations are often detected by the Shader compiler and optimized into a version that uses instruction-level parallelism (the code shown next is the high-level code equivalent of the resulting machine code): float sum1, sum2, sum3, sum4; sum1 = input.color1.r; sum2 = input.color2.g; sum3 = input.color3.b sum4 = input.color4.a; float sum = sum1 + sum2 + sum3 + sum4; float result = CalculateSomething(sum); In this case, the compiler would recognize that it can fetch the four values from memory in parallel and complete the summation once all four have been fetched independently via thread-level parallelism. This can save a lot of time, relative to performing the four fetches one after another. However, long chains of data dependency can absolutely murder Shader performance. If we create a strong data dependency in our Shader's source code, then it has been given no freedom to make such optimizations. For example, the following data dependency would be painful on performance, as one step cannot be completed without waiting on another to fetch data and performing the appropriate calculation. float4 val1 = tex2D(_tex1, input.texcoord.xy); float4 val2 = tex2D(_tex2, val1.yz); float4 val3 = tex2D(_tex3, val2.zw); Strong data dependencies such as these should be avoided whenever possible. Surface Shaders If we're using Unity's Surface Shaders, which are a way for Unity developers to get to grips with Shader programming in a more simplified fashion, then the Unity Engine takes care of converting our Surface Shader code for us, abstracting away some of the optimization opportunities we have just covered. However, it does provide some miscellaneous values that can be used as replacements, which reduce accuracy but simplify the mathematics in the resulting code. Surface Shaders are designed to handle the general case fairly efficiently, but optimization is best achieved with a personal touch. The approxview attribute will approximate the view direction, saving costly operations. halfasview will reduce the precision of the view vector, but beware of its effect on mathematical operations involving multiple precision types. noforwardadd will limit the Shader to only considering a single directional light, reducing Draw Calls since the Shader will render in only a single pass, but reducing lighting complexity. Finally, noambient will disable ambient lighting in the Shader, removing some extra mathematical operations that we may not need. Use Shader-based LOD We can force Unity to render distant objects using simpler Shaders, which can be an effective way of saving fill rate, particularly if we're deploying our game onto multiple platforms or supporting a wide range of hardware capability. The LOD keyword can be used in the Shader to set the onscreen size factor that the Shader supports. If the current LOD level does not match this value, it will drop to the next fallback Shader and so on until it finds the Shader that supports the given size factor. We can also change a given Shader object's LOD value at runtime using the maximumLOD property. This feature is similar to the mesh-based LOD covered earlier, and uses the same LOD values for determining object form factor, so it should be configured as such. Memory bandwidth Another major component of back end processing and a potential source of bottlenecks is memory bandwidth. Memory bandwidth is consumed whenever a texture must be pulled from a section of the GPU's main video memory (also known as VRAM). The GPU contains multiple cores that each have access to the same area of VRAM, but they also each contain a much smaller, local Texture Cache that stores the current texture(s) the GPU has been most recently working with. This is similar in design to the multitude of CPU cache levels that allow memory transfer up and down the chain, as a workaround for the fact that faster memory will, invariably, be more expensive to produce, and hence smaller in capacity compared to slower memory. Whenever a Fragment Shader requests a sample from a texture that is already within the core's local Texture Cache, then it is lightning fast and barely perceivable. But, if a texture sample request is made, that does not yet exist within the Texture Cache, then it must be pulled in from VRAM before it can be sampled. This fetch request risks cache misses within VRAM as it tries to find the relevant texture. The transfer itself consumes a certain amount of memory bandwidth, specifically an amount equal to the total size of the texture file stored within VRAM (which may not be the exact size of the original file, nor the size in RAM, due to GPU-level compression). It's for this reason that, if we're bottlenecked on memory bandwidth, then performing a brute force test by reducing texture quality would suddenly result in a performance improvement. We've shrunk the size of our textures, easing the burden on the GPU's memory bandwidth, allowing it to fetch the necessary textures much quicker. Globally reducing texture quality can be achieved by going to Edit | Project Settings | Quality | Texture Quality and setting the value to Half Res, Quarter Res, or Eighth Res. In the event that memory bandwidth is bottlenecked, then the GPU will keep fetching the necessary texture files, but the entire process will be throttled as the Texture Cache waits for the data to appear before processing the fragment. The GPU won't be able to push data back to the Frame Buffer in time to be rendered onto the screen, blocking the whole process and culminating in a poor frame rate. Ultimately, proper usage of memory bandwidth is a budgeting concern. For example, with a memory bandwidth of 96 GB/sec per core and a target frame rate of 60 frames per second, then the GPU can afford to pull 96/60 = 1.6 GB worth of texture data every frame before being bottlenecked on memory bandwidth. Memory bandwidth is often listed on a per core basis, but some GPU manufacturers may try to mislead you by multiplying memory bandwidth by the number of cores in order to list a bigger, but less practical number. Because of this, research may be necessary to confirm the memory bandwidth limit we have for the target GPU hardware is given on a per core basis. Note that this value is not the maximum limit on the texture data that our game can contain in the project, nor in CPU RAM, not even in VRAM. It is a metric that limits how much texture swapping can occur during one frame. The same texture could be pulled back and forth multiple times in a single frame depending on how many Shaders need to use them, the order that the objects are rendered, and how often texture sampling must occur, so rendering just a few objects could consume whole gigabytes of memory bandwidth if they all require the same high quality, massive textures, require multiple secondary texture maps (normal maps, emission maps, and so on), and are not batched together, because there simply isn't enough Texture Cache space available to keep a single texture file long enough to exploit it during the next rendering pass. There are several approaches we can take to solve bottlenecks in memory bandwidth. Use less texture data This approach is simple, straightforward, and always a good idea to consider. Reducing texture quality, either through resolution or bit rate, is not ideal for graphical quality, but we can sometimes get away with using 16-bit textures without any noticeable degradation. Mip Maps are another excellent way of reducing the amount of texture data being pushed back and forth between VRAM and the Texture Cache. Note that the Scene View has a Mipmaps Shading Mode, which will highlight textures in our scene blue or red depending on whether the current texture scale is appropriate for the current Scene View's camera position and orientation. This will help identify what textures are good candidates for further optimization. Mip Maps should almost always be used in 3D Scenes, unless the camera moves very little. Test different GPU Texture Compression formats The Texture Compression techniques helpe reduce our application's footprint (executable file size), and runtime CPU memory usage, that is, the storage area where all texture resource data is kept until it is needed by the GPU. However, once the data reaches the GPU, it uses a different form of compression to keep texture data small. The common formats are DXT, PVRTC, ETC, and ASTC. To make matters more confusing, each platform and GPU hardware supports different compression formats, and if the device does not support the given compression format, then it will be handled at the software level. In other words, the CPU will need to stop and recompress the texture to the desired format the GPU wants, as opposed to the GPU taking care of it with a specialized hardware chip. The compression options are only available if a texture resource has its Texture Type field set to Advanced. Using any of the other texture type settings will simplify the choices, and Unity will make a best guess when deciding which format to use for the target platform, which may not be ideal for a given piece of hardware and thus will consume more memory bandwidth than necessary. The best approach to determining the correct format is to simply test a bunch of different devices and Texture Compression techniques and find one that fits. For example, common wisdom says that ETC is the best choice for Android since more devices support it, but some developers have found their game works better with the DXT and PVRTC formats on certain devices. Beware that, if we're at the point where individually tweaking Texture Compression techniques is necessary, then hopefully we have exhausted all other options for reducing memory bandwidth. By going down this road, we could be committing to supporting many different devices each in their own specific way. Many of us would prefer to keep things simple with a general solution instead of personal customization and time-consuming handiwork to work around problems like this. Minimize texture sampling Can we modify our Shaders to remove some texture sampling overhead? Did we add some extra texture lookup files to give ourselves some fill rate savings on mathematical functions? If so, we might want to consider lowering the resolution of such textures or reverting the changes and solving our fill rate problems in other ways. Essentially, the less texture sampling we do, the less often we need to use memory bandwidth and the closer we get to resolving the bottleneck. Organize assets to reduce texture swaps This approach basically comes back to Batching and Atlasing again. Are there opportunities to batch some of our biggest texture files together? If so, then we could save the GPU from having to pull in the same texture files over and over again during the same frame. As a last resort, we could look for ways to remove some textures from the entire project and reuse similar files. For instance, if we have fill rate budget to spare, then we may be able to use some Fragment Shaders to make a handful of textures files appear in our game with different color variations. VRAM limits One last consideration related to textures is how much VRAM we have available. Most texture transfer from CPU to GPU occurs during initialization, but can also occur when a non-existent texture is first required by the current view. This process is asynchronous and will result in a blank texture being used until the full texture is ready for rendering. As such, we should avoid too much texture variation across our Scenes. Texture preloading Even though it doesn't strictly relate to graphics performance, it is worth mentioning that the blank texture that is used during asynchronous texture loading can be jarring when it comes to game quality. We would like a way to control and force the texture to be loaded from disk to the main memory and then to VRAM before it is actually needed. A common workaround is to create a hidden GameObject that features the texture and place it somewhere in the Scene on the route that the player will take towards the area where it is actually needed. As soon as the textured object becomes a candidate for the rendering system (even if it's technically hidden), it will begin the process of copying the data towards VRAM. This is a little clunky, but is easy to implement and works sufficiently well in most cases. We can also control such behavior via Script code by changing a hidden Material's texture: GetComponent<Renderer>().material.texture = textureToPreload; Texture thrashing In the rare event that too much texture data is loaded into VRAM, and the required texture is not present, the GPU will need to request it from the main memory and overwrite the existing texture data to make room. This is likely to worsen over time as the memory becomes fragmented, and it introduces a risk that the texture just flushed from VRAM needs to be pulled again within the same frame. This will result in a serious case of memory "thrashing", and should be avoided at all costs. This is less of a concern on modern consoles such as the PS4, Xbox One, and WiiU, since they share a common memory space for both CPU and GPU. This design is a hardware-level optimization given the fact that the device is always running a single application, and almost always rendering 3D graphics. But, all other platforms must share time and space with multiple applications and be capable of running without a GPU. They therefore feature separate CPU and GPU memory, and we must ensure that the total texture usage at any given moment remains below the available VRAM of the target hardware. Note that this "thrashing" is not precisely the same as hard disk thrashing, where memory is copied back and forth between main memory and virtual memory (the swap file), but it is analogous. In either case, data is being unnecessarily copied back and forth between two regions of memory because too much data is being requested in too short a time period for the smaller of the two memory regions to hold it all. Thrashing such as this can be a common cause of dreadful graphics performance when games are ported from modern consoles to the desktop and should be treated with care. Avoiding this behavior may require customizing texture quality and file sizes on a per-platform and per-device basis. Be warned that some players are likely to notice these inconsistencies if we're dealing with hardware from the same console or desktop GPU generation. As many of us will know, even small differences in hardware can lead to a lot of apples-versus-oranges comparisons, but hardcore gamers will expect a similar level of quality across the board. Lighting and Shadowing Lighting and Shadowing can affect all parts of the graphics pipeline, and so they will be treated separately. This is perhaps one of the most important parts of game art and design to get right. Good Lighting and Shadowing can turn a mundane scene into something spectacular as there is something magical about professional coloring that makes it visually appealing. Even the low-poly art style (think Monument Valley) relies heavily on a good lighting and shadowing profile in order to allow the player to distinguish one object from another. But, this isn't an art book, so we will focus on the performance characteristics of various Lighting and Shadowing features. Unity offers two styles of dynamic light rendering, as well as baked lighting effects through lightmaps. It also provides multiple ways of generating shadows with varying levels of complexity and runtime processing cost. Between the two, there are a lot of options to explore, and a lot of things that can trip us up if we're not careful. The Unity documentation covers all of these features in an excellent amount of detail (start with this page and work through them: http://docs.unity3d.com/Manual/Lighting.html), so we'll examine these features from a performance standpoint. Let's tackle the two main light rendering modes first. This setting can be found under Edit | Project Settings | Player | Other Settings | Rendering, and can be configured on a per-platform basis. Forward Rendering Forward Rendering is the classical form of rendering lights in our scene. Each object is likely to be rendered in multiple passes through the same Shader. How many passes are required will be based on the number, distance, and brightness of light sources. Unity will try to prioritize which directional light is affecting the object the most and render the object in a "base pass" as a starting point. It will then take up to four of the most powerful point lights nearby and re-render the same object multiple times through the same Fragment Shader. The next four point lights will then be processed on a per-vertex basis. All remaining lights are treated as a giant blob by means of a technique called spherical harmonics. Some of this behavior can be simplified by setting a light's Render Mode to values such as Not Important, and changing the value of Edit | Project Settings | Quality | Pixel Light Count. This value limits how many lights will be treated on a per pixel basis, but is overridden by any lights with a Render Mode set to Important. It is therefore up to us to use this combination of settings responsibly. As you can imagine, the design of Forward Rendering can utterly explode our Draw Call count very quickly in scenes with a lot of point lights present, due to the number of render states being configured and Shader passes being reprocessed. CPU-bound applications should avoid this rendering mode if possible. More information on Forward Rendering can be found in the Unity documentation: http://docs.unity3d.com/Manual/RenderTech-ForwardRendering.html. Deferred Shading Deferred Shading or Deferred Rendering as it is sometimes known, is only available on GPUs running at least Shader Model 3.0. In other words, any desktop graphics card made after around 2004. The technique has been around for a while, but it has not resulted in a complete replacement of the Forward Rendering method due to the caveats involved and limited support on mobile devices. Anti-aliasing, transparency, and animated characters receiving shadows are all features that cannot be managed through Deferred Shading alone and we must use the Forward Rendering technique as a fallback. Deferred Shading is so named because actual shading does not occur until much later in the process; that is, it is deferred until later. From a performance perspective, the results are quite impressive as it can generate very good per pixel lighting with surprisingly little Draw Call effort. The advantage is that a huge amount of lighting can be accomplished using only a single pass through the lighting Shader. The main disadvantages include the additional costs if we wish to pile on advanced lighting features such as Shadowing and any steps that must pass through Forward Rendering in order to complete, such as transparency. The Unity documentation contains an excellent source of information on the Deferred Shading technique, its advantages, and its pitfalls: http://docs.unity3d.com/Manual/RenderTech-DeferredShading.html Vertex Lit Shading (legacy) Technically, there are more than two lighting methods. Unity allows us to use a couple of legacy lighting systems, only one of which may see actual use in the field: Vertex Lit Shading. This is a massive simplification of lighting, as lighting is only considered per vertex, and not per pixel. In other words, entire faces are colored based on the incoming light color, and not individual pixels. It is not expected that many, or really any, 3D games will make use of this legacy technique, as a lack of shadows and proper lighting make visualizations of depth very difficult. It is mostly relegated to 2D games that don't intend to make use of shadows, normal maps, and various other lighting features, but it is there if we need it. Real-time Shadows Soft Shadows are expensive, Hard Shadows are cheap, and No Shadows are free. Shadow Resolution, Shadow Projection, Shadow Distance, and Shadow Cascades are all settings we can find under Edit | Project Settings | Quality | Shadows that we can use to modify the behavior and complexity of our shadowing passes. That summarizes almost everything we need to know about Unity's real-time shadowing techniques from a high-level performance standpoint. We will cover shadows more in the following section on optimizing our lighting effects. Lighting optimization With a cursory glance at all of the relevant lighting techniques, let's run through some techniques we can use to improve lighting costs. Use the appropriate Shading Mode It is worth testing both of the main rendering modes to see which one best suits our game. Deferred Shading is often used as a backup in the event that Forward Rendering is becoming a burden on performance, but it really depends on where else we're finding bottlenecks as it is sometimes difficult to tell the difference between them. Use Culling Masks A Light Component's Culling Mask property is a layer-based mask that can be used to limit which objects will be affected by the given Light. This is an effective way of reducing lighting overhead, assuming that the layer interactions also make sense with how we are using layers for physics optimization. Objects can only be a part of a single layer, and reducing physics overhead probably trumps lighting overhead in most cases; thus, if there is a conflict, then this may not be the ideal approach. Note that there is limited support for Culling Masks when using Deferred Shading. Because of the way it treats lighting in a very global fashion, only four layers can be disabled from the mask, limiting our ability to optimize its behavior through this method. Use Baked Lightmaps Baking Lighting and Shadowing into a Scene is significantly less processor-intensive than generating them at runtime. The downside is the added application footprint, memory consumption, and potential for memory bandwidth abuse. Ultimately, unless a game's lighting effects are being handled exclusively through Legacy Vertex Lighting or a single Directional Light, then it should probably include Lightmapping to make some huge budget savings on lighting calculations. Relying entirely on real-time lighting and shadows is a recipe for disaster unless the game is trying to win an award for the smallest application file size of all time. Optimize Shadows Shadowing passes mostly consume our Draw Calls and fill rate, but the amount of vertex position data we feed into the process and our selection for the Shadow Projection setting will affect the front end's ability to generate the required shadow casters and shadow receivers. We should already be attempting to reduce vertex counts to solve front end bottlenecking in the first place, and making this change will be an added multiplier towards that effort. Draw Calls are consumed during shadowing by rendering visible objects into a separate buffer (known as the shadow map) as either a shadow caster, a shadow receiver, or both. Each object that is rendered into this map will consume another Draw Call, which makes shadows a huge performance cost multiplier, so it is often a setting that games will expose to users via quality settings, allowing users with weaker hardware to reduce the effect or even disable it entirely. Shadow Distance is a global multiplier for runtime shadow rendering. The fewer shadows we need to draw, the happier the entire rendering process will be. There is little point in rendering shadows at a great distance from the camera, so this setting should be configured specific to our game and how much shadowing we expect to witness during gameplay. It is also a common setting that is exposed to the user to reduce the burden of rendering shadows. Higher values of Shadow Resolution and Shadow Cascades will increase our memory bandwidth and fill rate consumption. Both of these settings can help curb the effects of artefacts in shadow rendering, but at the cost of a much larger shadow map size that must be moved around and of the canvas size to draw to. The Unity documentation contains an excellent summary on the topic of the aliasing effect of shadow maps and how the Shadow Cascades feature helps to solve the problem: http://docs.unity3d.com/Manual/DirLightShadows.html. It's worth noting that Soft Shadows do not consume any more memory or CPU overhead relative to Hard Shadows, as the only difference is a more complex Shader. This means that applications with enough fill rate to spare can enjoy the improved graphical fidelity of Soft Shadows. Optimizing graphics for mobile Unity's ability to deploy to mobile devices has contributed greatly to its popularity among hobbyist, small, and mid-size development teams. As such, it would be prudent to cover some approaches that are more beneficial for mobile platforms than for desktop and other devices. Note that any, and all, of the following approaches may become obsolete soon, if they aren't already. The mobile device market is moving blazingly fast, and the following techniques as they apply to mobile devices merely reflect conventional wisdom from the last half decade. We should occasionally test the assumptions behind these approaches from time-to-time to see whether the limitations of mobile devices still fit the mobile marketplace. Minimize Draw Calls Mobile applications are more often bottlenecked on Draw Calls than on fill rate. Not that fill rate concerns should be ignored (nothing should, ever!), but this makes it almost necessary for any mobile application of reasonable quality to implement Mesh Combining, Batching, and Atlasing techniques from the very beginning. Deferred Rendering is also the preferred technique as it fits well with other mobile-specific concerns, such as avoiding transparency and having too many animated characters. Minimize the Material count This concern goes hand in hand with the concepts of Batching and Atlasing. The fewer Materials we use, the fewer Draw Calls will be necessary. This strategy will also help with concerns relating to VRAM and memory bandwidth, which tend to be very limited on mobile devices. Minimize texture size and Material count Most mobile devices feature a very small Texture Cache relative to desktop GPUs. For instance, the iPhone 3G can only support a total texture size of 1024x1024 due to running OpenGLES1.1 with simple vertex rendering techniques. Meanwhile the iPhone 3GS, iPhone 4, and iPad generation run OpenGLES 2.0, which only supports textures up to 2048x2048. Later generations can support textures up to 4096x4096. Double check the device hardware we are targeting to be sure it supports the texture file sizes we wish to use (there are too many Android devices to list here).
However, later-generation devices are never the most common devices in the mobile marketplace. If we wish our game to reach a wide audience (increasing its chances of success), then we must be willing to support weaker hardware. Note that textures that are too large for the GPU will be downscaled by the CPU during initialization, wasting valuable loading time, and leaving us with unintended graphical fidelity. This makes texture reuse of paramount importance for mobile devices due to the limited VRAM and Texture Cache sizes available. Make textures square and power-of-2 The GPU will find it difficult, or simply be unable to compress the texture if it is not in a square format, so make sure you stick to the common development convention and keep things square and sized to a power of 2. Use the lowest possible precision formats in Shaders Mobile GPUs are particularly sensitive to precision formats in its Shaders, so the smallest formats should be used. On a related note, format conversion should be avoided for the same reason. Avoid Alpha Testing Mobile GPUs haven't quite reached the same levels of chip optimization as desktop GPUs, and Alpha Testing remains a particularly costly task on mobile devices. In most cases it should simply be avoided in favor of Alpha Blending. Summary If you've made it this far without skipping ahead, then congratulations are in order. That was a lot of information to absorb for just one component of the Unity Engine, but then it is clearly the most complicated of them all, requiring a matching depth of explanation. Hopefully, you've learned a lot of approaches to help you improve your rendering performance and enough about the rendering pipeline to know how to use them responsibly! To learn more about Unity 5, the following books published by Packt Publishing (https://www.packtpub.com/) are recommended: Unity 5 Game Optimization (https://www.packtpub.com/game-development/unity-5-game-optimization) Unity 5.x By Example (https://www.packtpub.com/game-development/unity-5x-example) Unity 5.x Cookbook (https://www.packtpub.com/game-development/unity-5x-cookbook) Unity 5 for Android Essentials (https://www.packtpub.com/game-development/unity-5-android-essentials) Resources for Article: Further resources on this subject: The Vertex Functions [article] UI elements and their implementation [article] Routing for Yii Demystified [article]
Read more
  • 0
  • 0
  • 40570

article-image-azure-stream-analytics-7-reasons-to-choose
Sugandha Lahoti
19 Apr 2018
11 min read
Save for later

How to get started with Azure Stream Analytics and 7 reasons to choose it

Sugandha Lahoti
19 Apr 2018
11 min read
In this article, we will introduce Azure Stream Analytics, and show how to configure it. We will then look at some of key the advantages of the Stream Analytics platform including how it will enhance developer productivity, reduce and improve the Total Cost of Ownership (TCO) of building and maintaining a scaling streaming solution among other factors. What is Azure Stream Analytics and how does it work? Microsoft Azure Stream Analytics falls into the category of PaaS services where the customers don't need to manage the underlying infrastructure. However, they are still responsible for and manage an application that they build on the top of PaaS service and more importantly the customer data. Azure Stream Analytics is a fully managed server-less PaaS service that is built for real-time analytics computations on streaming data. The service can consume from a multitude of sources. Azure will take care of the hosting, scaling, and management of the underlying hardware and software ecosystem. The following are some of the examples of different use cases for Azure Stream Analytics. When we are designing the solution that involves streaming data, in almost every case, Azure Stream Analytics will be part of a larger solution that the customer was trying to deploy. This can be real-time dashboarding for monitoring purposes or real-time monitoring of IT infrastructure equipment, preventive maintenance (auto-manufacturing, vending machines, and so on), and fraud detection. This means that the streaming solution needs to be thoughtful about providing out-of-the-box integration with a whole plethora of services that could help build a solution in a relatively quick fashion. Let's review a usage pattern for Azure Stream Analytics using a canonical model: We can see devices and applications that generate data on the left in the preceding illustration that can connect directly or through cloud gateways to your stream ingest sources. Azure Stream Analytics can pick up the data from these ingest sources, augment it with reference data, run necessary analytics, gather insights and push them downstream for action. You can trigger business processes, write the data to a database or directly view the anomalies on a dashboard. In the previous canonical pattern, the number of streaming ingest technologies are used; let's review them in the following section: Event Hub: Global scale event ingestion system, where one can publish events from millions of sensors and applications. This will guarantee that as soon as an event comes in here, a subscriber can pick that event up within a few milliseconds. You can have one or more subscriber as well depending on your business requirements. A typical use case for an Event Hub is real-time financial fraud detection and social media sentiment analytics. IoT Hub: IoT Hub is very similar to Event Hub but takes the concept a lot further forward—in that you can take bidirectional actions. It will not only ingest data from sensors in real time but can also send commands back to them. It also enables you to do things like device management. Enabling fundamental aspects such as security is a primary need for IoT built with it. Azure Blob: Azure Blob is a massively scalable object storage for unstructured data, and is accessible through HTTP or HTTPS. Blob storage can expose data publicly to the world or store application data privately. Reference Data: This is auxiliary data that is either static or that changes slowly. Reference data can be used to enrich incoming data to perform correlation and lookups. On the ingress side, with a few clicks, you can connect to Event Hub, IOT Hub, or Blob storage. The Streaming data can be enriched with reference data in the Blob store. Data from the ingress process will be consumed by the Azure Stream Analytics service; we can call machine learning (ML) for event scoring in real time. The data can be egressed to live Dashboarding to Power BI, or could also push data back to Event Hub from where dashboards and reports can pick it up. The following is a summary of the ingress, egress, and archiving options: Ingress choices: Event Hub IoT Hub Blob storage Egress choices: Live Dashboards: PowerBI Event Hub Driving workflows: Event Hubs Service Bus Archiving and post analysis: Blob storage Document DB Data Lake SQL Server Table storage Azure Functions One key point to note is there the number of customers who push data from Stream Analytics processing (egress point) to Event Hub and then add Azure website-as hosted solutions into their own custom dashboard. One can drive workflows by pushing the events to Azure Service Bus and PowerBI. For example, customer can build IoT support solutions to detect an anomaly in connected appliances and pushing the result into Azure Service Bus. A worker role can run as a daemon to pull the messages and create support tickets using Dynamics CRM API. Then use Power BI on the ticket can be archived for post analysis. This solution eliminates the need for the customer to log a ticket , but the system will automatically do it based on predefined anomaly thresholds. This is just one sample of real-time connected solution. There are a number of use cases that don't even involve real-time alerts. You can also use it to aggregate data, filter data, and store it in Blob storage, Azure Data Lake (ADL), Document DB, SQL, and then run U-SQL Azure Data Lake Analytics (ADLA), HDInsight, or even call ML models for things like predictive maintenance. Configuring Azure Stream Analytics Azure Stream Analytics (ASA) is a fully managed, cost-effective real-time event processing engine. Stream Analytics makes it easy to set up real-time analytic computations on data streaming from devices, sensors, websites, social media, applications, infrastructure systems, and more. The service can be hosted with a few clicks in the Azure portal; users can author a Stream Analytics job specifying the input source of the streaming data, the output sink for the results of your job, and a data transformation expressed in a SQL-like language. The jobs can be monitored and you can adjust the scale/speed of the job in the Azure portal to scale from a few kilobytes to a gigabyte or more of events processed per second. Let's review how to configure Azure Stream Analytics step by step: Log in to the Azure portal using your Azure credentials, click on New, and search for Stream Analytics job: 2. Click on Create to create an Azure Stream Analytics instance: 3. Provide a Job Name and Resource group name for the Azure Stream Analytics job deployment: 4. After a few minutes, the deployment will be complete: 5. Review the following in the deployment--audit trail of the creation: 6. Ability stream up and down using a simple UI: 7. Build in the Query interface to run queries: 8. Run Queries using a SQL-like interface, with the ability to accept late-arriving events with simple GUI-based configuration: Key advantages of Azure Stream Analytics Let's quickly review how traditional streaming solutions are built; the core deployment starts with procuring and setting up the basic infrastructure necessary to host the streaming solution. Once this is done, we can then build the ingress and egress solution on top of the deployed infrastructure. Once the core infrastructure is built, customer tools will be used to build business intelligence (BI) or machine-learning integration. After the system goes into production, scaling during runtime needs to be taken care of by capturing the telemetry and building and configuration of HW/SW resources as necessary. As business needs ramp up, so does the monitoring and troubleshooting. Security Azure Stream Analytics provides a number of inbuilt security mechanics in areas such as authentication, authorization, auditing, segmentation, and data protection. Let's quickly review them. Authentication support: Authentication support in Azure Stream Analytics is done at portal level. Users should have a valid subscription ID and password to access the Azure Stream Analytics job. Authorization: Authorization is the process during login where users provide their credentials (for example, user account name and password, smart card and PIN, Secure ID and PIN, and so on) to prove their Microsoft identity so that they can retrieve their access token from the authentication server. Authorization is supported by Azure Stream Analytics. Only authenticated/authorized users can access the Azure Stream Analytics job. Support for encryption: Data-at-rest using client-side encryption and TDE. Support for key management: Key management is supported through ingress and egress points. Programmer productivity One of the key features of Azure Stream Analytics is developer productivity, and it is driven a lot by the query language that is based on SQL constructs. It provides a wide array of functions for analytics on streaming data, all the way from simple data manipulation functions, data and time functions, temporal functions, mathematical, string, scaling, and much more. It provides two features natively out of the box. Let's review the features in detail in the next section Declarative SQL constructs Built-in temporal semantics Declarative SQL constructs A simple-to-use UI is provided and queries can be constructed using the provided user interface. The following is the feature set of the declarative SQL constructs: Filters (Where) Projections (Select) Time-window and property-based aggregates (Group By) Time-shifted joins (specifying time bounds within which the joining events must occur) All combinations thereof The following is a summary of different constructs to manipulate streaming data: Data manipulation: SELECT, FROM, WHERE GROUP BY, HAVING, CASE WHEN THEN ELSE, INNER/LEFT OUTER JOIN, UNION, CROSS/OUTER APPLY, CAST, INTO, ORDER BY ASC, DSC Date and time functions: DateName, DatePart, Day, Month, Year, DateDiff, DateTimeFromParts, DateAdd Temporal functions: Lag, IsFirst, LastCollectTop Aggregate functions: SUM, COUNT, AVG, MIN, MAX, STDEV, STDEVP, VAR VARP, TopOne Mathematical functions: ABS, CEILING, EXP, FLOOR POWER, SIGN, SQUARE, SQRT String functions: Len, Concat, CharIndex Substring, Lower Upper, PatIndex Scaling extensions: WITH, PARTITION BY OVER Geospatial: CreatePoint, CreatePolygon, CreateLineString, ST_DISTANCE, ST_WITHIN, ST_OVERLAPS, ST_INTERSECTS Built-in temporal semantics Azure Stream Analytics provides prebuilt temporal semantics to query time-based information and merge streams with multiple timelines. Here is a list of temporal semantics: Application or ingest timestamp Windowing functions Policies for event ordering Policies to manage latencies between ingress sources Manage streams with multiple timelines Join multiple streams of temporal windows Join streaming data with data-at-rest Lowest total cost of ownership Azure Stream Analytics is a fully managed PaaS service on Azure. There are no upfront costs or costs involved in setting up computer clusters and complex hardware wiring like you would do with an on-prem solution. It's a simple job service where there is no cluster provisioning and customers pay for what they use. A key consideration is the variable workloads. With Azure Stream Analytics, you do not need to design your system for peak throughput and can add more compute footprint as you go. If you have scenarios where data comes in spurts, you do not want to design a system for peak usage and leave it unutilized for other times. Let's say you are building a traffic monitoring solution—naturally, there is the expectation that it will expect peaks to show up during morning and evening rush hours. However, you would not want to design your system or investments to cater to these extremes. Cloud elasticity that Azure offers is a perfect fit here. Azure Stream Analytics also offers fast recovery by checkpointing and at-least-once event delivery. Mission-critical and enterprise-less scalability and availability Azure Stream Analytics is available across multiple worldwide data centers and sovereign clouds. Azure Stream Analytics promises 3-9s availability that is financially guaranteed with built-in auto recovery so that you will never lose the data. The good thing is customers do not need to write a single line of code to achieve this. The bottom-line is that enterprise readiness is built into the platform. Here is a summary of the Enterprise-ready features: Distributed scale-out architecture Ingests millions of events per second Accommodates variable loads Easily adds incremental resources to scale Available across multiple data centres and sovereign clouds Global compliance In addition, Azure Stream Analytics is compliant with many industries and government certifications. It is already HIPPA-compliant built-in and suitable to host healthcare applications. That's how customers can scale up their businesses confidently. Here is a summary of global compliance: ISO 27001 ISO 27018 SOC 1 Type 2 SOC 2 Type 2 SOC 3 Type 2 HIPAA/HITECH PCI DSS Level 1 European Union Model Clauses China GB 18030 Thus we reviewed Azure Stream Analytics and understood its key advantages. These advantages included: Ease in terms of developer productivity, Ease of development and how to reduces total cost of ownership, Global compliance certifications, The value of the PaaS based streaming solution to host mission-critical applications and security This post is taken from the book, Stream Analytics with Microsoft Azure, written by Anindita Basak, Krishna Venkataraman, Ryan Murphy, and Manpreet Singh. This book will help you to understand Azure Stream Analytics so that you can develop efficient analytics solutions that can work with any type of data. Say hello to Streaming Analytics How to build a live interactive visual dashboard in Power BI with Azure Stream Performing Vehicle Telemetry job analysis with Azure Stream Analytics tools    
Read more
  • 0
  • 0
  • 40529

article-image-an-introduction-to-typescript-types-for-asp-net-core-tutorial
Prasad Ramesh
23 Feb 2019
9 min read
Save for later

An introduction to TypeScript types for ASP.NET core [Tutorial]

Prasad Ramesh
23 Feb 2019
9 min read
JavaScript, being a flexible scripting language along with its dynamic type system, can become harder to maintain, the more a project scales up and as the team's staff changes. There are many tools and languages that can assist with this situation, one of which is TypeScript. This article is an excerpt from a book written by Tamir Dresher, Amir Zuker, and Shay Friedman titled Hands-On Full-Stack Web Development with ASP.NET Core. In this book, you will learn how to build web applications using Angular, React, and Vue. Back in the day when JavaScript ES5 was the active version, writing JavaScript code that adhered to encapsulation and modularity was not a trivial task. Things such as prototypes, namespaces, and self-executing functions could certainly improve your code; unfortunately, many JavaScript developers did not invest the necessary effort to improve the manageability of their code using those constructs. Consequently, new technologies surfaced to help with this matter; a popular example relevant at that time was CoffeeScript. Then, JavaScript ES6 was released. ES6, with its great added features, such as modules, classes, and more specific variable scoping, basically overturned CoffeeScript and similar alternatives at that time. Evidently, JavaScript is still missing a key feature, and that is type information. JavaScript's dynamic type system is a strong feature at times; unfortunately, the manageability and reusability of existing code suffers due to this fact. For that key purpose, every large-scale project should examine the use of tools that compensate the situation; those leading the field now are TypeScript and Flow. TypeScript, an open source scripting language, is developed and maintained by Microsoft. Its killer feature is bringing static typing to JavaScript-based systems and allowing you to annotate the code with type information. Additionally, it supports augmenting existing JavaScript code with external type information, similar to the header files pattern, as it's commonly known in C++. TypeScript is a superset of JavaScript, that meaning it has seamless integration with JavaScript, and makes JavaScript developers feel right at home. Being a JavaScript developer makes you a TypeScript developer as well since JavaScript code is actually valid TypeScript. Importantly, this makes the gradual upgrade of existing JavaScript code fairly easy. Another example of how great TypeScript is is the fact that Angular, a framework built by Google, adopts it extensively, and is actually implemented in TypeScript internally. The TypeScript compiler TypeScript is a superset of JavaScript, yet browsers and other platforms don't recognize it, nor are they able to execute it. For that purpose, TypeScript is processed by the TypeScript compiler, which transpiles your TypeScript code to JavaScript. Then, you can run the transpiled code basically everywhere JavaScript is supported. The tsc is available as an npm package that you can install globally. Open your favorite terminal and install it using the following command: npm install -g typescript Afterward, you can use the command tsc to execute the tsc: TypeScript files have a .ts file extension; this is where you write your TypeScript code. When needed, you use the TypeScript compiler to transpile your TypeScript code, which creates the JavaScript implementation of the code; you do that by executing the following command: tsc <typescript_filename> Types in Typescript One of the key features of TypeScript is bringing static typing to JavaScript. In TypeScript, you annotate your code with type information; every declaration can be defined with its associated typing. In the declaration, you do that by adding a colon following the type. For example, the following statement defines a variable of type number: const productsCount: number; Annotating your code with type information enables validation. The TypeScript compiler should throw errors if you try to do anything that is not supported by the specified type, thus productivity and discoverability benefit substantially through type safety. Additionally, with proper tooling support, you get IntelliSense support and automatic code completion. TypeScript supports a handful of types to reflect most constructs in JavaScript. We will cover these next, starting with basic types. Basic types TypeScript supports a handful of built-in types; the following are examples of the most common ones: const fullName: string = "John Doe"; const age: number = 6; const isDone: boolean = false; const d: Date = new Date(); const canBeAnything: any = 6; As you can see, TypeScript supports basic primitive types such as string, number, boolean, and date. Another thing to notice is the type any. The any type is a valid TypeScript type that indicates that the declaration can be of any given type and all operations on it should be allowed and considered safe. Arrays TypeScript arrays can be written in one of two ways. You can specify the item type followed by square brackets or you can use the generic Array type as follows: const list: number[] = [1, 2, 3]; const list: Array<number> = [1, 2, 3]; Enums Enums allow you to specify named values that correspond to numeric or string values. If you don't manually set the associated corresponding value, the named values in the enum are numbered, starting at 0 by default. Enums are extremely useful, and are commonly used to represent a set of predefined options or lookup values. In TypeScript, you define enums by using the enum keyword as follows: enum Color {Red, Green, Blue} const c: Color = Color.Red; Objects In JavaScript, objects contain key/value pairs that form the shape of the object. TypeScript supports object types as well, very similar to how you write plain JavaScript objects. Consider the following plain JavaScript object: const obj = { x: 5, y: 6 } In the preceding code, obj is a plain object defined with two keys, x and y, both with number values. To define an object type in TypeScript, you use a similar format — curly braces to represent an object, inside the keys, and their type information: { x:number, y:number } Together, this is how you associate a type to an object: const obj: { x:number, y:number } = { x: 5, y: 6 } In the preceding code, the obj object is initialized the same as before, only now along with its type information, which is marked in bold. Functions Functions have type information as well. Functions are comprised of parameters and a return value: function buildName(firstName: string, lastName: string): string { return firstName + " " + lastName; } Each parameter is annotated with its type, and then the return value's type is added at the end of the function signature. Unlike JavaScript, TypeScript enforces calls to functions to adhere to the defined signature. For example, if you try to call the function with anything other than two string parameters, the TypeScript compiler will not allow it. You can define more flexible function signatures if you like. TypeScript supports defining optional parameters by using the question mark symbol, ?: function buildName(firstName: string, lastName: string, title?: string): string { return title + " " + firstName + " " + lastName; } const name = buildName('John', 'Doe'); // valid call The preceding function can now be called with either 2 or 3 string parameters. Everything in JavaScript is still supported, of course. You can still use ES6 default parameter values and rest parameters as well. Type inference Until now, the illustrated code included the type information in the same statement where the actual value assignment took place. TypeScript supports type inference, meaning that it tries to resolve the relevant type information if you don't specify any. Let's look at some examples: const count = 2; In the preceding simple example, TypeScript is smart enough to realize that the type of count is a number without you having to explicitly specify that: function sum(x1: number, x2: number) { return x1 + x2; } const mySum = sum(1, 2); The preceding example demonstrates the power of type inference in TypeScript. If you pay close attention, the code doesn't specify the return value type of the sum function. TypeScript evaluates the code and determines that the return type is number in this case. As a result, the type of the variable mySum is also a number. Type inference is an extremely useful feature since it saves us the incredible time that would be otherwise spent in adding endless type information. Type casting Having the code statically typed leads to type safety. TypeScript does not let you perform actions that are not considered safe and supported. In the following example, the loadProducts function is defined with a return value of type any, thus the type of products is inferred as any: function loadProducts(): any { return [{name: 'Book1'}, {name: 'Book2'}]; } const products = loadProducts(); // products is of type 'any' In some cases, you may actually know the expected type. You can instruct TypeScript of the type by using a cast. Typecasting in TypeScript is supported by using the as an operator or the use of angle brackets, as follows: const products = loadProducts() as {name: string}[]; const products = <{name: string}[]>loadProducts(); Using the preceding cast, you have full support and recognition by TypeScript that products is now an array of objects with a name key. Type aliasing TypeScript supports defining new types via type aliases. You can use type aliases to reuse type information in multiple declarations, for example: type person = {name: string}; let employee: person; let contact: person; Given the preceding example, the code defines a type alias named person as an object with a name key of type string. Afterwards, the rest of the code can reference this type where needed. Type aliases are somewhat similar to interfaces (covered next), but can name primitives, unions, and tuples. Unlike interfaces, type aliases cannot be extended or implemented from, and so interfaces are generally preferred. Unions and intersections allow you to construct a type from multiple existing types, while tuples express arrays with a fixed number of known elements. You can read more about these at https://www.typescriptlang.org/docs/handbook/basic-types.html and https://www.typescriptlang.org/docs/handbook/advanced-types.html. TypeScript is one of the most popular scripting languages in respect to JavaScript-based code bases. At its very core, it brings static typing and assists tremendously with maintainability and productivity. In this article, we covered TypeScript types. TypeScript advances at a steady pace and shows great promise for using it in large projects. It has more features that weren't addressed in this tutorial, such as generics, mapped types, abstract classes, project references, and much more that will help you for ASP.NET core development. To know more about interfaces and compilers of TypeScript, check out the book Hands-On Full-Stack Web Development with ASP.NET Core. Typescript 3.3 is finally released! Future of ESLint support in TypeScript Introducing ReX.js v1.0.0 a companion library for RegEx written in TypeScript
Read more
  • 0
  • 0
  • 40467
article-image-numpy-array-object
Packt
03 Mar 2017
18 min read
Save for later

The NumPy array object

Packt
03 Mar 2017
18 min read
In this article by Armando Fandango author of the book Python Data Analysis - Second Edition, discuss how the NumPy provides a multidimensional array object called ndarray. NumPy arrays are typed arrays of fixed size. Python lists are heterogeneous and thus elements of a list may contain any object type, while NumPy arrays are homogenous and can contain object of only one type. An ndarray consists of two parts, which are as follows: The actual data that is stored in a contiguous block of memory The metadata describing the actual data Since the actual data is stored in a contiguous block of memory hence loading of the large data set as ndarray is affected by availability of large enough contiguous block of memory. Most of the array methods and functions in NumPy leave the actual data unaffected and only modify the metadata. Actually, we made a one-dimensional array that held a set of numbers. The ndarray can have more than a single dimension. (For more resources related to this topic, see here.) Advantages of NumPy arrays The NumPy array is, in general, homogeneous (there is a particular record array type that is heterogeneous)—the items in the array have to be of the same type. The advantage is that if we know that the items in an array are of the same type, it is easy to ascertain the storage size needed for the array. NumPy arrays can execute vectorized operations, processing a complete array, in contrast to Python lists, where you usually have to loop through the list and execute the operation on each element. NumPy arrays are indexed from 0, just like lists in Python. NumPy utilizes an optimized C API to make the array operations particularly quick. We will make an array with the arange() subroutine again. You will see snippets from Jupyter Notebook sessions where NumPy is already imported with instruction import numpy as np. Here's how to get the data type of an array: In: a = np.arange(5) In: a.dtype Out: dtype('int64') The data type of the array a is int64 (at least on my computer), but you may get int32 as the output if you are using 32-bit Python. In both the cases, we are dealing with integers (64 bit or 32 bit). Besides the data type of an array, it is crucial to know its shape. A vector is commonly used in mathematics but most of the time we need higher-dimensional objects. Let's find out the shape of the vector we produced a few minutes ago: In: a Out: array([0, 1, 2, 3, 4]) In: a.shape Out: (5,) As you can see, the vector has five components with values ranging from 0 to 4. The shape property of the array is a tuple; in this instance, a tuple of 1 element, which holds the length in each dimension. Creating a multidimensional array Now that we know how to create a vector, we are set to create a multidimensional NumPy array. After we produce the matrix, we will again need to show its, as demonstrated in the following code snippets: Create a multidimensional array as follows: In: m = np.array([np.arange(2), np.arange(2)]) In: m Out: array([[0, 1], [0, 1]]) We can show the array shape as follows: In: m.shape Out: (2, 2) We made a 2 x 2 array with the arange() subroutine. The array() function creates an array from an object that you pass to it. The object has to be an array, for example, a Python list. In the previous example, we passed a list of arrays. The object is the only required parameter of the array() function. NumPy functions tend to have a heap of optional arguments with predefined default options. Selecting NumPy array elements From time to time, we will wish to select a specific constituent of an array. We will take a look at how to do this, but to kick off, let's make a 2 x 2 matrix again: In: a = np.array([[1,2],[3,4]]) In: a Out: array([[1, 2], [3, 4]]) The matrix was made this time by giving the array() function a list of lists. We will now choose each item of the matrix one at a time, as shown in the following code snippet. Recall that the index numbers begin from 0: In: a[0,0] Out: 1 In: a[0,1] Out: 2 In: a[1,0] Out: 3 In: a[1,1] Out: 4 As you can see, choosing elements of an array is fairly simple. For the array a, we just employ the notation a[m,n], where m and n are the indices of the item in the array. Have a look at the following figure for your reference: NumPy numerical types Python has an integer type, a float type, and complex type; nonetheless, this is not sufficient for scientific calculations. In practice, we still demand more data types with varying precisions and, consequently, different storage sizes of the type. For this reason, NumPy has many more data types. The bulk of the NumPy mathematical types ends with a number. This number designates the count of bits related to the type. The following table (adapted from the NumPy user guide) presents an overview of NumPy numerical types: Type Description bool Boolean (True or False) stored as a bit inti Platform integer (normally either int32 or int64) int8 Byte (-128 to 127) int16 Integer (-32768 to 32767) int32 Integer (-2 ** 31 to 2 ** 31 -1) int64 Integer (-2 ** 63 to 2 ** 63 -1) uint8 Unsigned integer (0 to 255) uint16 Unsigned integer (0 to 65535) uint32 Unsigned integer (0 to 2 ** 32 - 1) uint64 Unsigned integer (0 to 2 ** 64 - 1) float16 Half precision float: sign bit, 5 bits exponent, and 10 bits mantissa float32 Single precision float: sign bit, 8 bits exponent, and 23 bits mantissa float64 or float Double precision float: sign bit, 11 bits exponent, and 52 bits mantissa complex64 Complex number, represented by two 32-bit floats (real and imaginary components) complex128 or complex Complex number, represented by two 64-bit floats (real and imaginary components) For each data type, there exists a matching conversion function: In: np.float64(42) Out: 42.0 In: np.int8(42.0) Out: 42 In: np.bool(42) Out: True In: np.bool(0) Out: False In: np.bool(42.0) Out: True In: np.float(True) Out: 1.0 In: np.float(False) Out: 0.0 Many functions have a data type argument, which is frequently optional: In: np.arange(7, dtype= np.uint16) Out: array([0, 1, 2, 3, 4, 5, 6], dtype=uint16) It is important to be aware that you are not allowed to change a complex number into an integer. Attempting to do that sparks off a TypeError: In: np.int(42.0 + 1.j) Traceback (most recent call last): <ipython-input-24-5c1cd108488d> in <module>() ----> 1 np.int(42.0 + 1.j) TypeError: can't convert complex to int The same goes for conversion of a complex number into a floating-point number. By the way, the j component is the imaginary coefficient of a complex number. Even so, you can convert a floating-point number to a complex number, for example, complex(1.0). The real and imaginary pieces of a complex number can be pulled out with the real() and imag() functions, respectively. Data type objects Data type objects are instances of the numpy.dtype class. Once again, arrays have a data type. To be exact, each element in a NumPy array has the same data type. The data type object can tell you the size of the data in bytes. The size in bytes is given by the itemsize property of the dtype class : In: a.dtype.itemsize Out: 8 Character codes Character codes are included for backward compatibility with Numeric. Numeric is the predecessor of NumPy. Its use is not recommended, but the code is supplied here because it pops up in various locations. You should use the dtype object instead. The following table lists several different data types and character codes related to them: Type Character code integer i Unsigned integer u Single precision float f Double precision float d bool b complex D string S unicode U Void V Take a look at the following code to produce an array of single precision floats: In: arange(7, dtype='f') Out: array([ 0., 1., 2., 3., 4., 5., 6.], dtype=float32) Likewise, the following code creates an array of complex numbers: In: arange(7, dtype='D') In: arange(7, dtype='D') Out: array([ 0.+0.j, 1.+0.j, 2.+0.j, 3.+0.j, 4.+0.j, 5.+0.j, 6.+0.j]) The dtype constructors We have a variety of means to create data types. Take the case of floating-point data (have a look at dtypeconstructors.py in this book's code bundle): We can use the general Python float, as shown in the following lines of code: In: np.dtype(float) Out: dtype('float64') We can specify a single precision float with a character code: In: np.dtype('f') Out: dtype('float32') We can use a double precision float with a character code: In: np.dtype('d') Out: dtype('float64') We can pass the dtype constructor a two-character code. The first character stands for the type; the second character is a number specifying the number of bytes in the type (the numbers 2, 4, and 8 correspond to floats of 16, 32, and 64 bits, respectively): In: np.dtype('f8') Out: dtype('float64') A (truncated) list of all the full data type codes can be found by applying sctypeDict.keys(): In: np.sctypeDict.keys() In: np.sctypeDict.keys() Out: dict_keys(['?', 0, 'byte', 'b', 1, 'ubyte', 'B', 2, 'short', 'h', 3, 'ushort', 'H', 4, 'i', 5, 'uint', 'I', 6, 'intp', 'p', 7, 'uintp', 'P', 8, 'long', 'l', 'L', 'longlong', 'q', 9, 'ulonglong', 'Q', 10, 'half', 'e', 23, 'f', 11, 'double', 'd', 12, 'longdouble', 'g', 13, 'cfloat', 'F', 14, 'cdouble', 'D', 15, 'clongdouble', 'G', 16, 'O', 17, 'S', 18, 'unicode', 'U', 19, 'void', 'V', 20, 'M', 21, 'm', 22, 'bool8', 'Bool', 'b1', 'float16', 'Float16', 'f2', 'float32', 'Float32', 'f4', 'float64', ' Float64', 'f8', 'float128', 'Float128', 'f16', 'complex64', 'Complex32', 'c8', 'complex128', 'Complex64', 'c16', 'complex256', 'Complex128', 'c32', 'object0', 'Object0', 'bytes0', 'Bytes0', 'str0', 'Str0', 'void0', 'Void0', 'datetime64', 'Datetime64', 'M8', 'timedelta64', 'Timedelta64', 'm8', 'int64', 'uint64', 'Int64', 'UInt64', 'i8', 'u8', 'int32', 'uint32', 'Int32', 'UInt32', 'i4', 'u4', 'int16', 'uint16', 'Int16', 'UInt16', 'i2', 'u2', 'int8', 'uint8', 'Int8', 'UInt8', 'i1', 'u1', 'complex_', 'int0', 'uint0', 'single', 'csingle', 'singlecomplex', 'float_', 'intc', 'uintc', 'int_', 'longfloat', 'clongfloat', 'longcomplex', 'bool_', 'unicode_', 'object_', 'bytes_', 'str_', 'string_', 'int', 'float', 'complex', 'bool', 'object', 'str', 'bytes', 'a']) The dtype attributes The dtype class has a number of useful properties. For instance, we can get information about the character code of a data type through the properties of dtype: In: t = np.dtype('Float64') In: t.char Out: 'd' The type attribute corresponds to the type of object of the array elements: In: t.type Out: numpy.float64 The str attribute of dtype gives a string representation of a data type. It begins with a character representing endianness, if appropriate, then a character code, succeeded by a number corresponding to the number of bytes that each array item needs. Endianness, here, entails the way bytes are ordered inside a 32- or 64-bit word. In the big-endian order, the most significant byte is stored first, indicated by >. In the little-endian order, the least significant byte is stored first, indicated by <, as exemplified in the following lines of code: In: t.str Out: '<f8' One-dimensional slicing and indexing Slicing of one-dimensional NumPy arrays works just like the slicing of standard Python lists. Let's define an array containing the numbers 0, 1, 2, and so on up to and including 8. We can select a part of the array from indexes 3 to 7, which extracts the elements of the arrays 3 through 6: In: a = np.arange(9) In: a[3:7] Out: array([3, 4, 5, 6]) We can choose elements from indexes the 0 to 7 with an increment of 2: In: a[:7:2] Out: array([0, 2, 4, 6]) Just as in Python, we can use negative indices and reverse the array: In: a[::-1] Out: array([8, 7, 6, 5, 4, 3, 2, 1, 0]) Manipulating array shapes We have already learned about the reshape() function. Another repeating chore is the flattening of arrays. Flattening in this setting entails transforming a multidimensional array into a one-dimensional array. Let us create an array b that we shall use for practicing the further examples: In: b = np.arange(24).reshape(2,3,4) In: print(b) Out: [[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]) We can manipulate array shapes using the following functions: Ravel: We can accomplish this with the ravel() function as follows: In: b Out: array([[[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11]], [[12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]]) In: b.ravel() Out: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]) Flatten: The appropriately named function, flatten(), does the same as ravel(). However, flatten() always allocates new memory, whereas ravel gives back a view of the array. This means that we can directly manipulate the array as follows: In: b.flatten() Out: array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]) Setting the shape with a tuple: Besides the reshape() function, we can also define the shape straightaway with a tuple, which is exhibited as follows: In: b.shape = (6,4) In: b Out: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]) As you can understand, the preceding code alters the array immediately. Now, we have a 6 x 4 array. Transpose: In linear algebra, it is common to transpose matrices. Transposing is a way to transform data. For a two-dimensional table, transposing means that rows become columns and columns become rows. We can do this too by using the following code: In: b.transpose() Out: array([[ 0, 4, 8, 12, 16, 20], [ 1, 5, 9, 13, 17, 21], [ 2, 6, 10, 14, 18, 22], [ 3, 7, 11, 15, 19, 23]]) Resize: The resize() method works just like the reshape() method, In: b.resize((2,12)) In: b Out: array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]]) Stacking arrays Arrays can be stacked horizontally, depth wise, or vertically. We can use, for this goal, the vstack(), dstack(), hstack(), column_stack(), row_stack(), and concatenate() functions. To start with, let's set up some arrays: In: a = np.arange(9).reshape(3,3) In: a Out: array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) In: b = 2 * a In: b Out: array([[ 0, 2, 4], [ 6, 8, 10], [12, 14, 16]]) As mentioned previously, we can stack arrays using the following techniques: Horizontal stacking: Beginning with horizontal stacking, we will shape a tuple of ndarrays and hand it to the hstack() function to stack the arrays. This is shown as follows: In: np.hstack((a, b)) Out: array([[ 0, 1, 2, 0, 2, 4], [ 3, 4, 5, 6, 8, 10], [ 6, 7, 8, 12, 14, 16]]) We can attain the same thing with the concatenate() function, which is shown as follows: In: np.concatenate((a, b), axis=1) Out: array([[ 0, 1, 2, 0, 2, 4], [ 3, 4, 5, 6, 8, 10], [ 6, 7, 8, 12, 14, 16]]) The following diagram depicts horizontal stacking: Vertical stacking: With vertical stacking, a tuple is formed again. This time it is given to the vstack() function to stack the arrays. This can be seen as follows: In: np.vstack((a, b)) Out: array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 0, 2, 4], [ 6, 8, 10], [12, 14, 16]]) The concatenate() function gives the same outcome with the axis parameter fixed to 0. This is the default value for the axis parameter, as portrayed in the following code: In: np.concatenate((a, b), axis=0) Out: array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 0, 2, 4], [ 6, 8, 10], [12, 14, 16]]) Refer to the following figure for vertical stacking: Depth stacking: To boot, there is the depth-wise stacking employing dstack() and a tuple, of course. This entails stacking a list of arrays along the third axis (depth). For example, we could stack 2D arrays of image data on top of each other as follows: In: np.dstack((a, b)) Out: array([[[ 0, 0], [ 1, 2], [ 2, 4]], [[ 3, 6], [ 4, 8], [ 5, 10]], [[ 6, 12], [ 7, 14], [ 8, 16]]]) Column stacking: The column_stack() function stacks 1D arrays column-wise. This is shown as follows: In: oned = np.arange(2) In: oned Out: array([0, 1]) In: twice_oned = 2 * oned In: twice_oned Out: array([0, 2]) In: np.column_stack((oned, twice_oned)) Out: array([[0, 0], [1, 2]]) 2D arrays are stacked the way the hstack() function stacks them, as demonstrated in the following lines of code: In: np.column_stack((a, b)) Out: array([[ 0, 1, 2, 0, 2, 4], [ 3, 4, 5, 6, 8, 10], [ 6, 7, 8, 12, 14, 16]]) In: np.column_stack((a, b)) == np.hstack((a, b)) Out: array([[ True, True, True, True, True, True], [ True, True, True, True, True, True], [ True, True, True, True, True, True]], dtype=bool) Yes, you guessed it right! We compared two arrays with the == operator. Row stacking: NumPy, naturally, also has a function that does row-wise stacking. It is named row_stack() and for 1D arrays, it just stacks the arrays in rows into a 2D array: In: np.row_stack((oned, twice_oned)) Out: array([[0, 1], [0, 2]]) The row_stack() function results for 2D arrays are equal to the vstack() function results: In: np.row_stack((a, b)) Out: array([[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8], [ 0, 2, 4], [ 6, 8, 10], [12, 14, 16]]) In: np.row_stack((a,b)) == np.vstack((a, b)) Out: array([[ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True], [ True, True, True]], dtype=bool) Splitting NumPy arrays Arrays can be split vertically, horizontally, or depth wise. The functions involved are hsplit(), vsplit(), dsplit(), and split(). We can split arrays either into arrays of the same shape or indicate the location after which the split should happen. Let's look at each of the functions in detail: Horizontal splitting: The following code splits a 3 x 3 array on its horizontal axis into three parts of the same size and shape (see splitting.py in this book's code bundle): In: a Out: array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) In: np.hsplit(a, 3) Out: [array([[0], [3], [6]]), array([[1], [4], [7]]), array([[2], [5], [8]])] Liken it with a call of the split() function, with an additional argument, axis=1: In: np.split(a, 3, axis=1) Out: [array([[0], [3], [6]]), array([[1], [4], [7]]), array([[2], [5], [8]])] Vertical splitting: vsplit() splits along the vertical axis: In: np.vsplit(a, 3) Out: [array([[0, 1, 2]]), array([[3, 4, 5]]), array([[6, 7, 8]])] The split() function, with axis=0, also splits along the vertical axis: In: np.split(a, 3, axis=0) Out: [array([[0, 1, 2]]), array([[3, 4, 5]]), array([[6, 7, 8]])] Depth-wise splitting: The dsplit() function, unsurprisingly, splits depth-wise. We will require an array of rank 3 to begin with: In: c = np.arange(27).reshape(3, 3, 3) In: c Out: array([[[ 0, 1, 2], [ 3, 4, 5], [ 6, 7, 8]], [[ 9, 10, 11], [12, 13, 14], [15, 16, 17]], [[18, 19, 20], [21, 22, 23], [24, 25, 26]]]) In: np.dsplit(c, 3) Out: [array([[[ 0], [ 3], [ 6]], [[ 9], [12], [15]], [[18], [21], [24]]]), array([[[ 1], [ 4], [ 7]], [[10], [13], [16]], [[19], [22], [25]]]), array([[[ 2], [ 5], [ 8]], [[11], [14], [17]], [[20], [23], [26]]])] NumPy array attributes Let's learn more about the NumPy array attributes with the help of an example. Let us create an array b that we shall use for practicing the further examples: In: b = np.arange(24).reshape(2, 12) In: b Out: array([[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]]) Besides the shape and dtype attributes, ndarray has a number of other properties, as shown in the following list: ndim gives the number of dimensions, as shown in the following code snippet: In: b.ndim Out: 2 size holds the count of elements. This is shown as follows: In: b.size Out: 24 itemsize returns the count of bytes for each element in the array, as shown in the following code snippet: In: b.itemsize Out: 8 If you require the full count of bytes the array needs, you can have a look at nbytes. This is just a product of the itemsize and size properties: In: b.nbytes Out: 192 In: b.size * b.itemsize Out: 192 The T property has the same result as the transpose() function, which is shown as follows: In: b.resize(6,4) In: b Out: array([[ 0, 1, 2, 3], [ 4, 5, 6, 7], [ 8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18, 19], [20, 21, 22, 23]]) In: b.T Out: array([[ 0, 4, 8, 12, 16, 20], [ 1, 5, 9, 13, 17, 21], [ 2, 6, 10, 14, 18, 22], [ 3, 7, 11, 15, 19, 23]]) If the array has a rank of less than 2, we will just get a view of the array: In: b.ndim Out: 1 In: b.T Out: array([0, 1, 2, 3, 4]) Complex numbers in NumPy are represented by j. For instance, we can produce an array with complex numbers as follows: In: b = np.array([1.j + 1, 2.j + 3]) In: b Out: array([ 1.+1.j, 3.+2.j]) The real property returns to us the real part of the array, or the array itself if it only holds real numbers: In: b.real Out: array([ 1., 3.]) The imag property holds the imaginary part of the array: In: b.imag Out: array([ 1., 2.]) If the array holds complex numbers, then the data type will automatically be complex as well: In: b.dtype Out: dtype('complex128') In: b.dtype.str Out: '<c16' The flat property gives back a numpy.flatiter object. This is the only means to get a flatiter object; we do not have access to a flatiter constructor. The flat iterator enables us to loop through an array as if it were a flat array, as shown in the following code snippet: In: b = np.arange(4).reshape(2,2) In: b Out: array([[0, 1], [2, 3]]) In: f = b.flat In: f Out: <numpy.flatiter object at 0x103013e00> In: for item in f: print(item) Out: 0 1 2 3 It is possible to straightaway obtain an element with the flatiter object: In: b.flat[2] Out: 2 Also, you can obtain multiple elements as follows: In: b.flat[[1,3]] Out: array([1, 3]) The flat property can be set. Setting the value of the flat property leads to overwriting the values of the entire array: In: b.flat = 7 In: b Out: array([[7, 7], [7, 7]]) We can also obtain selected elements as follows: In: b.flat[[1,3]] = 1 In: b Out: array([[7, 1], [7, 1]]) The next diagram illustrates various properties of ndarray: Converting arrays We can convert a NumPy array to a Python list with the tolist() function . The following is a brief explanation: Convert to a list: In: b Out: array([ 1.+1.j, 3.+2.j]) In: b.tolist() Out: [(1+1j), (3+2j)] The astype() function transforms the array to an array of the specified data type: In: b Out: array([ 1.+1.j, 3.+2.j]) In: b.astype(int) /usr/local/lib/python3.5/site-packages/ipykernel/__main__.py:1: ComplexWarning: Casting complex values to real discards the imaginary part … Out: array([1, 3]) In: b.astype('complex') Out: array([ 1.+1.j, 3.+2.j]) We are dropping off the imaginary part when casting from the complex type to int. The astype() function takes the name of a data type as a string too. The preceding code won't display a warning this time because we used the right data type. Summary In this article, we found out a heap about the NumPy basics: data types and arrays. Arrays have various properties that describe them. You learned that one of these properties is the data type, which, in NumPy, is represented by a full-fledged object. NumPy arrays can be sliced and indexed in an effective way, compared to standard Python lists. NumPy arrays have the extra ability to work with multiple dimensions. The shape of an array can be modified in multiple ways, such as stacking, resizing, reshaping, and splitting. Resources for Article: Further resources on this subject: Big Data Analytics [article] Python Data Science Up and Running [article] R and its Diverse Possibilities [article]
Read more
  • 0
  • 0
  • 40385

article-image-installing-arch-linux-using-official-iso
Packt
19 Feb 2013
7 min read
Save for later

Installing Arch Linux using the official ISO

Packt
19 Feb 2013
7 min read
(For more resources related to this topic, see here.) Getting ready You can get the official ISO image file from https://www.archlinux.org/download/. On this page you will find a download link to the latest release. Depending on your preference, download the torrent file or the ISO image file immediately. The following list describes the main tasks that we will perform in this recipe: Preparing, booting, and setting keyboard layout: We are going to get the ISO file from the download page of the Arch Linux website and store it on the preferred media of our choice. At the time of writing this article, there is a dual ISO image file that contains both i686 and x86-64 architectures on one disk. Start your PC with your preferred installation media (CD or USB stick). On most PC systems, you can access the boot menu by pressing one of the function keys, usually between F8 and F12 depending on the motherboard manufacturer. On older machines where you do not yet have a boot menu, you might need to change the boot order in the BIOS where the CD-ROM (or DVD/Blu-ray) has to be chosen as the first device to try booting from. We'll also explain how to use a different keyboard layout than the default one in this recipe. Creating, formatting, and mounting partitions: You can partition the disks the way you want using cfdisk (for MBR disk partitioning) or cgdisk (for GUID disk partitioning). After creating the partitions, we can choose to format our created partitions with specific filesystems. When all partitions are formatted, we need to mount the partitions. First we will mount the root partition to /mnt. The other partitions will be mounted later on after you have created the specific folders. We'll designate our device with /dev/sdX; in your case this can be /dev/sda, and so on. Connecting to the Internet: To be able to continue installing the ISO you need to connect to the Internet, because there are no packages available for installation on the ISO. For a wireless network you will need to use netcfg. When connected to a wired network, just use dhcpcd or dhclient. Installing the base system and boot loader: These days the base system gets installed by running a simple script pacstrap. Pacstrap takes multiple parameters, the target location, and the packages or groups you want to install. For people who want to develop on their machines, the best base install is adding base-devel to the default installation. For normal end users, just base will be sufficient to start. Configuring the system: In this recipe, we'll describe the flow of what to do during the configuration. How to do it... The following steps will guide you in preparing, booting, and setting keyboard layout: Once you have downloaded the ISO image file, you should also verify its integrity by downloading the sha1sums.txt file from the download page. These days you can also check if the ISO is completely valid by verifying the signature of the ISO. Verify the integrity by issuing the sha1sum -c sha1sums.txt command and you'll see whether your download was successful or not. Also check if the signature of the ISO is correct by running gpg -v archlinux-...iso.sig: sha1sum -c sha1sums.txt gpg -v archlinux-2012-08-04-dual.iso.sig The following screenshot shows the execution of this step: As you can see in the previous screenshot, the ISO's checksum is ok and the signature is valid. Now that we are sure our ISO is ok, we can burn this to a CD with our favorite burning program. Insert the CD into the drive, or insert the USB stick into the USB port of your PC. Enter the boot menu, or let your computer automatically boot from the inserted installation media. If the previous steps are performed correctly, you will see the following screenshot: Select the architecture you want and press Enter, and we'll be on our way. Search the keyboard layout desired for your region. The available keyboard layouts can be found at /usr/share/kbd/keymaps/. Set the desired keyboard layout with loadkeys keyboardlayout. Now let's perform the following steps to create, format, and mount partitions: Start cfdisk or cgdisk, having the first parameter as the device you want to partition: cfdisk /dev/sdX cgdisk /dev/sdX Create your partition scheme. Store the partition scheme. Use the mkfs command to create a filesystem on a specific partition: mkfs -t vfat /dev/sdX mkfs.ext4 -L root /dev/sdX Mount your root partition to /mnt: mount /dev/sdX3 /mnt Make directories under mount for your other partitions: mkdir -p /mnt/boot Mount the other partitions: mount /dev/sdX1 /mnt/boot The following steps are needed to connect to the Internet: When we need a wireless network, create a netcfg profile and run netcfg mywireless. Use dhclient or dhcpcd to get an IP address. The following steps should be performed for installing the base system and boot loader: Run pacstrap with the desired parameters: pacstrap /mnt base base-devel Install the desired boot loader: the best choice at this moment is Syslinux. The final installation of the boot loader will be done in a chroot during the initial configuration. We'll now list the steps to do during the configuration: Generate fstab with genfstab: genfstab -p /mnt >> /mnt/etc/fstab Change the root into the system location: arch-chroot /mnt Set your hostname in /etc/hostname. Create /etc/localtime symlink. Set your locale in /etc/locale.conf. Uncomment the configured locale in /etc/locale.gen. Run locale-gen. Configure /etc/mkinitcpio.conf. Generate your initial ramdisk: mkinitcpio -p linux Finish installation of your boot loader. Set the root password with passwd. Leave the chroot environment (exit). How it works... We downloaded the ISO image file via torrent, or via HTTP from the mirror sites listed on the download page. The sha1sum command lets us verify the integrity of the downloaded ISO. On top of the checksum, we can also check the integrity by verifying the signature available for the ISO. So now, we can rest assured that the downloaded file is the real one. The ISO contains a fully working operating system. It also contains all the necessary tools to perform system recovery and installation. The keyboard configuration set with loadkeys will make sure that the key you press on your keyboard will be translated to the correct letter on your screen. Using a different keyboard layout from the one on your physical keyboard might be confusing. We then created a partition scheme on the selected disk with the appropriate tool (cfdisk or cgdisk). Make Filesystem (mkfs) is a unified frontend to create a filesystem. Using it we created our filesystem layout manually under/mnt by creating our default partition layout in our root, and mounting the specific partitions accordingly. You can make a connection with your wireless network (if needed), and then use dhcpcd or dhclient to obtain an IP address that enables you to access the Internet. Pacstrap will run pacman with a modified root location to install the desired packages into the newly created system. For example, installing Syslinux: pacstrap /mnt syslinux The specific configuration files will ensure we don't have to do all those steps over and over again on every boot. Summary This article explained the procedure to get Arch Linux installed on your system using the official installation media. Resources for Article : Further resources on this subject: Compression Formats in Linux Shell Script [Article] Making a Complete yet Small Linux Distribution [Article] Linux Shell Script: Tips and Tricks [Article]
Read more
  • 0
  • 0
  • 40380

article-image-ai-distilled-32-navigating-industry-updates-and-innovations
Merlyn Shelley
12 Jan 2024
13 min read
Save for later

AI_Distilled #32: Navigating Industry Updates and Innovations

Merlyn Shelley
12 Jan 2024
13 min read
Dive deeper into the world of AI innovation and stay ahead of the AI curve! Subscribe to our AI_Distilled newsletter for the latest insights. Don't miss out – sign up today!👋 Hello ,“There is not going to be one model to rule them all. You need to be trying out different models, you need a real choice of model providers.”  -Adam Selipsky, CEO, AWS. There’s no one-size-fits-all approach in AI development. When you embrace diversity in AI, that’s when it truly shines. There’s also a different side to the coin — the infinitely scalable adaptability of AI to revolutionize field after field, such as when it can help discover promising new sustainable battery materials to potentially reduce reliance on Lithium. Welcome back to a new issue of AI Distilled - your one-stop destination for all things AI, ML, NLP, and Gen AI. Let’s get started with the latest news and developments across different industries and sectors: AI Launches & Industry Updates: Explore the GPT MarketplaceNVIDIA Unveils Innovations in Gaming, AI, and Robotics at CES 2024 Perplexity AI Secures $73.6M Funding Led by NVIDIA and Jeff Bezos OpenAI Set to Launch GPT Store for AI Models and Apps Google Faces Multibillion-Dollar Patent Trial Over AI Technology in U.S. Google's DeepMind Unveils Advances in Robotic Training with Video and Language Models AI in Healthcare: Isomorphic Labs Secures $3 Billion AI-Driven Drug Discovery Deals with Eli Lilly and Novartis Nabla Secures $24 Million in Series B Funding for AI-Powered Medical Assistant AI in Business: Deloitte Introduces PairD AI Chatbot for 75,000 Staff in Big Four's Latest Automation Move Walmart Revolutionizes Shopping with Generative AI Innovations AI in Science & Technology: Microsoft and PNNL Harness AI to Discover Promising Battery Material German Automakers Pioneer AI Integration in Cars, Elevating Driving Experience AI in Finance: Rising Concerns as Generative AI Use Grows in Finance, Amplifying Misinformation Risks AI in Supply Chain Management: Warehousing Industry Leverages Machine Learning to Tackle Disruptions We’ve also curated the latest GPT and LLM resources, tutorials, and secret knowledge: Explore the Future of AI: A Guide to the Top 9 AI APIs of 2024 Optimizing LLM Inference with Splitwise: Achieving Efficiency in GPU Usage A Comprehensive Guide to Merging LLMs AI Drift in Retrieval Augmented Generation Finally, don’t forget to check-out our hands-on tips and strategies from the AI community for you to use on your own projects: Creating Your Own AI Image Generator App with Generative AI Optimizing Code Output with CodeWhisperer Mastering Knowledge Graph Construction with KeyBERT, HDBSCAN, and Zephyr-7B-Beta How to Craft an Open Source Multi-Modal RAG System Looking for some inspiration? Here are some GitHub repositories to get your projects going!gxnu-zhonglab/odtrack DLYuanGod/TinyGPT-V intel/intel-extension-for-transformers CambioML/pykoi  📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition." We appreciate your input and hope you enjoy the book!  Share your thoughts and opinions here! Writer’s Credit: Special shout-out to Vidhu Jain for their valuable contribution to this week’s newsletter content!  Cheers,  Merlyn Shelley  Editor-in-Chief, Packt  SignUp | Advertise | Archives⚡ TechWave: AI/GPT News & AnalysisAI Launches & Industry Updates: ⭐ Explore the GPT Marketplace: Just two months in, 3 million custom ChatGPTs are already out there! The GPT Store is now open to ChatGPT Plus, Team, and Enterprise users, offering a variety of handy GPTs. Get in on the action at chat.openai.com/gpts! ⭐ NVIDIA Unveils Innovations in Gaming, AI, and Robotics at CES 2024: NVIDIA unveiled impressive CES 2024 innovations: GeForce RTX 40 SUPER GPUs, AI laptops, generative AI tools. They highlighted RTX GPUs' influence on generative AI, introduced TensorRT acceleration for Stable Diffusion XL and SDXL Turbo, and NVIDIA Avatar Cloud Engine (ACE) Microservices for digital avatars. Getty Images and Nvidia introduced Generative AI by iStock, a text-to-image platform for customized stock photos. ⭐ Perplexity AI Secures $73.6M Funding Led by NVIDIA and Jeff Bezos: San Francisco's Perplexity AI secures $73.6 million in funding led by IVP, with Nvidia and Jeff Bezos participating, valuing the company at $520 million. Despite serving 500 million queries in 2023, profitability remains elusive, as it competes with Google in the search market. The funds will be used for hiring and product development. ⭐ OpenAI Set to Launch GPT Store for AI Models and Apps: OpenAI is set to launch the GPT Store, where developers can present custom GPT model applications, following updated policies. The launch, previously delayed, offers diverse, code-free applications. Revenue-sharing details await clarification. ⭐ Google Faces Multibillion-Dollar Patent Trial Over AI Technology in U.S.: Google is facing a federal jury trial in Boston as Singular Computing alleges patent infringement in its AI processors. Singular seeks up to $7 billion in damages, while Google argues independent development. The trial may last two to three weeks. ⭐ Google's DeepMind Unveils Advances in Robotic Training with Video and Language Models: DeepMind Robotics unveils AutoRT, a system enhancing robot understanding of human intentions using Visual Language Models. It orchestrates 20 robots, suggesting tasks via LLMs and introduces RT-Trajectory with 63% success in 41 tasks using video input. AI in Healthcare: ⭐ Isomorphic Labs Secures $3 Billion AI-Driven Drug Discovery Deals with Eli Lilly and Novartis: London-based Isomorphic, a DeepMind spin-out, forms strategic alliances with Eli Lilly and Novartis, valued at $3 billion. Utilizing AlphaFold 2 AI technology, Isomorphic focuses on accurate protein predictions for innovative drug discovery. ⭐ Nabla Secures $24 Million in Series B Funding for AI-Powered Medical Assistant: Paris startup Nabla secures $24 million in a Series B funding round led by Cathay Innovation and ZEBOX Ventures. Nabla develops an AI copilot for doctors, streamlining administrative tasks while collaborating with physicians. AI in Business: ⭐ Deloitte Introduces PairD AI Chatbot for 75,000 Staff in Big Four's Latest Automation Move: Deloitte is using a chatbot called PairD to help 75,000 employees in Europe and the Middle East with everyday tasks. While it's convenient, there are concerns about its accuracy, so employees still check its work. Deloitte is also sharing PairD with 800 workers at the charity Scope as part of its AI strategy. ⭐ Walmart Revolutionizes Shopping with Generative AI Innovations: Walmart introduces generative AI-powered features on iOS, Android, and its website to improve the digital shopping experience. These features provide personalized responses and recommendations, shifting from scrolling to goal-oriented searching for a smoother shopping journey. AI in Science & Technology: ⭐ Microsoft and PNNL Harness AI to Discover Promising Battery Material: Microsoft and PNNL used AI and cloud computing to speed up battery innovation, identifying a safer, efficient solid-state electrolyte with less lithium. Azure Quantum Elements platform screened 32 million candidates in 80 hours, highlighting a material with potential for a 70% reduction in sodium use, advancing sustainable energy solutions. ⭐ German Automakers Pioneer AI Integration in Cars, Elevating Driving Experience: Leading German automakers like Volkswagen and Mercedes-Benz are revolutionizing the automotive industry with advanced AI integration. Volkswagen unveiled ChatGPT technology, enhancing the driving experience with AI-powered chatbots and IDA voice assistants, while Mercedes-Benz introduced a sophisticated virtual assistant for context-based suggestions, marking a significant leap in interactive AI utilization at CES 2024. AI in Finance: ⭐ Rising Concerns as Generative AI Use Grows in Finance, Amplifying Misinformation Risks: The finance sector's growing use of generative AI is transforming services but raises concerns of misinformation. A study by PYMNTS Intelligence and AI-ID shows 80% of consumers worry about generative AI's misinformation risk. Regulatory guidelines, model explainability tools, and industry cooperation are essential for responsible AI adoption in finance. AI in Supply Chain Management: ⭐ Warehousing Industry Leverages Machine Learning to Tackle Disruptions: Zebra Technologies Corporation's research highlights the warehousing industry's adoption of AI, particularly machine learning (ML), amid challenges like inflation and labor shortages. The report predicts ML, predictive analytics, and mobile dimensioning will dominate by 2028, aiding historical analysis, demand prediction, and automation. Decision-makers aim to boost resilience with 94% planning ML integration within five years.  🔮 Expert Insights from Packt Community The Handbook of NLP with Gensim - By Chris Kuo Gensim and its NLP modeling techniques Gensim is actively maintained and supported by a community of developers and is widely used in academic research and industry applications. It covers many important NLP techniques that make up the workforce of today’s NLP. Last year, I was at a company’s year-end party. The ballroom was filled with people standing in groups with their drinks. I walked around and listened for conversation topics where I could chime in. I heard one group talking about the FIFA World Cup 2022 and another group talking about stock markets. I joined the stock markets conversation. In that short moment, my mind had performed “word extractions,” “text summarization,” and “topic classifications.” These tasks are the core tasks of NLP and what Gensim is designed to do. We perform serious text analyses in professional fields including legal, medical, and business. We organize similar documents into topics. Such work also demands “word extractions,” “text summarization,” and “topic classifications.” In the following sections, I will give you a brief introduction to the key models that Gensim offers so you will have a good overview. These models include the following: BoW and TF-IDF Latent semantic analysis/indexing (LSA/LSI) Word2Vec Doc2Vec Text summarization LDA Ensemble LDA  BoW and TF-IDF Texts can be represented as a bag of words, which is the count frequency of a word. BoW uses the word count to reflect the significance of a word. However, this is not very intuitive. Frequent words may not carry special meanings depending on the type of document. LSA/LSI Latent semantic analysis (LSA) was developed in the 1990s. It's an NLP solution that far surpasses naïve keyword matching and has become an important search engine algorithm. Prior to that, in 1988, an LSA-based information retrieval system was patented (US Patent #4839853, now expired) and named “latent semantic indexing,” so the technique is also called latent semantic indexing (LSI). Gensim and many other reports name LSA as LSI so as not to confuse LSA with LDA. This is an excerpt from the book The Handbook of NLP with Gensim - By Chris Kuo and published in OCT ‘23. To see what's inside the book, read the entire chapter here or try a 7-day free trial to access the full Packt digital library. To discover more, click the button below.      Read through the Chapter 1 unlocked here...  🌟 Secret Knowledge: AI/LLM Resources⭐ Explore the Future of AI: A Guide to the Top 9 AI APIs of 2024: In this guide, you'll learn how to navigate the dynamic realm of AI APIs, uncovering the capabilities of the top 9 for 2024. Discover Google Cloud Vision AI, an unparalleled eye for accurate image analysis, IBM Watson Assistant, a conversational genius transforming virtual assistance, Amazon Lex, empowering apps with voice commands effortlessly, Azure Cognitive Services, the Swiss Army knife of AI, offering diverse tools, DeepAI, simplifying deep learning for innovation, and decode texts with MonkeyLearn, a text analysis guru, among others. Read the post to explore how these APIs can shape your tech ventures and redefine the future of AI. ⭐ Optimizing LLM Inference with Splitwise: Achieving Efficiency in GPU Usage: Discover how Splitwise, a technique from Azure Research - Systems, boosts LLM inference efficiency. It separates prompt computation and token-generation phases, optimizing hardware use. This method enhances GPU cluster design, achieving higher throughput, lower costs, and reduced power for efficient LLM deployment. ⭐ A Comprehensive Guide to Merging LLMs: This comprehensive guide explores merging LLMs using the mergekit library without requiring a GPU. It covers four merging techniques: SLERP, TIES, DARE, and passthrough, with configuration examples. The result is Marcoro14–7B-slerp, a high-performing model featured on the Open LLM Leaderboard. ⭐ AI Drift in Retrieval Augmented Generation (RAG): This guide delves into AI drift within RAG pipelines, drawing from a real case where a customer faced declining AI responses. It covers the causes (content drift, LLM drift, pipeline algorithm changes) and strategies (content management, API upgrades, internal metrics) to control AI drift.  🔛 Masterclass: AI/LLM Tutorials⭐ Creating Your Own AI Image Generator App with Generative AI: Discover how to build a powerful Generative AI Text-to-Image application in this detailed guide. The author shares their journey of seamlessly integrating AI-generated images into a React app, using third-party APIs like SegMind. With a step-by-step walkthrough, you'll explore the code behind the app on GitHub and learn how to choose the right API, integrate it into React, and unleash AI capabilities in web development. Read on to bring dynamic, AI-generated content to your React projects and stay at the forefront of web development innovation. ⭐ Optimizing Code Output with CodeWhisperer: Unlock the full potential of Amazon CodeWhisperer with this in-depth guide on prompt engineering. Learn how CodeWhisperer accelerates software development by offering code recommendations based on natural language comments. The post provides step-by-step insights on effective prompt engineering in Python, emphasizing best practices such as crafting specific and concise prompts, incorporating additional context, utilizing multiple comments strategically, and understanding CodeWhisperer's capacity for cross-file context. ⭐ Mastering Knowledge Graph Construction with KeyBERT, HDBSCAN, and Zephyr-7B-Beta: Discover how to leverage LLMs with traditional NLP and ML methods to create knowledge graphs from unstructured text. The author showcases the synergy of KeyBERT, HDBSCAN, and Zephyr-7B-Beta for improved keyword extraction, clustering, and refinement. The guide covers dataset prep, keyword extraction, and LLM integration. ⭐ How to Craft an Open Source Multi-Modal RAG System: Discover building a Retrieval-Augmented Generation (RAG) system with an Open Source Large Language Multi-Modal (LLMM). Learn the integration of ChromeDB and Hugging Face, covering Clip, data storage, and MLLMs for user chat sessions in a detailed, dependency-free guide.  🚀 HackHub: Trending AI Tools⭐ gxnu-zhonglab/odtrack: Efficient video-level tracking pipeline utilizing online token propagation to densely capture contextual relationships and spatio-temporal trajectories across frames.  ⭐ DLYuanGod/TinyGPT-V: Features an efficient Multimodal Large Language Model using small backbones for efficiently incorporating multimodal capabilities into language models. ⭐ intel/intel-extension-for-transformers: Toolkit to accelerate GenAI/LLM performance on Intel platforms, including Gaudi2, CPU, and GPU, seamlessly compressing Transformer-based models, accessing optimized model packages, and using NeuralChat. ⭐ CambioML/pykoi: An open-source Python library for LLMs, enhancing them with RLHF, collecting user feedback, fine-tuning with reinforcement learning, comparing models, and creating RAG chatbots efficiently.  
Read more
  • 0
  • 0
  • 40339
article-image-use-tensorflow-and-nlp-to-detect-duplicate-quora-questions-tutorial
Sunith Shetty
21 Jun 2018
32 min read
Save for later

Use TensorFlow and NLP to detect duplicate Quora questions [Tutorial]

Sunith Shetty
21 Jun 2018
32 min read
This tutorial shows how to build an NLP project with TensorFlow that explicates the semantic similarity between sentences using the Quora dataset. It is based on the work of Abhishek Thakur, who originally developed a solution on the Keras package. This article is an excerpt from a book written by Luca Massaron, Alberto Boschetti, Alexey Grigorev, Abhishek Thakur, and Rajalingappaa Shanmugamani titled TensorFlow Deep Learning Projects. Presenting the dataset The data, made available for non-commercial purposes (https://www.quora.com/about/tos) in a Kaggle competition (https://www.kaggle.com/c/quora-question-pairs) and on Quora's blog (https://data.quora.com/First-Quora-Dataset-Release-Question-Pairs), consists of 404,351 question pairs with 255,045 negative samples (non-duplicates) and 149,306 positive samples (duplicates). There are approximately 40% positive samples, a slight imbalance that won't need particular corrections. Actually, as reported on the Quora blog, given their original sampling strategy, the number of duplicated examples in the dataset was much higher than the non-duplicated ones. In order to set up a more balanced dataset, the negative examples were upsampled by using pairs of related questions, that is, questions about the same topic that are actually not similar. Before starting work on this project, you can simply directly download the data, which is about 55 MB, from its Amazon S3 repository at this link into our working directory. After loading it, we can start diving directly into the data by picking some example rows and examining them. The following diagram shows an actual snapshot of the few first rows from the dataset: Exploring further into the data, we can find some examples of question pairs that mean the same thing, that is, duplicates, as follows:  How does Quora quickly mark questions as needing improvement? Why does Quora mark my questions as needing improvement/clarification before I have time to give it details? Literally within seconds… Why did Trump win the Presidency? How did Donald Trump win the 2016 Presidential Election? What practical applications might evolve from the discovery of the Higgs Boson? What are some practical benefits of the discovery of the Higgs Boson? At first sight, duplicated questions have quite a few words in common, but they could be very different in length. On the other hand, examples of non-duplicate questions are as follows: Who should I address my cover letter to if I'm applying to a big company like Mozilla? Which car is better from a safety persepctive? swift or grand i10. My first priority is safety? Mr. Robot (TV series): Is Mr. Robot a good representation of real-life hacking and hacking culture? Is the depiction of hacker societies realistic? What mistakes are made when depicting hacking in Mr. Robot compared to real-life cyber security breaches or just a regular use of technologies? How can I start an online shopping (e-commerce) website? Which web technology is best suited for building a big e-commerce website? Some questions from these examples are clearly not duplicated and have few words in common, but some others are more difficult to detect as unrelated. For instance, the second pair in the example might turn to be appealing to some and leave even a human judge uncertain. The two questions might mean different things: why versus how, or they could be intended as the same from a superficial examination. Looking deeper, we may even find more doubtful examples and even some clear data mistakes; we surely have some anomalies in the dataset (as the Quota post on the dataset warned) but, given that the data is derived from a real-world problem, we can't do anything but deal with this kind of imperfection and strive to find a robust solution that works. At this point, our exploration becomes more quantitative than qualitative and some statistics on the question pairs are provided here: Average number of characters in question1 59.57 Minimum number of characters in question1 1 Maximum number of characters in question1 623 Average number of characters in question2 60.14 Minimum number of characters in question2 1 Maximum number of characters in question2 1169 Question 1 and question 2 are roughly the same average characters, though we have more extremes in question 2. There also must be some trash in the data, since we cannot figure out a question made up of a single character. We can even get a completely different vision of our data by plotting it into a word cloud and highlighting the most common words present in the dataset: Figure 1: A word cloud made up of the most frequent words to be found in the Quora dataset The presence of word sequences such as Hillary Clinton and Donald Trump reminds us that the data was gathered at a certain historical moment and that many questions we can find inside it are clearly ephemeral, reasonable only at the very time the dataset was collected. Other topics, such as programming language, World War, or earn money could be longer lasting, both in terms of interest and in the validity of the answers provided. After exploring the data a bit, it is now time to decide what target metric we will strive to optimize in our project. Throughout the article, we will be using accuracy as a metric to evaluate the performance of our models. Accuracy as a measure is simply focused on the effectiveness of the prediction, and it may miss some important differences between alternative models, such as discrimination power (is the model more able to detect duplicates or not?) or the exactness of probability scores (how much margin is there between being a duplicate and not being one?). We chose accuracy based on the fact that this metric was the one decided on by Quora's engineering team to create a benchmark for this dataset (as stated in this blog post of theirs: https://engineering.quora.com/Semantic-Question-Matching-with-Deep-Learning). Using accuracy as the metric makes it easier for us to evaluate and compare our models with the one from Quora's engineering team, and also several other research papers. In addition, in a real-world application, our work may simply be evaluated on the basis of how many times it is just right or wrong, regardless of other considerations. We can now proceed furthermore in our projects with some very basic feature engineering to start with. Starting with basic feature engineering Before starting to code, we have to load the dataset in Python and also provide Python with all the necessary packages for our project. We will need to have these packages installed on our system (the latest versions should suffice, no need for any specific package version): Numpy pandas fuzzywuzzy python-Levenshtein scikit-learn gensim pyemd NLTK As we will be using each one of these packages in the project, we will provide specific instructions and tips to install them. For all dataset operations, we will be using pandas (and Numpy will come in handy, too). To install numpy and pandas: pip install numpy pip install pandas The dataset can be loaded into memory easily by using pandas and a specialized data structure, the pandas dataframe (we expect the dataset to be in the same directory as your script or Jupyter notebook): import pandas as pd import numpy as np data = pd.read_csv('quora_duplicate_questions.tsv', sep='t') data = data.drop(['id', 'qid1', 'qid2'], axis=1) We will be using the pandas dataframe denoted by data , and also when we work with our TensorFlow model and provide input to it. We can now start by creating some very basic features. These basic features include length-based features and string-based features: Length of question1 Length of question2 Difference between the two lengths Character length of question1 without spaces Character length of question2 without spaces Number of words in question1 Number of words in question2 Number of common words in question1 and question2 These features are dealt with one-liners transforming the original input using the pandas package in Python and its method apply: # length based features data['len_q1'] = data.question1.apply(lambda x: len(str(x))) data['len_q2'] = data.question2.apply(lambda x: len(str(x))) # difference in lengths of two questions data['diff_len'] = data.len_q1 - data.len_q2 # character length based features data['len_char_q1'] = data.question1.apply(lambda x: len(''.join(set(str(x).replace(' ', ''))))) data['len_char_q2'] = data.question2.apply(lambda x: len(''.join(set(str(x).replace(' ', ''))))) # word length based features data['len_word_q1'] = data.question1.apply(lambda x: len(str(x).split())) data['len_word_q2'] = data.question2.apply(lambda x: len(str(x).split())) # common words in the two questions data['common_words'] = data.apply(lambda x: len(set(str(x['question1']) .lower().split()) .intersection(set(str(x['question2']) .lower().split()))), axis=1) For future reference, we will mark this set of features as feature set-1 or fs_1: fs_1 = ['len_q1', 'len_q2', 'diff_len', 'len_char_q1', 'len_char_q2', 'len_word_q1', 'len_word_q2', 'common_words'] This simple approach will help you to easily recall and combine a different set of features in the machine learning models we are going to build, turning comparing different models run by different feature sets into a piece of cake. Creating fuzzy features The next set of features are based on fuzzy string matching. Fuzzy string matching is also known as approximate string matching and is the process of finding strings that approximately match a given pattern. The closeness of a match is defined by the number of primitive operations necessary to convert the string into an exact match. These primitive operations include insertion (to insert a character at a given position), deletion (to delete a particular character), and substitution (to replace a character with a new one). Fuzzy string matching is typically used for spell checking, plagiarism detection, DNA sequence matching, spam filtering, and so on and it is part of the larger family of edit distances, distances based on the idea that a string can be transformed into another one. It is frequently used in natural language processing and other applications in order to ascertain the grade of difference between two strings of characters. It is also known as Levenshtein distance, from the name of the Russian scientist, Vladimir Levenshtein, who introduced it in 1965. These features were created using the fuzzywuzzy package available for Python (https://pypi.python.org/pypi/fuzzywuzzy). This package uses Levenshtein distance to calculate the differences in two sequences, which in our case are the pair of questions. The fuzzywuzzy package can be installed using pip3: pip install fuzzywuzzy As an important dependency, fuzzywuzzy requires the Python-Levenshtein package (https://github.com/ztane/python-Levenshtein/), which is a blazingly fast implementation of this classic algorithm, powered by compiled C code. To make the calculations much faster using fuzzywuzzy, we also need to install the Python-Levenshtein package: pip install python-Levenshtein The fuzzywuzzy package offers many different types of ratio, but we will be using only the following: QRatio WRatio Partial ratio Partial token set ratio Partial token sort ratio Token set ratio Token sort ratio Examples of fuzzywuzzy features on Quora data: from fuzzywuzzy import fuzz fuzz.QRatio("Why did Trump win the Presidency?", "How did Donald Trump win the 2016 Presidential Election") This code snippet will result in the value of 67 being returned: fuzz.QRatio("How can I start an online shopping (e-commerce) website?", "Which web technology is best suitable for building a big E-Commerce website?") In this comparison, the returned value will be 60. Given these examples, we notice that although the values of QRatio are close to each other, the value for the similar question pair from the dataset is higher than the pair with no similarity. Let's take a look at another feature from fuzzywuzzy for these same pairs of questions: fuzz.partial_ratio("Why did Trump win the Presidency?", "How did Donald Trump win the 2016 Presidential Election") In this case, the returned value is 73: fuzz.partial_ratio("How can I start an online shopping (e-commerce) website?", "Which web technology is best suitable for building a big E-Commerce website?") Now the returned value is 57. Using the partial_ratio method, we can observe how the difference in scores for these two pairs of questions increases notably, allowing an easier discrimination between being a duplicate pair or not. We assume that these features might add value to our models. By using pandas and the fuzzywuzzy package in Python, we can again apply these features as simple one-liners: data['fuzz_qratio'] = data.apply(lambda x: fuzz.QRatio( str(x['question1']), str(x['question2'])), axis=1) data['fuzz_WRatio'] = data.apply(lambda x: fuzz.WRatio( str(x['question1']), str(x['question2'])), axis=1) data['fuzz_partial_ratio'] = data.apply(lambda x: fuzz.partial_ratio(str(x['question1']), str(x['question2'])), axis=1) data['fuzz_partial_token_set_ratio'] = data.apply(lambda x: fuzz.partial_token_set_ratio(str(x['question1']), str(x['question2'])), axis=1) data['fuzz_partial_token_sort_ratio'] = data.apply(lambda x: fuzz.partial_token_sort_ratio(str(x['question1']), str(x['question2'])), axis=1) data['fuzz_token_set_ratio'] = data.apply(lambda x: fuzz.token_set_ratio(str(x['question1']), str(x['question2'])), axis=1) data['fuzz_token_sort_ratio'] = data.apply(lambda x: fuzz.token_sort_ratio(str(x['question1']), str(x['question2'])), axis=1) This set of features are henceforth denoted as feature set-2 or fs_2: fs_2 = ['fuzz_qratio', 'fuzz_WRatio', 'fuzz_partial_ratio', 'fuzz_partial_token_set_ratio', 'fuzz_partial_token_sort_ratio', 'fuzz_token_set_ratio', 'fuzz_token_sort_ratio'] Again, we will store our work and save it for later use when modeling. Resorting to TF-IDF and SVD features The next few sets of features are based on TF-IDF and SVD. Term Frequency-Inverse Document Frequency (TF-IDF). Is one of the algorithms at the foundation of information retrieval. Here, the algorithm is explained using a formula: You can understand the formula using this notation: C(t) is the number of times a term t appears in a document, N is the total number of terms in the document, this results in the Term Frequency (TF).  ND is the total number of documents and NDt is the number of documents containing the term t, this provides the Inverse Document Frequency (IDF).  TF-IDF for a term t is a multiplication of Term Frequency and Inverse Document Frequency for the given term t: Without any prior knowledge, other than about the documents themselves, such a score will highlight all the terms that could easily discriminate a document from the others, down-weighting the common words that won't tell you much, such as the common parts of speech (such as articles, for instance). If you need a more hands-on explanation of TFIDF, this great online tutorial will help you try coding the algorithm yourself and testing it on some text data: https://stevenloria.com/tf-idf/ For convenience and speed of execution, we resorted to the scikit-learn implementation of TFIDF.  If you don't already have scikit-learn installed, you can install it using pip: pip install -U scikit-learn We create TFIDF features for both question1 and question2 separately (in order to type less, we just deep copy the question1 TfidfVectorizer): from sklearn.feature_extraction.text import TfidfVectorizer from copy import deepcopy tfv_q1 = TfidfVectorizer(min_df=3, max_features=None, strip_accents='unicode', analyzer='word', token_pattern=r'w{1,}', ngram_range=(1, 2), use_idf=1, smooth_idf=1, sublinear_tf=1, stop_words='english') tfv_q2 = deepcopy(tfv_q1) It must be noted that the parameters shown here have been selected after quite a lot of experiments. These parameters generally work pretty well with all other problems concerning natural language processing, specifically text classification. One might need to change the stop word list to the language in question. We can now obtain the TFIDF matrices for question1 and question2 separately: q1_tfidf = tfv_q1.fit_transform(data.question1.fillna("")) q2_tfidf = tfv_q2.fit_transform(data.question2.fillna("")) In our TFIDF processing, we computed the TFIDF matrices based on all the data available (we used the fit_transform method). This is quite a common approach in Kaggle competitions because it helps to score higher on the leaderboard. However, if you are working in a real setting, you may want to exclude a part of the data as a training or validation set in order to be sure that your TFIDF processing helps your model to generalize to a new, unseen dataset. After we have the TFIDF features, we move to SVD features. SVD is a feature decomposition method and it stands for singular value decomposition. It is largely used in NLP because of a technique called Latent Semantic Analysis (LSA). A detailed discussion of SVD and LSA is beyond the scope of this article, but you can get an idea of their workings by trying these two approachable and clear online tutorials: https://alyssaq.github.io/2015/singular-value-decomposition-visualisation/ and https://technowiki.wordpress.com/2011/08/27/latent-semantic-analysis-lsa-tutorial/ To create the SVD features, we again use scikit-learn implementation. This implementation is a variation of traditional SVD and is known as TruncatedSVD. A TruncatedSVD is an approximate SVD method that can provide you with reliable yet computationally fast SVD matrix decomposition. You can find more hints about how this technique works and it can be applied by consulting this web page: http://langvillea.people.cofc.edu/DISSECTION-LAB/Emmie'sLSI-SVDModule/p5module.html from sklearn.decomposition import TruncatedSVD svd_q1 = TruncatedSVD(n_components=180) svd_q2 = TruncatedSVD(n_components=180) We chose 180 components for SVD decomposition and these features are calculated on a TF-IDF matrix: question1_vectors = svd_q1.fit_transform(q1_tfidf) question2_vectors = svd_q2.fit_transform(q2_tfidf) Feature set-3 is derived from a combination of these TF-IDF and SVD features. For example, we can have only the TF-IDF features for the two questions separately going into the model, or we can have the TF-IDF of the two questions combined with an SVD on top of them, and then the model kicks in, and so on. These features are explained as follows. Feature set-3(1) or fs3_1 is created using two different TF-IDFs for the two questions, which are then stacked together horizontally and passed to a machine learning model: This can be coded as: from scipy import sparse # obtain features by stacking the sparse matrices together fs3_1 = sparse.hstack((q1_tfidf, q2_tfidf)) Feature set-3(2), or fs3_2, is created by combining the two questions and using a single TF-IDF: tfv = TfidfVectorizer(min_df=3, max_features=None, strip_accents='unicode', analyzer='word', token_pattern=r'w{1,}', ngram_range=(1, 2), use_idf=1, smooth_idf=1, sublinear_tf=1, stop_words='english') # combine questions and calculate tf-idf q1q2 = data.question1.fillna("") q1q2 += " " + data.question2.fillna("") fs3_2 = tfv.fit_transform(q1q2) The next subset of features in this feature set, feature set-3(3) or fs3_3, consists of separate TF-IDFs and SVDs for both questions: This can be coded as follows: # obtain features by stacking the matrices together fs3_3 = np.hstack((question1_vectors, question2_vectors)) We can similarly create a couple more combinations using TF-IDF and SVD, and call them fs3-4 and fs3-5, respectively. These are depicted in the following diagrams, but the code is left as an exercise for the reader. Feature set-3(4) or fs3-4: Feature set-3(5) or fs3-5: After the basic feature set and some TF-IDF and SVD features, we can now move to more complicated features before diving into the machine learning and deep learning models. Mapping with Word2vec embeddings Very broadly, Word2vec models are two-layer neural networks that take a text corpus as input and output a vector for every word in that corpus. After fitting, the words with similar meaning have their vectors close to each other, that is, the distance between them is small compared to the distance between the vectors for words that have very different meanings. Nowadays, Word2vec has become a standard in natural language processing problems and often it provides very useful insights into information retrieval tasks. For this particular problem, we will be using the Google news vectors. This is a pretrained Word2vec model trained on the Google News corpus. Every word, when represented by its Word2vec vector, gets a position in space, as depicted in the following diagram: All the words in this example, such as Germany, Berlin, France, and Paris, can be represented by a 300-dimensional vector, if we are using the pretrained vectors from the Google news corpus. When we use Word2vec representations for these words and we subtract the vector of Germany from the vector of Berlin and add the vector of France to it, we will get a vector that is very similar to the vector of Paris. The Word2vec model thus carries the meaning of words in the vectors. The information carried by these vectors constitutes a very useful feature for our task. For a user-friendly, yet more in-depth, explanation and description of possible applications of Word2vec, we suggest reading https://www.distilled.net/resources/a-beginners-guide-to-Word2vec-aka-whats-the-opposite-of-canada/, or if you need a more mathematically defined explanation, we recommend reading this paper: http://www.1-4-5.net/~dmm/ml/how_does_Word2vec_work.pdf To load the Word2vec features, we will be using Gensim. If you don't have Gensim, you can install it easily using pip. At this time, it is suggested you also install the pyemd package, which will be used by the WMD distance function, a function that will help us to relate two Word2vec vectors: pip install gensim pip install pyemd To load the Word2vec model, we download the GoogleNews-vectors-negative300.bin.gz binary and use Gensim's load_Word2vec_format function to load it into memory. You can easily download the binary from an Amazon AWS repository using the wget command from a shell: wget -c "https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz" After downloading and decompressing the file, you can use it with the Gensim KeyedVectors functions: import gensim model = gensim.models.KeyedVectors.load_word2vec_format( 'GoogleNews-vectors-negative300.bin.gz', binary=True) Now, we can easily get the vector of a word by calling model[word]. However, a problem arises when we are dealing with sentences instead of individual words. In our case, we need vectors for all of question1 and question2 in order to come up with some kind of comparison. For this, we can use the following code snippet. The snippet basically adds the vectors for all words in a sentence that are available in the Google news vectors and gives a normalized vector at the end. We can call this sentence to vector, or Sent2Vec. Make sure that you have Natural Language Tool Kit (NLTK) installed before running the preceding function: $ pip install nltk It is also suggested that you download the punkt and stopwords packages, as they are part of NLTK: import nltk nltk.download('punkt') nltk.download('stopwords') If NLTK is now available, you just have to run the following snippet and define the sent2vec function: from nltk.corpus import stopwords from nltk import word_tokenize stop_words = set(stopwords.words('english')) def sent2vec(s, model): M = [] words = word_tokenize(str(s).lower()) for word in words: #It shouldn't be a stopword if word not in stop_words: #nor contain numbers if word.isalpha(): #and be part of word2vec if word in model: M.append(model[word]) M = np.array(M) if len(M) > 0: v = M.sum(axis=0) return v / np.sqrt((v ** 2).sum()) else: return np.zeros(300) When the phrase is null, we arbitrarily decide to give back a standard vector of zero values. To calculate the similarity between the questions, another feature that we created was word mover's distance. Word mover's distance uses Word2vec embeddings and works on a principle similar to that of earth mover's distance to give a distance between two text documents. Simply put, word mover's distance provides the minimum distance needed to move all the words from one document to another document. The WMD has been introduced by this paper: KUSNER, Matt, et al. From word embeddings to document distances. In: International Conference on Machine Learning. 2015. p. 957-966 which can be found at http://proceedings.mlr.press/v37/kusnerb15.pdf. For a hands-on tutorial on the distance, you can also refer to this tutorial based on the Gensim implementation of the distance: https://markroxor.github.io/gensim/static/notebooks/WMD_tutorial.html Final Word2vec (w2v) features also include other distances, more usual ones such as the Euclidean or cosine distance. We complete the sequence of features with some measurement of the distribution of the two document vectors: Word mover distance Normalized word mover distance Cosine distance between vectors of question1 and question2 Manhattan distance between vectors of question1 and question2 Jaccard similarity between vectors of question1 and question2 Canberra distance between vectors of question1 and question2 Euclidean distance between vectors of question1 and question2 Minkowski distance between vectors of question1 and question2 Braycurtis distance between vectors of question1 and question2 The skew of the vector for question1 The skew of the vector for question2 The kurtosis of the vector for question1 The kurtosis of the vector for question2 All the Word2vec features are denoted by fs4. A separate set of w2v features consists in the matrices of Word2vec vectors themselves: Word2vec vector for question1 Word2vec vector for question2 These will be represented by fs5: w2v_q1 = np.array([sent2vec(q, model) for q in data.question1]) w2v_q2 = np.array([sent2vec(q, model) for q in data.question2]) In order to easily implement all the different distance measures between the vectors of the Word2vec embeddings of the Quora questions, we use the implementations found in the scipy.spatial.distance module: from scipy.spatial.distance import cosine, cityblock, jaccard, canberra, euclidean, minkowski, braycurtis data['cosine_distance'] = [cosine(x,y) for (x,y) in zip(w2v_q1, w2v_q2)] data['cityblock_distance'] = [cityblock(x,y) for (x,y) in zip(w2v_q1, w2v_q2)] data['jaccard_distance'] = [jaccard(x,y) for (x,y) in zip(w2v_q1, w2v_q2)] data['canberra_distance'] = [canberra(x,y) for (x,y) in zip(w2v_q1, w2v_q2)] data['euclidean_distance'] = [euclidean(x,y) for (x,y) in zip(w2v_q1, w2v_q2)] data['minkowski_distance'] = [minkowski(x,y,3) for (x,y) in zip(w2v_q1, w2v_q2)] data['braycurtis_distance'] = [braycurtis(x,y) for (x,y) in zip(w2v_q1, w2v_q2)] All the features names related to distances are gathered under the list fs4_1: fs4_1 = ['cosine_distance', 'cityblock_distance', 'jaccard_distance', 'canberra_distance', 'euclidean_distance', 'minkowski_distance', 'braycurtis_distance'] The Word2vec matrices for the two questions are instead horizontally stacked and stored away in the w2v variable for later usage: w2v = np.hstack((w2v_q1, w2v_q2)) The Word Mover's Distance is implemented using a function that returns the distance between two questions, after having transformed them into lowercase and after removing any stopwords. Moreover, we also calculate a normalized version of the distance, after transforming all the Word2vec vectors into L2-normalized vectors (each vector is transformed to the unit norm, that is, if we squared each element in the vector and summed all of them, the result would be equal to one) using the init_sims method: def wmd(s1, s2, model): s1 = str(s1).lower().split() s2 = str(s2).lower().split() stop_words = stopwords.words('english') s1 = [w for w in s1 if w not in stop_words] s2 = [w for w in s2 if w not in stop_words] return model.wmdistance(s1, s2) data['wmd'] = data.apply(lambda x: wmd(x['question1'], x['question2'], model), axis=1) model.init_sims(replace=True) data['norm_wmd'] = data.apply(lambda x: wmd(x['question1'], x['question2'], model), axis=1) fs4_2 = ['wmd', 'norm_wmd'] After these last computations, we now have most of the important features that are needed to create some basic machine learning models, which will serve as a benchmark for our deep learning models. The following table displays a snapshot of the available features: Let's train some machine learning models on these and other Word2vec based features. Testing machine learning models Before proceeding, depending on your system, you may need to clean up the memory a bit and free space for machine learning models from previously used data structures. This is done using gc.collect, after deleting any past variables not required anymore, and then checking the available memory by exact reporting from the psutil.virtualmemory function: import gc import psutil del([tfv_q1, tfv_q2, tfv, q1q2, question1_vectors, question2_vectors, svd_q1, svd_q2, q1_tfidf, q2_tfidf]) del([w2v_q1, w2v_q2]) del([model]) gc.collect() psutil.virtual_memory() At this point, we simply recap the different features created up to now, and their meaning in terms of generated features: fs_1: List of basic features fs_2: List of fuzzy features fs3_1: Sparse data matrix of TFIDF for separated questions fs3_2: Sparse data matrix of TFIDF for combined questions fs3_3: Sparse data matrix of SVD fs3_4: List of SVD statistics fs4_1: List of w2vec distances fs4_2: List of wmd distances w2v: A matrix of transformed phrase's Word2vec vectors by means of the Sent2Vec function We evaluate two basic and very popular models in machine learning, namely logistic regression and gradient boosting using the xgboost package in Python. The following table provides the performance of the logistic regression and xgboost algorithms on different sets of features created earlier, as obtained during the Kaggle competition: Feature set Logistic regression accuracy xgboost accuracy Basic features (fs1) 0.658 0.721 Basic features + fuzzy features (fs1 + fs2) 0.660 0.738 Basic features + fuzzy features + w2v features (fs1 + fs2 + fs4) 0.676 0.766 W2v vector features (fs5) * 0.78 Basic features + fuzzy features + w2v features + w2v vector features (fs1 + fs2 + fs4 + fs5) * 0.814 TFIDF-SVD features (fs3-1) 0.777 0.749 TFIDF-SVD features (fs3-2) 0.804 0.748 TFIDF-SVD features (fs3-3) 0.706 0.763 TFIDF-SVD features (fs3-4) 0.700 0.753 TFIDF-SVD features (fs3-5) 0.714 0.759 * = These models were not trained due to high memory requirements. We can treat the performances achieved as benchmarks or baseline numbers before starting with deep learning models, but we won't limit ourselves to that and we will be trying to replicate some of them. We are going to start by importing all the necessary packages. As for as the logistic regression, we will be using the scikit-learn implementation. The xgboost is a scalable, portable, and distributed gradient boosting library (a tree ensemble machine learning algorithm). Initially created by Tianqi Chen from Washington University, it has been enriched with a Python wrapper by Bing Xu, and an R interface by Tong He (you can read the story behind xgboost directly from its principal creator at homes.cs.washington.edu/~tqchen/2016/03/10/story-and-lessons-behind-the-evolution-of-xgboost.html ). The xgboost is available for Python, R, Java, Scala, Julia, and C++, and it can work both on a single machine (leveraging multithreading) and in Hadoop and Spark clusters. Detailed instruction for installing xgboost on your system can be found on this page: github.com/dmlc/xgboost/blob/master/doc/build.md The installation of xgboost on both Linux and macOS is quite straightforward, whereas it is a little bit trickier for Windows users. For this reason, we provide specific installation steps for having xgboost working on Windows: First, download and install Git for Windows (git-for-windows.github.io) Then, you need a MINGW compiler present on your system. You can download it from www.mingw.org according to the characteristics of your system From the command line, execute: $> git clone --recursive https://github.com/dmlc/xgboost $> cd xgboost $> git submodule init $> git submodule update Then, always from the command line, you copy the configuration for 64-byte systems to be the default one: $> copy makemingw64.mk config.mk Alternatively, you just copy the plain 32-byte version: $> copy makemingw.mk config.mk After copying the configuration file, you can run the compiler, setting it to use four threads in order to speed up the compiling process: $> mingw32-make -j4 In MinGW, the make command comes with the name mingw32-make; if you are using a different compiler, the previous command may not work, but you can simply try: $> make -j4 Finally, if the compiler completed its work without errors, you can install the package in Python with: $> cd python-package $> python setup.py install If xgboost has also been properly installed on your system, you can proceed with importing both machine learning algorithms: from sklearn import linear_model from sklearn.preprocessing import StandardScaler import xgboost as xgb Since we will be using a logistic regression solver that is sensitive to the scale of the features (it is the sag solver from https://github.com/EpistasisLab/tpot/issues/292, which requires a linear computational time in respect to the size of the data), we will start by standardizing the data using the scaler function in scikit-learn: scaler = StandardScaler() y = data.is_duplicate.values y = y.astype('float32').reshape(-1, 1) X = data[fs_1+fs_2+fs3_4+fs4_1+fs4_2] X = X.replace([np.inf, -np.inf], np.nan).fillna(0).values X = scaler.fit_transform(X) X = np.hstack((X, fs3_3)) We also select the data for the training by first filtering the fs_1, fs_2, fs3_4, fs4_1, and fs4_2 set of variables, and then stacking the fs3_3 sparse SVD data matrix. We also provide a random split, separating 1/10 of the data for validation purposes (in order to effectively assess the quality of the created model): np.random.seed(42) n_all, _ = y.shape idx = np.arange(n_all) np.random.shuffle(idx) n_split = n_all // 10 idx_val = idx[:n_split] idx_train = idx[n_split:] x_train = X[idx_train] y_train = np.ravel(y[idx_train]) x_val = X[idx_val] y_val = np.ravel(y[idx_val]) As a first model, we try logistic regression, setting the regularization l2 parameter C to 0.1 (modest regularization). Once the model is ready, we test its efficacy on the validation set (x_val for the training matrix, y_val for the correct answers). The results are assessed on accuracy, that is the proportion of exact guesses on the validation set: logres = linear_model.LogisticRegression(C=0.1, solver='sag', max_iter=1000) logres.fit(x_train, y_train) lr_preds = logres.predict(x_val) log_res_accuracy = np.sum(lr_preds == y_val) / len(y_val) print("Logistic regr accuracy: %0.3f" % log_res_accuracy) After a while (the solver has a maximum of 1,000 iterations before giving up converging the results), the resulting accuracy on the validation set will be 0.743, which will be our starting baseline. Now, we try to predict using the xgboost algorithm. Being a gradient boosting algorithm, this learning algorithm has more variance (ability to fit complex predictive functions, but also to overfit) than a simple logistic regression afflicted by greater bias (in the end, it is a summation of coefficients) and so we expect much better results. We fix the max depth of its decision trees to 4 (a shallow number, which should prevent overfitting) and we use an eta of 0.02 (it will need to grow many trees because the learning is a bit slow). We also set up a watchlist, keeping an eye on the validation set for an early stop if the expected error on the validation doesn't decrease for over 50 steps. It is not best practice to stop early on the same set (the validation set in our case) we use for reporting the final results. In a real-world setting, ideally, we should set up a validation set for tuning operations, such as early stopping, and a test set for reporting the expected results when generalizing to new data. After setting all this, we run the algorithm. This time, we will have to wait for longer than we when running the logistic regression: params = dict() params['objective'] = 'binary:logistic' params['eval_metric'] = ['logloss', 'error'] params['eta'] = 0.02 params['max_depth'] = 4 d_train = xgb.DMatrix(x_train, label=y_train) d_valid = xgb.DMatrix(x_val, label=y_val) watchlist = [(d_train, 'train'), (d_valid, 'valid')] bst = xgb.train(params, d_train, 5000, watchlist, early_stopping_rounds=50, verbose_eval=100) xgb_preds = (bst.predict(d_valid) >= 0.5).astype(int) xgb_accuracy = np.sum(xgb_preds == y_val) / len(y_val) print("Xgb accuracy: %0.3f" % xgb_accuracy) The final result reported by xgboost is 0.803 accuracy on the validation set. Building TensorFlow model The deep learning models in this article are built using TensorFlow, based on the original script written by Abhishek Thakur using Keras (you can read the original code at https://github.com/abhishekkrthakur/is_that_a_duplicate_quora_question). Keras is a Python library that provides an easy interface to TensorFlow. Tensorflow has official support for Keras, and the models trained using Keras can easily be converted to TensorFlow models. Keras enables the very fast prototyping and testing of deep learning models. In our project, we rewrote the solution entirely in TensorFlow from scratch anyway. To start, let's import the necessary libraries, in particular, TensorFlow, and let's check its version by printing it: import zipfile from tqdm import tqdm_notebook as tqdm import tensorflow as tf print("TensorFlow version %s" % tf.__version__) At this point, we simply load the data into the df pandas dataframe or we load it from disk. We replace the missing values with an empty string and we set the y variable containing the target answer encoded as 1 (duplicated) or 0 (not duplicated): try: df = data[['question1', 'question2', 'is_duplicate']] except: df = pd.read_csv('data/quora_duplicate_questions.tsv', sep='t') df = df.drop(['id', 'qid1', 'qid2'], axis=1) df = df.fillna('') y = df.is_duplicate.values y = y.astype('float32').reshape(-1, 1) To summarize, we built a model with the help of TensorFlow in order to detect duplicated questions from the Quora dataset. To know more about how to build and train your own deep learning models with TensorFlow confidently, do checkout this book TensorFlow Deep Learning Projects. TensorFlow 1.9.0-rc0 release announced Implementing feedforward networks with TensorFlow How TFLearn makes building TensorFlow models easier
Read more
  • 0
  • 1
  • 40330

article-image-npm-javascript-predictions-for-2019-react-graphql-and-typescript-are-three-technologies-to-learn
Bhagyashree R
10 Dec 2018
3 min read
Save for later

npm JavaScript predictions for 2019: React, GraphQL, and TypeScript are three technologies to learn

Bhagyashree R
10 Dec 2018
3 min read
Based on Laurie Voss’ talk on Node+JS Interactive 2018, on Friday, npm has shared some insights and predictions about JavaScript for 2019. These predictions are aimed to help developers make better technical choices in 2019. Here are the four predictions npm has made: “You will abandon one of your current tools.” In JavaScript, frameworks and tools don’t last and generally enjoy a phase of peak popularity of 3-5 years. This follows a slow decline as developers have to maintain the legacy applications but move to newer frameworks for new work. Mr. Voss said in his talk, “Nothing lasts forever!..Any framework that we see today will have its hay days and then it will have an after-life where it will slowly slowly degrade.” For developers, this essentially means that it is better to keep on learning new frameworks instead of holding on to their current tools too tightly. “Despite a slowdown in growth, React will be the dominant framework in 2019.” Though React’s growth has slowed down in 2018, as compared to 2017, it still continues to dominate the web scene. 60% of npm survey respondents said they are using React. In 2019, npm says that more people will use React for building web applications. As people using it will grow we will have more tutorials, advice, and bug fixes. “You’ll need to learn GraphQL.” The GraphQL client library is showing tremendous popularity and as per npm it is going to be a “technical force to reckon with in 2019.” It was first publicly released in 2015 and it is still too early to put it into production, but going by its growing popularity, developers are recommended to learn its concepts in 2019. npm also predict that developers will see themselves using GraphQL in new projects later in the year and in 2020. “Somebody on your team will bring in TypeScript.” npm’s survey uncovered that 46% of the respondents were using Microsoft’s TypeScript, a typed superset of JavaScript that compiles to plain JavaScript. One of the reason for this major adoption by enthusiasts could be the extra safety TypeScript provides by type-checking. Adopting TypeScript in 2019 could prove really useful, especially if you’re a member of a larger team. Read the detailed report and predictions on npm’s website. 4 key findings from The State of JavaScript 2018 developer survey TypeScript 3.2 released with configuration inheritance and more 7 reasons to choose GraphQL APIs over REST for building your APIs
Read more
  • 0
  • 0
  • 40300
Modal Close icon
Modal Close icon