Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-tools-intypescript
Packt
02 Jan 2017
14 min read
Save for later

Tools inTypeScript

Packt
02 Jan 2017
14 min read
In this article by Nathan Rozentals, author of the book Mastering TypeScript, Second Edition, you will learn how to build enterprise-ready, industrial web applications using TypeScript and leading JavaScript frameworks. In this article, we will cover the following topics: What is TypeScript? The benefits of TypeScript (For more resources related to this topic, see here.) What is TypeScript? TypeScript is both a language and a set of tools to generate JavaScript. It was designed by Anders Hejlsberg at Microsoft (the designer of C#) as an open source project to help developers write enterprise scale JavaScript. TypeScript generates JavaScript – it's as simple as that. Instead of requiring a completely new runtime environment, TypeScript generated JavaScript can reuse all of the existing JavaScript tools, frameworks, and wealth of libraries that are available for JavaScript. The TypeScript language and compiler, however, brings the development of JavaScript closer to a more traditional object-oriented experience. EcmaScript JavaScript as a language has been around for a long time, and is also governed by a language feature standard. The language defined in this standard is called ECMAScript, and each JavaScript interpreter must deliver functions and features that conform to this standard. The definition of this standard helped the growth of JavaScript and the web in general, and allowed websites to render correctly on many different browsers on many different operating systems. The ECMAScript standard was published in 1999 and is known as ECMA-262, third edition. With the popularity of the language, and the explosive growth of internet applications, the ECMAScript standard needed to be revised and updated. This process resulted in a draft specification for ECMAScript, called the fourth edition. Unfortunately, this draft suggested a complete overhaul of the language, and was not well received. Eventually, leaders from Yahoo, Google, and Microsoft tabled an alternate proposal which they called ECMAScript 3.1. This proposal was numbered 3.1, as it was a smaller feature set of the third edition, and sat between edition three and four of the standard. This proposal for language changes tabled earlier was eventually adopted as the fifth edition of the standard, and was called ECMAScript 5. The ECMAScript fourth edition was never published, but it was decided to merge the best features of both the fourth edition and the 3.1 feature set into a sixth edition named ECMAScript Harmony. The TypeScript compiler has a parameter that can switch between different versions of the ECMAScript standard. TypeScript currently supports ECMAScript 3, ECMAScript 5, and ECMAScript 6. When the compiler runs over your TypeScript, it will generate compile errors if the code you are attempting to compile is not valid for that standard. The team at Microsoft has committed to follow the ECMAScript standards in any new versions of the TypeScript compiler, so as new editions are adopted, the TypeScript language and compiler will follow suit. The benefits of TypeScript To give you a flavor of the benefits of TypeScript (and this is by no means the full list), let's have a very quick look at some of the things that TypeScript brings to the table: A compilation step Strong or static typing Type definitions for popular JavaScript libraries Encapsulation Private and public member variable decorators Compiling One of the most frustrating things about JavaScript development is the lack of a compilation step. JavaScript is an interpreted language, and therefore needs to be run in order to test that it is valid. Every JavaScript developer will tell horror stories of hours spent trying to find bugs in their code, only to find that they have missed a stray closing brace { , or a simple comma ,– or even a double quote " where there should have been a single quote ' . Even worse, the real headaches arrive when you misspell a property name, or unwittingly reassign a global variable. TypeScript will compile your code, and generate compilation errors where it finds these sorts of syntax errors. This is obviously very useful, and can help to highlight errors before the JavaScript is run. In large projects, programmers will often need to do large code merges and with today's tools doing automatic merges, it is surprising how often the compiler will pick up these types of errors. While tools to do this sort of syntax checking like JSLint have been around for years, it is obviously beneficial to have these tools integrated into your IDE. Using TypeScript in a continuous integration environment will also fail a build completely when compilation errors are found, further protecting your programmers against these types of bugs. Strong typing JavaScript is not strongly typed. It is a language that is very dynamic, as it allows objects to change their properties and behavior on the fly. As an example of this, consider the following code: var test = "this is a string"; test = 1; test = function(a, b) { return a + b; } On the first line of the preceding code, the variable test is bound to a string. It is then assigned a number, and finally is redefined to be a function that expects two parameters. Traditional object-oriented languages, however, will not allow the type of a variable to change, hence they are called strongly typed languages. While all of the preceding code is valid JavaScript and could be justified, it is quite easy to see how this could cause runtime errors during execution. Imagine that you were responsible for writing a library function to add two numbers, and then another developer inadvertently reassigned your function to subtract these numbers instead. These sorts of errors may be easy to spot in a few lines of code, but it becomes increasingly difficult to find and fix these as your code base and your development team grows. Another feature of strong typing is that the IDE you are working in understands what type of variable you are working with, and can bring better autocomplete or Intellisense options to the fore. TypeScript's syntactic sugar TypeScript introduces a very simple syntax to check the type of an object at compile time. This syntax has been referred to as syntactic sugar, or more formally, type annotations. Consider the following TypeScript code: var test: string = "this is a string"; test = 1; test = function(a, b) { return a + b; } Note on the first line of this code snippet, we have introduced a colon : and a string keyword between our variable and its assignment. This type annotation syntax means that we are setting the type of our variable to be of type string, and that any code that does not adhere to these rules will generate a compile error. Running the preceding code through the TypeScript compiler will generate two errors: hello.ts(3,1): error TS2322: Type 'number' is not assignable to type 'string'. hello.ts(4,1): error TS2322: Type '(a: any, b: any) => any' is not assignable to type 'string'. The first error is fairly obvious. We have specified that the variable test is a string, and therefore attempting to assign a number to it will generate a compile error. The second error is similar to the first, and is, in essence, saying that we cannot assign a function to a string. In this way, the TypeScript compiler introduces strong or static typing to your JavaScript code, giving you all of the benefits of a strongly typed language. TypeScript is therefore described as a superset of JavaScript. Type definitions for popular JavaScript libraries As we have seen, TypeScript has the ability to annotate JavaScript, and bring strong typing to the JavaScript development experience. But how do we strongly type existing JavaScript libraries? The answer is surprisingly simple—by creating a definition file. TypeScript uses files with a .d.ts extension as a sort of header file, similar to languages such as C++, to superimpose strongly typing on existing JavaScript libraries. These definition files hold information that describes each available function, and or variables, along with their associated type annotations. Let's have a quick look at what a definition would look like. As an example, I have lifted a function from the popular Jasmine unit testing framework, called describe: var describe = function(description, specDefinitions) { return jasmine.getEnv().describe(description, specDefinitions); }; Note that this function has two parameters—description and specDefinitions. But JavaScript does not tell us what sort of variables these are. We would need to have a look at the Jasmine documentation to figure out how to call this function. If we head over to http://jasmine.github.io/2.0/introduction.html, we will see an example of how to use this function: describe("A suite", function () { it("contains spec with an expectation", function () { expect(true).toBe(true); }); }); From the documentation, then, we can easily see that the first parameter is a string, and the second parameter is a function. But there is nothing in JavaScript that forces us to conform to this API. As mentioned before, we could easily call this function with two numbers or inadvertently switch the parameters around, sending a function first, and a string second. We will obviously start getting runtime errors if we do this, but TypeScript using a definition file can generate compile-time errors before we even attempt to run this code. Let's have a look at a piece of the jasmine.d.ts definition file: declare function describe( description: string, specDefinitions: () => void ): void; This is the TypeScript definition for the describe function. Firstly, declare function describe tells us that we can use a function called describe, but that the implementation of this function will be provided at runtime. Clearly, the description parameter is strongly typed to a string, and the specDefinitions parameter is strongly typed to be a function that returns void. TypeScript uses the double braces () syntax to declare functions, and the arrow syntax to show the return type of the function. So () => void is a function that does not return anything. Finally, the describe function itself will return void. If our code were to try and pass in a function as the first parameter, and a string as the second parameter (clearly breaking the definition of this function), as shown in the following example: describe(() => { /* function body */}, "description"); TypeScript will generate the following error: hello.ts(11,11): error TS2345: Argument of type '() => void' is not assignable to parameter of type 'string'. This error is telling us that we are attempting to call the describe function with invalid parameters,andit clearly shows that TypeScript will generate errors if we attempt to use external JavaScript libraries incorrectly. DefinitelyTyped Soon after TypeScript was released, Boris Yankov started a Git Hub repository to house definition files, called Definitely Typed (http://definitelytyped.org). This repository has now become the first port of call for integrating external libraries into TypeScript, and it currently holds definitions for over 1,600 JavaScript libraries. Encapsulation One of the fundamental principles of object-oriented programming is encapsulation—the ability to define data, as well as a set of functions that can operate on that data, into a single component. Most programming languages have the concept of a class for this purpose, providing a way to define a template for data and related functions. Let's first take a look at a simple TypeScript class definition: class MyClass { add(x, y) { return x + y; } } varclassInstance = new MyClass(); var result = classInstance.add(1,2); console.log(`add(1,2) returns ${result}`); This code is pretty simple to read and understand. We have created a class, named MyClass, with a simple add function. To use this class we simply create an instance of it, and call the add function with two arguments. JavaScript, unfortunately, does not have a class statement, but instead uses functions to reproduce the functionality of classes. Encapsulation through classes is accomplished by either using the prototype pattern, or by using the closure pattern. Understanding prototypes and the closure pattern, and using them correctly, is considered a fundamental skill when writing enterprise-scale JavaScript. A closure is essentially a function that refers to independent variables. This means that variables defined within a closure function remember the environment in which they were created. This provides JavaScript with a way to define local variables, and provide encapsulation. Writing the MyClass definition in the preceding code, using a closure in JavaScript, would look something like the following: varMyClass = (function () { // the self-invoking function is the // environment that will be remembered // by the closure function MyClass() { // MyClass is the inner function, // the closure } MyClass.prototype.add = function (x, y) { return x + y; }; return MyClass; }()); varclassInstance = new MyClass(); var result = classInstance.add(1, 2); console.log("add(1,2) returns " + result); We start with a variable called MyClass, and assign it to a function that is executed immediately note the })(); syntax near the bottom of the closure definition. This syntax is a common way to write JavaScript in order to avoid leaking variables into the global namespace. We then define a new function named MyClass, and return this new function to the outer calling function. We then use the prototype keyword to inject a new function into the MyClass definition. This function is named add and takes two parameters, returning their sum. The last few lines of the code show how to use this closure in JavaScript. Create an instance of the closure type, and then execute the add function. Running this code will log add(1,2) returns 3 to the console, as expected. Looking at the JavaScript code versus the TypeScript code, we can easily see how simple the TypeScript looks compared to the equivalent JavaScript. Remember how we mentioned that JavaScript programmers can easily misplace a brace {, or a bracket (? Have a look at the last line in the closure definition—})();. Getting one of these brackets or braces wrong can take hours of debugging to find. Public and private accessors A further object oriented principle that is used in Encapsulation is the concept of data hiding that is the ability to have public and private variables. Private variables are meant to be hidden to the user of a particular class as these variables should only be used by the class itself. Inadvertently exposing these variables can easily cause runtime errors. Unfortunately, JavaScript does not have a native way of declaring variables private. While this functionality can be emulated using closures, a lot of JavaScript programmers simply use the underscore character _ to denote a private variable. At runtime though, if you know the name of a private variable you can easily assign a value to it. Consider the following JavaScript code: varMyClass = (function() { function MyClass() { this._count = 0; } MyClass.prototype.countUp = function() { this._count ++; } MyClass.prototype.getCountUp = function() { return this._count; } return MyClass; }()); var test = new MyClass(); test._count = 17; console.log("countUp : " + test.getCountUp()); The MyClass variable is actually a closure with a constructor function, a countUp function, and a getCountUp function. The variable _count is supposed to be a private member variable that is used only within the scope of the closure. Using the underscore naming convention gives the user of this class some indication that the variable is private, but JavaScript will still allow you to manipulate the variable _count. Take a look at the second last line of the code snippet. We are explicitly setting the value of _count to 17 which is allowed by JavaScript, but not desired by the original creator of the class. The output of this code would be countUp : 17. TypeScript, however, introduces the public and private keywords that can be used on class member variables. Trying to access a class member variable that has been marked as private will generate a compile time error. As an example of this, the JavaScript code above can be written in TypeScript, as follows: class CountClass { private _count: number; constructor() { this._count = 0; } countUp() { this._count ++; } getCount() { return this._count; } } varcountInstance = new CountClass() ; countInstance._count = 17; On the second line of our code snippet, we have declared a private member variable named _count. Again, we have a constructor, a countUp, and a getCount function. If we compile this file, the compiler will generate an error: hello.ts(39,15): error TS2341: Property '_count' is private and only accessible within class 'CountClass'. This error is generated because we are trying to access the private variable _count in the last line of the code. The TypeScript compiler, therefore, is helping us to adhere to public and private accessors by generating a compile error when we inadvertently break this rule. Summary In this article, we took a quick look at what TypeScript is and what benefits it can bring to the JavaScript development experience. Resources for Article: Further resources on this subject: Introducing Object Oriented Programmng with TypeScript [article] Understanding Patterns and Architecturesin TypeScript [article] Writing SOLID JavaScript code with TypeScript [article]
Read more
  • 0
  • 0
  • 22189

article-image-linux-programmers-opposed-to-new-code-of-conduct-threaten-to-pull-code-from-project
Melisha Dsouza
25 Sep 2018
6 min read
Save for later

Linux programmers opposed to new Code of Conduct threaten to pull code from project

Melisha Dsouza
25 Sep 2018
6 min read
Facts of the Controversy at hand To “help make the kernel community a welcoming environment to participate in”, Linux accepted a new Code of Conduct earlier this month. This created conflict among the developer community because of the clauses in the CoC. The CoC is derived from a Contributor Covenant , created by Coraline Ada Ehmke, a software developer, an open-source advocate, and an LGBT activist. Just 30 minutes after signing this CoC, Principal kernel contributor- Linus Torvalds sent a mail apologizing for his past behavior and announced temporary break to improve upon his behavior. “This week people in our community confronted me about my lifetime of not understanding emotions. My flippant attacks in emails have been both unprofessional and uncalled for. Especially at times when I made it personal. In my quest for a better patch, this made sense to me. I know now this was not OK and I am truly sorry” This decision of taking a break is speculated among many as a precautionary measure to prevent Torvalds from violating the newly created code of conduct. Controversies took better shape after Caroline’s sarcastic tweet-                                               Source: Twitter The new Linux Code of Conduct is causing huge conflict Linux’s move from its Code of Conflicts to a new Code of Conduct has not been received well by many of its developers. Some have threatened to pull out their blocks of code important to the project to revolt against the change. This could have serious consequences because Linux is one of the most important pieces of open source software in the world. If threats are put into action, large parts of the internet would be left vulnerable to exploits. Applications that use Linux would be like an incomplete Jenga stack that could collapse any minute. Why some Linux developers are opposed to the new code of conduct Here is a summary of developers views on the Code of Conduct that, according to them, justifies their decision: Amending to the CoC would mean good contributors are removed over trivial matters or even events that happened a long time ago–like Larry Garfield, a prominent Drupal contributor who was asked to step down after his sex fetish was made public. There is a lack of proper definitions for punishments, time frames, and what constitutes abuse or harassment, inappropriate behavior and leaves the Code of Conduct wide open for exploitation.  It gives the people charged with enforcement immense power. It could force acceptance of contributions that wouldn’t make the cut if made by cis white males. Developers are concerned that Linux will start accepting patches from anyone and everyone just to keep par with the Code of Conduct. It will no longer depend on the ability of a person to code, but on social parameters like race, color, sex, and gender. Why some developers believe in the new Linux Code of Conduct On the other side of the argument, here are some potential reasons why the CoC will foster social justice: Encouraging an inclusive and safe space for women, LGBTQIA+, and People of Color, who in the absence of the CoC are excluded, harassed, and sometimes even raped by cis white males. The CoC aims to overcome meritocracy, which in many organizations has consistently shown itself to mainly benefit those with privilege, to the exclusion of underrepresented people in technology VA vastmajority of Linux contributors are cis white males. CC’s Code of Conduct would enable the building of a more diverse demographic array of contributors What does this mean to the developer community? Linux includes programmers who are always free to contribute to its open source platform. Contributing good code would help them climb up the ladder and become a ‘maintainer’. The greatest strength of Linux was its flexibility. Developers would contribute to the kernel and be concerned about only a single entity- their code patch. The Linux community would judge the code based on its quality. However, with the new Code of Conduct, critics say this could make passing judgement on code more challenging, For them, the Code of Conduct is a set of rules that expects everyone to be at equal levels in the community. It could mean that certain patches are accepted for fear of contravening the Code of Conduct. Here is what Caroline Ada Ehmke was forthright in her criticism of this view:   Source: Twitter   Clearly, many of the fears of the Code of Conduct’s critics haven’t yet come to pass. What they’re ultimately worried about is that there could be negative consequences. Google Developer Ted Ts’o next on the hit list Earlier this week, activist Sage Sharp, tweeted about Ted Ts'o:                                                      Source: Twitter This perhaps needs some context - the beginning of this argument dates all the way back to 2011 when Ts’o when was a member of the Linux Foundation's technical advisory board-participated in a discussion on the mailing list for the Australian national Linux conference that year, making comments that were later interpreted by Aurora as rape apologism. Using Aurora's piece as a fuse, Google employee Matthew Garrett slammed Ts'o on his beliefs. In 2017, yielding to the demands of SJWs, Google threw out James Damore, an engineer who circulated an internal creed about reverse discrimination in hiring practices. The SJW’s are coming for him and the best way to go forward would be to “take a break”, just like Linus did. As claimed by Caroline, the underlying aim of the CoC was to guide people in behaving in a respectable way and create a positive environment for people irrespective of their race, ethnicity, religion, nationality and political views. However, overlooking this aim, developers are concerned with the loopholes in the CoC. Gain more insights on this news as well as views from members of the Linux community at itfloss. Linux drops Code of Conflict and adopts new Code of Conduct Linus Torvalds is sorry for his ‘hurtful behavior’, is taking ‘a break (from the Linux community) to get help’ NSA researchers present security improvements for Zephyr and Fucshia at Linux Security Summit 2018
Read more
  • 0
  • 0
  • 22176

article-image-10-machine-learning-tools-to-look-out-for-in-2018
Amey Varangaonkar
26 Dec 2017
7 min read
Save for later

10 Machine Learning Tools to watch in 2018

Amey Varangaonkar
26 Dec 2017
7 min read
2017 has been a wonderful year for Machine Learning. Developing smart, intelligent models has now become easier than ever thanks to the extensive research into and development of newer and more efficient tools and frameworks. While the likes of Tensorflow, Keras, PyTorch and some more have ruled the roost in 2017 as the top machine learning and deep learning libraries, 2018 promises to be even more exciting with a strong line-up of open source and enterprise tools ready to take over - or at least compete with - the current lot. In this article, we take a look at 10 such tools and frameworks which are expected to make it big in 2018. Amazon Sagemaker One of the major announcements in the AWS re:Invent 2017 was the general availability of Amazon Sagemaker - a new framework that eases the building and deployment of machine learning models on the cloud. This service will be of great use to developers who don’t have a deep exposure to machine learning, by giving them a variety of pre-built development environments, based on the popular Jupyter notebook format. Data scientists looking to build effective machine learning systems on AWS and to fine-tune their performance without spending a lot of time will also find this service useful. DSSTNE Yet another offering by Amazon, DSSTNE (popularly called as Destiny) is an open source library for developing machine learning models. It’s primary strength lies in the fact that it can be used to train and deploy recommendation models which work with sparse inputs. The models developed using DSSTNE can be trained to use multiple GPUs, are scalable and are optimized for fast performance. Boasting close to 4000 stars on GitHub, this library is yet another tool to look out for in 2018! Azure Machine Learning Workbench Way back in 2014, Microsoft put Machine Learning and AI capabilities on the cloud by releasing Azure Machine Learning. However, this was strictly a cloud-only service. During the Ignite 2017 conference held in September, Microsoft announced the next generation of Machine Learning on Azure - bringing machine learning capabilities to the organizations through their Azure Machine Learning Workbench. Azure ML Workbench is a cross-platform client which can run on both Windows and Apple machines. It is tailor-made for data scientists and machine learning developers who want to perform their data manipulation and wrangling tasks. Built for scalability, users can get intuitive insights from a broad range of data sources and use them for their data modeling tasks. Neon Way back in 2016, Intel announced their intentions to become a major player in the AI market with the $350 million acquisition of Nervana, an AI startup which had been developing both hardware and software for effective machine learning. With Neon, they now have a fast, high-performance deep learning framework designed specifically to run on top of the recently announced Nervana Neural Network Processor. Designed for ease of use and supporting integration with the iPython notebook, Neon supports training of common deep learning models such as CNN, RNN, LSTM and others. The framework is showing signs of continuous improvement and with over 3000 stars on GitHub, Neon looks set to challenge the major league of deep learning libraries in the years to come. Microsoft DMLT One of the major challenges with machine learning for enterprises is the need to scale out the models quickly, without compromising on the performance while minimising significant resource consumption. Microsoft’s Distributed Machine Learning framework is designed to do just that. Open sourced by Microsoft so that it can receive a much wider support from the community, DMLT allows machine learning developers and data scientists to take their single-machine algorithms and scale them out to build high performance distributed models. DMLT mostly focuses on distributed machine learning algorithms and allows you to perform tasks such as word embedding, sampling, and gradient boosting with ease. The framework does not have support for training deep learning models yet, however, we can expect this capability to be added to the framework very soon. Google Cloud Machine Learning Engine Considered to be Google’s premium machine learning offering, the Cloud Machine Learning Engine allows you to build machine learning models on all kinds of data with relative ease. Leveraging the popular Tensorflow machine learning framework, this platform can be used to perform predictive analytics at scale. It also lets you fine-tune and optimize the performance of your machine learning models using the popular HyperTune feature. With a serverless architecture supporting automated monitoring, provisioning and scaling, the Machine Learning Engine ensures you only have to worry about the kind of machine learning models you want to train. This feature is especially useful for machine learning developers looking to build large-scale models on the go. Apple Core ML Developed by Apple to help iOS developers build smarter applications, the Core ML framework is what makes Siri smarter. It takes advantage of both CPU and GPU capabilities to allow the developers to build different kinds of machine learning and deep learning models, which can then be integrated seamlessly into the iOS applications. Core ML supports all popularly used machine learning algorithms such as decision trees, Support Vector Machines, linear models and more. Targeting a variety of real-world use-cases such as natural language processing, computer vision and more, Core ML’s capabilities make it possible to analyze data on the Apple devices on the go, without having to import to the models for learning. Apple Turi Create In many cases, the iOS developers want to customize the machine learning models they want to integrate into their apps. For this, Apple has come up with Turi Create. This library allows you to focus on the task at hand rather than deciding which algorithm to use. You can be flexible in terms of the data set, the scale at which the model needs to operate and what platform the models need to be deployed to. Turi Create comes in very handy for building custom models for recommendations, image processing, text classification and many more tasks. All you need is some knowledge of Python to get started! Convnetjs Move over supercomputers and clusters of machines, deep learning is well and truly here - on your web browsers! You can now train your advanced machine learning and deep learning models directly on your browser, without needing a CPU or a GPU, using the popular Javascript-based Convnetjs library. Originally written by Andrej Karpathy, the current director of AI at Tesla, the library has since been open sourced and extended by the contributions of the community. You can easily train deep neural networks and even reinforcement learning models on your browser directly, powered by this very unique and useful library. This library is suited for those who do not wish to purchase serious hardware for training computationally-intensive models. With close to 9000 stars on GitHub, Convnetjs has been one of the rising stars in 2017 and is quickly becoming THE go-to library for deep learning. BigML BigML is a popular machine learning company that provides an easy to use platform for developing machine learning models. Using BigML’s REST API, you can seamlessly train your machine learning models on their platform. It allows you to perform different tasks such as anomaly detection, time series forecasting, and build apps that perform real-time predictive analytics. With BigML, you can deploy your models on-premise or on the cloud, giving you the flexibility of selecting the kind of environment you need to run your machine learning models. True to their promise, BigML really do make ‘machine learning beautifully simple for everyone’. So there you have it! With Microsoft, Amazon, and Google all fighting for supremacy in the AI space, 2018 could prove to be a breakthrough year for developments in Artificial Intelligence. Add to this mix the various open source libraries that aim to simplify machine learning for the users, and you get a very interesting list of tools and frameworks to keep a tab on. The exciting thing about all this is - all of them possess the capability to become the next TensorFlow and cause the next AI disruption.  
Read more
  • 0
  • 0
  • 22150

article-image-experts-discuss-dark-patterns-and-deceptive-ui-designs-what-are-they-what-do-they-do-how-do-we-stop-them
Sugandha Lahoti
29 Jun 2019
12 min read
Save for later

Experts discuss Dark Patterns and deceptive UI designs: What are they? What do they do? How do we stop them?

Sugandha Lahoti
29 Jun 2019
12 min read
Dark patterns are often used online to deceive users into taking actions they would otherwise not take under effective, informed consent. Dark patterns are generally used by shopping websites, social media platforms, mobile apps and services as a part of their user interface design choices. Dark patterns can lead to financial loss, tricking users into giving up vast amounts of personal data, or inducing compulsive and addictive behavior in adults and children. Using dark patterns is unambiguously unlawful in the United States (under Section 5 of the Federal Trade Commission Act and similar state laws), the European Union (under the Unfair Commercial Practices Directive and similar member state laws), and numerous other jurisdictions. Earlier this week, at the Russell Senate Office Building, a panel of experts met to discuss the implications of Dark patterns in the session, Deceptive Design and Dark Patterns: What are they? What do they do? How do we stop them? The session included remarks from Senator. Mark Warner and Deb Fischer, sponsors of the DETOUR Act, and a panel of experts including Tristan Harris (Co-Founder and Executive Director, Center for Humane Technology). The entire panel of experts included: Tristan Harris (Co-Founder and Executive Director, Center for Humane Technology) Rana Foroohar (Global Business Columnist and Associate Editor, Financial Times) Amina Fazlullah (Policy Counsel, Common Sense Media) Paul Ohm (Professor of Law and Associate Dean, Georgetown Law School), also the moderator Katie McInnis (Policy Counsel, Consumer Reports) Marshall Erwin (Senior Director of Trust & Security, Mozilla) Arunesh Mathur (Dept. of Computer Science, Princeton University) Dark patterns are growing in social media platforms, video games, shopping websites, and are increasingly used to target children The expert session was inaugurated by Arunesh Mathur (Dept. of Computer Science, Princeton University) who talked about his new study by researchers from Princeton University and the University of Chicago. The study suggests that shopping websites are abundant with dark patterns that rely on consumer deception. The researchers conducted a large-scale study, analyzing almost 53K product pages from 11K shopping websites to characterize and quantify the prevalence of dark patterns. They so discovered 1,841 instances of dark patterns on shopping websites, which together represent 15 types of dark patterns. One of the dark patterns was Sneak into Website, which adds additional products to users’ shopping carts without their consent. For example, you would buy a bouquet on a website and the website without your consent would add a greeting card in the hopes that you will actually purchase it. Katie McInnis agreed and added that Dark patterns not only undermine the choices that are available to users on social media and shopping platforms but they can also cost users money. User interfaces are sometimes designed to push a user away from protecting their privacy, making it tough to evaluate them. Amina Fazlullah, Policy Counsel, Common Sense Media said that dark patterns are also being used to target children. Manipulative apps use design techniques to shame or confuse children into in-app purchases or trying to keep them on the app for longer. Children mostly are unable to discern these manipulative techniques. Sometimes the screen will have icons or buttons that will appear to be a part of game play and children will click on them not realizing that they're either being asked to make a purchase or being shown an ad or being directed onto another site. There are games which ask for payments or microtransactions to continue the game forward. Mozilla uses product transparency to curb Dark patterns Marshall Erwin, Senior Director of Trust & Security at Mozilla talked about the negative effects of dark patterns and how they make their own products at Mozilla more transparent.  They have a set of checks and principles in place to avoid dark patterns. No surprises: If users were to figure out or start to understand exactly what is happening with the browser, it should be consistent with their expectations. If the users are surprised, this means browsers need to make a change either by stopping the activity entirely or creating additional transparency that helps people understand. Anti-tracking technology: Cross-site tracking is one of the most pervasive and pernicious dark patterns across the web today that is enabled by cookies. Browsers should take action to decrease the attack surface in the browser and actively protect people from those patterns online.  Mozilla and Apple have introduced anti tracking technology to actively intervene to protect people from the diverse parties that are probably not trustworthy. Detour Act by Senators Warner and Fisher In April, Warner and Fischer had introduced the Deceptive Experiences To Online Users Reduction (DETOUR) Act, a bipartisan legislation to prohibit large online platforms from using dark patterns to trick consumers into handing over their personal data. This act focuses on the activities of large online service providers (over a hundred million users visiting in a given month). Under this act you cannot use practices that trick users into obtaining information or consenting. You will experience new controls about conducting ‘psychological experiments on your users’ and you will no longer be able to target children under 13 with the goal of hooking them into your service. It extends additional rulemaking and enforcement abilities to the Federal Trade Commission. “Protecting users personal data and user autonomy online are truly bipartisan issues”: Senator Mark Warner In his presentation, Warner talked about how 2019 is the year when we need to recognize dark patterns and their ongoing manipulation of American consumers.  While we've all celebrated the benefits that communities have brought from social media, there is also an enormous dark underbelly, he says. It is important that Congress steps up and we play a role as senators such that Americans and their private data is not misused or manipulated going forward. Protecting users personal data and user autonomy online are truly bipartisan issues. This is not a liberal versus conservative, it's much more a future versus past and how we get this future right in a way that takes advantage of social media tools but also put some of the appropriate constraints in place. He says that the driving notion behind the Detour act is that users should have the choice and autonomy when it comes to their personal data. When a company like Facebook asks you to upload your phone contacts or some other highly valuable data to their platform, you ought to have a simple choice yes or no. Companies that run experiments on you without your consent are coercive and Detour act aims to put appropriate protections in place that defend user's ability to make informed choices. In addition to prohibiting large online platforms from using dark patterns to trick consumers into handing over their personal data, the bill would also require informed consent for behavior experimentation. In the process, the bill will be sending a clear message to the platform companies and the FTC that they are now in the business of preserving user's autonomy when it comes to the use of their personal data. The goal, Warner says, is simple - to bring some transparency to what remains a very opaque market and give consumers the tools they need to make informed choices about how and when to share their personal information. “Curbing the use of dark patterns will be foundational to increasing trust online” : Senator Deb Fischer Fischer argued that tech companies are increasingly tailoring users’ online experiences in ways that are more granular. On one hand, she says, you get a more personalized user experience and platforms are more responsive, however it's this variability that allows companies to take that design just a step too far. Companies are constantly competing for users attention and this increases the motivation for a more intrusive and invasive user design. The ability for online platforms to guide the visual interfaces that billions of people view is an incredible influence. It forces us to assess the impact of design on user privacy and well-being. Fundamentally the detour act would prohibit large online platforms from purposely using deceptive user interfaces - dark patterns. The detour act would provide a better accountability system for improved transparency and autonomy online. The legislation would take an important step to restore the hidden options. It would give users a tool to get out of the maze that coaxes you to just click on ‘I agree’. A privacy framework that involves consent cannot function properly if it doesn't ensure the user interface presents fair and transparent options. The detour act would enable the creation of a professional standards body which can register with the Federal Trade Commission. This would serve as a self regulatory body to develop best practices for UI design with the FTC as a backup. She adds, “We need clarity for the enforcement of dark patterns that don't directly involve our wallets. We need policies that place value on user choice and personal data online. We need a stronger mechanism to protect the public interest when the goal for tech companies is to make people engage more and more. User consent remains weakened by the presence of dark patterns and unethical design. Curbing the use of dark patterns will be foundational to increasing trust online. The detour act does provide a key step in getting there.” “The DETOUR act is calling attention to asymmetry and preventing deceptive asymmetry”: Tristan Harris Tristan says that companies are now competing not on manipulating your immediate behavior but manipulating and predicting the future. For example, Facebook has something called loyalty prediction which allows them to sell to an advertiser the ability to predict when you're going to become disloyal to a brand. It can sell that opportunity to another advertiser before probably you know you're going to switch. The DETOUR act is a huge step in the right direction because it's about calling attention to asymmetry and preventing deceptive asymmetry. We need a new relationship for this  asymmetric power by having a duty of care. It’s about treating asymmetrically powerful technologies to be in the service of the systems that they are supposed to protect. He says, we need to switch to a regenerative energy economy that actually treats attention as sacred and not directly tying profit to user extraction. Top questions raised by the panel and online viewers Does A/B testing result in dark patterns? Dark patterns are often a result of A/B testing right where a designer may try things that lead to better engagement or maybe nudge users in a way where the company benefits. However, A/B testing isn't the problem, it’s the intention of how A/B testing is being used. Companies and other organizations should have an oversight on the different experiments that they are conducting to see if A/B testing is actually leading to some kind of concrete harm. The challenge in the space is drawing a line about A/B testing features and optimizing for engagement and decreasing friction. Are consumers smart enough to tackle dark patterns on their own or do we need a legislation? It's well established that for children whose brains are just developing, they're unable to discern these types of deceptive techniques so especially for kids, these types of practices should be banned. For vulnerable families who are juggling all sorts of concerns around income and access to jobs and transportation and health care, putting this on their plate as well is just unreasonable. Dark patterns are deployed for an array of opaque reasons the average user will never recognize. From a consumer perspective, going through and identifying dark pattern techniques--that these platform companies have spent hundreds of thousands  of dollars developing to be as opaque and as tricky as possible--is an unrealistic expectation put on consumers. This is why the DETOUR act and this type of regulation are absolutely necessary and the only way forward. What is it about the largest online providers that make us want to focus on them first or only? Is it their scale or do they have more powerful dark patterns? Is it because they're just harming more people or is it politics? Sometimes larger companies stay wary of indulging in dark patterns because they have a greater risk in terms of getting caught and the PR backlash. However, they do engage in manipulative practices and that warrants a lot of attention. Moreover, targeting bigger companies is just one part of a more comprehensive privacy enforcement environment. Hitting companies that have a large number of users is also great for consumer engagement.  Obviously there is a need to target more broadly but this is a starting point. If Facebook were to suddenly reclass itself and its advertising business model, would you still trust them? No, the leadership that's in charge now for Facebook can not be trusted, especially the organizational cultures that have been building. There are change efforts going on inside of Google and Facebook right now but it’s getting gridlocked. Even if employees want to see policies being changed, they still have bonus structures and employee culture to keep in mind. We recommend you to go through the full hearing here. You can read more about the Detour Act here. U.S. senators introduce a bipartisan bill that bans social media platforms from using ‘dark patterns’ to trick its users. How social media enabled and amplified the Christchurch terrorist attack A new study reveals how shopping websites use ‘dark patterns’ to deceive you into buying things you may not want
Read more
  • 0
  • 0
  • 22147

article-image-basics-image-histograms-opencv
Packt
12 Oct 2016
11 min read
Save for later

Basics of Image Histograms in OpenCV

Packt
12 Oct 2016
11 min read
In this article by Samyak Datta, author of the book Learning OpenCV 3 Application Development we are going to focus our attention on a different style of processing pixel values. The output of the techniques, which would comprise our study in the current article, will not be images, but other forms of representation for images, namely image histograms. We have seen that a two-dimensional grid of intensity values is one of the default forms of representing images in digital systems for processing as well as storage. However, such representations are not at all easy to scale. So, for an image with a reasonably low spatial resolution, say 512 x 512 pixels, working with a two-dimensional grid might not pose any serious issues. However, as the dimensions increase, the corresponding increase in the size of the grid may start to adversely affect the performance of the algorithms that work with the images. A primary advantage that an image histogram has to offer is that the size of a histogram is a constant that is independent of the dimensions of the image. As a consequence of this, we are guaranteed that irrespective of the spatial resolution of the images that we are dealing with, the algorithms that power our solutions will have to deal with a constant amount of data if they are working with image histograms. (For more resources related to this topic, see here.) Each descriptor captures some particular aspects or features of the image to construct its own form of representation. One of the common pitfalls of using histograms as a form of image representation as compared to its native form of using the entire two-dimensional grid of values is loss of information. A full-fledged image representation using pixel intensity values for all pixel locations naturally consists of all the information that you would need to reconstruct a digital image. However, the same cannot be said about histograms. When we study about image histograms in detail, we'll get to see exactly what information do we stand to lose. And this loss in information is prevalent across all forms of image descriptors. The basics of histograms At the outset, we will briefly explain the concept of a histogram. Most of you might already know this from your lessons on basic statistics. However, we will reiterate this for the sake of completeness. Histogram is a form of data representation technique that relies on an aggregation of data points. The data is aggregated into a set of predefined bins that are represented along the x axis, and the number of data points that fall within each of the bins make up the corresponding counts on the y axis. For example, let's assume that our data looks something like the following: D={2,7,1,5,6,9,14,11,8,10,13} If we define three bins, namely Bin_1 (1 - 5), Bin_2 (6 - 10), and Bin_3 (11 - 15), then the histogram corresponding to our data would look something like this: Bins Frequency Bin_1 (1 - 5) 3 Bin_2 (6 - 10) 5 Bin_3 (11 - 15) 3 What this histogram data tells us is that we have three values between 1 and 5, five between 6 and 10, and three again between 11 and 15. Note that it doesn't tell us what the values are, just that some n values exist in a given bin. A more familiar visual representation of the histogram in discussion is shown as follows: As you can see, the bins have been plotted along the x axis and their corresponding frequencies along the y axis. Now, in the context of images, how is a histogram computed? Well, it's not that difficult to deduce. Since the data that we have comprise pixel intensity values, an image histogram is computed by plotting a histogram using the intensity values of all its constituent pixels. What this essentially means is that the sequence of pixel intensity values in our image becomes the data. Well, this is in fact the simplest kind of histogram that you can compute using the information available to you from the image. Now, coming back to image histograms, there are some basic terminologies (pertaining to histograms in general) that you need to be aware of before you can dip your hands into code. We have explained them in detail here: Histogram size: The histogram size refers to the number of bins in the histogram. Range: The range of a histogram is the range of data that we are dealing with. The range of data as well as the histogram size are both important parameters that define a histogram. Dimensions: Simply put, dimensions refer to the number of the type of items whose values we aggregate in the histogram bins. For example, consider a grayscale image. We might want to construct a histogram using the pixel intensity values for such an image. This would be an example of a single-dimensional histogram because we are just interested in aggregating the pixel intensity values and nothing else. The data, in this case, is spread over a range of 0 to 255. On account of being one-dimensional, such histograms can be represented graphically as 2D plots—one-dimensional data (pixel intensity values) being plotted on the x axis (in the form of bins) along with the corresponding frequency counts along the y axis. We have already seen an example of this before. Now, imagine a color image with three channels: red, green, and blue. Let's say that we want to plot a histogram for the intensities in the red and green channels combined. This means that our data now becomes a pair of values (r, g). A histogram that is plotted for such data will have a dimensionality of 2. The plot for such a histogram will be a 3D plot with the data bins covering the x and y axes and the frequency counts plotted along the z axis. Now that we have discussed the theoretical aspects of image histograms in detail, let's start thinking along the lines of code. We will start with the simplest (and in fact the most ubiquitous) design of image histograms. The range of our data will be from 0 to 255 (both inclusive), which means that all our data points will be integers that fall within the specified range. Also, the number of data points will equal the number of pixels that make up our input image. The simplicity in design comes from the fact that we fix the size of the histogram (the number of bins) as 256. Now, take a moment to think about what this means. There are 256 different possible values that our data points can take and we have a separate bin corresponding to each one of those values. So such an image histogram will essentially depict the 256 possible intensity values along with the counts of the number of pixels in the image that are colored with each of the different intensities. Before taking a peek at what OpenCV has to offer, let's try to implement such a histogram on our own! We define a function named computeHistogram() that takes the grayscale image as an input argument and returns the image histogram. From our earlier discussions, it is evident that the histogram must contain 256 entries (for the 256 bins): one for each integer between 0 and 255. The value stored in the histogram corresponding to each of the 256 entries will be the count of the image pixels that have a particular intensity value. So, conceptually, we can use an array for our implementation such that the value stored in the histogram [ i ] (for 0≤i≤255) will be the count of the number of pixels in the image having the intensity of i. However, instead of using a C++ array, we will comply with the rules and standards followed by OpenCV and represent the histogram as a Mat object. We have already seen that a Mat object is nothing but a multidimensional array store. The implementation is outlined in the following code snippet: Mat computeHistogram(Mat input_image) { Mat histogram = Mat::zeros(256, 1, CV_32S); for (int i = 0; i < input_image.rows; ++i) { for (int j = 0; j < input_image.cols; ++j) { int binIdx = (int) input_image.at<uchar>(i, j); histogram.at<int>(binIdx, 0) += 1; } } return histogram; } As you can see, we have chosen to represent the histogram as a 256-element-column-vector Mat object. We iterate over all the pixels in the input image and keep on incrementing the corresponding counts in the histogram (which had been initialized to 0). As per our description of the image histogram properties, it is easy to see that the intensity value of any pixel is the same as the bin index that is used to index into the appropriate histogram bin to increment the count. Having such an implementation ready, let's test it out with the help of an actual image. The following code demonstrates a main() function that reads an input image, calls the computeHistogram() function that we have defined just now, and displays the contents of the histogram that is returned as a result: int main() { Mat input_image = imread("/home/samyak/Pictures/lena.jpg", IMREAD_GRAYSCALE); Mat histogram = computeHistogram(input_image); cout << "Histogram...n"; for (int i = 0; i < histogram.rows; ++i) cout << i << " : " << histogram.at<int>(i, 0) << "n"; return 0; } We have used the fact that the histogram that is returned from the function will be a single column Mat object. This makes the code that displays the contents of the histogram much cleaner. Histograms in OpenCV We have just seen the implementation of a very basic and minimalistic histogram using the first principles in OpenCV. The image histogram was basic in the sense that all the bins were uniform in size and comprised only a single pixel intensity. This made our lives simple when we designed our code for the implementation; there wasn't any need to explicitly check the membership of a data point (the intensity value of a pixel) with all the bins of our histograms. However, we know that a histogram can have bins whose sizes span more than one. Can you think of the changes that we might need to make in the code that we had written just now to accommodate for bin sizes larger than 1? If this change seems doable to you, try to figure out how to incorporate the possibility of non-uniform bin sizes or multidimensional histograms. By now, things might have started to get a little overwhelming to you. No need to worry. As always, OpenCV has you covered! The developers at OpenCV have provided you with a calcHist() function whose sole purpose is to calculate the histograms for a given set of arrays. By arrays, we refer to the images represented as Mat objects, and we use the term set because the function has the capability to compute multidimensional histograms from the given data: Mat computeHistogram(Mat input_image) { Mat histogram; int channels[] = { 0 }; int histSize[] = { 256 }; float range[] = { 0, 256 }; const float* ranges[] = { range }; calcHist(&input_image, 1, channels, Mat(), histogram, 1, histSize, ranges, true, false); return histogram; } Before we move on to an explanation of the different parameters involved in the calcHist() function call, I want to bring your attention to the abundant use of arrays in the preceding code snippet. Even arguments as simple as histogram sizes are passed to the function in the form of arrays rather than integer values, which at first glance seem quite unnecessary and counter-intuitive. The usage of arrays is due to the fact that the implementation of calcHist() is equipped to handle multidimensional histograms as well, and when we are dealing with such multidimensional histogram data, we require multiple parameters to be passed, one for each dimension. This would become clearer once we demonstrate an example of calculating multidimensional histograms using the calcHist() function. For the time being, we just wanted to clear the immediate confusion that might have popped up in your minds upon seeing the array parameters. Here is a detailed list of the arguments in the calcHist() function call: Source images Number of source images Channel indices Mask Dimensions (dims) Histogram size Ranges Uniform flag Accumulate flag The last couple of arguments (the uniform and accumulate flags) have default values of true and false, respectively. Hence, the function call that you have seen just now can very well be written as follows: calcHist(&input_image, 1, channels, Mat(), histogram, 1, histSize, ranges); Summary Thus in this article we have successfully studied fundamentals of using histograms in OpenCV for image processing. Resources for Article: Further resources on this subject: Remote Sensing and Histogram [article] OpenCV: Image Processing using Morphological Filters [article] Learn computer vision applications in Open CV [article]
Read more
  • 0
  • 0
  • 22139

article-image-task-execution-asio
Packt
11 Aug 2015
20 min read
Save for later

Task Execution with Asio

Packt
11 Aug 2015
20 min read
In this article by Arindam Mukherjee, the author of Learning Boost C++ Libraries, we learch how to execute a task using Boost Asio (pronounced ay-see-oh), a portable library for performing efficient network I/O using a consistent programming model. At its core, Boost Asio provides a task execution framework that you can use to perform operations of any kind. You create your tasks as function objects and post them to a task queue maintained by Boost Asio. You enlist one or more threads to pick these tasks (function objects) and invoke them. The threads keep picking up tasks, one after the other till the task queues are empty at which point the threads do not block but exit. (For more resources related to this topic, see here.) IO Service, queues, and handlers At the heart of Asio is the type boost::asio::io_service. A program uses the io_service interface to perform network I/O and manage tasks. Any program that wants to use the Asio library creates at least one instance of io_service and sometimes more than one. In this section, we will explore the task management capabilities of io_service. Here is the IO Service in action using the obligatory "hello world" example: Listing 11.1: Asio Hello World 1 #include <boost/asio.hpp> 2 #include <iostream> 3 namespace asio = boost::asio; 4 5 int main() { 6   asio::io_service service; 7 8   service.post( 9     [] { 10       std::cout << "Hello, world!" << 'n'; 11     }); 12 13   std::cout << "Greetings: n"; 14   service.run(); 15 } We include the convenience header boost/asio.hpp, which includes most of the Asio library that we need for the examples in this aritcle (line 1). All parts of the Asio library are under the namespace boost::asio, so we use a shorter alias for this (line 3). The program itself just prints Hello, world! on the console but it does so through a task. The program first creates an instance of io_service (line 6) and posts a function object to it, using the post member function of io_service. The function object, in this case defined using a lambda expression, is referred to as a handler. The call to post adds the handler to a queue inside io_service; some thread (including that which posted the handler) must dispatch them, that is, remove them off the queue and call them. The call to the run member function of io_service (line 14) does precisely this. It loops through the handlers in the queue inside io_service, removing and calling each handler. In fact, we can post more handlers to the io_service before calling run, and it would call all the posted handlers. If we did not call run, none of the handlers would be dispatched. The run function blocks until all the handlers in the queue have been dispatched and returns only when the queue is empty. By itself, a handler may be thought of as an independent, packaged task, and Boost Asio provides a great mechanism for dispatching arbitrary tasks as handlers. Note that handlers must be nullary function objects, that is, they should take no arguments. Asio is a header-only library by default, but programs using Asio need to link at least with boost_system. On Linux, we can use the following command line to build this example: $ g++ -g listing11_1.cpp -o listing11_1 -lboost_system -std=c++11 Running this program prints the following: Greetings: Hello, World! Note that Greetings: is printed from the main function (line 13) before the call to run (line 14). The call to run ends up dispatching the sole handler in the queue, which prints Hello, World!. It is also possible for multiple threads to call run on the same I/O object and dispatch handlers concurrently. We will see how this can be useful in the next section. Handler states – run_one, poll, and poll_one While the run function blocks until there are no more handlers in the queue, there are other member functions of io_service that let you process handlers with greater flexibility. But before we look at this function, we need to distinguish between pending and ready handlers. The handlers we posted to the io_service were all ready to run immediately and were invoked as soon as their turn came on the queue. In general, handlers are associated with background tasks that run in the underlying OS, for example, network I/O tasks. Such handlers are meant to be invoked only once the associated task is completed, which is why in such contexts, they are called completion handlers. These handlers are said to be pending until the associated task is awaiting completion, and once the associated task completes, they are said to be ready. The poll member function, unlike run, dispatches all the ready handlers but does not wait for any pending handler to become ready. Thus, it returns immediately if there are no ready handlers, even if there are pending handlers. The poll_one member function dispatches exactly one ready handler if there be one, but does not block waiting for pending handlers to get ready. The run_one member function blocks on a nonempty queue waiting for a handler to become ready. It returns when called on an empty queue, and otherwise, as soon as it finds and dispatches a ready handler. post versus dispatch A call to the post member function adds a handler to the task queue and returns immediately. A later call to run is responsible for dispatching the handler. There is another member function called dispatch that can be used to request the io_service to invoke a handler immediately if possible. If dispatch is invoked in a thread that has already called one of run, poll, run_one, or poll_one, then the handler will be invoked immediately. If no such thread is available, dispatch adds the handler to the queue and returns just like post would. In the following example, we invoke dispatch from the main function and from within another handler: Listing 11.2: post versus dispatch 1 #include <boost/asio.hpp> 2 #include <iostream> 3 namespace asio = boost::asio; 4 5 int main() { 6   asio::io_service service; 7   // Hello Handler – dispatch behaves like post 8   service.dispatch([]() { std::cout << "Hellon"; }); 9 10   service.post( 11     [&service] { // English Handler 12       std::cout << "Hello, world!n"; 13       service.dispatch([] { // Spanish Handler, immediate 14                         std::cout << "Hola, mundo!n"; 15                       }); 16     }); 17   // German Handler 18   service.post([&service] {std::cout << "Hallo, Welt!n"; }); 19   service.run(); 20 } Running this code produces the following output: Hello Hello, world! Hola, mundo! Hallo, Welt! The first call to dispatch (line 8) adds a handler to the queue without invoking it because run was yet to be called on io_service. We call this the Hello Handler, as it prints Hello. This is followed by the two calls to post (lines 10, 18), which add two more handlers. The first of these two handlers prints Hello, world! (line 12), and in turn, calls dispatch (line 13) to add another handler that prints the Spanish greeting, Hola, mundo! (line 14). The second of these handlers prints the German greeting, Hallo, Welt (line 18). For our convenience, let's just call them the English, Spanish, and German handlers. This creates the following entries in the queue: Hello Handler English Handler German Handler Now, when we call run on the io_service (line 19), the Hello Handler is dispatched first and prints Hello. This is followed by the English Handler, which prints Hello, World! and calls dispatch on the io_service, passing the Spanish Handler. Since this executes in the context of a thread that has already called run, the call to dispatch invokes the Spanish Handler, which prints Hola, mundo!. Following this, the German Handler is dispatched printing Hallo, Welt! before run returns. What if the English Handler called post instead of dispatch (line 13)? In that case, the Spanish Handler would not be invoked immediately but would queue up after the German Handler. The German greeting Hallo, Welt! would precede the Spanish greeting Hola, mundo!. The output would look like this: Hello Hello, world! Hallo, Welt! Hola, mundo! Concurrent execution via thread pools The io_service object is thread-safe and multiple threads can call run on it concurrently. If there are multiple handlers in the queue, they can be processed concurrently by such threads. In effect, the set of threads that call run on a given io_service form a thread pool. Successive handlers can be processed by different threads in the pool. Which thread dispatches a given handler is indeterminate, so the handler code should not make any such assumptions. In the following example, we post a bunch of handlers to the io_service and then start four threads, which all call run on it: Listing 11.3: Simple thread pools 1 #include <boost/asio.hpp> 2 #include <boost/thread.hpp> 3 #include <boost/date_time.hpp> 4 #include <iostream> 5 namespace asio = boost::asio; 6 7 #define PRINT_ARGS(msg) do { 8   boost::lock_guard<boost::mutex> lg(mtx); 9   std::cout << '[' << boost::this_thread::get_id() 10             << "] " << msg << std::endl; 11 } while (0) 12 13 int main() { 14   asio::io_service service; 15   boost::mutex mtx; 16 17   for (int i = 0; i < 20; ++i) { 18     service.post([i, &mtx]() { 19                         PRINT_ARGS("Handler[" << i << "]"); 20                         boost::this_thread::sleep( 21                               boost::posix_time::seconds(1)); 22                       }); 23   } 24 25   boost::thread_group pool; 26   for (int i = 0; i < 4; ++i) { 27     pool.create_thread([&service]() { service.run(); }); 28   } 29 30   pool.join_all(); 31 } We post twenty handlers in a loop (line 18). Each handler prints its identifier (line 19), and then sleeps for a second (lines 19-20). To run the handlers, we create a group of four threads, each of which calls run on the io_service (line 21) and wait for all the threads to finish (line 24). We define the macro PRINT_ARGS which writes output to the console in a thread-safe way, tagged with the current thread ID (line 7-10). We will use this macro in other examples too. To build this example, you must also link against libboost_thread, libboost_date_time, and in Posix environments, with libpthread too: $ g++ -g listing9_3.cpp -o listing9_3 -lboost_system -lboost_thread -lboost_date_time -pthread -std=c++11 One particular run of this program on my laptop produced the following output (with some lines snipped): [b5c15b40] Handler[0] [b6416b40] Handler[1] [b6c17b40] Handler[2] [b7418b40] Handler[3] [b5c15b40] Handler[4] [b6416b40] Handler[5] … [b6c17b40] Handler[13] [b7418b40] Handler[14] [b6416b40] Handler[15] [b5c15b40] Handler[16] [b6c17b40] Handler[17] [b7418b40] Handler[18] [b6416b40] Handler[19] You can see that the different handlers are executed by different threads (each thread ID marked differently). If any of the handlers threw an exception, it would be propagated across the call to the run function on the thread that was executing the handler. io_service::work Sometimes, it is useful to keep the thread pool started, even when there are no handlers to dispatch. Neither run nor run_one blocks on an empty queue. So in order for them to block waiting for a task, we have to indicate, in some way, that there is outstanding work to be performed. We do this by creating an instance of io_service::work, as shown in the following example: Listing 11.4: Using io_service::work to keep threads engaged 1 #include <boost/asio.hpp> 2 #include <memory> 3 #include <boost/thread.hpp> 4 #include <iostream> 5 namespace asio = boost::asio; 6 7 typedef std::unique_ptr<asio::io_service::work> work_ptr; 8 9 #define PRINT_ARGS(msg) do { … ... 14 15 int main() { 16   asio::io_service service; 17   // keep the workers occupied 18   work_ptr work(new asio::io_service::work(service)); 19   boost::mutex mtx; 20 21   // set up the worker threads in a thread group 22   boost::thread_group workers; 23   for (int i = 0; i < 3; ++i) { 24     workers.create_thread([&service, &mtx]() { 25                         PRINT_ARGS("Starting worker thread "); 26                         service.run(); 27                         PRINT_ARGS("Worker thread done"); 28                       }); 29   } 30 31   // Post work 32   for (int i = 0; i < 20; ++i) { 33     service.post( 34       [&service, &mtx]() { 35         PRINT_ARGS("Hello, world!"); 36         service.post([&mtx]() { 37                           PRINT_ARGS("Hola, mundo!"); 38                         }); 39       }); 40   } 41 42 work.reset(); // destroy work object: signals end of work 43   workers.join_all(); // wait for all worker threads to finish 44 } In this example, we create an object of io_service::work wrapped in a unique_ptr (line 18). We associate it with an io_service object by passing to the work constructor a reference to the io_service object. Note that, unlike listing 11.3, we create the worker threads first (lines 24-27) and then post the handlers (lines 33-39). However, the worker threads stay put waiting for the handlers because of the calls to run block (line 26). This happens because of the io_service::work object we created, which indicates that there is outstanding work in the io_service queue. As a result, even after all handlers are dispatched, the threads do not exit. By calling reset on the unique_ptr, wrapping the work object, its destructor is called, which notifies the io_service that all outstanding work is complete (line 42). The calls to run in the threads return and the program exits once all the threads are joined (line 43). We wrapped the work object in a unique_ptr to destroy it in an exception-safe way at a suitable point in the program. We omitted the definition of PRINT_ARGS here, refer to listing 11.3. Serialized and ordered execution via strands Thread pools allow handlers to be run concurrently. This means that handlers that access shared resources need to synchronize access to these resources. We already saw examples of this in listings 11.3 and 11.4, when we synchronized access to std::cout, which is a global object. As an alternative to writing synchronization code in handlers, which can make the handler code more complex, we can use strands. Think of a strand as a subsequence of the task queue with the constraint that no two handlers from the same strand ever run concurrently. The scheduling of other handlers in the queue, which are not in the strand, is not affected by the strand in any way. Let us look at an example of using strands: Listing 11.5: Using strands 1 #include <boost/asio.hpp> 2 #include <boost/thread.hpp> 3 #include <boost/date_time.hpp> 4 #include <cstdlib> 5 #include <iostream> 6 #include <ctime> 7 namespace asio = boost::asio; 8 #define PRINT_ARGS(msg) do { ... 13 14 int main() { 15   std::srand(std::time(0)); 16 asio::io_service service; 17   asio::io_service::strand strand(service); 18   boost::mutex mtx; 19   size_t regular = 0, on_strand = 0; 20 21 auto workFuncStrand = [&mtx, &on_strand] { 22           ++on_strand; 23           PRINT_ARGS(on_strand << ". Hello, from strand!"); 24           boost::this_thread::sleep( 25                       boost::posix_time::seconds(2)); 26         }; 27 28   auto workFunc = [&mtx, &regular] { 29                   PRINT_ARGS(++regular << ". Hello, world!"); 30                  boost::this_thread::sleep( 31                         boost::posix_time::seconds(2)); 32                 }; 33   // Post work 34   for (int i = 0; i < 15; ++i) { 35     if (rand() % 2 == 0) { 36       service.post(strand.wrap(workFuncStrand)); 37     } else { 38       service.post(workFunc); 39     } 40   } 41 42   // set up the worker threads in a thread group 43   boost::thread_group workers; 44   for (int i = 0; i < 3; ++i) { 45     workers.create_thread([&service, &mtx]() { 46                      PRINT_ARGS("Starting worker thread "); 47                       service.run(); 48                       PRINT_ARGS("Worker thread done"); 49                     }); 50   } 51 52   workers.join_all(); // wait for all worker threads to finish 53 } In this example, we create two handler functions: workFuncStrand (line 21) and workFunc (line 28). The lambda workFuncStrand captures a counter on_strand, increments it, and prints a message Hello, from strand!, prefixed with the value of the counter. The function workFunc captures another counter regular, increments it, and prints Hello, World!, prefixed with the counter. Both pause for 2 seconds before returning. To define and use a strand, we first create an object of io_service::strand associated with the io_service instance (line 17). Thereafter, we post all handlers that we want to be part of that strand by wrapping them using the wrap member function of the strand (line 36). Alternatively, we can post the handlers to the strand directly by using either the post or the dispatch member function of the strand, as shown in the following snippet: 33   for (int i = 0; i < 15; ++i) { 34     if (rand() % 2 == 0) { 35       strand.post(workFuncStrand); 37     } else { ... The wrap member function of strand returns a function object, which in turn calls dispatch on the strand to invoke the original handler. Initially, it is this function object rather than our original handler that is added to the queue. When duly dispatched, this invokes the original handler. There are no constraints on the order in which these wrapper handlers are dispatched, and therefore, the actual order in which the original handlers are invoked can be different from the order in which they were wrapped and posted. On the other hand, calling post or dispatch directly on the strand avoids an intermediate handler. Directly posting to a strand also guarantees that the handlers will be dispatched in the same order that they were posted, achieving a deterministic ordering of the handlers in the strand. The dispatch member of strand blocks until the handler is dispatched. The post member simply adds it to the strand and returns. Note that workFuncStrand increments on_strand without synchronization (line 22), while workFunc increments the counter regular within the PRINT_ARGS macro (line 29), which ensures that the increment happens in a critical section. The workFuncStrand handlers are posted to a strand and therefore are guaranteed to be serialized; hence no need for explicit synchronization. On the flip side, entire functions are serialized via strands and synchronizing smaller blocks of code is not possible. There is no serialization between the handlers running on the strand and other handlers; therefore, the access to global objects, like std::cout, must still be synchronized. The following is a sample output of running the preceding code: [b73b6b40] Starting worker thread [b73b6b40] 0. Hello, world from strand! [b6bb5b40] Starting worker thread [b6bb5b40] 1. Hello, world! [b63b4b40] Starting worker thread [b63b4b40] 2. Hello, world! [b73b6b40] 3. Hello, world from strand! [b6bb5b40] 5. Hello, world! [b63b4b40] 6. Hello, world! … [b6bb5b40] 14. Hello, world! [b63b4b40] 4. Hello, world from strand! [b63b4b40] 8. Hello, world from strand! [b63b4b40] 10. Hello, world from strand! [b63b4b40] 13. Hello, world from strand! [b6bb5b40] Worker thread done [b73b6b40] Worker thread done [b63b4b40] Worker thread done There were three distinct threads in the pool and the handlers from the strand were picked up by two of these three threads: initially, by thread ID b73b6b40, and later on, by thread ID b63b4b40. This also dispels a frequent misunderstanding that all handlers in a strand are dispatched by the same thread, which is clearly not the case. Different handlers in the same strand may be dispatched by different threads but will never run concurrently. Summary Asio is a well-designed library that can be used to write fast, nimble network servers that utilize the most optimal mechanisms for asynchronous I/O available on a system. It is an evolving library and is the basis for a Technical Specification that proposes to add a networking library to a future revision of the C++ Standard. In this article, we learned how to use the Boost Asio library as a task queue manager and leverage Asio's TCP and UDP interfaces to write programs that communicate over the network. Resources for Article: Further resources on this subject: Animation features in Unity 5 [article] Exploring and Interacting with Materials using Blueprints [article] A Simple Pathfinding Algorithm for a Maze [article]
Read more
  • 0
  • 1
  • 22134
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
Packt
30 Mar 2016
15 min read
Save for later

ALM – Developers and QA

Packt
30 Mar 2016
15 min read
This article by Can Bilgin, the author of Mastering Cross-Platform Development with Xamarin, provides an introduction to Application Lifecycle Management (ALM) and continuous integration methodologies on Xamarin cross-platform applications. As the part of the ALM process that is most relevant for developers, unit test strategies will be discussed and demonstrated, as well as automated UI testing. This article is divided into the following sections: Development pipeline Troubleshooting Unit testing UI testing (For more resources related to this topic, see here.) Development pipeline The development pipeline can be described as the virtual production line that steers a project from a mere bundle of business requirements to the consumers. Stakeholders that are part of this pipeline include, but are not limited to, business proxies, developers, the QA team, the release and configuration team, and finally the consumers themselves. Each stakeholder in this production line assumes different responsibilities, and they should all function in harmony. Hence, having an efficient, healthy, and preferably automated pipeline that is going to provide the communication and transfer of deliverables between units is vital for the success of a project. In the Agile project management framework, the development pipeline is cyclical rather than a linear delivery queue. In the application life cycle, requirements are inserted continuously into a backlog. The backlog leads to a planning and development phase, which is followed by testing and QA. Once the production-ready application is released, consumers can be made part of this cycle using live application telemetry instrumentation. Figure 1: Application life cycle management In Xamarin cross-platform application projects, development teams are blessed with various tools and frameworks that can ease the execution of ALM strategies. From sketching and mock-up tools available for early prototyping and design to source control and project management tools that make up the backbone of ALM, Xamarin projects can utilize various tools to automate and systematically analyze project timeline. The following sections of this article concentrate mainly on the lines of defense that protect the health and stability of a Xamarin cross-platform project in the timeline between the assignment of tasks to developers to the point at which the task or bug is completed/resolved and checked into a source control repository. Troubleshooting and diagnostics SDKs associated with Xamarin target platforms and development IDEs are equipped with comprehensive analytic tools. Utilizing these tools, developers can identify issues causing app freezes, crashes, slow response time, and other resource-related problems (for example, excessive battery usage). Xamarin.iOS applications are analyzed using the XCode Instruments toolset. In this toolset, there are a number of profiling templates, each used to analyze a certain perspective of application execution. Instrument templates can be executed on an application running on the iOS simulator or on an actual device. Figure 2: XCode Instruments Similarly, Android applications can be analyzed using the device monitor provided by the Android SDK. Using Android Monitor, memory profile, CPU/GPU utilization, and network usage can also be analyzed, and application-provided diagnostic information can be gathered. Android Debug Bridge (ADB) is a command-line tool that allows various manual or automated device-related operations. For Windows Phone applications, Visual Studio provides a number of analysis tools for profiling CPU usage, energy consumption, memory usage, and XAML UI responsiveness. XAML diagnostic sessions in particular can provide valuable information on problematic sections of view implementation and pinpoint possible visual and performance issues: Figure 3: Visual Studio XAML analyses Finally, Xamarin Profiler, as a maturing application (currently in preview release), can help analyze memory allocations and execution time. Xamarin Profiler can be used with iOS and Android applications. Unit testing The test-driven development (TDD) pattern dictates that the business requirements and the granular use-cases defined by these requirements should be initially reflected on unit test fixtures. This allows a mobile application to grow/evolve within the defined borders of these assertive unit test models. Whether following a TDD strategy or implementing tests to ensure the stability of the development pipeline, unit tests are fundamental components of a development project. Figure 4: Unit test project templates Xamarin Studio and Visual Studio both provide a number of test project templates targeting different areas of a cross-platform project. In Xamarin cross-platform projects, unit tests can be categorized into two groups: platform-agnostic and platform-specific testing. Platform-agnostic unit tests Platform-agnostic components, such as portable class libraries containing shared logic for Xamarin applications, can be tested using the common unit test projects targeting the .NET framework. Visual Studio Test Tools or the NUnit test framework can be used according to the development environment of choice. It is also important to note that shared projects used to create shared logic containers for Xamarin projects cannot be tested with .NET unit test fixtures. For shared projects and the referencing platform-specific projects, platform-specific unit test fixtures should be prepared. When following an MVVM pattern, view models are the focus of unit test fixtures since, as previously explained, view models can be perceived as a finite state machine where the bindable properties are used to create a certain state on which the commands are executed, simulating a specific use-case to be tested. This approach is the most convenient way to test the UI behavior of a Xamarin application without having to implement and configure automated UI tests. While implementing unit tests for such projects, a mocking framework is generally used to replace the platform-dependent sections of the business logic. Loosely coupling these dependent components makes it easier for developers to inject mocked interface implementations and increases the testability of these modules. The most popular mocking frameworks for unit testing are Moq and RhinoMocks. Both Moq and RhinoMocks utilize reflection and, more specifically, the Reflection.Emit namespace, which is used to generate types, methods, events, and other artifacts in the runtime. Aforementioned iOS restrictions on code generation make these libraries inapplicable for platform-specific testing, but they can still be included in unit test fixtures targeting the .NET framework. For platform-specific implementation, the True Fakes library provides compile time code generation and mocking features. Depending on the implementation specifics (such as namespaces used, network communication, multithreading, and so on), in some scenarios it is imperative to test the common logic implementation on specific platforms as well. For instance, some multithreading and parallel task implementations give different results on Windows Runtime, Xamarin.Android, and Xamarin.iOS. These variations generally occur because of the underlying platform's mechanism or slight differences between the .NET and Mono implementation logic. In order to ensure the integrity of these components, common unit test fixtures can be added as linked/referenced files to platform-specific test projects and executed on the test harness. Platform-specific unit tests In a Xamarin project, platform-dependent features cannot be unit tested using the conventional unit test runners available in Visual Studio Test Suite and NUnit frameworks. Platform-dependent tests are executed on empty platform-specific projects that serve as a harness for unit tests for that specific platform. Windows Runtime application projects can be tested using the Visual Studio Test Suite. However, for Android and iOS, the NUnit testing framework should be used, since Visual Studio Test Tools are not available for the Xamarin.Android and Xamarin.iOS platforms.                              Figure 5: Test harnesses The unit test runner for Windows Phone (Silverlight) and Windows Phone 8.1 applications uses a test harness integrated with the Visual Studio test explorer. The unit tests can be executed and debugged from within Visual Studio. Xamarin.Android and Xamarin.iOS test project templates use NUnitLite implementation for the respective platforms. In order to run these tests, the test application should be deployed on the simulator (or the testing device) and the application has to be manually executed. It is possible to automate the unit tests on Android and iOS platforms through instrumentation. In each Xamarin target platform, the initial application lifetime event is used to add the necessary unit tests: [Activity(Label = "Xamarin.Master.Fibonacci.Android.Tests", MainLauncher = true, Icon = "@drawable/icon")] public class MainActivity : TestSuiteActivity { protected override void OnCreate(Bundle bundle) { // tests can be inside the main assembly //AddTest(Assembly.GetExecutingAssembly()); // or in any reference assemblies AddTest(typeof(Fibonacci.Android.Tests.TestsSample).Assembly); // Once you called base.OnCreate(), you cannot add more assemblies. base.OnCreate(bundle); } } In the Xamarin.Android implementation, the MainActivity class derives from the TestSuiteActivity, which implements the necessary infrastructure to run the unit tests and the UI elements to visualize the test results. On the Xamarin.iOS platform, the test application uses the default UIApplicationDelegate, and generally, the FinishedLaunching event delegate is used to create the ViewController for the unit test run fixture: public override bool FinishedLaunching(UIApplication application, NSDictionary launchOptions) { // Override point for customization after application launch. // If not required for your application you can safely delete this method var window = new UIWindow(UIScreen.MainScreen.Bounds); var touchRunner = new TouchRunner(window); touchRunner.Add(System.Reflection.Assembly.GetExecutingAssembly()); window.RootViewController = new UINavigationController(touchRunner.GetViewController()); window.MakeKeyAndVisible(); return true; } The main shortcoming of executing unit tests this way is the fact that it is not easy to generate a code coverage report and archive the test results. Neither of these testing methods provide the ability to test the UI layer. They are simply used to test platform-dependent implementations. In order to test the interactive layer, platform-specific or cross-platform (Xamarin.Forms) coded UI tests need to be implemented. UI testing In general terms, the code coverage of the unit tests directly correlates with the amount of shared code which amounts to, at the very least, 70-80 percent of the code base in a mundane Xamarin project. One of the main driving factors of architectural patterns was to decrease the amount of logic and code in the view layer so that the testability of the project utilizing conventional unit tests reaches a satisfactory level. Coded UI (or automated UI acceptance) tests are used to test the uppermost layer of the cross-platform solution: the views. Xamarin.UITests and Xamarin Test Cloud The main UI testing framework used for Xamarin projects is the Xamarin.UITests testing framework. This testing component can be used on various platform-specific projects, varying from native mobile applications to Xamarin.Forms implementations, except for the Windows Phone platform and applications. Xamarin.UITests is an implementation based on the Calabash framework, which is an automated UI acceptance testing framework targeting mobile applications. Xamarin.UITests is introduced to the Xamarin.iOS or Xamarin.Android applications using the publicly available NuGet packages. The included framework components are used to provide an entry point to the native applications. The entry point is the Xamarin Test Cloud Agent, which is embedded into the native application during the compilation. The cloud agent is similar to a local server that allows either the Xamarin Test Cloud or the test runner to communicate with the app infrastructure and simulate user interaction with the application. Xamarin Test Cloud is a subscription-based service allowing Xamarin applications to be tested on real mobile devices using UI tests implemented via Xamarin.UITests. Xamarin Test Cloud not only provides a powerful testing infrastructure for Xamarin.iOS and Xamarin.Android applications with an abundant amount of mobile devices but can also be integrated into Continuous Integration workflows. After installing the appropriate NuGet package, the UI tests can be initialized for a specific application on a specific device. In order to initialize the interaction adapter for the application, the app package and the device should be configured. On Android, the APK package path and the device serial can be used for the initialization: IApp app = ConfigureApp.Android.ApkFile("<APK Path>/MyApplication.apk") .DeviceSerial("<DeviceID>") .StartApp(); For an iOS application, the procedure is similar: IApp app = ConfigureApp.iOS.AppBundle("<App Bundle Path>/MyApplication.app") .DeviceIdentifier("<DeviceID of Simulator") .StartApp(); Once the App handle has been created, each test written using NUnit should first create the pre-conditions for the tests, simulate the interaction, and finally test the outcome. The IApp interface provides a set of methods to select elements on the visual tree and simulate certain interactions, such as text entry and tapping. On top of the main testing functionality, screenshots can be taken to document test steps and possible bugs. Both Visual Studio and Xamarin Studio provide project templates for Xamarin.UITests. Xamarin Test Recorder Xamarin Test Recorder is an application that can ease the creation of automated UI tests. It is currently in its preview version and is only available for the Mac OS platform. Figure 6: Xamarin Test Recorder Using this application, developers can select the application in need of testing and the device/simulator that is going to run the application. Once the recording session starts, each interaction on the screen is recorded as execution steps on a separate screen, and these steps can be used to generate the preparation or testing steps for the Xamarin.UITests implementation. Coded UI tests (Windows Phone) Coded UI tests are used for automated UI testing on the Windows Phone platform. Coded UI Tests for Windows Phone and Windows Store applications are not any different than their counterparts for other .NET platforms such as Windows Forms, WPF, or ASP.Net. It is also important to note that only XAML applications support Coded UI tests. Coded UI tests are generated on a simulator and written on an Automation ID premise. The Automation ID property is an automatically generated or manually configured identifier for Windows Phone applications (only in XAML) and the UI controls used in the application. Coded UI tests depend on the UIMap created for each control on a specific screen using the Automation IDs. While creating the UIMap, a crosshair tool can be used to select the application and the controls on the simulator screen to define the interactive elements: Figure 7:- Generating coded UI accessors and tests Once the UIMap has been created and the designer files have been generated, gestures and the generated XAML accessors can be used to create testing pre-conditions and assertions. For Coded UI tests, multiple scenario-specific input values can be used and tested on a single assertion. Using the DataRow attribute, unit tests can be expanded to test multiple data-driven scenarios. The code snippet below uses multiple input values to test different incorrect input values: [DataRow(0,"Zero Value")] [DataRow(-2, "Negative Value")] [TestMethod] public void FibonnaciCalculateTest_IncorrectOrdinal(int ordinalInput) { // TODO: Check if bad values are handled correctly } Automated tests can run on available simulators and/or a real device. They can also be included in CI build workflows and made part of the automated development pipeline. Calabash Calabash is an automated UI acceptance testing framework used to execute Cucumber tests. Cucumber tests provide an assertion strategy similar to coded UI tests, only broader and behavior oriented. The Cucumber test framework supports tests written in the Gherkin language (a human-readable programming grammar description for behavior definitions). Calabash makes up the necessary infrastructure to execute these tests on various platforms and application runtimes. A simple declaration of the feature and the scenario that is previously tested on Coded UI using the data-driven model would look similar to the excerpt below. Only two of the possible test scenarios are declared in this feature for demonstration; the feature can be extended: Feature: Calculate Single Fibonacci number. Ordinal entry should greater than 0. Scenario: Ordinal is lower than 0. Given I use the native keyboard to enter "-2" into text field Ordinal And I touch the "Calculate" button Then I see the text "Ordinal cannot be a negative number." Scenario: Ordinal is 0. Given I use the native keyboard to enter "0" into text field Ordinal And I touch the "Calculate" button Then I see the text "Cannot calculate the number for the 0th ordinal." Calabash test execution is possible on Xamarin target platforms since the Ruby API exposed by the Calabash framework has a bidirectional communication line with the Xamarin Test Cloud Agent embedded in Xamarin applications with NuGet packages. Calabash/Cucumber tests can be executed on Xamarin Test Cloud on real devices since the communication between the application runtime and Calabash framework is maintained by Xamarin Test Cloud Agent, the same as Xamarin.UI tests. Summary Xamarin projects can benefit from a properly established development pipeline and the use of ALM principles. This type of approach makes it easier for teams to share responsibilities and work out business requirements in an iterative manner. In the ALM timeline, the development phase is the main domain in which most of the concrete implementation takes place. In order for the development team to provide quality code that can survive the ALM cycle, it is highly advised to analyze and test native applications using the available tooling in Xamarin development IDEs. While the common codebase for a target platform in a Xamarin project can be treated and tested as a .NET implementation using the conventional unit tests, platform-specific implementations require more particular handling. Platform-specific parts of the application need to be tested on empty shell applications, called test harnesses, on the respective platform simulators or devices. To test views, available frameworks such as Coded UI tests (for Windows Phone) and Xamarin.UITests (for Xamarin.Android and Xamarin.iOS) can be utilized to increase the test code coverage and create a stable foundation for the delivery pipeline. Most tests and analysis tools discussed in this article can be integrated into automated continuous integration processes. Resources for Article:   Further resources on this subject: A cross-platform solution with Xamarin.Forms and MVVM architecture [article] Working with Xamarin.Android [article] Application Development Workflow [article]
Read more
  • 0
  • 0
  • 22117

article-image-databricks-dbrx-stability-ais-stable-code-instruct-3b-sambanovas-samba-coe-v02-frugalgpt-advanced-rag-patterns-on-amazon-sagemaker
Merlyn Shelley
02 Apr 2024
10 min read
Save for later

Databricks' DBRX, Stability AI's Stable Code Instruct 3B, SambaNova's Samba CoE v0.2, FrugalGPT, Advanced RAG Patterns on Amazon SageMaker

Merlyn Shelley
02 Apr 2024
10 min read
Subscribe to our Data Pro newsletter for the latest insights. Don't miss out – sign up today!👋 Hello,Welcome to DataPro#87 – Your Gateway to the Cutting-Edge of Data Science & Machine Learning! 🚀 Dive into this edition to explore: ⚙️ LLMs & GPTs Unleashed Samba CoE v0.2: SambaNova's Speedy AI Models Efficient Training of Language Models with OpenAI AI21's Revolutionary SSM-Transformer Model: Jamba Databricks' DBRX: The New Open LLM Benchmark Stable Code Instruct 3B: Stability AI's Latest Offering HyperLLaVA: Boosting Multimodal Language Models ✨ What's Fresh & Exciting FrugalGPT: Cutting LLM Operating Costs Building a Reliable AI Agent from Scratch with OpenAI Tool Calling Fine-Tuning Instruct Models over Raw Text Data Crafting an OpenAI-Compatible API ⚡ Industry Pulse:  Deciphering Advanced RAG Patterns on Amazon SageMaker Unveil the Future with AutoBNN: Mastering Probabilistic Time Series Forecasting! Engaging with Microsoft Copilot (web): Learning from Interaction 📚 Packt's Latest Gem "Principles of Data Science - Third Edition" by Sinan Ozdemir DataPro Newsletter is not just a publication; it’s a comprehensive toolkit for anyone serious about mastering the ever-changing landscape of data and AI. Grab your copy and start transforming your data expertise today! 📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition."We appreciate your input and hope you enjoy the book!Share your Feedback!Cheers,Merlyn ShelleyEditor-in-Chief, PacktSign Up | Advertise | Archives🔰 GitHub Finds: Any of These Repos in Your Toolbox?🛠️ Zejun-Yang/AniPortrait: AniPortrait is a new framework for creating high-quality animations using audio input and a reference portrait image, with face reenactment capabilities.🛠️ agiresearch/AIOS: AIOS embeds large language models into operating systems, enabling smarter resource allocation, context switching, and concurrent agent execution, advancing AGI. 🛠️ lichao-sun/Mora: Mora is a multi-agent framework for video generation, enhancing OpenAI's Sora capabilities through collaborative visual agents for diverse tasks. 🛠️ jasonppy/VoiceCraft: VoiceCraft is a high-performing neural codec language model for speech editing and zero-shot text-to-speech, excelling with diverse real-world data. 🛠️ dvlab-research/MiniGemini: Mini-Gemini enhances LLMs (Large Language Models) from 2B to 34B, integrating image understanding, reasoning, and generation, inspired by LLaVA. 🛠️ Picsart-AI-Research/StreamingT2V: StreamingT2V is a technique for creating long videos with rich motion dynamics, ensuring temporal consistency and high image quality. 📚 Expert Insights from Packt Community"Principles of Data Science - Third Edition" by Sinan Ozdemir. The Five Steps of Data Science A question I’ve gotten at least once a month for the past decade is What’s the difference between data science and data analytics? One could argue that there is no difference between the two; others will argue that there are hundreds of differences! I believe that, regardless of how many differences there are between the two terms, the following applies: Data science follows a structured, step-by-step process that, when followed, preserves the integrity of the results and leads to a deeper understanding of the data and the environment the data comes from. As with any other scientific endeavor, this process must be adhered to, or else the analysis and the results are in danger of scrutiny. On a simpler level, following a strict process can make it much easier for any data scientist, hobbyist, or professional to obtain results faster than if they were exploring data with no clear vision. While these steps are a guiding lesson for amateur analysts, they also provide the foundation for all data scientists, even those in the highest levels of business and academia. Every data scientist recognizes the value of these steps and follows them in some way or another. Overview of the five steps The process of data science involves a series of steps that are essential for effectively extracting insights and knowledge from data. These steps are presented as follows: Asking an interesting question: The first step in any data science project is to identify a question or challenge that you want to address with your analysis. This involves finding a topic that is relevant, important, and that can be addressed with data. Obtaining the data: Once you have identified your question, the next step is to collect the data that you will need to answer it. This can involve sourcing data from a variety of sources, such as databases, online platforms, or through data scraping or data collection methods. Exploring the data: After you have collected your data, the next step is to explore it and get a better understanding of its characteristics and patterns. This might involve examining summary statistics, visualizing the data, or applying statistical or machine learning (ML) techniques to identify trends or relationships. Modeling the data: Once you have explored your data, the next step is to build models that can be used to make predictions or inform decision-making. This might involve applying ML algorithms, building statistical models, or using other techniques to find patterns in the data. Communicating and visualizing the results: Finally, it’s important to communicate your findings to others in a clear and effective way. This might involve creating reports, presentations, or visualizations that help to explain your results and their implications. By following these five essential steps, you can effectively use data science to solve real-world problems and extract valuable insights from data. It’s important to note that different data scientists may have different approaches to the data science process, and the steps outlined previously are just one way of organizing the process. Some data scientists might group the steps differently or include additional steps such as feature engineering or model evaluation. Despite these differences, most data scientists agree that the steps listed previously are essential to the data science process. Whether they are organized in this specific way or not, these steps are all crucial for effectively using data to solve problems and extract valuable insights. Let’s dive into these steps one by one.Discover more insights from "Principles of Data Science - Third Edition" by Sinan Ozdemir. Unlock access to the full book and a wealth of other titles with a 7-day free trial in the Packt Library. Start exploring today!    Read Here!⚡ Tech Tidbits: Stay Wired to the Latest Industry Buzz! AWS ML Made Easy 🌀 Advanced RAG patterns on Amazon SageMaker: This post discusses how customers across various industries are utilizing large language models (LLMs) like Mixtral-8x7B Instruct to build generative AI applications such as QnA chatbots and search engines. It highlights the challenges and solutions in improving the accuracy and performance of these applications, focusing on Retrieval Augmented Generation (RAG) patterns implemented with LangChain.Google Research 🌀 AutoBNN: Probabilistic time series forecasting with compositional bayesian neural networks. This research introduces AutoBNN, an open-source package for automated, interpretable time series forecasting using Bayesian neural networks (BNNs). It addresses limitations of traditional methods like Gaussian processes (GPs) and Structural Time Series by combining the interpretability of GPs with the scalability and flexibility of neural networks. AutoBNN automates model discovery, provides high-quality uncertainty estimates, and scales effectively for large datasets. Microsoft Research🌀 Learning from interaction with Microsoft Copilot (web): This research focuses on how AI systems like Bing and Microsoft Copilot learn and improve from user interactions, particularly through reinforcement learning from human feedback (RLHF). It also explores how Bing has evolved its search capabilities and how Copilot is changing user interactions to be more conversational and workflow oriented. The research introduces frameworks like TnT-LLM and SPUR to improve taxonomy generation and user satisfaction estimation in AI interactions. Email Forwarded? Join DataPro Here!🔍 From Bits to BERT: Keeping Up with LLMs & GPTs 🌀 Samba CoE v0.2 from SambaNova delivers accurate AI models at blazing speeds: This blog post highlights Samba's advancements in AI architecture, specifically focusing on the introduction of Samba-1, a CoE architecture for enterprise AI. It discusses the features and benefits of Samba-1, its performance benchmarks, and plans for future releases, emphasizing the role of RDUs in driving efficiency and speed in AI models. 🌀 OpenAI’s Efficient Training of Language Models to Fill in the Middle: OpenAI demonstrates that autoregressive language models can effectively learn to infill text by moving a span of text from the middle of a document to its end, without harming generative capability. They propose training models with this method by default and provide benchmarks and best practices. 🌀 Jamba: AI21's Groundbreaking SSM-Transformer Model. Jamba is a groundbreaking model that merges Mamba SSM with Transformer elements, offering a 256K context window and outperforming similar models. Released under Apache 2.0, it will be available in the NVIDIA API catalog. Jamba optimizes memory, throughput, and performance, delivering remarkable efficiency. 🌀 Databricks’ DBRX: A New State-of-the-Art Open LLM. Databricks introduces DBRX, an open LLM setting new benchmarks in language understanding, programming, and math. With a 256K context window, it outperforms GPT-3.5 and competes with Gemini 1.0 Pro. DBRX is 40% smaller than Grok-1, offering 2x faster inference than LLaMA2-70B. 🌀 Introducing Stable Code Instruct 3B — Stability AI: Stable Code Instruct 3B, built on Stable Code 3B, offers state-of-the-art performance in code completion and natural language interactions for programming tasks. It outperforms Codellama 7B Instruct and matches StarChat 15B, with a focus on popular languages like Python and Java. Available for commercial use with a Stability AI Membership, the model is accessible on Hugging Face. 🌀 HyperLLaVA: Enhancing Multimodal Language Models with Dynamic Visual and Language Experts. This blog explores the advancements in Multimodal Large Language Models (MLLMs) and introduces HyperLLaVA, a dynamic model that improves performance by adaptively tuning parameters for handling diverse multimodal tasks, surpassing existing benchmarks and opening new avenues for multimodal learning systems. ✨ On the Radar: Catch Up on What's Fresh🌀 FrugalGPT and Reducing LLM Operating Costs: The blog discusses the high cost of running Large Language Models (LLMs) and introduces the "FrugalGPT" framework, which reduces operating costs significantly while maintaining quality. It explains how different models cost different amounts and proposes using a cascade of LLMs to minimize costs while maximizing answer quality. 🌀 Leverage OpenAI Tool calling: Building a reliable AI Agent from Scratch. The blog discusses the future role of AI in everyday tasks, focusing on text creation, correction, and brainstorming. It highlights the importance of Retrieval-Augmented Generation (RAG) pipelines and aims to provide Large Language Models with better context to generate more valuable content. 🌀 Fine-tune an Instruct model over raw text data: The blog explores the challenges of integrating modern chatbots with large datasets, focusing on context window sizes and the use of Retrieval-Augmented Generation (RAG) techniques. It proposes a lighter approach to fine-tuning chatbots on smaller datasets, aiming to bridge the gap between the constraints of a 128K context window and the complexities of models fine-tuned on billions of tokens. The experiment involves fine-tuning a model on The Guardian's dataset and aims to provide reproducible instructions for cost-effective model training using accessible hardware. 🌀 How to build an OpenAI-compatible API: The blog discusses the dominance of OpenAI in the Gen AI market, and the reasons developers might choose alternative LLM providers. It explores implementing a Python FastAPI server compatible with the OpenAI API specs to wrap any LLM, aiming for flexibility and cost-effectiveness. See you next time!
Read more
  • 0
  • 0
  • 22112

article-image-camera-calibration
Packt
25 Aug 2014
18 min read
Save for later

Camera Calibration

Packt
25 Aug 2014
18 min read
This article by Robert Laganière, author of OpenCV Computer Vision Application Programming Cookbook Second Edition, includes that images are generally produced using a digital camera, which captures a scene by projecting light going through its lens onto an image sensor. The fact that an image is formed by the projection of a 3D scene onto a 2D plane implies the existence of important relationships between a scene and its image and between different images of the same scene. Projective geometry is the tool that is used to describe and characterize, in mathematical terms, the process of image formation. In this article, we will introduce you to some of the fundamental projective relations that exist in multiview imagery and explain how these can be used in computer vision programming. You will learn how matching can be made more accurate through the use of projective constraints and how a mosaic from multiple images can be composited using two-view relations. Before we start the recipe, let's explore the basic concepts related to scene projection and image formation. (For more resources related to this topic, see here.) Image formation Fundamentally, the process used to produce images has not changed since the beginning of photography. The light coming from an observed scene is captured by a camera through a frontal aperture; the captured light rays hit an image plane (or an image sensor) located at the back of the camera. Additionally, a lens is used to concentrate the rays coming from the different scene elements. This process is illustrated by the following figure: Here, do is the distance from the lens to the observed object, di is the distance from the lens to the image plane, and f is the focal length of the lens. These quantities are related by the so-called thin lens equation: In computer vision, this camera model can be simplified in a number of ways. First, we can neglect the effect of the lens by considering that we have a camera with an infinitesimal aperture since, in theory, this does not change the image appearance. (However, by doing so, we ignore the focusing effect by creating an image with an infinite depth of field.) In this case, therefore, only the central ray is considered. Second, since most of the time we have do>>di, we can assume that the image plane is located at the focal distance. Finally, we can note from the geometry of the system that the image on the plane is inverted. We can obtain an identical but upright image by simply positioning the image plane in front of the lens. Obviously, this is not physically feasible, but from a mathematical point of view, this is completely equivalent. This simplified model is often referred to as the pin-hole camera model, and it is represented as follows: From this model, and using the law of similar triangles, we can easily derive the basic projective equation that relates a pictured object with its image: The size (hi) of the image of an object (of height ho) is therefore inversely proportional to its distance (do) from the camera, which is naturally true. In general, this relation describes where a 3D scene point will be projected on the image plane given the geometry of the camera. Calibrating a camera From the introduction of this article, we learned that the essential parameters of a camera under the pin-hole model are its focal length and the size of the image plane (which defines the field of view of the camera). Also, since we are dealing with digital images, the number of pixels on the image plane (its resolution) is another important characteristic of a camera. Finally, in order to be able to compute the position of an image's scene point in pixel coordinates, we need one additional piece of information. Considering the line coming from the focal point that is orthogonal to the image plane, we need to know at which pixel position this line pierces the image plane. This point is called the principal point. It might be logical to assume that this principal point is at the center of the image plane, but in practice, this point might be off by a few pixels depending on the precision at which the camera has been manufactured. Camera calibration is the process by which the different camera parameters are obtained. One can obviously use the specifications provided by the camera manufacturer, but for some tasks, such as 3D reconstruction, these specifications are not accurate enough. Camera calibration will proceed by showing known patterns to the camera and analyzing the obtained images. An optimization process will then determine the optimal parameter values that explain the observations. This is a complex process that has been made easy by the availability of OpenCV calibration functions. How to do it... To calibrate a camera, the idea is to show it a set of scene points for which their 3D positions are known. Then, you need to observe where these points project on the image. With the knowledge of a sufficient number of 3D points and associated 2D image points, the exact camera parameters can be inferred from the projective equation. Obviously, for accurate results, we need to observe as many points as possible. One way to achieve this would be to take one picture of a scene with many known 3D points, but in practice, this is rarely feasible. A more convenient way is to take several images of a set of some 3D points from different viewpoints. This approach is simpler but requires you to compute the position of each camera view in addition to the computation of the internal camera parameters, which fortunately is feasible. OpenCV proposes that you use a chessboard pattern to generate the set of 3D scene points required for calibration. This pattern creates points at the corners of each square, and since this pattern is flat, we can freely assume that the board is located at Z=0, with the X and Y axes well-aligned with the grid. In this case, the calibration process simply consists of showing the chessboard pattern to the camera from different viewpoints. Here is one example of a 6x4 calibration pattern image: The good thing is that OpenCV has a function that automatically detects the corners of this chessboard pattern. You simply provide an image and the size of the chessboard used (the number of horizontal and vertical inner corner points). The function will return the position of these chessboard corners on the image. If the function fails to find the pattern, then it simply returns false: // output vectors of image points std::vector<cv::Point2f> imageCorners; // number of inner corners on the chessboard cv::Size boardSize(6,4); // Get the chessboard corners bool found = cv::findChessboardCorners(image, boardSize, imageCorners); The output parameter, imageCorners, will simply contain the pixel coordinates of the detected inner corners of the shown pattern. Note that this function accepts additional parameters if you needs to tune the algorithm, which are not discussed here. There is also a special function that draws the detected corners on the chessboard image, with lines connecting them in a sequence: //Draw the corners cv::drawChessboardCorners(image, boardSize, imageCorners, found); // corners have been found The following image is obtained: The lines that connect the points show the order in which the points are listed in the vector of detected image points. To perform a calibration, we now need to specify the corresponding 3D points. You can specify these points in the units of your choice (for example, in centimeters or in inches); however, the simplest is to assume that each square represents one unit. In that case, the coordinates of the first point would be (0,0,0) (assuming that the board is located at a depth of Z=0), the coordinates of the second point would be (1,0,0), and so on, the last point being located at (5,3,0). There are a total of 24 points in this pattern, which is too small to obtain an accurate calibration. To get more points, you need to show more images of the same calibration pattern from various points of view. To do so, you can either move the pattern in front of the camera or move the camera around the board; from a mathematical point of view, this is completely equivalent. The OpenCV calibration function assumes that the reference frame is fixed on the calibration pattern and will calculate the rotation and translation of the camera with respect to the reference frame. Let's now encapsulate the calibration process in a CameraCalibrator class. The attributes of this class are as follows: class CameraCalibrator { // input points: // the points in world coordinates std::vector<std::vector<cv::Point3f>> objectPoints; // the point positions in pixels std::vector<std::vector<cv::Point2f>> imagePoints; // output Matrices cv::Mat cameraMatrix; cv::Mat distCoeffs; // flag to specify how calibration is done int flag; Note that the input vectors of the scene and image points are in fact made of std::vector of point instances; each vector element is a vector of the points from one view. Here, we decided to add the calibration points by specifying a vector of the chessboard image filename as input: // Open chessboard images and extract corner points int CameraCalibrator::addChessboardPoints( const std::vector<std::string>& filelist, cv::Size & boardSize) { // the points on the chessboard std::vector<cv::Point2f> imageCorners; std::vector<cv::Point3f> objectCorners; // 3D Scene Points: // Initialize the chessboard corners // in the chessboard reference frame // The corners are at 3D location (X,Y,Z)= (i,j,0) for (int i=0; i<boardSize.height; i++) { for (int j=0; j<boardSize.width; j++) { objectCorners.push_back(cv::Point3f(i, j, 0.0f)); } } // 2D Image points: cv::Mat image; // to contain chessboard image int successes = 0; // for all viewpoints for (int i=0; i<filelist.size(); i++) { // Open the image image = cv::imread(filelist[i],0); // Get the chessboard corners bool found = cv::findChessboardCorners( image, boardSize, imageCorners); // Get subpixel accuracy on the corners cv::cornerSubPix(image, imageCorners, cv::Size(5,5), cv::Size(-1,-1), cv::TermCriteria(cv::TermCriteria::MAX_ITER + cv::TermCriteria::EPS, 30, // max number of iterations 0.1)); // min accuracy //If we have a good board, add it to our data if (imageCorners.size() == boardSize.area()) { // Add image and scene points from one view addPoints(imageCorners, objectCorners); successes++; } } return successes; } The first loop inputs the 3D coordinates of the chessboard, and the corresponding image points are the ones provided by the cv::findChessboardCorners function. This is done for all the available viewpoints. Moreover, in order to obtain a more accurate image point location, the cv::cornerSubPix function can be used, and as the name suggests, the image points will then be localized at a subpixel accuracy. The termination criterion that is specified by the cv::TermCriteria object defines the maximum number of iterations and the minimum accuracy in subpixel coordinates. The first of these two conditions that is reached will stop the corner refinement process. When a set of chessboard corners have been successfully detected, these points are added to our vectors of the image and scene points using our addPoints method. Once a sufficient number of chessboard images have been processed (and consequently, a large number of 3D scene point / 2D image point correspondences are available), we can initiate the computation of the calibration parameters as follows: // Calibrate the camera // returns the re-projection error double CameraCalibrator::calibrate(cv::Size &imageSize) { //Output rotations and translations std::vector<cv::Mat> rvecs, tvecs; // start calibration return calibrateCamera(objectPoints, // the 3D points imagePoints, // the image points imageSize, // image size cameraMatrix, // output camera matrix distCoeffs, // output distortion matrix rvecs, tvecs, // Rs, Ts flag); // set options } In practice, 10 to 20 chessboard images are sufficient, but these must be taken from different viewpoints at different depths. The two important outputs of this function are the camera matrix and the distortion parameters. These will be described in the next section. How it works... In order to explain the result of the calibration, we need to go back to the figure in the introduction, which describes the pin-hole camera model. More specifically, we want to demonstrate the relationship between a point in 3D at the position (X,Y,Z) and its image (x,y) on a camera specified in pixel coordinates. Let's redraw this figure by adding a reference frame that we position at the center of the projection as seen here: Note that the y axis is pointing downward to get a coordinate system compatible with the usual convention that places the image origin at the upper-left corner. We learned previously that the point (X,Y,Z) will be projected onto the image plane at (fX/Z,fY/Z). Now, if we want to translate this coordinate into pixels, we need to divide the 2D image position by the pixel's width (px) and height (py), respectively. Note that by dividing the focal length given in world units (generally given in millimeters) by px, we obtain the focal length expressed in (horizontal) pixels. Let's then define this term as fx. Similarly, fy =f/py is defined as the focal length expressed in vertical pixel units. Therefore, the complete projective equation is as follows: Recall that (u0,v0) is the principal point that is added to the result in order to move the origin to the upper-left corner of the image. These equations can be rewritten in the matrix form through the introduction of homogeneous coordinates, in which 2D points are represented by 3-vectors and 3D points are represented by 4-vectors (the extra coordinate is simply an arbitrary scale factor, S, that needs to be removed when a 2D coordinate needs to be extracted from a homogeneous 3-vector). Here is the rewritten projective equation: The second matrix is a simple projection matrix. The first matrix includes all of the camera parameters, which are called the intrinsic parameters of the camera. This 3x3 matrix is one of the output matrices returned by the cv::calibrateCamera function. There is also a function called cv::calibrationMatrixValues that returns the value of the intrinsic parameters given by a calibration matrix. More generally, when the reference frame is not at the projection center of the camera, we will need to add a rotation vector (a 3x3 matrix) and a translation vector (a 3x1 matrix). These two matrices describe the rigid transformation that must be applied to the 3D points in order to bring them back to the camera reference frame. Therefore, we can rewrite the projection equation in its most general form: Remember that in our calibration example, the reference frame was placed on the chessboard. Therefore, there is a rigid transformation (made of a rotation component represented by the matrix entries r1 to r9 and a translation represented by t1, t2, and t3) that must be computed for each view. These are in the output parameter list of the cv::calibrateCamera function. The rotation and translation components are often called the extrinsic parameters of the calibration, and they are different for each view. The intrinsic parameters remain constant for a given camera/lens system. The intrinsic parameters of our test camera obtained from a calibration based on 20 chessboard images are fx=167, fy=178, u0=156, and v0=119. These results are obtained by cv::calibrateCamera through an optimization process aimed at finding the intrinsic and extrinsic parameters that will minimize the difference between the predicted image point position, as computed from the projection of the 3D scene points, and the actual image point position, as observed on the image. The sum of this difference for all the points specified during the calibration is called the re-projection error. Let's now turn our attention to the distortion parameters. So far, we have mentioned that under the pin-hole camera model, we can neglect the effect of the lens. However, this is only possible if the lens that is used to capture an image does not introduce important optical distortions. Unfortunately, this is not the case with lower quality lenses or with lenses that have a very short focal length. You may have already noted that the chessboard pattern shown in the image that we used for our example is clearly distorted—the edges of the rectangular board are curved in the image. Also, note that this distortion becomes more important as we move away from the center of the image. This is a typical distortion observed with a fish-eye lens, and it is called radial distortion. The lenses used in common digital cameras usually do not exhibit such a high degree of distortion, but in the case of the lens used here, these distortions certainly cannot be ignored. It is possible to compensate for these deformations by introducing an appropriate distortion model. The idea is to represent the distortions induced by a lens by a set of mathematical equations. Once established, these equations can then be reverted in order to undo the distortions visible on the image. Fortunately, the exact parameters of the transformation that will correct the distortions can be obtained together with the other camera parameters during the calibration phase. Once this is done, any image from the newly calibrated camera will be undistorted. Therefore, we have added an additional method to our calibration class: // remove distortion in an image (after calibration) cv::Mat CameraCalibrator::remap(const cv::Mat &image) { cv::Mat undistorted; if (mustInitUndistort) { // called once per calibration cv::initUndistortRectifyMap( cameraMatrix, // computed camera matrix distCoeffs, // computed distortion matrix cv::Mat(), // optional rectification (none) cv::Mat(), // camera matrix to generate undistorted image.size(), // size of undistorted CV_32FC1, // type of output map map1, map2); // the x and y mapping functions mustInitUndistort= false; } // Apply mapping functions cv::remap(image, undistorted, map1, map2, cv::INTER_LINEAR); // interpolation type return undistorted; } Running this code results in the following image: As you can see, once the image is undistorted, we obtain a regular perspective image. To correct the distortion, OpenCV uses a polynomial function that is applied to the image points in order to move them at their undistorted position. By default, five coefficients are used; a model made of eight coefficients is also available. Once these coefficients are obtained, it is possible to compute two cv::Mat mapping functions (one for the x coordinate and one for the y coordinate) that will give the new undistorted position of an image point on a distorted image. This is computed by the cv::initUndistortRectifyMap function, and the cv::remap function remaps all the points of an input image to a new image. Note that because of the nonlinear transformation, some pixels of the input image now fall outside the boundary of the output image. You can expand the size of the output image to compensate for this loss of pixels, but you will now obtain output pixels that have no values in the input image (they will then be displayed as black pixels). There's more... More options are available when it comes to camera calibration. Calibration with known intrinsic parameters When a good estimate of the camera's intrinsic parameters is known, it could be advantageous to input them in the cv::calibrateCamera function. They will then be used as initial values in the optimization process. To do so, you just need to add the CV_CALIB_USE_INTRINSIC_GUESS flag and input these values in the calibration matrix parameter. It is also possible to impose a fixed value for the principal point (CV_CALIB_FIX_PRINCIPAL_POINT), which can often be assumed to be the central pixel. You can also impose a fixed ratio for the focal lengths fx and fy (CV_CALIB_FIX_RATIO); in which case, you assume the pixels of the square shape. Using a grid of circles for calibration Instead of the usual chessboard pattern, OpenCV also offers the possibility to calibrate a camera by using a grid of circles. In this case, the centers of the circles are used as calibration points. The corresponding function is very similar to the function we used to locate the chessboard corners: cv::Size boardSize(7,7); std::vector<cv::Point2f> centers; bool found = cv:: findCirclesGrid( image, boardSize, centers); See also The A flexible new technique for camera calibration article by Z. Zhang in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no 11, 2000, is a classic paper on the problem of camera calibration Summary In this article, we explored the projective relations that exist between two images of the same scene. Resources for Article: Further resources on this subject: Creating an Application from Scratch [Article] Wrapping OpenCV [Article] New functionality in OpenCV 3.0 [Article]
Read more
  • 0
  • 0
  • 22104

article-image-integrating-angular-2-react
Mary Gualtieri
01 Sep 2016
5 min read
Save for later

How to integrate Angular 2 with React

Mary Gualtieri
01 Sep 2016
5 min read
It can be overwhelming to choose which framework is the best framework to use to get the job done in JavaScript, because you have so many options out there. In previous years, we have seen two popular frameworks come to fame: React and Angular. React has gained a lot of popularity because it is what Facebook and Instagram are built on. So, which one do you use? If you ask most JavaScript developers, they advise you to use one or the other, and that comes down to personal choice. Let's go ahead and explore Angular and React, and actually explore how the two can work together A common misconception about React is that React is a full JavaScript framework in competition with Angular. But it actually is not. React is a user interface library and is just the view in an 'MVC' framework, with a little bit of a controller in it. In other words, React is a template (an Angular term for view) where you can add some controller logic.It is the same idea of integrating the jQuery UI into JavaScript. React aids in Angular, which is a frontend framework, and makes it more efficient for the user. This is because you can write a reusable component that can be plugged into an application. Angular has many pros, and that's what makes it so appealing to a lot of developers and companies. With my personal experience, Angular has been very powerful in making solid applications. However, one of the cons that bothers me about Angular is how it goes about executing a template. I always want to practice writing DRY code and that can be a problem with Angular. You can end up writing a bunch of HTML that can be complicated and difficult to read. But you can also end up causing a spaghetti effect in your CSS. One of Angular's big strengths is the watchers and the binders. When executed correctly in well-thought-out places, it can be great for fast binding and good responsiveness. But like every human, we all make errors, and when misused, you can have performance issues when binding way too many elements in your HTML. This leads to a very slow and lagged application that no one wants to use. But there is a way to rectify this. You can use React to aid in Angular's downfalls.React was designed to work really well with other libraries and makes rendering views or templates much faster. The beauty of React is how it uses a more efficient algorithm on the virtual DOM. In plain terms, it allows you to change parts of the application that need to be updated without having to touch the rest of the application. You can send a command to update the user interface and React compares these changes to the existing DOM. Instead of theorizing on how Angular and React complement one another, let's see how it looks in code. First, let's create an HMTL file that has an Angular script and a React script. *Remember, your relative path may be different from the example.  Next, let's create a React component that renders a string that is inputted by the user.  What is happening here is that we are using React to render our model. We created a component that renders the props passed to it. Then we create an Angular directive and controller to start the app. The directive is calling the React component and telling it to render. But, let's take a look at another example that integrates Angular 2. We can demonstrate how a React component can be self-contained but still be injected into the Angular 2 world. Angular 2 has an optional hook, onInit(), that takes advantage of triggering the code to render a React component. As you can see, the host component has defined implementation for the onInit handler, where you can call the static initialize function. If you noticed, the initialize method is passing a title text that is passed down from the React component as a prop. *React Component Another item to consider that unites React and Angular is TypeScript. This is a big deal because we can manage both Angular code and React code in the same compilation step. When it comes down to it, TypeScript is what gets stripped down to regular JavaScript. One thing that you have to remember to do is tell the compiler that you are using JSX by specifying the JSX flag. To conclude, Angular will always remain a very popular framework for developers. For an application to render faster, React is a great way to render templates faster and create a more efficient user experience. React is a great compliment to Angular and will only enhance your application. About the author Mary Gualtieri is a full-stack web developer and web designer who enjoys all aspects of the web and creating a pleasant user experience. Web development, specifically frontend development, is an interest of hers because it challenges her to think outside of the box and solveproblems, all while constantly learning. She can be found on GitHub at MaryGualtieri.
Read more
  • 0
  • 2
  • 22094
article-image-how-to-publish-a-microservice-as-a-service-onto-a-docker
Pravin Dhandre
12 Apr 2018
6 min read
Save for later

How to publish Microservice as a service onto a Docker

Pravin Dhandre
12 Apr 2018
6 min read
In today’s tutorial, we will show a step-by-step approach to publishing a prebuilt microservice onto a docker so you can scale and control it further. Once the swarm is ready, we can use it to create services that we can use for scaling, but first, we will create a shared registry in our swarm. Then, we will build a microservice and publish it into a Docker. Creating a registry When we create a swarm's service, we specify an image that will be used, but when we ask Docker to create the instances of the service, it will use the swarm master node to do it. If we have built our Docker images in our machine, they are not available on the master node of the swarm, so it will create a registry service that we can use to publish our images and reference when we create our own services. First, let's create the registry service: docker service create --name registry --publish 5000:5000 registry Now, if we list our services, we should see our registry created: docker service ls ID NAME MODE REPLICAS IMAGE PORTS os5j0iw1p4q1 registry replicated 1/1 registry:latest *:5000->5000/tcp And if we visit http://localhost:5000/v2/_catalog, we should just get: {"repositories":[]} This Docker registry is ephemeral, meaning that if we stop the service or start it again, our images will disappear. For that reason, we recommend to use it only for development. For a real registry, you may want to use an external service. We discussed some of them in Chapter 7, Creating Dockers. Creating a microservice In order to create a microservice, we will use Spring Initializr, as we have been doing in previous chapters. We can start by visiting the URL: https:/​/​start.​spring.​io/​: Creating a project in Spring Initializr We have chosen to create a Maven Project using Kotlin and Spring Boot 2.0.0 M7. We chose the Group to be com.Microservices and the Artifact to be chapter08. For Dependencies, we have set Web. Now, we can click Generate Project to download it as a zip file. After we unzip it, we can open it with IntelliJ IDEA to start working on our project. After a few minutes, our project will be ready and we can open the Maven window to see the different lifecycle phases and Maven plugins and their goals. We covered how to use Spring Initializr, Maven, and IntelliJ IDEA in Chapter 2, Getting Started with Spring Boot 2.0. You can check out this chapter to learn about topics not covered in this section. Now, we will add RestController to our project, in IntelliJ IDEA, in the Project window. Right-click our com.Microservices.chapter08 package and then choose New | Kotlin File/Class. In the pop-up window, we will set its name as HelloController and in the Kind drop-down, choose Class. Let's modify our newly created controller: package com.microservices.chapter08 import org.springframework.web.bind.annotation.GetMapping import org.springframework.web.bind.annotation.RestController import java.util.* import java.util.concurrent.atomic.AtomicInteger @RestController class HelloController { private val id: String = UUID.randomUUID().toString() companion object { val total: AtomicInteger = AtomicInteger() } @GetMapping("/hello") fun hello() = "Hello I'm $id and I have been called ${total.incrementAndGet()} time(s)" } This small controller will display a message with a GET request to the /hello URL. The message will display a unique id that we establish when the object is created and it will show how many times it has been called. We have used AtomicInteger to guarantee that no other requests are modified on our number concurrently. Finally, we will rename application.properties in the resources folder, using Shift + F6 to get application.yml, and edit it: logging.level.org.springframework.web: DEBUG We will establish that we have the log in the debug level for the Sprint Web Framework, so for the latter, we could view the information that will be needed in our logs. Now, we can run our microservice, either using IntelliJ IDEA Maven Project window with the springboot:run target, or by executing it on the command line: mvnw spring-boot:run Either way, when we visit the http://localhost:8080/hello URL, we should see something like: Hello I'm a0193477-b9dd-4a50-85cc-9d9405e02299 and I have been called 1 time(s) Doing consecutive calls will increase the number. Finally, we can see at the end of our log that we have several requests. The following should appear: Hello I'm a0193477-b9dd-4a50-85cc-9d9405e02299 and I have been called 1 time(s) Now, we will stop the microservice to create our Docker. Creating our Docker Now that our microservice is running, we need to create a Docker. To simplify the process, we will just use Dockerfile. To create this file on the top of the Project window, rightclick chapter08 and then select New | File and type Dockerfile in the drop-down menu. In the next window, click OK and the file will be created. Now, we can add this to our Dockerfile: FROM openjdk:8-jdk-alpine ADD target/*.jar app.jar ENTRYPOINT ["java","-Djava.security.egd=file:/dev/./urandom","-jar", "App.jar"] In this case, we tell the JVM to use dev/urandom instead of dev/random to generate random numbers required by our ID generation and this will make the operation much faster. You may think that urandom is less secure than random, but read this article for more information: https:/​/www.​2uo.​de/​myths-​about-​urandom/​.​ From our microservice directory, we will use the command line to create our package: mvnw package Then, we will create a Docker for it: docker build . -t hello Now, we need to tag our image in our shared registry service, using the following command: docker tag hello localhost:5000/hello Then, we will push our image to the shared registry service: docker push localhost:5000/hello Creating the service Finally, we will add the microservice as a service to our swarm, exposing the 8080 port: docker service create --name hello-service --publish 8080:8080 localhost:5000/hello Now, we can navigate to http://localhost:8888/application/default to get the same results as before, but now we run our microservice in a Docker swarm. If you found this tutorial useful, do check out the book Hands-On Microservices with Kotlin  to work with Kotlin, Spring and Spring Boot for building reactive and cloud-native microservices. Also check out other posts: How to publish Docker and integrate with Maven Understanding Microservices How to build Microservices using REST framework
Read more
  • 0
  • 0
  • 22057

article-image-managing-files-folders-and-registry-items-using-powershell
Packt
23 Apr 2015
27 min read
Save for later

Managing Files, Folders, and Registry Items Using PowerShell

Packt
23 Apr 2015
27 min read
This article is written by Brenton J.W. Blawat, the author of Mastering Windows PowerShell Scripting. When you are automating tasks on servers and workstations, you will frequently run into situations where you need to manage files, folders, and registry items. PowerShell provides a wide variety of cmdlets that enable you to create, view, modify, and delete items on a system. (For more resources related to this topic, see here.) In this article, you will learn many techniques to interact with files, folders, and registry items. These techniques and items include: Registry provider Creating files, folders, registry keys, and registry named values Adding named values to registry keys Verifying the existence of item files, folders, and registry keys Renaming files, folders, registry keys, and named values Copying and moving files and folders Deleting files, folders, registry keys, and named values To properly follow the examples in this article, you will need to sequentially execute the examples. Each example builds on the previous examples, and some of these examples may not function properly if you do not execute the previous steps. Registry provider When you're working with the registry, PowerShell interprets the registry in the same way it does files and folders. In fact, the cmdlets that you use for files and folders are the same that you would use for registry items. The only difference with the registry is the way in which you call the registry path locations. When you want to reference the registry in PowerShell, you use the [RegistryLocation]:Path syntax. This is made available through the PowerShell Windows Registry Provider. While referencing [RegistryLocation]:Path, PowerShell provides you with the ability to use registry abbreviations pertaining to registry path locations. Instead of referencing the full path of HKEY_LOCAL_MACHINE, you can use the abbreviation of HKLM. Some other abbreviations include: HKLM: Abbreviation for HKEY_LOCAL_MACHINE hive HKCU: Abbreviation for HKEY_CURRENT_USER hive HKU: Abbreviation for HKEY_USERS hive HKCR: Abbreviation for HKEY_CLASSES_ROOT hive HKCC: Abbreviation for HKEY_CURRENT_CONFIG hive For example, if you wanted to reference the named values in the Run registry key for programs that start up on boot, the command line syntax would look like this: HKLM:SoftwareMicrosoftWindowsCurrentVersionRun While it is recommended that you don't use cmdlet aliases in your scripts, it is recommended, and a common practice, to use registry abbreviations in your code. This not only reduces the amount of effort to create the scripts but also makes it easier for others to read the registry locations. Creating files, folders, and registry items with PowerShell When you want to create a new file, folder, or registry key, you will need to leverage the new-item cmdlet. The syntax of this command is new-item, calling the –path argument to specify the location, calling the -name argument to provide a name for the item, and the -ItemType argument to designate whether you want a file or a directory (folder). When you are creating a file, it has an additional argument of –value, which allows you to prepopulate data into the file after creation. When you are creating a new registry key in PowerShell, you can omit the –ItemType argument as it is not needed for registry key creation. PowerShell assumes that when you are interacting with the registry using new-item, you are creating registry keys. The new-item command accepts the -force argument in the instance that the file, folder, or key is being created in a space that is restricted by User Account Control (UAC). To create a new folder and registry item, do the following action: New-item –path "c:Program Files" -name MyCustomSoftware –ItemType Directory New-item –path HKCU:SoftwareMyCustomSoftware -force The output of this is shown in the following screenshot: The preceding example shows how you can create folders and registry keys for a custom application. You first create a new folder in c:Program Files named MyCustomSoftware. You then create a new registry key in HKEY_CURRENT_USER:Software named MyCustomSoftware. You start by issuing the new-item cmdlet followed by the –path argument to designate that the new folder should be placed in c:Program Files. You then call the –name argument to specify the name of MyCustomSoftware. Finally, you tell the cmdlet that the -ItemType argument is Directory. After executing this command you will see a new folder in c:Progam Files named MyCustomSoftware. You then create the new registry key by calling the new-item cmdlet and issuing the –path argument and then specifying the HKCU:SoftwareMyCustomSoftware key location, and you complete it with the –force argument to force the creation of the key. After executing this command, you will see a new registry key in HKEY_CURRENT_USER:Software named MyCustomSoftware. One of the main benefits of PowerShell breaking apart the -path, -name, and -values arguments is that you have the ability to customize each of the values before you use them with the new-item cmdlet. For example, if you want to name a log file with the date stamp, add that parameter into a string and set the –name value to a string. To create a log file with a date included in the filename, do the following action: $logpath = "c:Program FilesMyCustomSoftwareLogs" New-item –path $logpath –ItemType Directory | out-null $itemname = (get-date –format "yyyyMMddmmss") + "MyLogFile.txt" $itemvalue = "Starting Logging at: " + " " + (get-date) New-item –path $logpath -name $itemname –ItemType File –value $itemvalue $logfile = $logpath + $itemname $logfile The output of this is shown in the following screenshot: The content of the log file is shown in the following screenshot: The preceding example displays how you can properly create a new log file with a date time path included in the log file name. It also shows how to create a new directory for the logs. It then displays how to include text inside the log file, designating the start of a new log file. Finally, this example displays how you can save the log file name and path in a variable to use later in your scripts. You first start by declaring the path of c:Program FilesMyCustomSoftwareLogs in the $logpath variable. You then use the new-item cmdlet to create a new folder in c:Program FilesMyCustomSoftware named Logs. By piping the command to out-null, the default output of the directory creation is silenced. You then declare the name that you want the file to be by using the get-date cmdlet, with the –format argument set to yyyyMMddmmss, and by adding mylogfile.txt. This will generate a date time stamp in the format of 4 digits including year, month, day, minutes, seconds, and mylogfile.txt. You then set the name of the file to the $itemname variable. Finally, you declare the $itemvalue variable which contains Starting Log at: and the standard PowerShell date time information. After the variables are populated, you issue the new-item command, the –path argument referencing the $logpath variable, the –name argument referencing the $itemname variable, the –ItemType referencing File, and the –value argument referencing the $itemvalue variable. At the end of the script, you will take the $logpath and $itemname variables to create a new variable of $logfile, which contains the location of the log file. As you will see from this example, after you execute the script the log file is populated with the value of Starting Logging at: 03/16/2015 14:38:24. Adding named values to registry keys When you are interacting with the registry, you typically view and edit named values or properties that are contained with in the keys and sub-keys. PowerShell uses several cmdlets to interact with named values. The first is the get-itemproperty cmdlet which allows you to retrieve the properties of a named value. The proper syntax for this cmdlet is to specify get-itemproperty to use the –path argument to specify the location in the registry, and to use the –name argument to specify the named value. The second cmdlet is new-itemproperty, which allows you to create new named values. The proper syntax for this cmdlet is specifying new-itemproperty, followed by the –path argument and the location where you want to create the new named value. You then specify the –name argument and the name you want to call the named value with. Finally, you use the –PropertyType argument which allows you to specify what kind of registry named value you want to create. The PropertyType argument can be set to Binary, DWord, ExpandString, MultiString, String, and Qword, depending on what your need for the registry value is. Finally, you specifythe –value argument which enables you to place a value into that named value. You may also use the –force overload to force the creation of the key in the instance that the key may be restricted by UAC. To create a named value in the registry, do the following action: $regpath = "HKCU:SoftwareMyCustomSoftware" $regname = "BuildTime" $regvalue = "Build Started At: " + " " + (get-date) New-itemproperty –path $regpath –name $regname –PropertyType String –value $regvalue $verifyValue = Get-itemproperty –path $regpath –name $regname Write-Host "The $regName named value is set to: " $verifyValue.$regname The output of this is shown in the following screenshot:   After executing the script, the registry will look like the following screenshot:   This script displays how you can create a registry named value in a specific location. It also displays how you can retrieve a value and display it in the console. You first start by defining several variables. The first variable $regpath defines where you want to create the new named value which is in the HKCU:SoftwareMyCustomSoftware registry key. The second variable $regname defines what you want the new named value to be named, which is BuildTime. The third variable defines what you want the value of the named value to be, which is Build Started At: with the current date and time. The next step in the script is to create the new value. You first call the new-itemproperty cmdlet with the –path argument and specify the $regpath variable. You then use the –name argument and specify $regname. This is followed by specifying the –PropertyType argument and by specifying the string PropertyType. Finally, you specify the –value argument and use the $regvalue variable to fill the named value with data. Proceeding forward in the script, you verify that the named value has proper data by leveraging the get-itemproperty cmdlet. You first define the $verifyvalue variable that captures the data from the cmdlet. You then issue get-itemproperty with the –path argument of $regpath and the –name argument of $regname. You then write to the console that the $regname named value is set to $verifyvalue.$regname. When you are done with script execution, you should have a new registry named value of BuildTime in the HKEY_CURRENT_USER:SoftwareMyCustomSoftware key with a value similar to Build Started At: 03/16/2015 14:49:22. Verifying files, folders, and registry items When you are creating and modifying objects, it's important to make sure that the file, folder, and registry items don't exist prior to creating and modifying them. The test-path cmdlet allows you to test to see if a file, folder, or registry item exists prior to working with it. The proper syntax for this is first calling test-path and then specifying a file, folder, or registry location. The result of the test-path command is True if the object exists or False if the object doesn't exist. To verify if files, folders, and registry entries exist, do the following action: $testfolder = test-path "c:Program FilesMyCustomSoftwareLogs" #Update The Following Line with the Date/Timestamp of your file $testfile = test-path "c:Program FilesMyCustomSoftwareLogs201503163824MyLogFile.txt" $testreg = test-path "HKCU:SoftwareMyCustomSoftware" If ($testfolder) { write-host "Folder Found!" } If ($testfile) { write-host "File Found!" } If ($testreg) { write-host "Registry Key Found!" } The output is shown in the following screenshot: This example displays how to verify if a file, folder, and registry item exists. You first start by declaring a variable to catch the output from the test-path cmdlet. You then specify test-path, followed by a file, folder, or registry item whose existence you want to verify. In this example, you start by using the test-path cmdlet to verify if the Logs folder is located in the c:Program FilesMyCustomSoftware directory. You then store the result in the $testfolder variable. You then use the test-path cmdlet to check if the file located at c:Program FilesMyCustomSoftwareLogs201503163824MyLogFile.txt exists. You then store the result in the $testfile variable. Finally, you use the test-path cmdlet to see if the registry key of HKCU:SoftwareMyCustomSoftware exists. You then store the result in the $testreg variable. To evaluate the variables, you create IF statements to check whether the variables are True and write to the console if the items are found. After executing the script, the console will output the messages Folder Found!, File Found!, and Registry Key Found!. Copying and moving files and folders When you are working in the operating system, there may be instances where you need to copy or move files and folders around on the operating system. PowerShell provides two cmdlets to copy and move files. The copy-item cmdlet allows you to copy a file or a folder from one location to another. The proper syntax of this cmdlet is calling copy-item, followed by –path argument for the source you want to copy and the –destination argument for the destination of the file or folder. The copy-item cmdlet also has the –force argument to write over a read-only or hidden file. There are instances when read-only files cannot be overwritten, such as a lack of user permissions, which will require additional code to change the file attributes before copying over files or folders. The copy-item cmdlet also has a –recurse argument, which allows you to recursively copy the files in a folder and its subdirectories. A common trick to use with the copy-item cmdlet is to rename during the copy operation. To do this, you change the destination to the desired name you want the file or folder to be. After executing the command, the file or folder will have a new name in its destination. This reduces the number of steps required to copy and rename a file or folder. The move-item cmdlet allows you to move files from one location to another. The move-item cmdlet has the same syntax as the copy-item cmdlet. The proper syntax of this cmdlet is calling move-item, followed by the –path argument for the source you want to move and the –destination argument for the destination of the file or folder. The move-item cmdlet also has the –force overload to write over a read-only or hidden file. There are also instances when read-only files cannot be overwritten, such as a lack of user permissions, which will require additional code to change the file attributes before moving files or folders. The move-item cmdlet does not, however, have a -recurse argument. Also, it's important to remember that the move-item cmdlet requires the destination to be created prior to the move. If the destination folder is not available, it will throw an exception. It's recommended to use the test-path cmdlet in conjunction with the move-item cmdlet to verify that the destination exists prior to the move operation. PowerShell has the same file and folder limitations as the core operating system it is being run on. This means that file paths that are longer than 256 characters in length will receive an error message during the copy process. For paths that are over 256 characters in length, you need to leverage robocopy.exe or a similar file copy program to copy or move files. All move-item operations are recursive by default. You do not have to specify the –recurse argument to recursively move files. To copy files recursively, you need to specify the –recurse argument. To copy and move files and folders, do the following action: New-item –path "c:Program FilesMyCustomSoftwareAppTesting" –ItemType Directory | Out-null New-item –path "c:Program FilesMyCustomSoftwareAppTestingHelp" -ItemType Directory | Out-null New-item –path "c:Program FilesMyCustomSoftwareAppTesting" –name AppTest.txt –ItemType File | out-null New-item –path "c:Program FilesMyCustomSoftwareAppTestingHelp" –name HelpInformation.txt –ItemType File | out-null New-item –path "c:Program FilesMyCustomSoftware" -name ConfigFile.txt –ItemType File | out-null move-item –path "c:Program FilesMyCustomSoftwareAppTesting" –destination "c:Program FilesMyCustomSoftwareArchive" –force copy-item –path "c:Program FilesMyCustomSoftwareConfigFile.txt" "c:Program FilesMyCustomSoftwareArchiveArchived_ConfigFile.txt" –force The output of this is shown in the following screenshot: This example displays how to properly use the copy-item and move-item cmdlets. You first start by using the new-item cmdlet with the –path argument set to c:Program FilesMyCustomSoftwareAppTesting and the –ItemType argument set to Directory. You then pipe the command to out-null to suppress the default output. This creates the AppTesting sub directory in the c:Program FilesMyCustomSoftware directory. You then create a second folder using the new-item cmdlet with the –path argument set to c:Program FilesMyCustomSoftwareAppTestingHelp and the –ItemType argument set to Directory. You then pipe the command to out-null. This creates the Help sub directory in the c:Program FilesMyCustomSoftwareAppTesting directory. After creating the directories, you create a new file using the new-item cmdlet with the path of c:Program FilesMyCustomSoftwareAppTesting, the -name argument set to AppTest.txt, the -ItemType argument set to File; you then pipe it to out-null. You create a second file by using the new-item cmdlet with the path of c:Program FilesMyCustomSoftwareAppTestingHelp, the -name argument set to HelpInformation.txt and the -ItemType argument set to File, and then piping it to out-null. Finally, you create a third file using the new-item cmdlet with the path of c:Program FilesMyCustomSoftware, the -name argument set to ConfigFile.txt and the -ItemType argument set to File, and then pipe it to out-null. After creating the files, you are ready to start copying and moving files. You first move the AppTesting directory to the Archive directory by using the move-item cmdlet and then specifying the –path argument with the value of c:Program FilesMyCustomSoftwareAppTesting as the source, the –destination argument with the value of c:Program FilesMyCustomSoftwareArchive as the destination, and the –force argument to force the move if the directory is hidden. You then copy a configuration file by using the copy-item cmdlet, using the –path argument with c:Program FilesMyCustomSoftwareConfigFile.txt as the source, and then specifying the –destination argument with c:Program FilesMyCustomSoftwareArchiveArchived_ConfigFile.txt as the new destination with a new filename;, you then leverage the –force argument to force the copy if the file is hidden. This follow screenshot displays the file folder hierarchy after executing this script: After executing this script, the file folder hierarchy should be as displayed in the preceding screenshot. This also displays that when you move the AppTesting directory to the Archive folder, it automatically performs the move recursively, keeping the file and folder structure intact. Renaming files, folders, registry keys, and named values When you are working with PowerShell scripts, you may have instances where you need to rename files, folders, and registry keys. The rename-item cmdlet can be used to perform renaming operations on a system. The syntax for this cmdlet is rename-item and specifying the –path argument with path to the original object, and then you call the –newname argument with a full path to what you want the item to be renamed to. The rename-item has a –force argument to force the rename in instances where the file or folder is hidden or restricted by UAC or to avoid prompting for the rename action. To copy and rename files and folders, do the following action: New-item –path "c:Program FilesMyCustomSoftwareOldConfigFiles" –ItemType Directory | out-null Rename-item –path "c:Program FilesMyCustomSoftwareOldConfigFiles" –newname "c:Program FilesMyCustomSoftwareConfigArchive" -force copy-item –path "c:Program FilesMyCustomSoftwareConfigFile.txt" "c:Program FilesMyCustomSoftwareConfigArchiveConfigFile.txt" –force Rename-item –path "c:Program FilesMyCustomSoftwareConfigArchiveConfigFile.txt" –newname "c:Program FilesMyCustomSoftwareConfigArchiveOld_ConfigFile.txt" –force The output of this is shown in the following screenshot: In this example, you create a script that creates a new folder and a new file, and then renames the file and the folder. To start, you leverage the new-item cmdlet which creates a new folder in c:Program FilesMyCustomSoftware named OldConfigFiles. You then pipe that command to Out-Null, which silences the standard console output of the folder creation. You proceed to rename the folder c:Program FilesMyCustomSoftwareOldConfigFiles with the rename-item cmdlet using the –newname argument to c:Program FilesMyCustomSoftwareConfigArchive. You follow the command with the –force argument to force the renaming of the folder. You leverage the copy-item cmdlet to copy the ConfigFile.txt into the ConfigArchive directory. You first start by specifying the copy-item cmdlet with the –path argument set to c:Program FilesMyCustomSoftwareConfigFile.txt and the destination set to c:Program FilesMyCustomSoftwareConfigArchiveConfigFile.txt. You include the –Force argument to force the copy. After moving the file, leverage the rename-item cmdlet with the -path argument to rename c:Program FilesMyCustomSoftwareConfigArchiveConfigFile.txt using the –newname argument to c:Program FilesMyCustomSoftwareConfigArchiveOld_ConfigFile.txt. You follow this command with the –force argument to force the renaming of the file. At the end of this script, you will have successfully renamed a folder, moved a file into that renamed folder, and renamed a file in the newly created folder. In the instance that you want to rename a registry key, do the following action: New-item –path "HKCU:SoftwareMyCustomSoftware" –name CInfo –force | out-null Rename-item –path "HKCU:SoftwareMyCustomSoftwareCInfo" –newname ConnectionInformation –force The output of this is shown in the following screenshot: After renaming the subkey, the registry will look like the following screenshot: This example displays how to create a new subkey and rename it. You first start by using the new-item cmdlet to create a new sub key with the –path argument of the HKCU:SoftwareMyCustomSoftware key and the -name argument set to CInfo. You then pipe that line to out-null in order to suppress the standard output from the script. You proceed to execute the rename-item cmdlet with the –path argument set to HKCU:SoftwareMyCustomSoftwareCInfo and the –newname argument set to ConnectionInformation. You then use the –force argument to force the renaming in instances when the subkey is restricted by UAC. After executing this command, you will see that the CInfo subkey located in HKCU:SoftwareMyCustomSoftware is now renamed to ConnectionInformation. When you want to update named values in the registry, you will not be able to use the rename-item cmdlet. This is due to the named values being properties of the keys themselves. Instead, PowerShell provides the rename-itemproperty cmdlet to rename the named values in the key. The proper syntax for this cmdlet is calling rename-itemproperty by using the –path argument, followed by the path to the key that contains the named value. You then issue the –name argument to specify the named value you want to rename. Finally, you specify the –newname argument and the name you want the named value to be renamed to. To rename the registry named value, do the following action: $regpath = "HKCU:SoftwareMyCustomSoftwareConnectionInformation" $regname = "DBServer" $regvalue = "mySQLserver.mydomain.local" New-itemproperty –path $regpath –name $regname –PropertyType String –value $regvalue | Out-null Rename-itemproperty –path $regpath –name DBServer –newname DatabaseServer The output of this is shown in the following screenshot: After updating the named value, the registry will reflect this change, and so should look like the following screenshot: The preceding script displays how to create a new named value and rename it to a different named value. You first start by defining the variables to be used with the new-itemproperty cmdlet. You define the location of the registry subkey in the $regpath variable and set it to HKCU:SoftwareMyCustomSoftwareConnectionInformation. You then specify the named value name of DBServer and store it in the $regname variable. Finally, you define the $regvalue variable and store the value of mySQLserver.mydomain.local. To create the new named value, leverage new-itemproperty, specify the –path argument with the $regpath variable, use the –name argument with the $regname variable, and use the –value argument with the $regvalue variable. You then pipe this command to out-null in order to suppress the default output of the command. This command will create the new named value of DBServer with the value of mySQLserver.mydomain.local in the HKCU:SoftwareMyCustomSoftwareConnectionInformation subkey. The last step in the script is renaming the DBServer named value to DatabaseServer. You first start by calling the rename-itemproperty cmdlet and then using the –path argument and specifying the $regpath variable which contains the HKCU:SoftwareMyCustomSoftwareConnectionInformation subkey; you then proceed by calling the –name argument and specifying DBServer and finally calling the –newname argument with the new value of DatabaseServer. After executing this command, you will see that the HKCU:SoftwareMyCustomSoftwareConnectionInformation key has a new named value of DatabaseServer containing the same value of mySQLserver.mydomain.local. Deleting files, folders, registry keys, and named values When you are creating scripts, there are instances when you need to delete items from a computer. PowerShell has the remove-item cmdlet that enables the removal of objects from a computer. The syntax of this cmdlet starts by calling the remove-item cmdlet and proceeds with specifying the –path argument with a file, folder, or registry key to delete. The remove-item cmdlet has several useful arguments that can be leveraged. The –force argument is available to delete files, folders, and registry keys that are read-only, hidden, or restricted by UAC. The –recurse argument is available to enable recursive deletion of files, folders, and registry keys on a system. The –include argument enables you to delete specific files, folders, and registry keys. The –include argument allows you to use the wildcard character of an asterisk (*) to search for specific values in an object name or a specific object type. The –exclude argument will exclude specific files, folders, and registry keys on a system. It also accepts the wildcard character of an asterisk (*) to search for specific values in an object name or a specific object type. The named values in the registry are properties of the key that they are contained in. As a result, you cannot use the remove-item cmdlet to remove them. Instead, PowerShell offers the remove-itemproperty cmdlet to enable the removal of the named values. The remove-itemproperty cmdlet has arguments similar to those of the remove-item cmdlet. It is important to note, however, that the –filter, -include, and –exclude arguments will not work with named values in the registry. They only work with item paths such as registry keys. To set up the system for the deletion example, you need to process the following script: # Create New Directory new-item –path "c:program filesMyCustomSoftwareGraphics" –ItemType Directory | Out-null # Create Files for This Example new-item –path "c:program filesMyCustomSoftwareGraphics" –name FirstGraphic.bmp –ItemType File | Out-Null new-item –path "c:program filesMyCustomSoftwareGraphics" –name FirstGraphic.png –ItemType File | Out-Null new-item –path "c:program filesMyCustomSoftwareGraphics" –name SecondGraphic.bmp –ItemType File | Out-Null new-item –path "c:program filesMyCustomSoftwareGraphics" –name SecondGraphic.png –ItemType File | Out-Null new-item –path "c:program filesMyCustomSoftwareLogs" –name 201301010101LogFile.txt –ItemType File | Out-Null new-item –path "c:program filesMyCustomSoftwareLogs" –name 201302010101LogFile.txt –ItemType File | Out-Null new-item –path "c:program filesMyCustomSoftwareLogs" –name 201303010101LogFile.txt –ItemType File | Out-Null # Create New Registry Keys and Named Values New-item –path "HKCU:SoftwareMyCustomSoftwareAppSettings" | Out-null New-item –path "HKCU:SoftwareMyCustomSoftwareApplicationSettings" | Out-null New-itemproperty –path "HKCU:SoftwareMyCustomSoftwareApplicationSettings" –name AlwaysOn –PropertyType String –value True | Out-null New-itemproperty –path "HKCU:SoftwareMyCustomSoftwareApplicationSettings" –name AutoDeleteLogs –PropertyType String –value True | Out-null The output of this is shown in the following screenshot: The preceding example is designed to set up the file structure for the following example. You first use the new-item cmdlet to create a new directory called Graphics in c:program filesMyCustomSoftware. You then use the new-item cmdlet to create new files named FirstGraphic.bmp, FirstGraphic.png, SecondGraphic.bmp, and SecondGraphic.png in the c:Program FilesMyCustomSoftwareGraphics directory. You then use the new-item cmdlet to create new log files in c:Program FilesMyCustomSoftwareLogs named 201301010101LogFile.txt, 201302010101LogFile.txt, and 201303010101LogFile.txt. After creating the files, you create two new registry keys located at HKCU:SoftwareMyCustomSoftwareAppSettings and HKCU:SoftwareMyCustomSoftwareApplicationSettings. You then populate the HKCU:SoftwareMyCustomSoftwareApplicationSettings key with a named value of AlwaysOn set to True and a named value of AutoDeleteLogs set to True. To remove files, folders, and registry items from a system, do the following action: # Get Current year $currentyear = get-date –f yyyy # Build the Exclude String $exclude = "*" + $currentyear + "*" # Remove Items from System Remove-item –path "c:Program FilesMyCustomSoftwareGraphics" –include *.bmp –force -recurse Remove-item –path "c:Program FilesMyCustomSoftwareLogs" –exclude $exclude -force –recurse Remove-itemproperty –path "HKCU:SoftwareMyCustomSoftwareApplicationSettings" –Name AutoDeleteLogs Remove-item –path "HKCU:SoftwareMyCustomSoftwareApplicationSettings" The output of this is shown in the following screenshot: This script displays how you can leverage PowerShell to clean up files and folders with the remove-item cmdlet and the –exclude and –include arguments. You first start by building the exclusion string for the remove-item cmdlet. You retrieve the current year by using the get-date cmdlet with the –f parameter set to yyyy. You save the output into the $currentyear variable. You then create a $exclude variable that appends asterisks on each end of the $currentyear variable, which contains the current date. This will allow the exclusion filter to find the year anywhere in the file or folder names. The first command is that you use the remove-item cmdlet and call the –path argument with the path of c:Program FilesMyCustomSoftwareGraphics. You then specify the –include argument with the value of *.bmp. This tells the remove-item cmdlet to delete all files that end in .bmp. You then specify –force to force the deletion of the files and –recurse to search the entire Graphics directory to delete the files that meet the *.bmp inclusion criteria but leaves the other files you created with the *.png extension. The second command leverages the remove-item cmdlet with the –path argument set to c:Program FilesMyCustomSoftwareLogs. You use the –exclude argument with the value of $exclude to exclude files that contain the current year. You then specify –force to force the deletion of the files and –recurse to search the entire logs directory to delete the files and folders that do not meet the exclusion criteria. The third command leverages the remove-itemproperty cmdlet with the –path argument set to HKCU:SoftwareMyCustomSoftwareApplicationSettings and the –name argument set to AutoDeleteLogs. After execution, the AutoDeleteLogs named path is deleted from the registry. The last command leverages the remote-item cmdlet with the –path argument set to HKCU:SoftwareMyCustomSoftwareApplicationSettings. After running this last command, the entire subkey of ApplicationSettings is removed from HKCU:SoftwareMyCustomSoftware. After executing this script, you will see that the script deletes the .BMP files in the c:Program FilesMyCustomSoftwareGraphics directory, but it leaves the .PNG files. You will also see that the script deletes all of the log files except the ones that had the current year contained in them. Last, you will see that the ApplicationSettings sub key that was created in the previous step is successfully deleted from HKCU:SoftwareMyCustomSoftware. When you use the remove-item and -recurse parameters together, it is important to note that if remote-item cmdlet deletes all the files and folders in a directory, the –recurse parameter will also delete the empty folder and subfolders that contained those files. This is only true when there are no remaining files in the folders in a particular directory. This may create undesirable results on your system, and so you should use caution while performing this combination. Summary This article thoroughly explained the interaction of PowerShell with the files, folders, and registry objects. It began by displaying how to create a folder and a registry key by leveraging the new-item cmdlet. It also displayed the additional arguments that can be used with the new-item cmdlet to create a log file with the date time integrated in the filename. The article proceeded to display how to create and view a registry key property using the get-itemproperty and new-itemproperty cmdlets This article then moved to verification of files, folder, and registry items through the test-path cmdlet. By using this cmdlet, you can test to see if the object exists prior to interacting with it. You also learned how to interact with copying and moving files and folders by leveraging the copy-item and move-item cmdlets. You also learned how to rename files, folders, registry keys and registry properties with the use of the rename-item and rename-itemproperty cmdlets. This article ends with learning how to delete files, folders, and registry items by leveraging the remove-item and remove-itemproperty cmdlets. Resources for Article: Further resources on this subject: Azure Storage [article] Hyper-V Basics [article] Unleashing Your Development Skills with PowerShell [article]
Read more
  • 0
  • 0
  • 22055

article-image-speech2face-a-neural-network-that-imagines-faces-from-hearing-voices-is-it-too-soon-to-worry-about-ethnic-profiling
Savia Lobo
28 May 2019
8 min read
Save for later

Speech2Face: A neural network that “imagines” faces from hearing voices. Is it too soon to worry about ethnic profiling?

Savia Lobo
28 May 2019
8 min read
Last week, a few researchers from the MIT CSAIL and Google AI published their research study of reconstructing a facial image of a person from a short audio recording of that person speaking, in their paper titled, “Speech2Face: Learning the Face Behind a Voice”. The researchers designed and trained a neural network which uses millions of natural Internet/YouTube videos of people speaking. During training, they demonstrated that the model learns voice-face correlations that allows it to produce images that capture various physical attributes of the speakers such as age, gender, and ethnicity. The entire training was done in a self-supervised manner, by utilizing the natural co-occurrence of faces and speech in Internet videos, without the need to model attributes explicitly. They said they further evaluated and numerically quantified how their Speech2Face reconstructs, obtains results directly from audio, and how it resembles the true face images of the speakers. For this, they tested their model both qualitatively and quantitatively on the AVSpeech dataset and the VoxCeleb dataset. The Speech2Face model The researchers utilized the VGG-Face model, a face recognition model pre-trained on a large-scale face dataset called DeepFace and extracted a 4096-D face feature from the penultimate layer (fc7) of the network. These face features were shown to contain enough information to reconstruct the corresponding face images while being robust to many of the aforementioned variations. The Speech2Face pipeline consists of two main components: 1) a voice encoder, which takes a complex spectrogram of speech as input, and predicts a low-dimensional face feature that would correspond to the associated face; and 2) a face decoder, which takes as input the face feature and produces an image of the face in a canonical form (frontal-facing and with neutral expression). During training, the face decoder is fixed, and only the voice encoder is trained which further predicts the face feature. How were the facial features evaluated? To quantify how well different facial attributes are being captured in Speech2Face reconstructions, the researchers tested different aspects of the model. Demographic attributes Researchers used Face++, a leading commercial service for computing facial attributes. They evaluated and compared age, gender, and ethnicity, by running the Face++ classifiers on the original images and our Speech2Face reconstructions. The Face++ classifiers return either “male” or “female” for gender, a continuous number for age, and one of the four values, “Asian”, “black”, “India”, or “white”, for ethnicity. Source: Arxiv.org Craniofacial attributes Source: Arxiv.org The researchers evaluated craniofacial measurements commonly used in the literature, for capturing ratios and distances in the face. They computed the correlation between F2F and the corresponding S2F reconstructions. Face landmarks were computed using the DEST library. As can be seen, there is statistically significant (i.e., p < 0.001) positive correlation for several measurements. In particular, the highest correlation is measured for the nasal index (0.38) and nose width (0.35), the features indicative of nose structures that may affect a speaker’s voice. Feature similarity The researchers further test how well a person can be recognized from on the face features predicted from speech. They, first directly measured the cosine distance between the predicted features and the true ones obtained from the original face image of the speaker. The table above shows the average error over 5,000 test images, for the predictions using 3s and 6s audio segments. The use of longer audio clips exhibits consistent improvement in all error metrics; this further evidences the qualitative improvement observed in the image below. They further evaluated how accurately they could retrieve the true speaker from a database of face images. To do so, they took the speech of a person to predict the feature using the Speech2Face model and query it by computing its distances to the face features of all face images in the database. Ethical considerations with Speech2Face model Researchers said that the training data used is a collection of educational videos from YouTube and that it does not represent equally the entire world population. Hence, the model may be affected by the uneven distribution of data. They have also highlighted that “ if a certain language does not appear in the training data, our reconstructions will not capture well the facial attributes that may be correlated with that language”. “In our experimental section, we mention inferred demographic categories such as “White” and “Asian”. These are categories defined and used by a commercial face attribute classifier and were only used for evaluation in this paper. Our model is not supplied with and does not make use of this information at any stage”, the paper mentions. They also warn that any further investigation or practical use of this technology would be carefully tested to ensure that the training data is representative of the intended user population. “If that is not the case, more representative data should be broadly collected”, the researchers state. Limitations of the Speech2Face model In order to test the stability of the Speech2Face reconstruction, the researchers used faces from different speech segments of the same person, taken from different parts within the same video, and from a different video. The reconstructed face images were consistent within and between the videos. They further probed the model with an Asian male example speaking the same sentence in English and Chinese to qualitatively test the effect of language and accent. While having the same reconstructed face in both cases would be ideal, the model inferred different faces based on the spoken language. In other examples, the model was able to successfully factor out the language, reconstructing a face with Asian features even though the girl was speaking in English with no apparent accent. “In general, we observed mixed behaviors and a more thorough examination is needed to determine to which extent the model relies on language. More generally, the ability to capture the latent attributes from speech, such as age, gender, and ethnicity, depends on several factors such as accent, spoken language, or voice pitch. Clearly, in some cases, these vocal attributes would not match the person’s appearance”, the researchers state in the paper. Speech2Cartoon: Converting generated image into cartoon faces The face images reconstructed from speech may also be used for generating personalized cartoons of speakers from their voices. The researchers have used Gboard, the keyboard app available on Android phones, which is also capable of analyzing a selfie image to produce a cartoon-like version of the face. Such cartoon re-rendering of the face may be useful as a visual representation of a person during a phone or a video conferencing call when the person’s identity is unknown or the person prefers not to share his/her picture. The reconstructed faces may also be used directly, to assign faces to machine-generated voices used in home devices and virtual assistants. https://twitter.com/NirantK/status/1132880233017761792 A user on HackerNews commented, “This paper is a neat idea, and the results are interesting, but not in the way I'd expected. I had hoped it would the domain of how much person-specific information this can deduce from a voice, e.g. lip aperture, overbite, size of the vocal tract, openness of the nares. This is interesting from a speech perception standpoint. Instead, it's interesting more in the domain of how much social information it can deduce from a voice. This appears to be a relatively efficient classifier for gender, race, and age, taking voice as input.” “I'm sure this isn't the first time it's been done, but it's pretty neat to see it in action, and it's a worthwhile reminder: If a neural net is this good at inferring social, racial, and gender information from audio, humans are even better. And the idea of speech as a social construct becomes even more relevant”, he further added. This recent study is interesting considering the fact that it is taking AI to another level wherein we are able to predict the face just by using audio recordings and even without the need for a DNA. However, there can be certain repercussions, especially when it comes to security. One can easily misuse such technology by impersonating someone else and can cause trouble. It would be interesting to see how this study turns out to be in the near future. To more about the Speech2Face model in detail, head over to the research paper. OpenAI introduces MuseNet: A deep neural network for generating musical compositions An unsupervised deep neural network cracks 250 million protein sequences to reveal biological structures and functions OpenAI researchers have developed Sparse Transformers, a neural network which can predict what comes next in a sequence
Read more
  • 0
  • 0
  • 22010
article-image-cross-validation-r-predictive-models
Pravin Dhandre
17 Apr 2018
8 min read
Save for later

Cross-validation in R for predictive models

Pravin Dhandre
17 Apr 2018
8 min read
In today’s tutorial, we will efficiently train our first predictive model, we will use Cross-validation in R as the basis of our modeling process. We will build the corresponding confusion matrix. Most of the functionality comes from the excellent caret package. You can find more information on the vast features of caret package that we will not explore in this tutorial. Before moving to the training tutorial, lets understand what a confusion matrix is. A confusion matrix is a summary of prediction results on a classification problem. The number of correct and incorrect predictions are summarized with count values and broken down by each class. This is the key to the confusion matrix. The confusion matrix shows the ways in which your classification model is confused when it makes predictions. It gives you insight not only into the errors being made by your classifier but more importantly the types of errors that are being made. Training our first predictive model Following best practices, we will use Cross Validation (CV) as the basis of our modeling process. Using CV we can create estimates of how well our model will do with unseen data. CV is powerful, but the downside is that it requires more processing and therefore more time. If you can take the computational complexity, you should definitely take advantage of it in your projects. Going into the mathematics behind CV is outside of the scope of this tutorial. If interested, you can find out more information on cross validation on Wikipedia . The basic idea is that the training data will be split into various parts, and each of these parts will be taken out of the rest of the training data one at a time, keeping all remaining parts together. The parts that are kept together will be used to train the model, while the part that was taken out will be used for testing, and this will be repeated by rotating the parts such that every part is taken out once. This allows you to test the training procedure more thoroughly, before doing the final testing with the testing data. We use the trainControl() function to set our repeated CV mechanism with five splits and two repeats. This object will be passed to our predictive models, created with the caret package, to automatically apply this control mechanism within them: cv.control <- trainControl(method = "repeatedcv", number = 5, repeats = 2) Our predictive models pick for this example are Random Forests (RF). We will very briefly explain what RF are, but the interested reader is encouraged to look into James, Witten, Hastie, and Tibshirani's excellent "Statistical Learning" (Springer, 2013). RF are a non-linear model used to generate predictions. A tree is a structure that provides a clear path from inputs to specific outputs through a branching model. In predictive modeling they are used to find limited input-space areas that perform well when providing predictions. RF create many such trees and use a mechanism to aggregate the predictions provided by this trees into a single prediction. They are a very powerful and popular Machine Learning model. Let's have a look at the random forests example: Random forests aggregate trees To train our model, we use the train() function passing a formula that signals R to use MULT_PURCHASES as the dependent variable and everything else (~ .) as the independent variables, which are the token frequencies. It also specifies the data, the method ("rf" stands for random forests), the control mechanism we just created, and the number of tuning scenarios to use: model.1 <- train( MULT_PURCHASES ~ ., data = train.dfm.df, method = "rf", trControl = cv.control, tuneLength = 5 ) Improving speed with parallelization If you actually executed the previous code in your computer before reading this, you may have found that it took a long time to finish (8.41 minutes in our case). As we mentioned earlier, text analysis suffers from very high dimensional structures which take a long time to process. Furthermore, using CV runs will take a long time to run. To cut down on the total execution time, use the doParallel package to allow for multi-core computers to do the training in parallel and substantially cut down on time. We proceed to create the train_model() function, which takes the data and the control mechanism as parameters. It then makes a cluster object with the makeCluster() function with a number of available cores (processors) equal to the number of cores in the computer, detected with the detectCores() function. Note that if you're planning on using your computer to do other tasks while you train your models, you should leave one or two cores free to avoid choking your system (you can then use makeCluster(detectCores() -2) to accomplish this). After that, we start our time measuring mechanism, train our model, print the total time, stop the cluster, and return the resulting model. train_model <- function(data, cv.control) { cluster <- makeCluster(detectCores()) registerDoParallel(cluster) start.time <- Sys.time() model <- train( MULT_PURCHASES ~ ., data = data, method = "rf", trControl = cv.control, tuneLength = 5 ) print(Sys.time() - start.time) stopCluster(cluster) return(model) } Now we can retrain the same model much faster. The time reduction will depend on your computer's available resources. In the case of an 8-core system with 32 GB of memory available, the total time was 3.34 minutes instead of the previous 8.41 minutes, which implies that with parallelization, it only took 39% of the original time. Not bad right? Let's have look at how the model is trained: model.1 <- train_model(train.dfm.df, cv.control) Computing predictive accuracy and confusion matrices Now that we have our trained model, we can see its results and ask it to compute some predictive accuracy metrics. We start by simply printing the object we get back from the train() function. As can be seen, we have some useful metadata, but what we are concerned with right now is the predictive accuracy, shown in the Accuracy column. From the five values we told the function to use as testing scenarios, the best model was reached when we used 356 out of the 2,007 available features (tokens). In that case, our predictive accuracy was 65.36%. If we take into account the fact that the proportions in our data were around 63% of cases with multiple purchases, we have made an improvement. This can be seen by the fact that if we just guessed the class with the most observations (MULT_PURCHASES being true) for all the observations, we would only have a 63% accuracy, but using our model we were able to improve toward 65%. This is a 3% improvement. Keep in mind that this is a randomized process, and the results will be different every time you train these models. That's why we want a repeated CV as well as various testing scenarios to make sure that our results are robust: model.1 #> Random Forest #> #> 212 samples #> 2007 predictors #> 2 classes: 'FALSE', 'TRUE' #> #> No pre-processing #> Resampling: Cross-Validated (5 fold, repeated 2 times) #> Summary of sample sizes: 170, 169, 170, 169, 170, 169, ... #> Resampling results across tuning parameters: #> #> mtry Accuracy Kappa #> 2 0.6368771 0.00000000 #> 11 0.6439092 0.03436849 #> 63 0.6462901 0.07827322 #> 356 0.6536545 0.16160573 #> 2006 0.6512735 0.16892126 #> #> Accuracy was used to select the optimal model using the largest value. #> The final value used for the model was mtry = 356. To create a confusion matrix, we can use the confusionMatrix() function and send it the model's predictions first and the real values second. This will not only create the confusion matrix for us, but also compute some useful metrics such as sensitivity and specificity. We won't go deep into what these metrics mean or how to interpret them since that's outside the scope of this tutorial, but we highly encourage the reader to study them using the resources cited in this tutorial: confusionMatrix(model.1$finalModel$predicted, train$MULT_PURCHASES) #> Confusion Matrix and Statistics #> #> Reference #> Prediction FALSE TRUE #> FALSE 18 19 #> TRUE 59 116 #> #> Accuracy : 0.6321 #> 95% CI : (0.5633, 0.6971) #> No Information Rate : 0.6368 #> P-Value [Acc > NIR] : 0.5872 #> #> Kappa : 0.1047 #> Mcnemar's Test P-Value : 1.006e-05 #> #> Sensitivity : 0.23377 #> Specificity : 0.85926 #> Pos Pred Value : 0.48649 #> Neg Pred Value : 0.66286 #> Prevalence : 0.36321 #> Detection Rate : 0.08491 #> Detection Prevalence : 0.17453 #> Balanced Accuracy : 0.54651 #> #> 'Positive' Class : FALSE You read an excerpt from R Programming By Example authored by Omar Trejo Navarro. This book gets you familiar with R’s fundamentals and its advanced features to get you hands-on experience with R’s cutting edge tools for software development. Getting Started with Predictive Analytics Here’s how you can handle the bias variance trade-off in your ML models    
Read more
  • 0
  • 0
  • 22003

article-image-amazon-splits-hq2-between-new-york-and-washington-d-c-after-a-making-200-states-compete-over-a-year-public-sentiments-largely-negative
Sugandha Lahoti
14 Nov 2018
4 min read
Save for later

Amazon splits HQ2 between New York and Washington, D.C. after a making 200+ states compete over a year; public sentiments largely negative

Sugandha Lahoti
14 Nov 2018
4 min read
Yesterday, Amazon announced that it will split its second headquarters between New York City and a suburb of Washington, D.C. Specifically into Long Island City, Queens, and Crystal City, in Arlington, Virginia. Per Amazon, the company will invest $5 billion and create more than 50,000 jobs across the two new headquarters locations, with more than 25,000 employees each in New York City and Arlington. The new locations will join Seattle as the company’s three headquarters in North America. How did it all start? The stage was set fourteen months ago when Amazon invited North American cities to apply to compete for getting Amazon’s second headquarters in their cities. Every year, American cities and states spend up to $90 billion in tax breaks and cash grants to urge companies to move among states. After considering applications from nearly 238 cities, Amazon shocked everyone by finally choosing New York and D.C., the two most sought after cities in America. And to top that, the incentives the company will receive as part of the deals are touching the sky.  Amazon announced that, in New York, it will receive up to $1.2 billion in a refundable tax credit, tied to the creation of jobs, and a $325 million cash development grant. The company will meanwhile earn up to $573 million in cash grants for the Arlington investment if it creates the promised jobs. https://twitter.com/ByRosenberg/status/1062394439102943232 What do the people think? Amazon’s decision was met with backlash and outrage by concerned citizens as well as other people on the internet. People have found Amazon’s decision extremely concerning. https://twitter.com/Ocasio2018/status/1062204614496403457 https://twitter.com/SteveCase/status/1062399854905909248 Queens residents are also “outraged” at Amazon’s plans according to Rep.-elect Alexandria Ocasio-Cortez. https://twitter.com/Ocasio2018/status/1062203458227503104 Many people have also released statements and drafted open letters to Jeff Bezos disagreeing to Amazon’s decision and highlighting the trouble associated. https://twitter.com/NYCSpeakerCoJo/status/1062384477861748736 David Heinemeier Hansson, the creator of Ruby on Rails has written an open letter addressed to Jeff Bezos terming Amazon HQ2 process demeaning if not outright cruel. “At a time when politicians are viewed as more inept, more suspicious, and more corrupt than ever, you made city after city grovel in front of your selection committee. They debased themselves in a futile attempt to appeal to your grace and mercy, and you showed them little. The losers ended up worse than where they started, and even the winners may well too.” He wrote, “At some point people are going to have had enough, and when they figure out a way to channel that discontent into political action, they’re going to come looking for the heads of those that did them the most egregious wrongs.” How to control corporate giveaways? “We need a national truce, both within states and between states,” said Amy Liu, the director of the Metropolitan Policy Program at the Brookings Institution. “There should be no more poaching of private companies with public funds.” The Atlantic highlights three ways the government could take control of corporate giveaways: First, Congress could pass a national law banning this sort of corporate bribery. Second, Congress could make corporate subsidies less valuable by threatening to tax state or local incentives as a special kind of income. Finally, the federal government could actively discourage the culture of corporate subsidies by strict law enforcement measures. In another striking move, Democratic Assemblyman Ron Kim has announced a bill to block the Amazon deal and redirect taxpayer subsidies for Jeff Bezos into reducing student debt. “Giving Jeff Bezos hundreds of millions of dollars is an immoral waste of taxpayers’ money when it’s crystal clear that the money would create more jobs and more economic growth when it is used to relieve student debt,” said Kim. Read Amazon’s official press release to know in detail about the new headquarters and Amazon’s incentives in these HQ2. Amazon addresses employees dissent regarding the company’s law enforcement policies at an all-staff meeting, in a first. Jeff Bezos: Amazon will continue to support U.S. Defense Department. Amazon increases the minimum wage of all employees in the US and UK
Read more
  • 0
  • 0
  • 22000
Modal Close icon
Modal Close icon