How-To Tutorials

02 Jan 2017

14 min read

Tools inTypeScript

02 Jan 2017

In this article by Nathan Rozentals, author of the book Mastering TypeScript, Second Edition, you will learn how to build enterprise-ready, industrial web applications using TypeScript and leading JavaScript frameworks. In this article, we will cover the following topics: What is TypeScript? The benefits of TypeScript (For more resources related to this topic, see here.) What is TypeScript? TypeScript is both a language and a set of tools to generate JavaScript. It was designed by Anders Hejlsberg at Microsoft (the designer of C#) as an open source project to help developers write enterprise scale JavaScript. TypeScript generates JavaScript – it's as simple as that. Instead of requiring a completely new runtime environment, TypeScript generated JavaScript can reuse all of the existing JavaScript tools, frameworks, and wealth of libraries that are available for JavaScript. The TypeScript language and compiler, however, brings the development of JavaScript closer to a more traditional object-oriented experience. EcmaScript JavaScript as a language has been around for a long time, and is also governed by a language feature standard. The language defined in this standard is called ECMAScript, and each JavaScript interpreter must deliver functions and features that conform to this standard. The definition of this standard helped the growth of JavaScript and the web in general, and allowed websites to render correctly on many different browsers on many different operating systems. The ECMAScript standard was published in 1999 and is known as ECMA-262, third edition. With the popularity of the language, and the explosive growth of internet applications, the ECMAScript standard needed to be revised and updated. This process resulted in a draft specification for ECMAScript, called the fourth edition. Unfortunately, this draft suggested a complete overhaul of the language, and was not well received. Eventually, leaders from Yahoo, Google, and Microsoft tabled an alternate proposal which they called ECMAScript 3.1. This proposal was numbered 3.1, as it was a smaller feature set of the third edition, and sat between edition three and four of the standard. This proposal for language changes tabled earlier was eventually adopted as the fifth edition of the standard, and was called ECMAScript 5. The ECMAScript fourth edition was never published, but it was decided to merge the best features of both the fourth edition and the 3.1 feature set into a sixth edition named ECMAScript Harmony. The TypeScript compiler has a parameter that can switch between different versions of the ECMAScript standard. TypeScript currently supports ECMAScript 3, ECMAScript 5, and ECMAScript 6. When the compiler runs over your TypeScript, it will generate compile errors if the code you are attempting to compile is not valid for that standard. The team at Microsoft has committed to follow the ECMAScript standards in any new versions of the TypeScript compiler, so as new editions are adopted, the TypeScript language and compiler will follow suit. The benefits of TypeScript To give you a flavor of the benefits of TypeScript (and this is by no means the full list), let's have a very quick look at some of the things that TypeScript brings to the table: A compilation step Strong or static typing Type definitions for popular JavaScript libraries Encapsulation Private and public member variable decorators Compiling One of the most frustrating things about JavaScript development is the lack of a compilation step. JavaScript is an interpreted language, and therefore needs to be run in order to test that it is valid. Every JavaScript developer will tell horror stories of hours spent trying to find bugs in their code, only to find that they have missed a stray closing brace { , or a simple comma ,– or even a double quote " where there should have been a single quote ' . Even worse, the real headaches arrive when you misspell a property name, or unwittingly reassign a global variable. TypeScript will compile your code, and generate compilation errors where it finds these sorts of syntax errors. This is obviously very useful, and can help to highlight errors before the JavaScript is run. In large projects, programmers will often need to do large code merges and with today's tools doing automatic merges, it is surprising how often the compiler will pick up these types of errors. While tools to do this sort of syntax checking like JSLint have been around for years, it is obviously beneficial to have these tools integrated into your IDE. Using TypeScript in a continuous integration environment will also fail a build completely when compilation errors are found, further protecting your programmers against these types of bugs. Strong typing JavaScript is not strongly typed. It is a language that is very dynamic, as it allows objects to change their properties and behavior on the fly. As an example of this, consider the following code: var test = "this is a string"; test = 1; test = function(a, b) { return a + b; } On the first line of the preceding code, the variable test is bound to a string. It is then assigned a number, and finally is redefined to be a function that expects two parameters. Traditional object-oriented languages, however, will not allow the type of a variable to change, hence they are called strongly typed languages. While all of the preceding code is valid JavaScript and could be justified, it is quite easy to see how this could cause runtime errors during execution. Imagine that you were responsible for writing a library function to add two numbers, and then another developer inadvertently reassigned your function to subtract these numbers instead. These sorts of errors may be easy to spot in a few lines of code, but it becomes increasingly difficult to find and fix these as your code base and your development team grows. Another feature of strong typing is that the IDE you are working in understands what type of variable you are working with, and can bring better autocomplete or Intellisense options to the fore. TypeScript's syntactic sugar TypeScript introduces a very simple syntax to check the type of an object at compile time. This syntax has been referred to as syntactic sugar, or more formally, type annotations. Consider the following TypeScript code: var test: string = "this is a string"; test = 1; test = function(a, b) { return a + b; } Note on the first line of this code snippet, we have introduced a colon : and a string keyword between our variable and its assignment. This type annotation syntax means that we are setting the type of our variable to be of type string, and that any code that does not adhere to these rules will generate a compile error. Running the preceding code through the TypeScript compiler will generate two errors: hello.ts(3,1): error TS2322: Type 'number' is not assignable to type 'string'. hello.ts(4,1): error TS2322: Type '(a: any, b: any) => any' is not assignable to type 'string'. The first error is fairly obvious. We have specified that the variable test is a string, and therefore attempting to assign a number to it will generate a compile error. The second error is similar to the first, and is, in essence, saying that we cannot assign a function to a string. In this way, the TypeScript compiler introduces strong or static typing to your JavaScript code, giving you all of the benefits of a strongly typed language. TypeScript is therefore described as a superset of JavaScript. Type definitions for popular JavaScript libraries As we have seen, TypeScript has the ability to annotate JavaScript, and bring strong typing to the JavaScript development experience. But how do we strongly type existing JavaScript libraries? The answer is surprisingly simple—by creating a definition file. TypeScript uses files with a .d.ts extension as a sort of header file, similar to languages such as C++, to superimpose strongly typing on existing JavaScript libraries. These definition files hold information that describes each available function, and or variables, along with their associated type annotations. Let's have a quick look at what a definition would look like. As an example, I have lifted a function from the popular Jasmine unit testing framework, called describe: var describe = function(description, specDefinitions) { return jasmine.getEnv().describe(description, specDefinitions); }; Note that this function has two parameters—description and specDefinitions. But JavaScript does not tell us what sort of variables these are. We would need to have a look at the Jasmine documentation to figure out how to call this function. If we head over to http://jasmine.github.io/2.0/introduction.html, we will see an example of how to use this function: describe("A suite", function () { it("contains spec with an expectation", function () { expect(true).toBe(true); }); }); From the documentation, then, we can easily see that the first parameter is a string, and the second parameter is a function. But there is nothing in JavaScript that forces us to conform to this API. As mentioned before, we could easily call this function with two numbers or inadvertently switch the parameters around, sending a function first, and a string second. We will obviously start getting runtime errors if we do this, but TypeScript using a definition file can generate compile-time errors before we even attempt to run this code. Let's have a look at a piece of the jasmine.d.ts definition file: declare function describe( description: string, specDefinitions: () => void ): void; This is the TypeScript definition for the describe function. Firstly, declare function describe tells us that we can use a function called describe, but that the implementation of this function will be provided at runtime. Clearly, the description parameter is strongly typed to a string, and the specDefinitions parameter is strongly typed to be a function that returns void. TypeScript uses the double braces () syntax to declare functions, and the arrow syntax to show the return type of the function. So () => void is a function that does not return anything. Finally, the describe function itself will return void. If our code were to try and pass in a function as the first parameter, and a string as the second parameter (clearly breaking the definition of this function), as shown in the following example: describe(() => { /* function body */}, "description"); TypeScript will generate the following error: hello.ts(11,11): error TS2345: Argument of type '() => void' is not assignable to parameter of type 'string'. This error is telling us that we are attempting to call the describe function with invalid parameters,andit clearly shows that TypeScript will generate errors if we attempt to use external JavaScript libraries incorrectly. DefinitelyTyped Soon after TypeScript was released, Boris Yankov started a Git Hub repository to house definition files, called Definitely Typed (http://definitelytyped.org). This repository has now become the first port of call for integrating external libraries into TypeScript, and it currently holds definitions for over 1,600 JavaScript libraries. Encapsulation One of the fundamental principles of object-oriented programming is encapsulation—the ability to define data, as well as a set of functions that can operate on that data, into a single component. Most programming languages have the concept of a class for this purpose, providing a way to define a template for data and related functions. Let's first take a look at a simple TypeScript class definition: class MyClass { add(x, y) { return x + y; } } varclassInstance = new MyClass(); var result = classInstance.add(1,2); console.log(`add(1,2) returns ${result}`); This code is pretty simple to read and understand. We have created a class, named MyClass, with a simple add function. To use this class we simply create an instance of it, and call the add function with two arguments. JavaScript, unfortunately, does not have a class statement, but instead uses functions to reproduce the functionality of classes. Encapsulation through classes is accomplished by either using the prototype pattern, or by using the closure pattern. Understanding prototypes and the closure pattern, and using them correctly, is considered a fundamental skill when writing enterprise-scale JavaScript. A closure is essentially a function that refers to independent variables. This means that variables defined within a closure function remember the environment in which they were created. This provides JavaScript with a way to define local variables, and provide encapsulation. Writing the MyClass definition in the preceding code, using a closure in JavaScript, would look something like the following: varMyClass = (function () { // the self-invoking function is the // environment that will be remembered // by the closure function MyClass() { // MyClass is the inner function, // the closure } MyClass.prototype.add = function (x, y) { return x + y; }; return MyClass; }()); varclassInstance = new MyClass(); var result = classInstance.add(1, 2); console.log("add(1,2) returns " + result); We start with a variable called MyClass, and assign it to a function that is executed immediately note the })(); syntax near the bottom of the closure definition. This syntax is a common way to write JavaScript in order to avoid leaking variables into the global namespace. We then define a new function named MyClass, and return this new function to the outer calling function. We then use the prototype keyword to inject a new function into the MyClass definition. This function is named add and takes two parameters, returning their sum. The last few lines of the code show how to use this closure in JavaScript. Create an instance of the closure type, and then execute the add function. Running this code will log add(1,2) returns 3 to the console, as expected. Looking at the JavaScript code versus the TypeScript code, we can easily see how simple the TypeScript looks compared to the equivalent JavaScript. Remember how we mentioned that JavaScript programmers can easily misplace a brace {, or a bracket (? Have a look at the last line in the closure definition—})();. Getting one of these brackets or braces wrong can take hours of debugging to find. Public and private accessors A further object oriented principle that is used in Encapsulation is the concept of data hiding that is the ability to have public and private variables. Private variables are meant to be hidden to the user of a particular class as these variables should only be used by the class itself. Inadvertently exposing these variables can easily cause runtime errors. Unfortunately, JavaScript does not have a native way of declaring variables private. While this functionality can be emulated using closures, a lot of JavaScript programmers simply use the underscore character _ to denote a private variable. At runtime though, if you know the name of a private variable you can easily assign a value to it. Consider the following JavaScript code: varMyClass = (function() { function MyClass() { this._count = 0; } MyClass.prototype.countUp = function() { this._count ++; } MyClass.prototype.getCountUp = function() { return this._count; } return MyClass; }()); var test = new MyClass(); test._count = 17; console.log("countUp : " + test.getCountUp()); The MyClass variable is actually a closure with a constructor function, a countUp function, and a getCountUp function. The variable _count is supposed to be a private member variable that is used only within the scope of the closure. Using the underscore naming convention gives the user of this class some indication that the variable is private, but JavaScript will still allow you to manipulate the variable _count. Take a look at the second last line of the code snippet. We are explicitly setting the value of _count to 17 which is allowed by JavaScript, but not desired by the original creator of the class. The output of this code would be countUp : 17. TypeScript, however, introduces the public and private keywords that can be used on class member variables. Trying to access a class member variable that has been marked as private will generate a compile time error. As an example of this, the JavaScript code above can be written in TypeScript, as follows: class CountClass { private _count: number; constructor() { this._count = 0; } countUp() { this._count ++; } getCount() { return this._count; } } varcountInstance = new CountClass() ; countInstance._count = 17; On the second line of our code snippet, we have declared a private member variable named _count. Again, we have a constructor, a countUp, and a getCount function. If we compile this file, the compiler will generate an error: hello.ts(39,15): error TS2341: Property '_count' is private and only accessible within class 'CountClass'. This error is generated because we are trying to access the private variable _count in the last line of the code. The TypeScript compiler, therefore, is helping us to adhere to public and private accessors by generating a compile error when we inadvertently break this rule. Summary In this article, we took a quick look at what TypeScript is and what benefits it can bring to the JavaScript development experience. Resources for Article: Further resources on this subject: Introducing Object Oriented Programmng with TypeScript [article] Understanding Patterns and Architecturesin TypeScript [article] Writing SOLID JavaScript code with TypeScript [article]

0
0
22189

How-To Tutorials

article-image-linux-programmers-opposed-to-new-code-of-conduct-threaten-to-pull-code-from-project

Melisha Dsouza

25 Sep 2018

6 min read

Linux programmers opposed to new Code of Conduct threaten to pull code from project

Melisha Dsouza

25 Sep 2018

6 min read

Facts of the Controversy at hand To “help make the kernel community a welcoming environment to participate in”, Linux accepted a new Code of Conduct earlier this month. This created conflict among the developer community because of the clauses in the CoC. The CoC is derived from a Contributor Covenant , created by Coraline Ada Ehmke, a software developer, an open-source advocate, and an LGBT activist. Just 30 minutes after signing this CoC, Principal kernel contributor- Linus Torvalds sent a mail apologizing for his past behavior and announced temporary break to improve upon his behavior. “This week people in our community confronted me about my lifetime of not understanding emotions. My flippant attacks in emails have been both unprofessional and uncalled for. Especially at times when I made it personal. In my quest for a better patch, this made sense to me. I know now this was not OK and I am truly sorry” This decision of taking a break is speculated among many as a precautionary measure to prevent Torvalds from violating the newly created code of conduct. Controversies took better shape after Caroline’s sarcastic tweet- Source: Twitter The new Linux Code of Conduct is causing huge conflict Linux’s move from its Code of Conflicts to a new Code of Conduct has not been received well by many of its developers. Some have threatened to pull out their blocks of code important to the project to revolt against the change. This could have serious consequences because Linux is one of the most important pieces of open source software in the world. If threats are put into action, large parts of the internet would be left vulnerable to exploits. Applications that use Linux would be like an incomplete Jenga stack that could collapse any minute. Why some Linux developers are opposed to the new code of conduct Here is a summary of developers views on the Code of Conduct that, according to them, justifies their decision: Amending to the CoC would mean good contributors are removed over trivial matters or even events that happened a long time ago–like Larry Garfield, a prominent Drupal contributor who was asked to step down after his sex fetish was made public. There is a lack of proper definitions for punishments, time frames, and what constitutes abuse or harassment, inappropriate behavior and leaves the Code of Conduct wide open for exploitation. It gives the people charged with enforcement immense power. It could force acceptance of contributions that wouldn’t make the cut if made by cis white males. Developers are concerned that Linux will start accepting patches from anyone and everyone just to keep par with the Code of Conduct. It will no longer depend on the ability of a person to code, but on social parameters like race, color, sex, and gender. Why some developers believe in the new Linux Code of Conduct On the other side of the argument, here are some potential reasons why the CoC will foster social justice: Encouraging an inclusive and safe space for women, LGBTQIA+, and People of Color, who in the absence of the CoC are excluded, harassed, and sometimes even raped by cis white males. The CoC aims to overcome meritocracy, which in many organizations has consistently shown itself to mainly benefit those with privilege, to the exclusion of underrepresented people in technology VA vastmajority of Linux contributors are cis white males. CC’s Code of Conduct would enable the building of a more diverse demographic array of contributors What does this mean to the developer community? Linux includes programmers who are always free to contribute to its open source platform. Contributing good code would help them climb up the ladder and become a ‘maintainer’. The greatest strength of Linux was its flexibility. Developers would contribute to the kernel and be concerned about only a single entity- their code patch. The Linux community would judge the code based on its quality. However, with the new Code of Conduct, critics say this could make passing judgement on code more challenging, For them, the Code of Conduct is a set of rules that expects everyone to be at equal levels in the community. It could mean that certain patches are accepted for fear of contravening the Code of Conduct. Here is what Caroline Ada Ehmke was forthright in her criticism of this view: Source: Twitter Clearly, many of the fears of the Code of Conduct’s critics haven’t yet come to pass. What they’re ultimately worried about is that there could be negative consequences. Google Developer Ted Ts’o next on the hit list Earlier this week, activist Sage Sharp, tweeted about Ted Ts'o: Source: Twitter This perhaps needs some context - the beginning of this argument dates all the way back to 2011 when Ts’o when was a member of the Linux Foundation's technical advisory board-participated in a discussion on the mailing list for the Australian national Linux conference that year, making comments that were later interpreted by Aurora as rape apologism. Using Aurora's piece as a fuse, Google employee Matthew Garrett slammed Ts'o on his beliefs. In 2017, yielding to the demands of SJWs, Google threw out James Damore, an engineer who circulated an internal creed about reverse discrimination in hiring practices. The SJW’s are coming for him and the best way to go forward would be to “take a break”, just like Linus did. As claimed by Caroline, the underlying aim of the CoC was to guide people in behaving in a respectable way and create a positive environment for people irrespective of their race, ethnicity, religion, nationality and political views. However, overlooking this aim, developers are concerned with the loopholes in the CoC. Gain more insights on this news as well as views from members of the Linux community at itfloss. Linux drops Code of Conflict and adopts new Code of Conduct Linus Torvalds is sorry for his ‘hurtful behavior’, is taking ‘a break (from the Linux community) to get help’ NSA researchers present security improvements for Zephyr and Fucshia at Linux Security Summit 2018

0
0
22176

article-image-10-machine-learning-tools-to-look-out-for-in-2018

Amey Varangaonkar

26 Dec 2017

7 min read

10 Machine Learning Tools to watch in 2018

Amey Varangaonkar

26 Dec 2017

7 min read

0
0
22150

article-image-experts-discuss-dark-patterns-and-deceptive-ui-designs-what-are-they-what-do-they-do-how-do-we-stop-them

Sugandha Lahoti

29 Jun 2019

12 min read

Experts discuss Dark Patterns and deceptive UI designs: What are they? What do they do? How do we stop them?

Sugandha Lahoti

29 Jun 2019

12 min read

Dark patterns are often used online to deceive users into taking actions they would otherwise not take under effective, informed consent. Dark patterns are generally used by shopping websites, social media platforms, mobile apps and services as a part of their user interface design choices. Dark patterns can lead to financial loss, tricking users into giving up vast amounts of personal data, or inducing compulsive and addictive behavior in adults and children. Using dark patterns is unambiguously unlawful in the United States (under Section 5 of the Federal Trade Commission Act and similar state laws), the European Union (under the Unfair Commercial Practices Directive and similar member state laws), and numerous other jurisdictions. Earlier this week, at the Russell Senate Office Building, a panel of experts met to discuss the implications of Dark patterns in the session, Deceptive Design and Dark Patterns: What are they? What do they do? How do we stop them? The session included remarks from Senator. Mark Warner and Deb Fischer, sponsors of the DETOUR Act, and a panel of experts including Tristan Harris (Co-Founder and Executive Director, Center for Humane Technology). The entire panel of experts included: Tristan Harris (Co-Founder and Executive Director, Center for Humane Technology) Rana Foroohar (Global Business Columnist and Associate Editor, Financial Times) Amina Fazlullah (Policy Counsel, Common Sense Media) Paul Ohm (Professor of Law and Associate Dean, Georgetown Law School), also the moderator Katie McInnis (Policy Counsel, Consumer Reports) Marshall Erwin (Senior Director of Trust & Security, Mozilla) Arunesh Mathur (Dept. of Computer Science, Princeton University) Dark patterns are growing in social media platforms, video games, shopping websites, and are increasingly used to target children The expert session was inaugurated by Arunesh Mathur (Dept. of Computer Science, Princeton University) who talked about his new study by researchers from Princeton University and the University of Chicago. The study suggests that shopping websites are abundant with dark patterns that rely on consumer deception. The researchers conducted a large-scale study, analyzing almost 53K product pages from 11K shopping websites to characterize and quantify the prevalence of dark patterns. They so discovered 1,841 instances of dark patterns on shopping websites, which together represent 15 types of dark patterns. One of the dark patterns was Sneak into Website, which adds additional products to users’ shopping carts without their consent. For example, you would buy a bouquet on a website and the website without your consent would add a greeting card in the hopes that you will actually purchase it. Katie McInnis agreed and added that Dark patterns not only undermine the choices that are available to users on social media and shopping platforms but they can also cost users money. User interfaces are sometimes designed to push a user away from protecting their privacy, making it tough to evaluate them. Amina Fazlullah, Policy Counsel, Common Sense Media said that dark patterns are also being used to target children. Manipulative apps use design techniques to shame or confuse children into in-app purchases or trying to keep them on the app for longer. Children mostly are unable to discern these manipulative techniques. Sometimes the screen will have icons or buttons that will appear to be a part of game play and children will click on them not realizing that they're either being asked to make a purchase or being shown an ad or being directed onto another site. There are games which ask for payments or microtransactions to continue the game forward. Mozilla uses product transparency to curb Dark patterns Marshall Erwin, Senior Director of Trust & Security at Mozilla talked about the negative effects of dark patterns and how they make their own products at Mozilla more transparent. They have a set of checks and principles in place to avoid dark patterns. No surprises: If users were to figure out or start to understand exactly what is happening with the browser, it should be consistent with their expectations. If the users are surprised, this means browsers need to make a change either by stopping the activity entirely or creating additional transparency that helps people understand. Anti-tracking technology: Cross-site tracking is one of the most pervasive and pernicious dark patterns across the web today that is enabled by cookies. Browsers should take action to decrease the attack surface in the browser and actively protect people from those patterns online. Mozilla and Apple have introduced anti tracking technology to actively intervene to protect people from the diverse parties that are probably not trustworthy. Detour Act by Senators Warner and Fisher In April, Warner and Fischer had introduced the Deceptive Experiences To Online Users Reduction (DETOUR) Act, a bipartisan legislation to prohibit large online platforms from using dark patterns to trick consumers into handing over their personal data. This act focuses on the activities of large online service providers (over a hundred million users visiting in a given month). Under this act you cannot use practices that trick users into obtaining information or consenting. You will experience new controls about conducting ‘psychological experiments on your users’ and you will no longer be able to target children under 13 with the goal of hooking them into your service. It extends additional rulemaking and enforcement abilities to the Federal Trade Commission. “Protecting users personal data and user autonomy online are truly bipartisan issues”: Senator Mark Warner In his presentation, Warner talked about how 2019 is the year when we need to recognize dark patterns and their ongoing manipulation of American consumers. While we've all celebrated the benefits that communities have brought from social media, there is also an enormous dark underbelly, he says. It is important that Congress steps up and we play a role as senators such that Americans and their private data is not misused or manipulated going forward. Protecting users personal data and user autonomy online are truly bipartisan issues. This is not a liberal versus conservative, it's much more a future versus past and how we get this future right in a way that takes advantage of social media tools but also put some of the appropriate constraints in place. He says that the driving notion behind the Detour act is that users should have the choice and autonomy when it comes to their personal data. When a company like Facebook asks you to upload your phone contacts or some other highly valuable data to their platform, you ought to have a simple choice yes or no. Companies that run experiments on you without your consent are coercive and Detour act aims to put appropriate protections in place that defend user's ability to make informed choices. In addition to prohibiting large online platforms from using dark patterns to trick consumers into handing over their personal data, the bill would also require informed consent for behavior experimentation. In the process, the bill will be sending a clear message to the platform companies and the FTC that they are now in the business of preserving user's autonomy when it comes to the use of their personal data. The goal, Warner says, is simple - to bring some transparency to what remains a very opaque market and give consumers the tools they need to make informed choices about how and when to share their personal information. “Curbing the use of dark patterns will be foundational to increasing trust online” : Senator Deb Fischer Fischer argued that tech companies are increasingly tailoring users’ online experiences in ways that are more granular. On one hand, she says, you get a more personalized user experience and platforms are more responsive, however it's this variability that allows companies to take that design just a step too far. Companies are constantly competing for users attention and this increases the motivation for a more intrusive and invasive user design. The ability for online platforms to guide the visual interfaces that billions of people view is an incredible influence. It forces us to assess the impact of design on user privacy and well-being. Fundamentally the detour act would prohibit large online platforms from purposely using deceptive user interfaces - dark patterns. The detour act would provide a better accountability system for improved transparency and autonomy online. The legislation would take an important step to restore the hidden options. It would give users a tool to get out of the maze that coaxes you to just click on ‘I agree’. A privacy framework that involves consent cannot function properly if it doesn't ensure the user interface presents fair and transparent options. The detour act would enable the creation of a professional standards body which can register with the Federal Trade Commission. This would serve as a self regulatory body to develop best practices for UI design with the FTC as a backup. She adds, “We need clarity for the enforcement of dark patterns that don't directly involve our wallets. We need policies that place value on user choice and personal data online. We need a stronger mechanism to protect the public interest when the goal for tech companies is to make people engage more and more. User consent remains weakened by the presence of dark patterns and unethical design. Curbing the use of dark patterns will be foundational to increasing trust online. The detour act does provide a key step in getting there.” “The DETOUR act is calling attention to asymmetry and preventing deceptive asymmetry”: Tristan Harris Tristan says that companies are now competing not on manipulating your immediate behavior but manipulating and predicting the future. For example, Facebook has something called loyalty prediction which allows them to sell to an advertiser the ability to predict when you're going to become disloyal to a brand. It can sell that opportunity to another advertiser before probably you know you're going to switch. The DETOUR act is a huge step in the right direction because it's about calling attention to asymmetry and preventing deceptive asymmetry. We need a new relationship for this asymmetric power by having a duty of care. It’s about treating asymmetrically powerful technologies to be in the service of the systems that they are supposed to protect. He says, we need to switch to a regenerative energy economy that actually treats attention as sacred and not directly tying profit to user extraction. Top questions raised by the panel and online viewers Does A/B testing result in dark patterns? Dark patterns are often a result of A/B testing right where a designer may try things that lead to better engagement or maybe nudge users in a way where the company benefits. However, A/B testing isn't the problem, it’s the intention of how A/B testing is being used. Companies and other organizations should have an oversight on the different experiments that they are conducting to see if A/B testing is actually leading to some kind of concrete harm. The challenge in the space is drawing a line about A/B testing features and optimizing for engagement and decreasing friction. Are consumers smart enough to tackle dark patterns on their own or do we need a legislation? It's well established that for children whose brains are just developing, they're unable to discern these types of deceptive techniques so especially for kids, these types of practices should be banned. For vulnerable families who are juggling all sorts of concerns around income and access to jobs and transportation and health care, putting this on their plate as well is just unreasonable. Dark patterns are deployed for an array of opaque reasons the average user will never recognize. From a consumer perspective, going through and identifying dark pattern techniques--that these platform companies have spent hundreds of thousands of dollars developing to be as opaque and as tricky as possible--is an unrealistic expectation put on consumers. This is why the DETOUR act and this type of regulation are absolutely necessary and the only way forward. What is it about the largest online providers that make us want to focus on them first or only? Is it their scale or do they have more powerful dark patterns? Is it because they're just harming more people or is it politics? Sometimes larger companies stay wary of indulging in dark patterns because they have a greater risk in terms of getting caught and the PR backlash. However, they do engage in manipulative practices and that warrants a lot of attention. Moreover, targeting bigger companies is just one part of a more comprehensive privacy enforcement environment. Hitting companies that have a large number of users is also great for consumer engagement. Obviously there is a need to target more broadly but this is a starting point. If Facebook were to suddenly reclass itself and its advertising business model, would you still trust them? No, the leadership that's in charge now for Facebook can not be trusted, especially the organizational cultures that have been building. There are change efforts going on inside of Google and Facebook right now but it’s getting gridlocked. Even if employees want to see policies being changed, they still have bonus structures and employee culture to keep in mind. We recommend you to go through the full hearing here. You can read more about the Detour Act here. U.S. senators introduce a bipartisan bill that bans social media platforms from using ‘dark patterns’ to trick its users. How social media enabled and amplified the Christchurch terrorist attack A new study reveals how shopping websites use ‘dark patterns’ to deceive you into buying things you may not want

0
0
22147

article-image-basics-image-histograms-opencv

Packt

12 Oct 2016

11 min read

Basics of Image Histograms in OpenCV

Packt

12 Oct 2016

11 min read

In this article by Samyak Datta, author of the book Learning OpenCV 3 Application Development we are going to focus our attention on a different style of processing pixel values. The output of the techniques, which would comprise our study in the current article, will not be images, but other forms of representation for images, namely image histograms. We have seen that a two-dimensional grid of intensity values is one of the default forms of representing images in digital systems for processing as well as storage. However, such representations are not at all easy to scale. So, for an image with a reasonably low spatial resolution, say 512 x 512 pixels, working with a two-dimensional grid might not pose any serious issues. However, as the dimensions increase, the corresponding increase in the size of the grid may start to adversely affect the performance of the algorithms that work with the images. A primary advantage that an image histogram has to offer is that the size of a histogram is a constant that is independent of the dimensions of the image. As a consequence of this, we are guaranteed that irrespective of the spatial resolution of the images that we are dealing with, the algorithms that power our solutions will have to deal with a constant amount of data if they are working with image histograms. (For more resources related to this topic, see here.) Each descriptor captures some particular aspects or features of the image to construct its own form of representation. One of the common pitfalls of using histograms as a form of image representation as compared to its native form of using the entire two-dimensional grid of values is loss of information. A full-fledged image representation using pixel intensity values for all pixel locations naturally consists of all the information that you would need to reconstruct a digital image. However, the same cannot be said about histograms. When we study about image histograms in detail, we'll get to see exactly what information do we stand to lose. And this loss in information is prevalent across all forms of image descriptors. The basics of histograms At the outset, we will briefly explain the concept of a histogram. Most of you might already know this from your lessons on basic statistics. However, we will reiterate this for the sake of completeness. Histogram is a form of data representation technique that relies on an aggregation of data points. The data is aggregated into a set of predefined bins that are represented along the x axis, and the number of data points that fall within each of the bins make up the corresponding counts on the y axis. For example, let's assume that our data looks something like the following: D={2,7,1,5,6,9,14,11,8,10,13} If we define three bins, namely Bin_1 (1 - 5), Bin_2 (6 - 10), and Bin_3 (11 - 15), then the histogram corresponding to our data would look something like this: Bins Frequency Bin_1 (1 - 5) 3 Bin_2 (6 - 10) 5 Bin_3 (11 - 15) 3 What this histogram data tells us is that we have three values between 1 and 5, five between 6 and 10, and three again between 11 and 15. Note that it doesn't tell us what the values are, just that some n values exist in a given bin. A more familiar visual representation of the histogram in discussion is shown as follows: As you can see, the bins have been plotted along the x axis and their corresponding frequencies along the y axis. Now, in the context of images, how is a histogram computed? Well, it's not that difficult to deduce. Since the data that we have comprise pixel intensity values, an image histogram is computed by plotting a histogram using the intensity values of all its constituent pixels. What this essentially means is that the sequence of pixel intensity values in our image becomes the data. Well, this is in fact the simplest kind of histogram that you can compute using the information available to you from the image. Now, coming back to image histograms, there are some basic terminologies (pertaining to histograms in general) that you need to be aware of before you can dip your hands into code. We have explained them in detail here: Histogram size: The histogram size refers to the number of bins in the histogram. Range: The range of a histogram is the range of data that we are dealing with. The range of data as well as the histogram size are both important parameters that define a histogram. Dimensions: Simply put, dimensions refer to the number of the type of items whose values we aggregate in the histogram bins. For example, consider a grayscale image. We might want to construct a histogram using the pixel intensity values for such an image. This would be an example of a single-dimensional histogram because we are just interested in aggregating the pixel intensity values and nothing else. The data, in this case, is spread over a range of 0 to 255. On account of being one-dimensional, such histograms can be represented graphically as 2D plots—one-dimensional data (pixel intensity values) being plotted on the x axis (in the form of bins) along with the corresponding frequency counts along the y axis. We have already seen an example of this before. Now, imagine a color image with three channels: red, green, and blue. Let's say that we want to plot a histogram for the intensities in the red and green channels combined. This means that our data now becomes a pair of values (r, g). A histogram that is plotted for such data will have a dimensionality of 2. The plot for such a histogram will be a 3D plot with the data bins covering the x and y axes and the frequency counts plotted along the z axis. Now that we have discussed the theoretical aspects of image histograms in detail, let's start thinking along the lines of code. We will start with the simplest (and in fact the most ubiquitous) design of image histograms. The range of our data will be from 0 to 255 (both inclusive), which means that all our data points will be integers that fall within the specified range. Also, the number of data points will equal the number of pixels that make up our input image. The simplicity in design comes from the fact that we fix the size of the histogram (the number of bins) as 256. Now, take a moment to think about what this means. There are 256 different possible values that our data points can take and we have a separate bin corresponding to each one of those values. So such an image histogram will essentially depict the 256 possible intensity values along with the counts of the number of pixels in the image that are colored with each of the different intensities. Before taking a peek at what OpenCV has to offer, let's try to implement such a histogram on our own! We define a function named computeHistogram() that takes the grayscale image as an input argument and returns the image histogram. From our earlier discussions, it is evident that the histogram must contain 256 entries (for the 256 bins): one for each integer between 0 and 255. The value stored in the histogram corresponding to each of the 256 entries will be the count of the image pixels that have a particular intensity value. So, conceptually, we can use an array for our implementation such that the value stored in the histogram [ i ] (for 0≤i≤255) will be the count of the number of pixels in the image having the intensity of i. However, instead of using a C++ array, we will comply with the rules and standards followed by OpenCV and represent the histogram as a Mat object. We have already seen that a Mat object is nothing but a multidimensional array store. The implementation is outlined in the following code snippet: Mat computeHistogram(Mat input_image) { Mat histogram = Mat::zeros(256, 1, CV_32S); for (int i = 0; i < input_image.rows; ++i) { for (int j = 0; j < input_image.cols; ++j) { int binIdx = (int) input_image.at<uchar>(i, j); histogram.at<int>(binIdx, 0) += 1; } } return histogram; } As you can see, we have chosen to represent the histogram as a 256-element-column-vector Mat object. We iterate over all the pixels in the input image and keep on incrementing the corresponding counts in the histogram (which had been initialized to 0). As per our description of the image histogram properties, it is easy to see that the intensity value of any pixel is the same as the bin index that is used to index into the appropriate histogram bin to increment the count. Having such an implementation ready, let's test it out with the help of an actual image. The following code demonstrates a main() function that reads an input image, calls the computeHistogram() function that we have defined just now, and displays the contents of the histogram that is returned as a result: int main() { Mat input_image = imread("/home/samyak/Pictures/lena.jpg", IMREAD_GRAYSCALE); Mat histogram = computeHistogram(input_image); cout << "Histogram...n"; for (int i = 0; i < histogram.rows; ++i) cout << i << " : " << histogram.at<int>(i, 0) << "n"; return 0; } We have used the fact that the histogram that is returned from the function will be a single column Mat object. This makes the code that displays the contents of the histogram much cleaner. Histograms in OpenCV We have just seen the implementation of a very basic and minimalistic histogram using the first principles in OpenCV. The image histogram was basic in the sense that all the bins were uniform in size and comprised only a single pixel intensity. This made our lives simple when we designed our code for the implementation; there wasn't any need to explicitly check the membership of a data point (the intensity value of a pixel) with all the bins of our histograms. However, we know that a histogram can have bins whose sizes span more than one. Can you think of the changes that we might need to make in the code that we had written just now to accommodate for bin sizes larger than 1? If this change seems doable to you, try to figure out how to incorporate the possibility of non-uniform bin sizes or multidimensional histograms. By now, things might have started to get a little overwhelming to you. No need to worry. As always, OpenCV has you covered! The developers at OpenCV have provided you with a calcHist() function whose sole purpose is to calculate the histograms for a given set of arrays. By arrays, we refer to the images represented as Mat objects, and we use the term set because the function has the capability to compute multidimensional histograms from the given data: Mat computeHistogram(Mat input_image) { Mat histogram; int channels[] = { 0 }; int histSize[] = { 256 }; float range[] = { 0, 256 }; const float* ranges[] = { range }; calcHist(&input_image, 1, channels, Mat(), histogram, 1, histSize, ranges, true, false); return histogram; } Before we move on to an explanation of the different parameters involved in the calcHist() function call, I want to bring your attention to the abundant use of arrays in the preceding code snippet. Even arguments as simple as histogram sizes are passed to the function in the form of arrays rather than integer values, which at first glance seem quite unnecessary and counter-intuitive. The usage of arrays is due to the fact that the implementation of calcHist() is equipped to handle multidimensional histograms as well, and when we are dealing with such multidimensional histogram data, we require multiple parameters to be passed, one for each dimension. This would become clearer once we demonstrate an example of calculating multidimensional histograms using the calcHist() function. For the time being, we just wanted to clear the immediate confusion that might have popped up in your minds upon seeing the array parameters. Here is a detailed list of the arguments in the calcHist() function call: Source images Number of source images Channel indices Mask Dimensions (dims) Histogram size Ranges Uniform flag Accumulate flag The last couple of arguments (the uniform and accumulate flags) have default values of true and false, respectively. Hence, the function call that you have seen just now can very well be written as follows: calcHist(&input_image, 1, channels, Mat(), histogram, 1, histSize, ranges); Summary Thus in this article we have successfully studied fundamentals of using histograms in OpenCV for image processing. Resources for Article: Further resources on this subject: Remote Sensing and Histogram [article] OpenCV: Image Processing using Morphological Filters [article] Learn computer vision applications in Open CV [article]

0
0
22139

Packt

11 Aug 2015

20 min read

Task Execution with Asio

Packt

11 Aug 2015

20 min read

In this article by Arindam Mukherjee, the author of Learning Boost C++ Libraries, we learch how to execute a task using Boost Asio (pronounced ay-see-oh), a portable library for performing efficient network I/O using a consistent programming model. At its core, Boost Asio provides a task execution framework that you can use to perform operations of any kind. You create your tasks as function objects and post them to a task queue maintained by Boost Asio. You enlist one or more threads to pick these tasks (function objects) and invoke them. The threads keep picking up tasks, one after the other till the task queues are empty at which point the threads do not block but exit. (For more resources related to this topic, see here.) IO Service, queues, and handlers At the heart of Asio is the type boost::asio::io_service. A program uses the io_service interface to perform network I/O and manage tasks. Any program that wants to use the Asio library creates at least one instance of io_service and sometimes more than one. In this section, we will explore the task management capabilities of io_service. Here is the IO Service in action using the obligatory "hello world" example: Listing 11.1: Asio Hello World 1 #include <boost/asio.hpp> 2 #include <iostream> 3 namespace asio = boost::asio; 4 5 int main() { 6 asio::io_service service; 7 8 service.post( 9 [] { 10 std::cout << "Hello, world!" << 'n'; 11 }); 12 13 std::cout << "Greetings: n"; 14 service.run(); 15 } We include the convenience header boost/asio.hpp, which includes most of the Asio library that we need for the examples in this aritcle (line 1). All parts of the Asio library are under the namespace boost::asio, so we use a shorter alias for this (line 3). The program itself just prints Hello, world! on the console but it does so through a task. The program first creates an instance of io_service (line 6) and posts a function object to it, using the post member function of io_service. The function object, in this case defined using a lambda expression, is referred to as a handler. The call to post adds the handler to a queue inside io_service; some thread (including that which posted the handler) must dispatch them, that is, remove them off the queue and call them. The call to the run member function of io_service (line 14) does precisely this. It loops through the handlers in the queue inside io_service, removing and calling each handler. In fact, we can post more handlers to the io_service before calling run, and it would call all the posted handlers. If we did not call run, none of the handlers would be dispatched. The run function blocks until all the handlers in the queue have been dispatched and returns only when the queue is empty. By itself, a handler may be thought of as an independent, packaged task, and Boost Asio provides a great mechanism for dispatching arbitrary tasks as handlers. Note that handlers must be nullary function objects, that is, they should take no arguments. Asio is a header-only library by default, but programs using Asio need to link at least with boost_system. On Linux, we can use the following command line to build this example: $ g++ -g listing11_1.cpp -o listing11_1 -lboost_system -std=c++11 Running this program prints the following: Greetings: Hello, World! Note that Greetings: is printed from the main function (line 13) before the call to run (line 14). The call to run ends up dispatching the sole handler in the queue, which prints Hello, World!. It is also possible for multiple threads to call run on the same I/O object and dispatch handlers concurrently. We will see how this can be useful in the next section. Handler states – run_one, poll, and poll_one While the run function blocks until there are no more handlers in the queue, there are other member functions of io_service that let you process handlers with greater flexibility. But before we look at this function, we need to distinguish between pending and ready handlers. The handlers we posted to the io_service were all ready to run immediately and were invoked as soon as their turn came on the queue. In general, handlers are associated with background tasks that run in the underlying OS, for example, network I/O tasks. Such handlers are meant to be invoked only once the associated task is completed, which is why in such contexts, they are called completion handlers. These handlers are said to be pending until the associated task is awaiting completion, and once the associated task completes, they are said to be ready. The poll member function, unlike run, dispatches all the ready handlers but does not wait for any pending handler to become ready. Thus, it returns immediately if there are no ready handlers, even if there are pending handlers. The poll_one member function dispatches exactly one ready handler if there be one, but does not block waiting for pending handlers to get ready. The run_one member function blocks on a nonempty queue waiting for a handler to become ready. It returns when called on an empty queue, and otherwise, as soon as it finds and dispatches a ready handler. post versus dispatch A call to the post member function adds a handler to the task queue and returns immediately. A later call to run is responsible for dispatching the handler. There is another member function called dispatch that can be used to request the io_service to invoke a handler immediately if possible. If dispatch is invoked in a thread that has already called one of run, poll, run_one, or poll_one, then the handler will be invoked immediately. If no such thread is available, dispatch adds the handler to the queue and returns just like post would. In the following example, we invoke dispatch from the main function and from within another handler: Listing 11.2: post versus dispatch 1 #include <boost/asio.hpp> 2 #include <iostream> 3 namespace asio = boost::asio; 4 5 int main() { 6 asio::io_service service; 7 // Hello Handler – dispatch behaves like post 8 service.dispatch([]() { std::cout << "Hellon"; }); 9 10 service.post( 11 [&service] { // English Handler 12 std::cout << "Hello, world!n"; 13 service.dispatch([] { // Spanish Handler, immediate 14 std::cout << "Hola, mundo!n"; 15 }); 16 }); 17 // German Handler 18 service.post([&service] {std::cout << "Hallo, Welt!n"; }); 19 service.run(); 20 } Running this code produces the following output: Hello Hello, world! Hola, mundo! Hallo, Welt! The first call to dispatch (line 8) adds a handler to the queue without invoking it because run was yet to be called on io_service. We call this the Hello Handler, as it prints Hello. This is followed by the two calls to post (lines 10, 18), which add two more handlers. The first of these two handlers prints Hello, world! (line 12), and in turn, calls dispatch (line 13) to add another handler that prints the Spanish greeting, Hola, mundo! (line 14). The second of these handlers prints the German greeting, Hallo, Welt (line 18). For our convenience, let's just call them the English, Spanish, and German handlers. This creates the following entries in the queue: Hello Handler English Handler German Handler Now, when we call run on the io_service (line 19), the Hello Handler is dispatched first and prints Hello. This is followed by the English Handler, which prints Hello, World! and calls dispatch on the io_service, passing the Spanish Handler. Since this executes in the context of a thread that has already called run, the call to dispatch invokes the Spanish Handler, which prints Hola, mundo!. Following this, the German Handler is dispatched printing Hallo, Welt! before run returns. What if the English Handler called post instead of dispatch (line 13)? In that case, the Spanish Handler would not be invoked immediately but would queue up after the German Handler. The German greeting Hallo, Welt! would precede the Spanish greeting Hola, mundo!. The output would look like this: Hello Hello, world! Hallo, Welt! Hola, mundo! Concurrent execution via thread pools The io_service object is thread-safe and multiple threads can call run on it concurrently. If there are multiple handlers in the queue, they can be processed concurrently by such threads. In effect, the set of threads that call run on a given io_service form a thread pool. Successive handlers can be processed by different threads in the pool. Which thread dispatches a given handler is indeterminate, so the handler code should not make any such assumptions. In the following example, we post a bunch of handlers to the io_service and then start four threads, which all call run on it: Listing 11.3: Simple thread pools 1 #include <boost/asio.hpp> 2 #include <boost/thread.hpp> 3 #include <boost/date_time.hpp> 4 #include <iostream> 5 namespace asio = boost::asio; 6 7 #define PRINT_ARGS(msg) do { 8 boost::lock_guard<boost::mutex> lg(mtx); 9 std::cout << '[' << boost::this_thread::get_id() 10 << "] " << msg << std::endl; 11 } while (0) 12 13 int main() { 14 asio::io_service service; 15 boost::mutex mtx; 16 17 for (int i = 0; i < 20; ++i) { 18 service.post([i, &mtx]() { 19 PRINT_ARGS("Handler[" << i << "]"); 20 boost::this_thread::sleep( 21 boost::posix_time::seconds(1)); 22 }); 23 } 24 25 boost::thread_group pool; 26 for (int i = 0; i < 4; ++i) { 27 pool.create_thread([&service]() { service.run(); }); 28 } 29 30 pool.join_all(); 31 } We post twenty handlers in a loop (line 18). Each handler prints its identifier (line 19), and then sleeps for a second (lines 19-20). To run the handlers, we create a group of four threads, each of which calls run on the io_service (line 21) and wait for all the threads to finish (line 24). We define the macro PRINT_ARGS which writes output to the console in a thread-safe way, tagged with the current thread ID (line 7-10). We will use this macro in other examples too. To build this example, you must also link against libboost_thread, libboost_date_time, and in Posix environments, with libpthread too: $ g++ -g listing9_3.cpp -o listing9_3 -lboost_system -lboost_thread -lboost_date_time -pthread -std=c++11 One particular run of this program on my laptop produced the following output (with some lines snipped): [b5c15b40] Handler[0] [b6416b40] Handler[1] [b6c17b40] Handler[2] [b7418b40] Handler[3] [b5c15b40] Handler[4] [b6416b40] Handler[5] … [b6c17b40] Handler[13] [b7418b40] Handler[14] [b6416b40] Handler[15] [b5c15b40] Handler[16] [b6c17b40] Handler[17] [b7418b40] Handler[18] [b6416b40] Handler[19] You can see that the different handlers are executed by different threads (each thread ID marked differently). If any of the handlers threw an exception, it would be propagated across the call to the run function on the thread that was executing the handler. io_service::work Sometimes, it is useful to keep the thread pool started, even when there are no handlers to dispatch. Neither run nor run_one blocks on an empty queue. So in order for them to block waiting for a task, we have to indicate, in some way, that there is outstanding work to be performed. We do this by creating an instance of io_service::work, as shown in the following example: Listing 11.4: Using io_service::work to keep threads engaged 1 #include <boost/asio.hpp> 2 #include <memory> 3 #include <boost/thread.hpp> 4 #include <iostream> 5 namespace asio = boost::asio; 6 7 typedef std::unique_ptr<asio::io_service::work> work_ptr; 8 9 #define PRINT_ARGS(msg) do { … ... 14 15 int main() { 16 asio::io_service service; 17 // keep the workers occupied 18 work_ptr work(new asio::io_service::work(service)); 19 boost::mutex mtx; 20 21 // set up the worker threads in a thread group 22 boost::thread_group workers; 23 for (int i = 0; i < 3; ++i) { 24 workers.create_thread([&service, &mtx]() { 25 PRINT_ARGS("Starting worker thread "); 26 service.run(); 27 PRINT_ARGS("Worker thread done"); 28 }); 29 } 30 31 // Post work 32 for (int i = 0; i < 20; ++i) { 33 service.post( 34 [&service, &mtx]() { 35 PRINT_ARGS("Hello, world!"); 36 service.post([&mtx]() { 37 PRINT_ARGS("Hola, mundo!"); 38 }); 39 }); 40 } 41 42 work.reset(); // destroy work object: signals end of work 43 workers.join_all(); // wait for all worker threads to finish 44 } In this example, we create an object of io_service::work wrapped in a unique_ptr (line 18). We associate it with an io_service object by passing to the work constructor a reference to the io_service object. Note that, unlike listing 11.3, we create the worker threads first (lines 24-27) and then post the handlers (lines 33-39). However, the worker threads stay put waiting for the handlers because of the calls to run block (line 26). This happens because of the io_service::work object we created, which indicates that there is outstanding work in the io_service queue. As a result, even after all handlers are dispatched, the threads do not exit. By calling reset on the unique_ptr, wrapping the work object, its destructor is called, which notifies the io_service that all outstanding work is complete (line 42). The calls to run in the threads return and the program exits once all the threads are joined (line 43). We wrapped the work object in a unique_ptr to destroy it in an exception-safe way at a suitable point in the program. We omitted the definition of PRINT_ARGS here, refer to listing 11.3. Serialized and ordered execution via strands Thread pools allow handlers to be run concurrently. This means that handlers that access shared resources need to synchronize access to these resources. We already saw examples of this in listings 11.3 and 11.4, when we synchronized access to std::cout, which is a global object. As an alternative to writing synchronization code in handlers, which can make the handler code more complex, we can use strands. Think of a strand as a subsequence of the task queue with the constraint that no two handlers from the same strand ever run concurrently. The scheduling of other handlers in the queue, which are not in the strand, is not affected by the strand in any way. Let us look at an example of using strands: Listing 11.5: Using strands 1 #include <boost/asio.hpp> 2 #include <boost/thread.hpp> 3 #include <boost/date_time.hpp> 4 #include <cstdlib> 5 #include <iostream> 6 #include <ctime> 7 namespace asio = boost::asio; 8 #define PRINT_ARGS(msg) do { ... 13 14 int main() { 15 std::srand(std::time(0)); 16 asio::io_service service; 17 asio::io_service::strand strand(service); 18 boost::mutex mtx; 19 size_t regular = 0, on_strand = 0; 20 21 auto workFuncStrand = [&mtx, &on_strand] { 22 ++on_strand; 23 PRINT_ARGS(on_strand << ". Hello, from strand!"); 24 boost::this_thread::sleep( 25 boost::posix_time::seconds(2)); 26 }; 27 28 auto workFunc = [&mtx, &regular] { 29 PRINT_ARGS(++regular << ". Hello, world!"); 30 boost::this_thread::sleep( 31 boost::posix_time::seconds(2)); 32 }; 33 // Post work 34 for (int i = 0; i < 15; ++i) { 35 if (rand() % 2 == 0) { 36 service.post(strand.wrap(workFuncStrand)); 37 } else { 38 service.post(workFunc); 39 } 40 } 41 42 // set up the worker threads in a thread group 43 boost::thread_group workers; 44 for (int i = 0; i < 3; ++i) { 45 workers.create_thread([&service, &mtx]() { 46 PRINT_ARGS("Starting worker thread "); 47 service.run(); 48 PRINT_ARGS("Worker thread done"); 49 }); 50 } 51 52 workers.join_all(); // wait for all worker threads to finish 53 } In this example, we create two handler functions: workFuncStrand (line 21) and workFunc (line 28). The lambda workFuncStrand captures a counter on_strand, increments it, and prints a message Hello, from strand!, prefixed with the value of the counter. The function workFunc captures another counter regular, increments it, and prints Hello, World!, prefixed with the counter. Both pause for 2 seconds before returning. To define and use a strand, we first create an object of io_service::strand associated with the io_service instance (line 17). Thereafter, we post all handlers that we want to be part of that strand by wrapping them using the wrap member function of the strand (line 36). Alternatively, we can post the handlers to the strand directly by using either the post or the dispatch member function of the strand, as shown in the following snippet: 33 for (int i = 0; i < 15; ++i) { 34 if (rand() % 2 == 0) { 35 strand.post(workFuncStrand); 37 } else { ... The wrap member function of strand returns a function object, which in turn calls dispatch on the strand to invoke the original handler. Initially, it is this function object rather than our original handler that is added to the queue. When duly dispatched, this invokes the original handler. There are no constraints on the order in which these wrapper handlers are dispatched, and therefore, the actual order in which the original handlers are invoked can be different from the order in which they were wrapped and posted. On the other hand, calling post or dispatch directly on the strand avoids an intermediate handler. Directly posting to a strand also guarantees that the handlers will be dispatched in the same order that they were posted, achieving a deterministic ordering of the handlers in the strand. The dispatch member of strand blocks until the handler is dispatched. The post member simply adds it to the strand and returns. Note that workFuncStrand increments on_strand without synchronization (line 22), while workFunc increments the counter regular within the PRINT_ARGS macro (line 29), which ensures that the increment happens in a critical section. The workFuncStrand handlers are posted to a strand and therefore are guaranteed to be serialized; hence no need for explicit synchronization. On the flip side, entire functions are serialized via strands and synchronizing smaller blocks of code is not possible. There is no serialization between the handlers running on the strand and other handlers; therefore, the access to global objects, like std::cout, must still be synchronized. The following is a sample output of running the preceding code: [b73b6b40] Starting worker thread [b73b6b40] 0. Hello, world from strand! [b6bb5b40] Starting worker thread [b6bb5b40] 1. Hello, world! [b63b4b40] Starting worker thread [b63b4b40] 2. Hello, world! [b73b6b40] 3. Hello, world from strand! [b6bb5b40] 5. Hello, world! [b63b4b40] 6. Hello, world! … [b6bb5b40] 14. Hello, world! [b63b4b40] 4. Hello, world from strand! [b63b4b40] 8. Hello, world from strand! [b63b4b40] 10. Hello, world from strand! [b63b4b40] 13. Hello, world from strand! [b6bb5b40] Worker thread done [b73b6b40] Worker thread done [b63b4b40] Worker thread done There were three distinct threads in the pool and the handlers from the strand were picked up by two of these three threads: initially, by thread ID b73b6b40, and later on, by thread ID b63b4b40. This also dispels a frequent misunderstanding that all handlers in a strand are dispatched by the same thread, which is clearly not the case. Different handlers in the same strand may be dispatched by different threads but will never run concurrently. Summary Asio is a well-designed library that can be used to write fast, nimble network servers that utilize the most optimal mechanisms for asynchronous I/O available on a system. It is an evolving library and is the basis for a Technical Specification that proposes to add a networking library to a future revision of the C++ Standard. In this article, we learned how to use the Boost Asio library as a task queue manager and leverage Asio's TCP and UDP interfaces to write programs that communicate over the network. Resources for Article: Further resources on this subject: Animation features in Unity 5 [article] Exploring and Interacting with Materials using Blueprints [article] A Simple Pathfinding Algorithm for a Maze [article]

0
1
22134

Packt

30 Mar 2016

15 min read

ALM – Developers and QA

Packt

30 Mar 2016

15 min read

0
0
22117

Merlyn Shelley

02 Apr 2024

10 min read

Databricks' DBRX, Stability AI's Stable Code Instruct 3B, SambaNova's Samba CoE v0.2, FrugalGPT, Advanced RAG Patterns on Amazon SageMaker

Merlyn Shelley

02 Apr 2024

10 min read

Subscribe to our Data Pro newsletter for the latest insights. Don't miss out – sign up today!👋 Hello,Welcome to DataPro#87 – Your Gateway to the Cutting-Edge of Data Science & Machine Learning! 🚀 Dive into this edition to explore: ⚙️ LLMs & GPTs Unleashed Samba CoE v0.2: SambaNova's Speedy AI Models Efficient Training of Language Models with OpenAI AI21's Revolutionary SSM-Transformer Model: Jamba Databricks' DBRX: The New Open LLM Benchmark Stable Code Instruct 3B: Stability AI's Latest Offering HyperLLaVA: Boosting Multimodal Language Models ✨ What's Fresh & Exciting FrugalGPT: Cutting LLM Operating Costs Building a Reliable AI Agent from Scratch with OpenAI Tool Calling Fine-Tuning Instruct Models over Raw Text Data Crafting an OpenAI-Compatible API ⚡ Industry Pulse: Deciphering Advanced RAG Patterns on Amazon SageMaker Unveil the Future with AutoBNN: Mastering Probabilistic Time Series Forecasting! Engaging with Microsoft Copilot (web): Learning from Interaction 📚 Packt's Latest Gem "Principles of Data Science - Third Edition" by Sinan Ozdemir DataPro Newsletter is not just a publication; it’s a comprehensive toolkit for anyone serious about mastering the ever-changing landscape of data and AI. Grab your copy and start transforming your data expertise today! 📥 Feedback on the Weekly EditionTake our weekly survey and get a free PDF copy of our best-selling book, "Interactive Data Visualization with Python - Second Edition."We appreciate your input and hope you enjoy the book!Share your Feedback!Cheers,Merlyn ShelleyEditor-in-Chief, PacktSign Up | Advertise | Archives🔰 GitHub Finds: Any of These Repos in Your Toolbox?🛠️ Zejun-Yang/AniPortrait: AniPortrait is a new framework for creating high-quality animations using audio input and a reference portrait image, with face reenactment capabilities.🛠️ agiresearch/AIOS: AIOS embeds large language models into operating systems, enabling smarter resource allocation, context switching, and concurrent agent execution, advancing AGI. 🛠️ lichao-sun/Mora: Mora is a multi-agent framework for video generation, enhancing OpenAI's Sora capabilities through collaborative visual agents for diverse tasks. 🛠️ jasonppy/VoiceCraft: VoiceCraft is a high-performing neural codec language model for speech editing and zero-shot text-to-speech, excelling with diverse real-world data. 🛠️ dvlab-research/MiniGemini: Mini-Gemini enhances LLMs (Large Language Models) from 2B to 34B, integrating image understanding, reasoning, and generation, inspired by LLaVA. 🛠️ Picsart-AI-Research/StreamingT2V: StreamingT2V is a technique for creating long videos with rich motion dynamics, ensuring temporal consistency and high image quality. 📚 Expert Insights from Packt Community"Principles of Data Science - Third Edition" by Sinan Ozdemir. The Five Steps of Data Science A question I’ve gotten at least once a month for the past decade is What’s the difference between data science and data analytics? One could argue that there is no difference between the two; others will argue that there are hundreds of differences! I believe that, regardless of how many differences there are between the two terms, the following applies: Data science follows a structured, step-by-step process that, when followed, preserves the integrity of the results and leads to a deeper understanding of the data and the environment the data comes from. As with any other scientific endeavor, this process must be adhered to, or else the analysis and the results are in danger of scrutiny. On a simpler level, following a strict process can make it much easier for any data scientist, hobbyist, or professional to obtain results faster than if they were exploring data with no clear vision. While these steps are a guiding lesson for amateur analysts, they also provide the foundation for all data scientists, even those in the highest levels of business and academia. Every data scientist recognizes the value of these steps and follows them in some way or another. Overview of the five steps The process of data science involves a series of steps that are essential for effectively extracting insights and knowledge from data. These steps are presented as follows: Asking an interesting question: The first step in any data science project is to identify a question or challenge that you want to address with your analysis. This involves finding a topic that is relevant, important, and that can be addressed with data. Obtaining the data: Once you have identified your question, the next step is to collect the data that you will need to answer it. This can involve sourcing data from a variety of sources, such as databases, online platforms, or through data scraping or data collection methods. Exploring the data: After you have collected your data, the next step is to explore it and get a better understanding of its characteristics and patterns. This might involve examining summary statistics, visualizing the data, or applying statistical or machine learning (ML) techniques to identify trends or relationships. Modeling the data: Once you have explored your data, the next step is to build models that can be used to make predictions or inform decision-making. This might involve applying ML algorithms, building statistical models, or using other techniques to find patterns in the data. Communicating and visualizing the results: Finally, it’s important to communicate your findings to others in a clear and effective way. This might involve creating reports, presentations, or visualizations that help to explain your results and their implications. By following these five essential steps, you can effectively use data science to solve real-world problems and extract valuable insights from data. It’s important to note that different data scientists may have different approaches to the data science process, and the steps outlined previously are just one way of organizing the process. Some data scientists might group the steps differently or include additional steps such as feature engineering or model evaluation. Despite these differences, most data scientists agree that the steps listed previously are essential to the data science process. Whether they are organized in this specific way or not, these steps are all crucial for effectively using data to solve problems and extract valuable insights. Let’s dive into these steps one by one.Discover more insights from "Principles of Data Science - Third Edition" by Sinan Ozdemir. Unlock access to the full book and a wealth of other titles with a 7-day free trial in the Packt Library. Start exploring today! Read Here!⚡ Tech Tidbits: Stay Wired to the Latest Industry Buzz! AWS ML Made Easy 🌀 Advanced RAG patterns on Amazon SageMaker: This post discusses how customers across various industries are utilizing large language models (LLMs) like Mixtral-8x7B Instruct to build generative AI applications such as QnA chatbots and search engines. It highlights the challenges and solutions in improving the accuracy and performance of these applications, focusing on Retrieval Augmented Generation (RAG) patterns implemented with LangChain.Google Research 🌀 AutoBNN: Probabilistic time series forecasting with compositional bayesian neural networks. This research introduces AutoBNN, an open-source package for automated, interpretable time series forecasting using Bayesian neural networks (BNNs). It addresses limitations of traditional methods like Gaussian processes (GPs) and Structural Time Series by combining the interpretability of GPs with the scalability and flexibility of neural networks. AutoBNN automates model discovery, provides high-quality uncertainty estimates, and scales effectively for large datasets. Microsoft Research🌀 Learning from interaction with Microsoft Copilot (web): This research focuses on how AI systems like Bing and Microsoft Copilot learn and improve from user interactions, particularly through reinforcement learning from human feedback (RLHF). It also explores how Bing has evolved its search capabilities and how Copilot is changing user interactions to be more conversational and workflow oriented. The research introduces frameworks like TnT-LLM and SPUR to improve taxonomy generation and user satisfaction estimation in AI interactions. Email Forwarded? Join DataPro Here!🔍 From Bits to BERT: Keeping Up with LLMs & GPTs 🌀 Samba CoE v0.2 from SambaNova delivers accurate AI models at blazing speeds: This blog post highlights Samba's advancements in AI architecture, specifically focusing on the introduction of Samba-1, a CoE architecture for enterprise AI. It discusses the features and benefits of Samba-1, its performance benchmarks, and plans for future releases, emphasizing the role of RDUs in driving efficiency and speed in AI models. 🌀 OpenAI’s Efficient Training of Language Models to Fill in the Middle: OpenAI demonstrates that autoregressive language models can effectively learn to infill text by moving a span of text from the middle of a document to its end, without harming generative capability. They propose training models with this method by default and provide benchmarks and best practices. 🌀 Jamba: AI21's Groundbreaking SSM-Transformer Model. Jamba is a groundbreaking model that merges Mamba SSM with Transformer elements, offering a 256K context window and outperforming similar models. Released under Apache 2.0, it will be available in the NVIDIA API catalog. Jamba optimizes memory, throughput, and performance, delivering remarkable efficiency. 🌀 Databricks’ DBRX: A New State-of-the-Art Open LLM. Databricks introduces DBRX, an open LLM setting new benchmarks in language understanding, programming, and math. With a 256K context window, it outperforms GPT-3.5 and competes with Gemini 1.0 Pro. DBRX is 40% smaller than Grok-1, offering 2x faster inference than LLaMA2-70B. 🌀 Introducing Stable Code Instruct 3B — Stability AI: Stable Code Instruct 3B, built on Stable Code 3B, offers state-of-the-art performance in code completion and natural language interactions for programming tasks. It outperforms Codellama 7B Instruct and matches StarChat 15B, with a focus on popular languages like Python and Java. Available for commercial use with a Stability AI Membership, the model is accessible on Hugging Face. 🌀 HyperLLaVA: Enhancing Multimodal Language Models with Dynamic Visual and Language Experts. This blog explores the advancements in Multimodal Large Language Models (MLLMs) and introduces HyperLLaVA, a dynamic model that improves performance by adaptively tuning parameters for handling diverse multimodal tasks, surpassing existing benchmarks and opening new avenues for multimodal learning systems. ✨ On the Radar: Catch Up on What's Fresh🌀 FrugalGPT and Reducing LLM Operating Costs: The blog discusses the high cost of running Large Language Models (LLMs) and introduces the "FrugalGPT" framework, which reduces operating costs significantly while maintaining quality. It explains how different models cost different amounts and proposes using a cascade of LLMs to minimize costs while maximizing answer quality. 🌀 Leverage OpenAI Tool calling: Building a reliable AI Agent from Scratch. The blog discusses the future role of AI in everyday tasks, focusing on text creation, correction, and brainstorming. It highlights the importance of Retrieval-Augmented Generation (RAG) pipelines and aims to provide Large Language Models with better context to generate more valuable content. 🌀 Fine-tune an Instruct model over raw text data: The blog explores the challenges of integrating modern chatbots with large datasets, focusing on context window sizes and the use of Retrieval-Augmented Generation (RAG) techniques. It proposes a lighter approach to fine-tuning chatbots on smaller datasets, aiming to bridge the gap between the constraints of a 128K context window and the complexities of models fine-tuned on billions of tokens. The experiment involves fine-tuning a model on The Guardian's dataset and aims to provide reproducible instructions for cost-effective model training using accessible hardware. 🌀 How to build an OpenAI-compatible API: The blog discusses the dominance of OpenAI in the Gen AI market, and the reasons developers might choose alternative LLM providers. It explores implementing a Python FastAPI server compatible with the OpenAI API specs to wrap any LLM, aiming for flexibility and cost-effectiveness. See you next time!

0
0
22112

Packt

25 Aug 2014

18 min read

Camera Calibration

Packt

25 Aug 2014

18 min read

This article by Robert Laganière, author of OpenCV Computer Vision Application Programming Cookbook Second Edition, includes that images are generally produced using a digital camera, which captures a scene by projecting light going through its lens onto an image sensor. The fact that an image is formed by the projection of a 3D scene onto a 2D plane implies the existence of important relationships between a scene and its image and between different images of the same scene. Projective geometry is the tool that is used to describe and characterize, in mathematical terms, the process of image formation. In this article, we will introduce you to some of the fundamental projective relations that exist in multiview imagery and explain how these can be used in computer vision programming. You will learn how matching can be made more accurate through the use of projective constraints and how a mosaic from multiple images can be composited using two-view relations. Before we start the recipe, let's explore the basic concepts related to scene projection and image formation. (For more resources related to this topic, see here.) Image formation Fundamentally, the process used to produce images has not changed since the beginning of photography. The light coming from an observed scene is captured by a camera through a frontal aperture; the captured light rays hit an image plane (or an image sensor) located at the back of the camera. Additionally, a lens is used to concentrate the rays coming from the different scene elements. This process is illustrated by the following figure: Here, do is the distance from the lens to the observed object, di is the distance from the lens to the image plane, and f is the focal length of the lens. These quantities are related by the so-called thin lens equation: In computer vision, this camera model can be simplified in a number of ways. First, we can neglect the effect of the lens by considering that we have a camera with an infinitesimal aperture since, in theory, this does not change the image appearance. (However, by doing so, we ignore the focusing effect by creating an image with an infinite depth of field.) In this case, therefore, only the central ray is considered. Second, since most of the time we have do>>di, we can assume that the image plane is located at the focal distance. Finally, we can note from the geometry of the system that the image on the plane is inverted. We can obtain an identical but upright image by simply positioning the image plane in front of the lens. Obviously, this is not physically feasible, but from a mathematical point of view, this is completely equivalent. This simplified model is often referred to as the pin-hole camera model, and it is represented as follows: From this model, and using the law of similar triangles, we can easily derive the basic projective equation that relates a pictured object with its image: The size (hi) of the image of an object (of height ho) is therefore inversely proportional to its distance (do) from the camera, which is naturally true. In general, this relation describes where a 3D scene point will be projected on the image plane given the geometry of the camera. Calibrating a camera From the introduction of this article, we learned that the essential parameters of a camera under the pin-hole model are its focal length and the size of the image plane (which defines the field of view of the camera). Also, since we are dealing with digital images, the number of pixels on the image plane (its resolution) is another important characteristic of a camera. Finally, in order to be able to compute the position of an image's scene point in pixel coordinates, we need one additional piece of information. Considering the line coming from the focal point that is orthogonal to the image plane, we need to know at which pixel position this line pierces the image plane. This point is called the principal point. It might be logical to assume that this principal point is at the center of the image plane, but in practice, this point might be off by a few pixels depending on the precision at which the camera has been manufactured. Camera calibration is the process by which the different camera parameters are obtained. One can obviously use the specifications provided by the camera manufacturer, but for some tasks, such as 3D reconstruction, these specifications are not accurate enough. Camera calibration will proceed by showing known patterns to the camera and analyzing the obtained images. An optimization process will then determine the optimal parameter values that explain the observations. This is a complex process that has been made easy by the availability of OpenCV calibration functions. How to do it... To calibrate a camera, the idea is to show it a set of scene points for which their 3D positions are known. Then, you need to observe where these points project on the image. With the knowledge of a sufficient number of 3D points and associated 2D image points, the exact camera parameters can be inferred from the projective equation. Obviously, for accurate results, we need to observe as many points as possible. One way to achieve this would be to take one picture of a scene with many known 3D points, but in practice, this is rarely feasible. A more convenient way is to take several images of a set of some 3D points from different viewpoints. This approach is simpler but requires you to compute the position of each camera view in addition to the computation of the internal camera parameters, which fortunately is feasible. OpenCV proposes that you use a chessboard pattern to generate the set of 3D scene points required for calibration. This pattern creates points at the corners of each square, and since this pattern is flat, we can freely assume that the board is located at Z=0, with the X and Y axes well-aligned with the grid. In this case, the calibration process simply consists of showing the chessboard pattern to the camera from different viewpoints. Here is one example of a 6x4 calibration pattern image: The good thing is that OpenCV has a function that automatically detects the corners of this chessboard pattern. You simply provide an image and the size of the chessboard used (the number of horizontal and vertical inner corner points). The function will return the position of these chessboard corners on the image. If the function fails to find the pattern, then it simply returns false: // output vectors of image points std::vector<cv::Point2f> imageCorners; // number of inner corners on the chessboard cv::Size boardSize(6,4); // Get the chessboard corners bool found = cv::findChessboardCorners(image, boardSize, imageCorners); The output parameter, imageCorners, will simply contain the pixel coordinates of the detected inner corners of the shown pattern. Note that this function accepts additional parameters if you needs to tune the algorithm, which are not discussed here. There is also a special function that draws the detected corners on the chessboard image, with lines connecting them in a sequence: //Draw the corners cv::drawChessboardCorners(image, boardSize, imageCorners, found); // corners have been found The following image is obtained: The lines that connect the points show the order in which the points are listed in the vector of detected image points. To perform a calibration, we now need to specify the corresponding 3D points. You can specify these points in the units of your choice (for example, in centimeters or in inches); however, the simplest is to assume that each square represents one unit. In that case, the coordinates of the first point would be (0,0,0) (assuming that the board is located at a depth of Z=0), the coordinates of the second point would be (1,0,0), and so on, the last point being located at (5,3,0). There are a total of 24 points in this pattern, which is too small to obtain an accurate calibration. To get more points, you need to show more images of the same calibration pattern from various points of view. To do so, you can either move the pattern in front of the camera or move the camera around the board; from a mathematical point of view, this is completely equivalent. The OpenCV calibration function assumes that the reference frame is fixed on the calibration pattern and will calculate the rotation and translation of the camera with respect to the reference frame. Let's now encapsulate the calibration process in a CameraCalibrator class. The attributes of this class are as follows: class CameraCalibrator { // input points: // the points in world coordinates std::vector<std::vector<cv::Point3f>> objectPoints; // the point positions in pixels std::vector<std::vector<cv::Point2f>> imagePoints; // output Matrices cv::Mat cameraMatrix; cv::Mat distCoeffs; // flag to specify how calibration is done int flag; Note that the input vectors of the scene and image points are in fact made of std::vector of point instances; each vector element is a vector of the points from one view. Here, we decided to add the calibration points by specifying a vector of the chessboard image filename as input: // Open chessboard images and extract corner points int CameraCalibrator::addChessboardPoints( const std::vector<std::string>& filelist, cv::Size & boardSize) { // the points on the chessboard std::vector<cv::Point2f> imageCorners; std::vector<cv::Point3f> objectCorners; // 3D Scene Points: // Initialize the chessboard corners // in the chessboard reference frame // The corners are at 3D location (X,Y,Z)= (i,j,0) for (int i=0; i<boardSize.height; i++) { for (int j=0; j<boardSize.width; j++) { objectCorners.push_back(cv::Point3f(i, j, 0.0f)); } } // 2D Image points: cv::Mat image; // to contain chessboard image int successes = 0; // for all viewpoints for (int i=0; i<filelist.size(); i++) { // Open the image image = cv::imread(filelist[i],0); // Get the chessboard corners bool found = cv::findChessboardCorners( image, boardSize, imageCorners); // Get subpixel accuracy on the corners cv::cornerSubPix(image, imageCorners, cv::Size(5,5), cv::Size(-1,-1), cv::TermCriteria(cv::TermCriteria::MAX_ITER + cv::TermCriteria::EPS, 30, // max number of iterations 0.1)); // min accuracy //If we have a good board, add it to our data if (imageCorners.size() == boardSize.area()) { // Add image and scene points from one view addPoints(imageCorners, objectCorners); successes++; } } return successes; } The first loop inputs the 3D coordinates of the chessboard, and the corresponding image points are the ones provided by the cv::findChessboardCorners function. This is done for all the available viewpoints. Moreover, in order to obtain a more accurate image point location, the cv::cornerSubPix function can be used, and as the name suggests, the image points will then be localized at a subpixel accuracy. The termination criterion that is specified by the cv::TermCriteria object defines the maximum number of iterations and the minimum accuracy in subpixel coordinates. The first of these two conditions that is reached will stop the corner refinement process. When a set of chessboard corners have been successfully detected, these points are added to our vectors of the image and scene points using our addPoints method. Once a sufficient number of chessboard images have been processed (and consequently, a large number of 3D scene point / 2D image point correspondences are available), we can initiate the computation of the calibration parameters as follows: // Calibrate the camera // returns the re-projection error double CameraCalibrator::calibrate(cv::Size &imageSize) { //Output rotations and translations std::vector<cv::Mat> rvecs, tvecs; // start calibration return calibrateCamera(objectPoints, // the 3D points imagePoints, // the image points imageSize, // image size cameraMatrix, // output camera matrix distCoeffs, // output distortion matrix rvecs, tvecs, // Rs, Ts flag); // set options } In practice, 10 to 20 chessboard images are sufficient, but these must be taken from different viewpoints at different depths. The two important outputs of this function are the camera matrix and the distortion parameters. These will be described in the next section. How it works... In order to explain the result of the calibration, we need to go back to the figure in the introduction, which describes the pin-hole camera model. More specifically, we want to demonstrate the relationship between a point in 3D at the position (X,Y,Z) and its image (x,y) on a camera specified in pixel coordinates. Let's redraw this figure by adding a reference frame that we position at the center of the projection as seen here: Note that the y axis is pointing downward to get a coordinate system compatible with the usual convention that places the image origin at the upper-left corner. We learned previously that the point (X,Y,Z) will be projected onto the image plane at (fX/Z,fY/Z). Now, if we want to translate this coordinate into pixels, we need to divide the 2D image position by the pixel's width (px) and height (py), respectively. Note that by dividing the focal length given in world units (generally given in millimeters) by px, we obtain the focal length expressed in (horizontal) pixels. Let's then define this term as fx. Similarly, fy =f/py is defined as the focal length expressed in vertical pixel units. Therefore, the complete projective equation is as follows: Recall that (u0,v0) is the principal point that is added to the result in order to move the origin to the upper-left corner of the image. These equations can be rewritten in the matrix form through the introduction of homogeneous coordinates, in which 2D points are represented by 3-vectors and 3D points are represented by 4-vectors (the extra coordinate is simply an arbitrary scale factor, S, that needs to be removed when a 2D coordinate needs to be extracted from a homogeneous 3-vector). Here is the rewritten projective equation: The second matrix is a simple projection matrix. The first matrix includes all of the camera parameters, which are called the intrinsic parameters of the camera. This 3x3 matrix is one of the output matrices returned by the cv::calibrateCamera function. There is also a function called cv::calibrationMatrixValues that returns the value of the intrinsic parameters given by a calibration matrix. More generally, when the reference frame is not at the projection center of the camera, we will need to add a rotation vector (a 3x3 matrix) and a translation vector (a 3x1 matrix). These two matrices describe the rigid transformation that must be applied to the 3D points in order to bring them back to the camera reference frame. Therefore, we can rewrite the projection equation in its most general form: Remember that in our calibration example, the reference frame was placed on the chessboard. Therefore, there is a rigid transformation (made of a rotation component represented by the matrix entries r1 to r9 and a translation represented by t1, t2, and t3) that must be computed for each view. These are in the output parameter list of the cv::calibrateCamera function. The rotation and translation components are often called the extrinsic parameters of the calibration, and they are different for each view. The intrinsic parameters remain constant for a given camera/lens system. The intrinsic parameters of our test camera obtained from a calibration based on 20 chessboard images are fx=167, fy=178, u0=156, and v0=119. These results are obtained by cv::calibrateCamera through an optimization process aimed at finding the intrinsic and extrinsic parameters that will minimize the difference between the predicted image point position, as computed from the projection of the 3D scene points, and the actual image point position, as observed on the image. The sum of this difference for all the points specified during the calibration is called the re-projection error. Let's now turn our attention to the distortion parameters. So far, we have mentioned that under the pin-hole camera model, we can neglect the effect of the lens. However, this is only possible if the lens that is used to capture an image does not introduce important optical distortions. Unfortunately, this is not the case with lower quality lenses or with lenses that have a very short focal length. You may have already noted that the chessboard pattern shown in the image that we used for our example is clearly distorted—the edges of the rectangular board are curved in the image. Also, note that this distortion becomes more important as we move away from the center of the image. This is a typical distortion observed with a fish-eye lens, and it is called radial distortion. The lenses used in common digital cameras usually do not exhibit such a high degree of distortion, but in the case of the lens used here, these distortions certainly cannot be ignored. It is possible to compensate for these deformations by introducing an appropriate distortion model. The idea is to represent the distortions induced by a lens by a set of mathematical equations. Once established, these equations can then be reverted in order to undo the distortions visible on the image. Fortunately, the exact parameters of the transformation that will correct the distortions can be obtained together with the other camera parameters during the calibration phase. Once this is done, any image from the newly calibrated camera will be undistorted. Therefore, we have added an additional method to our calibration class: // remove distortion in an image (after calibration) cv::Mat CameraCalibrator::remap(const cv::Mat &image) { cv::Mat undistorted; if (mustInitUndistort) { // called once per calibration cv::initUndistortRectifyMap( cameraMatrix, // computed camera matrix distCoeffs, // computed distortion matrix cv::Mat(), // optional rectification (none) cv::Mat(), // camera matrix to generate undistorted image.size(), // size of undistorted CV_32FC1, // type of output map map1, map2); // the x and y mapping functions mustInitUndistort= false; } // Apply mapping functions cv::remap(image, undistorted, map1, map2, cv::INTER_LINEAR); // interpolation type return undistorted; } Running this code results in the following image: As you can see, once the image is undistorted, we obtain a regular perspective image. To correct the distortion, OpenCV uses a polynomial function that is applied to the image points in order to move them at their undistorted position. By default, five coefficients are used; a model made of eight coefficients is also available. Once these coefficients are obtained, it is possible to compute two cv::Mat mapping functions (one for the x coordinate and one for the y coordinate) that will give the new undistorted position of an image point on a distorted image. This is computed by the cv::initUndistortRectifyMap function, and the cv::remap function remaps all the points of an input image to a new image. Note that because of the nonlinear transformation, some pixels of the input image now fall outside the boundary of the output image. You can expand the size of the output image to compensate for this loss of pixels, but you will now obtain output pixels that have no values in the input image (they will then be displayed as black pixels). There's more... More options are available when it comes to camera calibration. Calibration with known intrinsic parameters When a good estimate of the camera's intrinsic parameters is known, it could be advantageous to input them in the cv::calibrateCamera function. They will then be used as initial values in the optimization process. To do so, you just need to add the CV_CALIB_USE_INTRINSIC_GUESS flag and input these values in the calibration matrix parameter. It is also possible to impose a fixed value for the principal point (CV_CALIB_FIX_PRINCIPAL_POINT), which can often be assumed to be the central pixel. You can also impose a fixed ratio for the focal lengths fx and fy (CV_CALIB_FIX_RATIO); in which case, you assume the pixels of the square shape. Using a grid of circles for calibration Instead of the usual chessboard pattern, OpenCV also offers the possibility to calibrate a camera by using a grid of circles. In this case, the centers of the circles are used as calibration points. The corresponding function is very similar to the function we used to locate the chessboard corners: cv::Size boardSize(7,7); std::vector<cv::Point2f> centers; bool found = cv:: findCirclesGrid( image, boardSize, centers); See also The A flexible new technique for camera calibration article by Z. Zhang in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no 11, 2000, is a classic paper on the problem of camera calibration Summary In this article, we explored the projective relations that exist between two images of the same scene. Resources for Article: Further resources on this subject: Creating an Application from Scratch [Article] Wrapping OpenCV [Article] New functionality in OpenCV 3.0 [Article]

0
0
22104

article-image-integrating-angular-2-react

Mary Gualtieri

01 Sep 2016

5 min read

How to integrate Angular 2 with React

Mary Gualtieri

01 Sep 2016

5 min read

It can be overwhelming to choose which framework is the best framework to use to get the job done in JavaScript, because you have so many options out there. In previous years, we have seen two popular frameworks come to fame: React and Angular. React has gained a lot of popularity because it is what Facebook and Instagram are built on. So, which one do you use? If you ask most JavaScript developers, they advise you to use one or the other, and that comes down to personal choice. Let's go ahead and explore Angular and React, and actually explore how the two can work together A common misconception about React is that React is a full JavaScript framework in competition with Angular. But it actually is not. React is a user interface library and is just the view in an 'MVC' framework, with a little bit of a controller in it. In other words, React is a template (an Angular term for view) where you can add some controller logic.It is the same idea of integrating the jQuery UI into JavaScript. React aids in Angular, which is a frontend framework, and makes it more efficient for the user. This is because you can write a reusable component that can be plugged into an application. Angular has many pros, and that's what makes it so appealing to a lot of developers and companies. With my personal experience, Angular has been very powerful in making solid applications. However, one of the cons that bothers me about Angular is how it goes about executing a template. I always want to practice writing DRY code and that can be a problem with Angular. You can end up writing a bunch of HTML that can be complicated and difficult to read. But you can also end up causing a spaghetti effect in your CSS. One of Angular's big strengths is the watchers and the binders. When executed correctly in well-thought-out places, it can be great for fast binding and good responsiveness. But like every human, we all make errors, and when misused, you can have performance issues when binding way too many elements in your HTML. This leads to a very slow and lagged application that no one wants to use. But there is a way to rectify this. You can use React to aid in Angular's downfalls.React was designed to work really well with other libraries and makes rendering views or templates much faster. The beauty of React is how it uses a more efficient algorithm on the virtual DOM. In plain terms, it allows you to change parts of the application that need to be updated without having to touch the rest of the application. You can send a command to update the user interface and React compares these changes to the existing DOM. Instead of theorizing on how Angular and React complement one another, let's see how it looks in code. First, let's create an HMTL file that has an Angular script and a React script. *Remember, your relative path may be different from the example. Next, let's create a React component that renders a string that is inputted by the user. What is happening here is that we are using React to render our model. We created a component that renders the props passed to it. Then we create an Angular directive and controller to start the app. The directive is calling the React component and telling it to render. But, let's take a look at another example that integrates Angular 2. We can demonstrate how a React component can be self-contained but still be injected into the Angular 2 world. Angular 2 has an optional hook, onInit(), that takes advantage of triggering the code to render a React component. As you can see, the host component has defined implementation for the onInit handler, where you can call the static initialize function. If you noticed, the initialize method is passing a title text that is passed down from the React component as a prop. *React Component Another item to consider that unites React and Angular is TypeScript. This is a big deal because we can manage both Angular code and React code in the same compilation step. When it comes down to it, TypeScript is what gets stripped down to regular JavaScript. One thing that you have to remember to do is tell the compiler that you are using JSX by specifying the JSX flag. To conclude, Angular will always remain a very popular framework for developers. For an application to render faster, React is a great way to render templates faster and create a more efficient user experience. React is a great compliment to Angular and will only enhance your application. About the author Mary Gualtieri is a full-stack web developer and web designer who enjoys all aspects of the web and creating a pleasant user experience. Web development, specifically frontend development, is an interest of hers because it challenges her to think outside of the box and solveproblems, all while constantly learning. She can be found on GitHub at MaryGualtieri.

0
2
22094

How-To Tutorials

article-image-how-to-publish-a-microservice-as-a-service-onto-a-docker

Pravin Dhandre

12 Apr 2018

6 min read

How to publish Microservice as a service onto a Docker

Pravin Dhandre

12 Apr 2018

6 min read

In today’s tutorial, we will show a step-by-step approach to publishing a prebuilt microservice onto a docker so you can scale and control it further. Once the swarm is ready, we can use it to create services that we can use for scaling, but first, we will create a shared registry in our swarm. Then, we will build a microservice and publish it into a Docker. Creating a registry When we create a swarm's service, we specify an image that will be used, but when we ask Docker to create the instances of the service, it will use the swarm master node to do it. If we have built our Docker images in our machine, they are not available on the master node of the swarm, so it will create a registry service that we can use to publish our images and reference when we create our own services. First, let's create the registry service: docker service create --name registry --publish 5000:5000 registry Now, if we list our services, we should see our registry created: docker service ls ID NAME MODE REPLICAS IMAGE PORTS os5j0iw1p4q1 registry replicated 1/1 registry:latest *:5000->5000/tcp And if we visit http://localhost:5000/v2/_catalog, we should just get: {"repositories":[]} This Docker registry is ephemeral, meaning that if we stop the service or start it again, our images will disappear. For that reason, we recommend to use it only for development. For a real registry, you may want to use an external service. We discussed some of them in Chapter 7, Creating Dockers. Creating a microservice In order to create a microservice, we will use Spring Initializr, as we have been doing in previous chapters. We can start by visiting the URL: https://start.spring.io/: Creating a project in Spring Initializr We have chosen to create a Maven Project using Kotlin and Spring Boot 2.0.0 M7. We chose the Group to be com.Microservices and the Artifact to be chapter08. For Dependencies, we have set Web. Now, we can click Generate Project to download it as a zip file. After we unzip it, we can open it with IntelliJ IDEA to start working on our project. After a few minutes, our project will be ready and we can open the Maven window to see the different lifecycle phases and Maven plugins and their goals. We covered how to use Spring Initializr, Maven, and IntelliJ IDEA in Chapter 2, Getting Started with Spring Boot 2.0. You can check out this chapter to learn about topics not covered in this section. Now, we will add RestController to our project, in IntelliJ IDEA, in the Project window. Right-click our com.Microservices.chapter08 package and then choose New | Kotlin File/Class. In the pop-up window, we will set its name as HelloController and in the Kind drop-down, choose Class. Let's modify our newly created controller: package com.microservices.chapter08 import org.springframework.web.bind.annotation.GetMapping import org.springframework.web.bind.annotation.RestController import java.util.* import java.util.concurrent.atomic.AtomicInteger @RestController class HelloController { private val id: String = UUID.randomUUID().toString() companion object { val total: AtomicInteger = AtomicInteger() } @GetMapping("/hello") fun hello() = "Hello I'm $id and I have been called ${total.incrementAndGet()} time(s)" } This small controller will display a message with a GET request to the /hello URL. The message will display a unique id that we establish when the object is created and it will show how many times it has been called. We have used AtomicInteger to guarantee that no other requests are modified on our number concurrently. Finally, we will rename application.properties in the resources folder, using Shift + F6 to get application.yml, and edit it: logging.level.org.springframework.web: DEBUG We will establish that we have the log in the debug level for the Sprint Web Framework, so for the latter, we could view the information that will be needed in our logs. Now, we can run our microservice, either using IntelliJ IDEA Maven Project window with the springboot:run target, or by executing it on the command line: mvnw spring-boot:run Either way, when we visit the http://localhost:8080/hello URL, we should see something like: Hello I'm a0193477-b9dd-4a50-85cc-9d9405e02299 and I have been called 1 time(s) Doing consecutive calls will increase the number. Finally, we can see at the end of our log that we have several requests. The following should appear: Hello I'm a0193477-b9dd-4a50-85cc-9d9405e02299 and I have been called 1 time(s) Now, we will stop the microservice to create our Docker. Creating our Docker Now that our microservice is running, we need to create a Docker. To simplify the process, we will just use Dockerfile. To create this file on the top of the Project window, rightclick chapter08 and then select New | File and type Dockerfile in the drop-down menu. In the next window, click OK and the file will be created. Now, we can add this to our Dockerfile: FROM openjdk:8-jdk-alpine ADD target/*.jar app.jar ENTRYPOINT ["java","-Djava.security.egd=file:/dev/./urandom","-jar", "App.jar"] In this case, we tell the JVM to use dev/urandom instead of dev/random to generate random numbers required by our ID generation and this will make the operation much faster. You may think that urandom is less secure than random, but read this article for more information: https://www.2uo.de/myths-about-urandom/. From our microservice directory, we will use the command line to create our package: mvnw package Then, we will create a Docker for it: docker build . -t hello Now, we need to tag our image in our shared registry service, using the following command: docker tag hello localhost:5000/hello Then, we will push our image to the shared registry service: docker push localhost:5000/hello Creating the service Finally, we will add the microservice as a service to our swarm, exposing the 8080 port: docker service create --name hello-service --publish 8080:8080 localhost:5000/hello Now, we can navigate to http://localhost:8888/application/default to get the same results as before, but now we run our microservice in a Docker swarm. If you found this tutorial useful, do check out the book Hands-On Microservices with Kotlin to work with Kotlin, Spring and Spring Boot for building reactive and cloud-native microservices. Also check out other posts: How to publish Docker and integrate with Maven Understanding Microservices How to build Microservices using REST framework

0
0
22057

How-To Tutorials

article-image-managing-files-folders-and-registry-items-using-powershell

Packt

23 Apr 2015

27 min read

Managing Files, Folders, and Registry Items Using PowerShell

Packt

23 Apr 2015

27 min read

0
0
22055

article-image-speech2face-a-neural-network-that-imagines-faces-from-hearing-voices-is-it-too-soon-to-worry-about-ethnic-profiling

Savia Lobo

28 May 2019

8 min read

Speech2Face: A neural network that “imagines” faces from hearing voices. Is it too soon to worry about ethnic profiling?

Savia Lobo

28 May 2019

8 min read

Last week, a few researchers from the MIT CSAIL and Google AI published their research study of reconstructing a facial image of a person from a short audio recording of that person speaking, in their paper titled, “Speech2Face: Learning the Face Behind a Voice”. The researchers designed and trained a neural network which uses millions of natural Internet/YouTube videos of people speaking. During training, they demonstrated that the model learns voice-face correlations that allows it to produce images that capture various physical attributes of the speakers such as age, gender, and ethnicity. The entire training was done in a self-supervised manner, by utilizing the natural co-occurrence of faces and speech in Internet videos, without the need to model attributes explicitly. They said they further evaluated and numerically quantified how their Speech2Face reconstructs, obtains results directly from audio, and how it resembles the true face images of the speakers. For this, they tested their model both qualitatively and quantitatively on the AVSpeech dataset and the VoxCeleb dataset. The Speech2Face model The researchers utilized the VGG-Face model, a face recognition model pre-trained on a large-scale face dataset called DeepFace and extracted a 4096-D face feature from the penultimate layer (fc7) of the network. These face features were shown to contain enough information to reconstruct the corresponding face images while being robust to many of the aforementioned variations. The Speech2Face pipeline consists of two main components: 1) a voice encoder, which takes a complex spectrogram of speech as input, and predicts a low-dimensional face feature that would correspond to the associated face; and 2) a face decoder, which takes as input the face feature and produces an image of the face in a canonical form (frontal-facing and with neutral expression). During training, the face decoder is fixed, and only the voice encoder is trained which further predicts the face feature. How were the facial features evaluated? To quantify how well different facial attributes are being captured in Speech2Face reconstructions, the researchers tested different aspects of the model. Demographic attributes Researchers used Face++, a leading commercial service for computing facial attributes. They evaluated and compared age, gender, and ethnicity, by running the Face++ classifiers on the original images and our Speech2Face reconstructions. The Face++ classifiers return either “male” or “female” for gender, a continuous number for age, and one of the four values, “Asian”, “black”, “India”, or “white”, for ethnicity. Source: Arxiv.org Craniofacial attributes Source: Arxiv.org The researchers evaluated craniofacial measurements commonly used in the literature, for capturing ratios and distances in the face. They computed the correlation between F2F and the corresponding S2F reconstructions. Face landmarks were computed using the DEST library. As can be seen, there is statistically significant (i.e., p < 0.001) positive correlation for several measurements. In particular, the highest correlation is measured for the nasal index (0.38) and nose width (0.35), the features indicative of nose structures that may affect a speaker’s voice. Feature similarity The researchers further test how well a person can be recognized from on the face features predicted from speech. They, first directly measured the cosine distance between the predicted features and the true ones obtained from the original face image of the speaker. The table above shows the average error over 5,000 test images, for the predictions using 3s and 6s audio segments. The use of longer audio clips exhibits consistent improvement in all error metrics; this further evidences the qualitative improvement observed in the image below. They further evaluated how accurately they could retrieve the true speaker from a database of face images. To do so, they took the speech of a person to predict the feature using the Speech2Face model and query it by computing its distances to the face features of all face images in the database. Ethical considerations with Speech2Face model Researchers said that the training data used is a collection of educational videos from YouTube and that it does not represent equally the entire world population. Hence, the model may be affected by the uneven distribution of data. They have also highlighted that “ if a certain language does not appear in the training data, our reconstructions will not capture well the facial attributes that may be correlated with that language”. “In our experimental section, we mention inferred demographic categories such as “White” and “Asian”. These are categories defined and used by a commercial face attribute classifier and were only used for evaluation in this paper. Our model is not supplied with and does not make use of this information at any stage”, the paper mentions. They also warn that any further investigation or practical use of this technology would be carefully tested to ensure that the training data is representative of the intended user population. “If that is not the case, more representative data should be broadly collected”, the researchers state. Limitations of the Speech2Face model In order to test the stability of the Speech2Face reconstruction, the researchers used faces from different speech segments of the same person, taken from different parts within the same video, and from a different video. The reconstructed face images were consistent within and between the videos. They further probed the model with an Asian male example speaking the same sentence in English and Chinese to qualitatively test the effect of language and accent. While having the same reconstructed face in both cases would be ideal, the model inferred different faces based on the spoken language. In other examples, the model was able to successfully factor out the language, reconstructing a face with Asian features even though the girl was speaking in English with no apparent accent. “In general, we observed mixed behaviors and a more thorough examination is needed to determine to which extent the model relies on language. More generally, the ability to capture the latent attributes from speech, such as age, gender, and ethnicity, depends on several factors such as accent, spoken language, or voice pitch. Clearly, in some cases, these vocal attributes would not match the person’s appearance”, the researchers state in the paper. Speech2Cartoon: Converting generated image into cartoon faces The face images reconstructed from speech may also be used for generating personalized cartoons of speakers from their voices. The researchers have used Gboard, the keyboard app available on Android phones, which is also capable of analyzing a selfie image to produce a cartoon-like version of the face. Such cartoon re-rendering of the face may be useful as a visual representation of a person during a phone or a video conferencing call when the person’s identity is unknown or the person prefers not to share his/her picture. The reconstructed faces may also be used directly, to assign faces to machine-generated voices used in home devices and virtual assistants. https://twitter.com/NirantK/status/1132880233017761792 A user on HackerNews commented, “This paper is a neat idea, and the results are interesting, but not in the way I'd expected. I had hoped it would the domain of how much person-specific information this can deduce from a voice, e.g. lip aperture, overbite, size of the vocal tract, openness of the nares. This is interesting from a speech perception standpoint. Instead, it's interesting more in the domain of how much social information it can deduce from a voice. This appears to be a relatively efficient classifier for gender, race, and age, taking voice as input.” “I'm sure this isn't the first time it's been done, but it's pretty neat to see it in action, and it's a worthwhile reminder: If a neural net is this good at inferring social, racial, and gender information from audio, humans are even better. And the idea of speech as a social construct becomes even more relevant”, he further added. This recent study is interesting considering the fact that it is taking AI to another level wherein we are able to predict the face just by using audio recordings and even without the need for a DNA. However, there can be certain repercussions, especially when it comes to security. One can easily misuse such technology by impersonating someone else and can cause trouble. It would be interesting to see how this study turns out to be in the near future. To more about the Speech2Face model in detail, head over to the research paper. OpenAI introduces MuseNet: A deep neural network for generating musical compositions An unsupervised deep neural network cracks 250 million protein sequences to reveal biological structures and functions OpenAI researchers have developed Sparse Transformers, a neural network which can predict what comes next in a sequence

0
0
22010

article-image-cross-validation-r-predictive-models

Pravin Dhandre

17 Apr 2018

8 min read

Cross-validation in R for predictive models

Pravin Dhandre

17 Apr 2018

8 min read

In today’s tutorial, we will efficiently train our first predictive model, we will use Cross-validation in R as the basis of our modeling process. We will build the corresponding confusion matrix. Most of the functionality comes from the excellent caret package. You can find more information on the vast features of caret package that we will not explore in this tutorial. Before moving to the training tutorial, lets understand what a confusion matrix is. A confusion matrix is a summary of prediction results on a classification problem. The number of correct and incorrect predictions are summarized with count values and broken down by each class. This is the key to the confusion matrix. The confusion matrix shows the ways in which your classification model is confused when it makes predictions. It gives you insight not only into the errors being made by your classifier but more importantly the types of errors that are being made. Training our first predictive model Following best practices, we will use Cross Validation (CV) as the basis of our modeling process. Using CV we can create estimates of how well our model will do with unseen data. CV is powerful, but the downside is that it requires more processing and therefore more time. If you can take the computational complexity, you should definitely take advantage of it in your projects. Going into the mathematics behind CV is outside of the scope of this tutorial. If interested, you can find out more information on cross validation on Wikipedia . The basic idea is that the training data will be split into various parts, and each of these parts will be taken out of the rest of the training data one at a time, keeping all remaining parts together. The parts that are kept together will be used to train the model, while the part that was taken out will be used for testing, and this will be repeated by rotating the parts such that every part is taken out once. This allows you to test the training procedure more thoroughly, before doing the final testing with the testing data. We use the trainControl() function to set our repeated CV mechanism with five splits and two repeats. This object will be passed to our predictive models, created with the caret package, to automatically apply this control mechanism within them: cv.control <- trainControl(method = "repeatedcv", number = 5, repeats = 2) Our predictive models pick for this example are Random Forests (RF). We will very briefly explain what RF are, but the interested reader is encouraged to look into James, Witten, Hastie, and Tibshirani's excellent "Statistical Learning" (Springer, 2013). RF are a non-linear model used to generate predictions. A tree is a structure that provides a clear path from inputs to specific outputs through a branching model. In predictive modeling they are used to find limited input-space areas that perform well when providing predictions. RF create many such trees and use a mechanism to aggregate the predictions provided by this trees into a single prediction. They are a very powerful and popular Machine Learning model. Let's have a look at the random forests example: Random forests aggregate trees To train our model, we use the train() function passing a formula that signals R to use MULT_PURCHASES as the dependent variable and everything else (~ .) as the independent variables, which are the token frequencies. It also specifies the data, the method ("rf" stands for random forests), the control mechanism we just created, and the number of tuning scenarios to use: model.1 <- train( MULT_PURCHASES ~ ., data = train.dfm.df, method = "rf", trControl = cv.control, tuneLength = 5 ) Improving speed with parallelization If you actually executed the previous code in your computer before reading this, you may have found that it took a long time to finish (8.41 minutes in our case). As we mentioned earlier, text analysis suffers from very high dimensional structures which take a long time to process. Furthermore, using CV runs will take a long time to run. To cut down on the total execution time, use the doParallel package to allow for multi-core computers to do the training in parallel and substantially cut down on time. We proceed to create the train_model() function, which takes the data and the control mechanism as parameters. It then makes a cluster object with the makeCluster() function with a number of available cores (processors) equal to the number of cores in the computer, detected with the detectCores() function. Note that if you're planning on using your computer to do other tasks while you train your models, you should leave one or two cores free to avoid choking your system (you can then use makeCluster(detectCores() -2) to accomplish this). After that, we start our time measuring mechanism, train our model, print the total time, stop the cluster, and return the resulting model. train_model <- function(data, cv.control) { cluster <- makeCluster(detectCores()) registerDoParallel(cluster) start.time <- Sys.time() model <- train( MULT_PURCHASES ~ ., data = data, method = "rf", trControl = cv.control, tuneLength = 5 ) print(Sys.time() - start.time) stopCluster(cluster) return(model) } Now we can retrain the same model much faster. The time reduction will depend on your computer's available resources. In the case of an 8-core system with 32 GB of memory available, the total time was 3.34 minutes instead of the previous 8.41 minutes, which implies that with parallelization, it only took 39% of the original time. Not bad right? Let's have look at how the model is trained: model.1 <- train_model(train.dfm.df, cv.control) Computing predictive accuracy and confusion matrices Now that we have our trained model, we can see its results and ask it to compute some predictive accuracy metrics. We start by simply printing the object we get back from the train() function. As can be seen, we have some useful metadata, but what we are concerned with right now is the predictive accuracy, shown in the Accuracy column. From the five values we told the function to use as testing scenarios, the best model was reached when we used 356 out of the 2,007 available features (tokens). In that case, our predictive accuracy was 65.36%. If we take into account the fact that the proportions in our data were around 63% of cases with multiple purchases, we have made an improvement. This can be seen by the fact that if we just guessed the class with the most observations (MULT_PURCHASES being true) for all the observations, we would only have a 63% accuracy, but using our model we were able to improve toward 65%. This is a 3% improvement. Keep in mind that this is a randomized process, and the results will be different every time you train these models. That's why we want a repeated CV as well as various testing scenarios to make sure that our results are robust: model.1 #> Random Forest #> #> 212 samples #> 2007 predictors #> 2 classes: 'FALSE', 'TRUE' #> #> No pre-processing #> Resampling: Cross-Validated (5 fold, repeated 2 times) #> Summary of sample sizes: 170, 169, 170, 169, 170, 169, ... #> Resampling results across tuning parameters: #> #> mtry Accuracy Kappa #> 2 0.6368771 0.00000000 #> 11 0.6439092 0.03436849 #> 63 0.6462901 0.07827322 #> 356 0.6536545 0.16160573 #> 2006 0.6512735 0.16892126 #> #> Accuracy was used to select the optimal model using the largest value. #> The final value used for the model was mtry = 356. To create a confusion matrix, we can use the confusionMatrix() function and send it the model's predictions first and the real values second. This will not only create the confusion matrix for us, but also compute some useful metrics such as sensitivity and specificity. We won't go deep into what these metrics mean or how to interpret them since that's outside the scope of this tutorial, but we highly encourage the reader to study them using the resources cited in this tutorial: confusionMatrix(model.1$finalModel$predicted, train$MULT_PURCHASES) #> Confusion Matrix and Statistics #> #> Reference #> Prediction FALSE TRUE #> FALSE 18 19 #> TRUE 59 116 #> #> Accuracy : 0.6321 #> 95% CI : (0.5633, 0.6971) #> No Information Rate : 0.6368 #> P-Value [Acc > NIR] : 0.5872 #> #> Kappa : 0.1047 #> Mcnemar's Test P-Value : 1.006e-05 #> #> Sensitivity : 0.23377 #> Specificity : 0.85926 #> Pos Pred Value : 0.48649 #> Neg Pred Value : 0.66286 #> Prevalence : 0.36321 #> Detection Rate : 0.08491 #> Detection Prevalence : 0.17453 #> Balanced Accuracy : 0.54651 #> #> 'Positive' Class : FALSE You read an excerpt from R Programming By Example authored by Omar Trejo Navarro. This book gets you familiar with R’s fundamentals and its advanced features to get you hands-on experience with R’s cutting edge tools for software development. Getting Started with Predictive Analytics Here’s how you can handle the bias variance trade-off in your ML models

0
0
22003

article-image-amazon-splits-hq2-between-new-york-and-washington-d-c-after-a-making-200-states-compete-over-a-year-public-sentiments-largely-negative

Sugandha Lahoti

14 Nov 2018

4 min read

Amazon splits HQ2 between New York and Washington, D.C. after a making 200+ states compete over a year; public sentiments largely negative

Sugandha Lahoti

14 Nov 2018

4 min read

Yesterday, Amazon announced that it will split its second headquarters between New York City and a suburb of Washington, D.C. Specifically into Long Island City, Queens, and Crystal City, in Arlington, Virginia. Per Amazon, the company will invest $5 billion and create more than 50,000 jobs across the two new headquarters locations, with more than 25,000 employees each in New York City and Arlington. The new locations will join Seattle as the company’s three headquarters in North America. How did it all start? The stage was set fourteen months ago when Amazon invited North American cities to apply to compete for getting Amazon’s second headquarters in their cities. Every year, American cities and states spend up to $90 billion in tax breaks and cash grants to urge companies to move among states. After considering applications from nearly 238 cities, Amazon shocked everyone by finally choosing New York and D.C., the two most sought after cities in America. And to top that, the incentives the company will receive as part of the deals are touching the sky. Amazon announced that, in New York, it will receive up to $1.2 billion in a refundable tax credit, tied to the creation of jobs, and a $325 million cash development grant. The company will meanwhile earn up to $573 million in cash grants for the Arlington investment if it creates the promised jobs. https://twitter.com/ByRosenberg/status/1062394439102943232 What do the people think? Amazon’s decision was met with backlash and outrage by concerned citizens as well as other people on the internet. People have found Amazon’s decision extremely concerning. https://twitter.com/Ocasio2018/status/1062204614496403457 https://twitter.com/SteveCase/status/1062399854905909248 Queens residents are also “outraged” at Amazon’s plans according to Rep.-elect Alexandria Ocasio-Cortez. https://twitter.com/Ocasio2018/status/1062203458227503104 Many people have also released statements and drafted open letters to Jeff Bezos disagreeing to Amazon’s decision and highlighting the trouble associated. https://twitter.com/NYCSpeakerCoJo/status/1062384477861748736 David Heinemeier Hansson, the creator of Ruby on Rails has written an open letter addressed to Jeff Bezos terming Amazon HQ2 process demeaning if not outright cruel. “At a time when politicians are viewed as more inept, more suspicious, and more corrupt than ever, you made city after city grovel in front of your selection committee. They debased themselves in a futile attempt to appeal to your grace and mercy, and you showed them little. The losers ended up worse than where they started, and even the winners may well too.” He wrote, “At some point people are going to have had enough, and when they figure out a way to channel that discontent into political action, they’re going to come looking for the heads of those that did them the most egregious wrongs.” How to control corporate giveaways? “We need a national truce, both within states and between states,” said Amy Liu, the director of the Metropolitan Policy Program at the Brookings Institution. “There should be no more poaching of private companies with public funds.” The Atlantic highlights three ways the government could take control of corporate giveaways: First, Congress could pass a national law banning this sort of corporate bribery. Second, Congress could make corporate subsidies less valuable by threatening to tax state or local incentives as a special kind of income. Finally, the federal government could actively discourage the culture of corporate subsidies by strict law enforcement measures. In another striking move, Democratic Assemblyman Ron Kim has announced a bill to block the Amazon deal and redirect taxpayer subsidies for Jeff Bezos into reducing student debt. “Giving Jeff Bezos hundreds of millions of dollars is an immoral waste of taxpayers’ money when it’s crystal clear that the money would create more jobs and more economic growth when it is used to relieve student debt,” said Kim. Read Amazon’s official press release to know in detail about the new headquarters and Amazon’s incentives in these HQ2. Amazon addresses employees dissent regarding the company’s law enforcement policies at an all-staff meeting, in a first. Jeff Bezos: Amazon will continue to support U.S. Defense Department. Amazon increases the minimum wage of all employees in the US and UK

0
0
22000

How-To Tutorials

Tools inTypeScript

Linux programmers opposed to new Code of Conduct threaten to pull code from project

10 Machine Learning Tools to watch in 2018

Experts discuss Dark Patterns and deceptive UI designs: What are they? What do they do? How do we stop them?

Basics of Image Histograms in OpenCV

Task Execution with Asio

ALM – Developers and QA

Databricks' DBRX, Stability AI's Stable Code Instruct 3B, SambaNova's Samba CoE v0.2, FrugalGPT, Advanced RAG Patterns on Amazon SageMaker

Camera Calibration

How to integrate Angular 2 with React

Trending Topics

How to publish Microservice as a service onto a Docker

Managing Files, Folders, and Registry Items Using PowerShell

Speech2Face: A neural network that “imagines” faces from hearing voices. Is it too soon to worry about ethnic profiling?

Cross-validation in R for predictive models

Amazon splits HQ2 between New York and Washington, D.C. after a making 200+ states compete over a year; public sentiments largely negative

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access