Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-deepfakes-house-committee-hearing-risks-vulnerabilities-and-recommendations

21 Jun 2019

16 min read

Deepfakes House Committee Hearing: Risks, Vulnerabilities and Recommendations

21 Jun 2019

Last week, the House Intelligence Committee held a hearing to examine the public risks posed by “deepfake” videos. Deepfake is identified as a technology that alters audio or video and then is passed off as true or original content. In this hearing, experts on AI and digital policy highlighted to the committee, deepfakes risk to national security, upcoming elections, public trust and the mission of journalism. They also offered potential recommendations on what Congress could do to combat deepfakes and misinformation. The chair of the committee Adam B. Schiff, initiated the hearing by stating that it is time to regulate the technology of deepfake videos as it is enabling sinister forms of deception and disinformation by malicious actors. He adds that “Advances in AI or machine learning have led to the emergence of advance digitally doctored type of media, the so-called deepfakes that enable malicious actors to foment chaos, division or crisis and have the capacity to disrupt entire campaigns including that for the Presidency.” For a quick glance, here’s a TL;DR: Jack Clerk believes that governments should be in the business of measuring and assessing deepfake threats by looking directly at the scientific literature and developing a base knowledge of it. David Doermann suggests that tools and processes which can identify fake content should be made available in the hands of individuals, rather than relying completely on the government or on social media platforms to police content. Danielle Citron warns that the phenomenon of deepfake is going to be increasingly felt by women and minorities and for people from marginalized communities. Clint Watts provides a list of recommendations which should be implemented to prohibit U.S. officials, elected representatives and agencies from creating and distributing false and manipulated content. A unified standard should be followed by all social media platforms. Also they should be pressurized to have a 10-15 seconds delay in all videos, so that they can decide, to label a particular video or not. Regarding 2020 Presidential election: State governments and social media companies should be ready with a response plan, if a fake video surfaces to cause disrupt. It was also recommended that the algorithms to make deepfakes should be open sourced. Laws should be altered, and strict actions should be awarded, to discourage deepfake videos. Being forewarned is forearmed in case of deepfake technology Jack Clerk, OpenAI Policy Director, highlighted in his testimony that he does not think A.I. is the cause of any disruption, but actually is an “accelerant to an issue which has been with us for some time.'' He adds that computer software aligned with A.I. technology has become significantly cheaper and more powerful, due to its increased accessibility. This has led to its usage in audio or video editing, which was previously very difficult. Similar technologies are being used for production of synthetic media. Also deepfakes are being used in valuable scientific research. Clerk suggests that interventions should be made to avoid its misuse. He believes that “it may be possible for large-scale technology platforms to try and develop and share tools for the detection of malicious synthetic media at both the individual account level and the platform level. We can also increase funding.” He strongly believes that governments should be in the business of measuring and assessing these threats by looking directly at the scientific literature and developing a base knowledge. Clerk concludes saying that “being forewarned is forearmed here.” Make Deepfake detector tools readily availaible David Doermann, the former Project Manager at the Defense Advanced Research Projects Agency mentions that the phrase ‘seeing is believing’ is no longer true. He states that there is nothing fundamentally wrong or evil about the technology, like basic image and video desktop editors, deepfakes is only a tool. There are a lot of positive applications of generative networks just as there are negative ones. He adds that, as of today, there are some solutions that can identify deepfakes reliably. However, Doermann fears that it’s only a matter of time before the current detection capabilities will be rendered less effective. He adds that “it's likely to get much worse before it gets much better.” Doermann suggests that tools and processes which can identify such fake content should be made available in the hands of individuals, rather than relying completely on the government or on social media platforms to police content. At the same time, there should also be ways to verify it or prove it or easily report it. He also hopes that automated detection tools will be developed, in the future, which will help in filtering and detection at the front end of the distribution pipeline. He also adds that “appropriate warning labels should be provided, which suggests that this is not real or not authentic, or not what it's purported to be. This would be independent of whether this is done and the decisions are made, by humans, machines or a combination.” Groups most vulnerable to Deepfake attacks Women and minorities Danielle Citron, a Law Professor at the University of Maryland, describes Deepfake as “particularly troubling when they're provocative and destructive.” She adds that, we as humans, tend to believe what our eyes and ears are telling us and also tend to share information that confirms our biases. It’s particularly true when that information is novel and negative, so the more salacious, we're more willing to pass it on. She also specifies that the deepfakes on social media networks are ad-driven. When all of this is put together, it turns out that the more provocative the deepfake is, the salacious will be the spread virally. She also informed the panel committee about an incident, involving an investigative journalist in India, who had her posters circulated over the internet and deepfake sex videos, with her face morphed into pornography, over a provocative article. Citron thus states that “the economic and the social and psychological harm is profound”. Also based on her work in cyber stalking, she believes that this phenomenon is going to be increasingly felt by women and minorities and for people from marginalized communities. She also shared other examples explaining the effect of deepfake on trades and businesses. Citron also highlighted that “We need a combination of law, markets and really societal resilience to get through this, but the law has a modest role to play.” She also mentioned that though there are laws to sue for defamation, intentional infliction of emotional distress, privacy torture, these procedures are quite expensive. She adds that criminal law offers very less opportunity for the public to push criminals to the next level. National security Clint Watts, a Senior Fellow at the Foreign Policy Research Institute provided insight into how such technologies can affect national security. He says that “A.I. provides purveyors of disinformation to identify psychological vulnerabilities and to create modified content digital forgeries advancing false narratives against Americans and American interests.” Watts suspects that Russia, “being an enduring purveyor of disinformation is and will continue to pursue the acquisition of synthetic media capability, and employ the output against adversaries around the world.” He also adds that China, being the U.S. rival, will join Russia “to get vast amounts of information stolen from the U.S. The country has already shown a propensity to employ synthetic media in broadcast journalism. They'll likely use it as part of disinformation campaigns to discredit foreign detractors, incite fear inside western-style democracy and then, distort the reality of audiences and the audiences of America's allies.” He also mentions that deepfake proliferation can present a danger to American constituency by demoralizing it. Watts suspects that the U.S. diplomats and military personnel deployed overseas, will be prime target for deepfake driven disinformation planted by adversaries. Watts provided a list of recommendations which should be implemented to “prohibit U.S. officials, elected representatives and agencies from creating and distributing false and manipulated content.” The U.S. government must be the sole purveyor of facts and truth to constituents, assuring the effective administration of democracy via productive policy debate from a shared basis of reality. Policy makers should work jointly with social media companies to develop standards for content and accountability. The U.S. government should partner with private sectors to implement digital verification designating a date, time and physical origination of the content. Social media companies should start labeling videos, and forward the same across all platforms. Consumers should be able to determine the source of the information and whether it's the authentic depiction of people and events. The U.S. government from a national security perspective, should maintain intelligence on capabilities of adversaries to conduct such information. The departments of defense and state should immediately develop response plans, for deepfake smear campaigns and mobilizations overseas, in an attempt to mitigate harm. Lastly he also added that public awareness of deepfakes and signatures, will assist in tamping down attempts to subvert the U.S. democracy and incite violence. Schiff asked the witnesses, if it's “time to do away with the immunity that social media platforms enjoy”, Watts replied in the affirmative and listed suggestions in three particular areas. If social media platforms see something spiking in terms of virality, it should be put in a queue for human review, linked to fact checkers, then down rate it and don't let it into news feeds. Also make the mainstream understand what is manipulated content. Anything related to outbreaks of violence and public safety should be regulated immediately. Anything related to elected officials or public institutions, should immediately be flagged and pulled down and checked and then a context should be given to it. Co-chair of the committee, Devin Nunes asked Citron what kind of filters can be placed on these tech companies, as “it's not developed by partisan left wing like it is now, where most of the time, it's conservatives who get banned and not democrats”. Citron suggested that proactive filtering won’t be possible and hence companies should react responsibly and should be bipartisan. She added that “but rather, is this a misrepresentation in a defamatory way, right, that we would say it's a falsehood that is harmful to reputation. that's an impersonation, then we should take it down. This is the default I am imagining.” How laws could be altered according to the changing times, to discourage deepfake videos Citron says that laws could be altered, like in the case of Section 230 C. It states that “No speaker or publisher -- or no online service shall be treated as a speaker or publisher of someone else's content.” This law can be altered to “No online service that engages in reasonable content moderation practices shall be treated as a speaker or publisher of somebody else's content.” Citron believes that avoiding reasonability could lead to negligence of law. She also adds that “I've been advising Twitter and Facebook all of the time. There is meaningful reasonable practices that are emerging and have emerged in the last ten years. We already have a guide, it's not as if this is a new issue in 2019. So we can come up with reasonable practices.” Also Watts added that if any adversary from big countries like China, Iran, Russia makes a deepfake video to push the US downwards, we can trace them back if we have aggressive laws at our hand. He says it could be anything from an “arrest and extradition, if the sanction permits, response should be individually, or in terms of cyber response”, could help us to discourage deepfake. How to slow down the spread of videos One of the reasons that these types of manipulated images gain traction is because it's almost instantaneous - they can be shared around the world, shared across platforms in a few seconds. Doermann says that these social media platforms must be pressurized to have a 10-15 seconds delay, so that it can be decided whether to label a particular video or not. He adds that “We've done it for child pornography, we've done it for human trafficking, they're serious about those things. This is another area that's a little bit more in the middle, but I think they can take the same effort in these areas to do that type of triage.” This delay will allow third parties or fact checkers to decide on the authenticity of videos and label them. Citron adds that this is where labelling a particular video can help, “I think it is incredibly important and there are times in which, that's the perfect rather than second best, and we should err on the side of inclusion and label it as synthetic.” The representative of Ohio, Brad Wenstrup added that we can have internal extradition laws, which can punish somebody when “something comes from some other country, maybe even a friendly country, that defames and hurts someone here”. There should be an agreement among nations that “we'll extradite those people and they can be punished in your country for what they did to one of your citizens.” Terri Sewell, the Representative of Alabama further probed about the current scenario of detecting fake videos, to which Doermann replied that currently we have enough solutions to detect a fake video, however with a constant delay of 15-20 minutes. Deepfakes and 2020 Presidential elections Watts says that he’s concerned about deepfakes acting on the eve of election day 2020. Foreign adversaries may use a standard disinformation approach by “using an organic content that suits their narrative and inject it back.” This can escalate as more people are making deepfakes each year. He also added that “Right now I would be very worried about someone making a fake video about electoral systems being out or broken down on election day 2020.” So state governments and social media companies should be ready with a response plan in the wake of such an event. Sewell then asked the witnesses for suggestions on campaigns to political parties/candidates so that they are prepared for the possibility of deepfake content. Watts replied that the most important thing to counter fake content would be a unified standard, that all the social media industries should follow. He added that “if you're a manipulator, domestic or international, and you're making deep fakes, you're going to go to whatever platform allows you to post anything from inauthentic accounts. they go to wherever the weak point is and it spreads throughout the system.” He believes that this system would help counter extremism, disinformation and political smear campaigns. Watts added any sort of lag in responding to such videos should be avoided as “any sort of lag in terms of response allows that conspiracy to grow.” Citron also pointed out that firstly all candidates should have a clear policy about deep fakes and should commit that they won’t use them or spread them. Should the algorithms to make deepfakes be open sourced? Doermann answered that the algorithms of deepfakes have to be absolutely open sourced. He says that though this might help adversaries, but they are anyway going to learn about it. He believes this is significant as, “We need to get this type of stuff out there. We need to get it into the hands of users. There are companies out there that are starting to make these types of things.” He also states that people should be able to use this technology. The more we educate them, more the tools they learn, more the correct choices people can make. On Mark Zuckerberg’s deepfake video On being asked to comment on the decision of Mark Zuckerberg to not take down his deepfake video from his own platform, Facebook, Citron replied that Mark gave a perfect example of “satire and parody”, by not taking down the video. She added that private companies can make these kinds of choices, as they have an incredible amount of power, without any liability, “it seemed to be a conversation about the choices they make and what does that mean for society. So it was incredibly productive, I think.” Watts also opined that he likes Facebook for its consistency in terms of enforcement and that they are always trying to learn better things and implement it. He adds that he really like Facebook as its always ready to hear “from legislatures about what falls inside those parameters. The one thing that I really like is that they're doing is identifying inauthentic account creation and inauthentic content generation, they are enforcing it, they have increased the scale,and it is very very good in terms of how they have scaled it up, it’s not perfect, but it is better.” Read More: Zuckberg just became the target of the world’s first high profile white hat deepfake op. Can Facebook come out unscathed? On the Nancy Pelosi doctored video Schiff asked the witnesses if there is any account on the number of millions of people who have watched the doctored video of Nancy Pelosi, and an account of how many of them ultimately got to know that it was not a real video. He said he’s asking this as according to psychologists, people never really forget their once constructed negative impression. Clarke replied that “Fact checks and clarifications tend not to travel nearly as far as the initial news.” He added that its becomes a very general thing as “If you care, you care about clarifications and fact checks. but if you're just enjoying media, you're enjoying media. You enjoy the experience of the media and the absolute minority doesn’t care whether it's true.” Schiff also recalled how in 2016, “some foreign actresses, particularly Russia had mimicked black lives matter to push out continent to racially divide people.” Such videos gave the impression of police violence, on people of colour. They “certainly push out videos that are enormously jarring and disruptive.” All the information revealed in the hearing was described as “scary and worrying”, by one of the representatives. The hearing was ended by Schiff, the chair of the committee, after thanking all the witnesses for their testimonies and recommendations. For more details, head over to the full Hearing on deepfake videos by the House Intelligence Committee. Worried about Deepfakes? Check out the new algorithm that manipulate talking-head videos by altering the transcripts Lawmakers introduce new Consumer privacy bill and Malicious Deep Fake Prohibition Act to support consumer privacy and battle deepfakes Machine generated videos like Deepfakes – Trick or Treat?

0
0
22425

article-image-julia-computing-research-team-runs-machine-learning-model-on-encrypted-data-without-decrypting-it

Fatema Patrawala

28 Nov 2019

5 min read

Julia Computing research team runs machine learning model on encrypted data without decrypting it

Fatema Patrawala

28 Nov 2019

5 min read

Last week, the team at Julia Computing published a research based on cutting edge cryptographic techniques. The research involved cryptography techniques to practically perform computation on data without ever decrypting it. For example, the user would send encrypted data (e.g. images) to the cloud API, which would run the machine learning model and then return the encrypted answer. Nowhere is the user data decrypted and in particular the cloud provider does not have access to either the original image nor is it able to decrypt the prediction it computed. The team made this possible by building a machine learning service for handwriting recognition of encrypted images (from the MNIST dataset). The ability to compute on encrypted data is generally referred to as “secure computation” and is a fairly large area of research, with many different cryptographic approaches and techniques for a plethora of different application scenarios. For their research, Julia team focused on using a technique known as “homomorphic encryption”. What is homomorphic encryption Homomorphic encryption is a form of encryption that allows computation on ciphertexts, generating an encrypted result which, when decrypted, matches the result of the operations as if they had been performed on the plaintext. This technique can be used for privacy-preserving outsourced storage and computation. It allows data to be encrypted and out-sourced to commercial cloud environments for processing, all while encrypted. In highly regulated industries, such as health care, homomorphic encryption can be used to enable new services by removing privacy barriers inhibiting data sharing. In this research, the Julia Computing team used a homomorphic encryption system which involves the following operations: pub_key, eval_key, priv_key = keygen() encrypted = encrypt(pub_key, plaintext) decrypted = decrypt(priv_key, encrypted) encrypted′ = eval(eval_key, f, encrypted) So the first three are fairly straightforward and are familiar to anyone who has used asymmetric cryptography before. The last one is important as it evaluates some function f on the encryption and returns another encrypted value corresponding to the result of evaluating f on the encrypted value. It is this property that gives homomorphic computation its name. Further the Julia Computing team talks about CKKS (Cheon-Kim-Kim-Song), a homomorphic encryption scheme that allowed homomorphic evaluation on the following primitive operations: Element-wise addition of length n vectors of complex numbers Element-wise multiplication of length n complex vectors Rotation (in the circshift sense) of elements in the vector Complex conjugation of vector elements But they also mentioned that computations using CKKS were noisy, and hence they tested to perform these operations in Julia. Which convolutional neural network did the Julia Computing team use As a starting point the Julia Computing team used the convolutional neural network example given in the Flux model zoo. They kept training the loop, prepared the data and tweaked the ML model slightly. It is essentially the same model as the one used in the paper “Secure Outsourced Matrix Computation and Application to Neural Networks”, which uses the same (CKKS) cryptographic scheme. This paper also encrypts the model, which the Julia team neglected for simplicity and they involved bias vectors after every layer (which Flux does by default). This resulted in a higher test set accuracy of the model used by Julia team which was (98.6% vs 98.1%). An unusual feature in this model are the x.^2 activation functions. More common choices here would have been tanh or relu or something more advanced. While those functions (relu in particular) are cheap to evaluate on plaintext values, they would however, be quite expensive to evaluate on encrypted values. Also, the team would have ended up evaluating a polynomial approximation had they adopted these common choices. Fortunately x.^2 worked fine for their purpose. How was the homomorphic operation carried out The team performed homomorphic operation on Convolutions and Matrix Multiply assuming a batch size of 64. They precomputed each convolution window of 7x7 extraction from the original images which gave them 64 7x7 matrices per input image. Then they collected the same position in each window into one vector and got a 64-element vector for each image, (i.e. a total of 49 64x64 matrices), and encrypted these matrices. In this way the convolution became a scalar multiplication of the whole matrix with the appropriate mask element, and by summing all 49 elements later, the team got the result of the convolution. Then the team moved to Matrix Multiply by rotating elements in the vector to effect a re-ordering of the multiplication indices. They considered a row-major ordering of matrix elements in the vector. Then shifted the vector by a multiple of the row-size, and got the effect of rotating the columns, which is a sufficient primitive for implementing matrix multiply. The team was able to get everything together and it worked. You can take a look at the official blog post to know the step by step implementation process with codes. Further they also executed the whole encryption process in Julia as it allows powerful abstractions and they could encapsulate the whole convolution extraction process as a custom array type. The Julia Computing team states, “Achieving the dream of automatically executing arbitrary computations securely is a tall order for any system, but Julia’s metaprogramming capabilities and friendly syntax make it well suited as a development platform.” Julia co-creator, Jeff Bezanson, on what’s wrong with Julialang and how to tackle issues like modularity and extension Julia v1.3 released with new multithreading features, and much more! The Julia team shares its finalized release process with the community Julia announces the preview of multi-threaded task parallelism in alpha release v1.3.0 How to make machine learning based recommendations using Julia [Tutorial]

0
0
22366

article-image-running-parallel-data-operations-using-java-streams

Pravin Dhandre

15 Jan 2018

8 min read

Running Parallel Data Operations using Java Streams

Pravin Dhandre

15 Jan 2018

8 min read

[box type="note" align="" class="" width=""]Our article is an excerpt from a book co-authored by Richard M. Reese and Jennifer L. Reese, titled Java for Data Science. This book provides in-depth understanding of important tools and techniques used across data science projects in a Java environment.[/box] This article will give you an advantage of using Java 8 for solving complex and math-intensive problems on larger datasets using Java streams and lambda expressions. You will explore short demonstrations for performing matrix multiplication and map-reduce using Java 8. The release of Java 8 came with a number of important enhancements to the language. The two enhancements of interest to us include lambda expressions and streams. A lambda expression is essentially an anonymous function that adds a functional programming dimension to Java. The concept of streams, as introduced in Java 8, does not refer to IO streams. Instead, you can think of it as a sequence of objects that can be generated and manipulated using a fluent style of programming. This style will be demonstrated shortly. As with most APIs, programmers must be careful to consider the actual execution performance of their code using realistic test cases and environments. If not used properly, streams may not actually provide performance improvements. In particular, parallel streams, if not crafted carefully, can produce incorrect results. We will start with a quick introduction to lambda expressions and streams. If you are familiar with these concepts you may want to skip over the next section. Understanding Java 8 lambda expressions and streams A lambda expression can be expressed in several different forms. The following illustrates a simple lambda expression where the symbol, ->, is the lambda operator. This will take some value, e, and return the value multiplied by two. There is nothing special about the name e. Any valid Java variable name can be used: e -> 2 * e It can also be expressed in other forms, such as the following: (int e) -> 2 * e (double e) -> 2 * e (int e) -> {return 2 * e; The form used depends on the intended value of e. Lambda expressions are frequently used as arguments to a method, as we will see shortly. A stream can be created using a number of techniques. In the following example, a stream is created from an array. The IntStream interface is a type of stream that uses integers. The Arrays class' stream method converts an array into a stream: IntStream stream = Arrays.stream(numbers); We can then apply various stream methods to perform an operation. In the following statement, the forEach method will simply display each integer in the stream: stream.forEach(e -> out.printf("%d ", e)); There are a variety of stream methods that can be applied to a stream. In the following example, the mapToDouble method will take an integer, multiply it by 2, and then return it as a double. The forEach method will then display these values: stream .mapToDouble(e-> 2 * e) .forEach(e -> out.printf("%.4f ", e)); The cascading of method invocations is referred to as fluent programing. Using Java 8 to perform matrix multiplication Here, we will illustrate how streams can be used to perform matrix multiplication. The definitions of the A, B, and C matrices are the same as declared in the Implementing basic matrix operations section. They are duplicated here for your convenience: double A[][] = { {0.1950, 0.0311}, {0.3588, 0.2203}, {0.1716, 0.5931}, {0.2105, 0.3242}}; double B[][] = { {0.0502, 0.9823, 0.9472}, {0.5732, 0.2694, 0.916}}; double C[][] = new double[n][p]; The following sequence is a stream implementation of matrix multiplication. A detailed explanation of the code follows: C = Arrays.stream(A) .parallel() .map(AMatrixRow -> IntStream.range(0, B[0].length) .mapToDouble(i -> IntStream.range(0, B.length) .mapToDouble(j -> AMatrixRow[j] * B[j][i]) .sum() ).toArray()).toArray(double[][]::new); The first map method, shown as follows, creates a stream of double vectors representing the 4 rows of the A matrix. The range method will return a list of stream elements ranging from its first argument to the second argument. .map(AMatrixRow -> IntStream.range(0, B[0].length) The variable i corresponds to the numbers generated by the second range method, which corresponds to the number of rows in the B matrix (2). The variable j corresponds to the numbers generated by the third range method, representing the number of columns of the B matrix (3). At the heart of the statement is the matrix multiplication, where the sum method calculates the sum: .mapToDouble(j -> AMatrixRow[j] * B[j][i]) .sum() The last part of the expression creates the two-dimensional array for the C matrix. The operator, ::new, is called a method reference and is a shorter way of invoking the new operator to create a new object: ).toArray()).toArray(double[][]::new); The displayResult method is as follows: public void displayResult() { out.println("Result"); for (int i = 0; i < n; i++) { for (int j = 0; j < p; j++) { out.printf("%.4f ", C[i][j]); } out.println(); } } The output of this sequence follows: Result 0.0276 0.1999 0.2132 0.1443 0.4118 0.5417 0.3486 0.3283 0.7058 0.1964 0.2941 0.4964 Using Java 8 to perform map-reduce In this section, we will use Java 8 streams to perform a map-reduce operation. In this example, we will use a Stream of Book objects. We will then demonstrate how to use the Java 8 reduce and average methods to get our total page count and average page count. Rather than begin with a text file, as we did in the Hadoop example, we have created a Book class with title, author, and page-count fields. In the main method of the driver class, we have created new instances of Book and added them to an ArrayList called books. We have also created a double value average to hold our average, and initialized our variable totalPg to zero: ArrayList<Book> books = new ArrayList<>(); double average; int totalPg = 0; books.add(new Book("Moby Dick", "Herman Melville", 822)); books.add(new Book("Charlotte's Web", "E.B. White", 189)); books.add(new Book("The Grapes of Wrath", "John Steinbeck", 212)); books.add(new Book("Jane Eyre", "Charlotte Bronte", 299)); books.add(new Book("A Tale of Two Cities", "Charles Dickens", 673)); books.add(new Book("War and Peace", "Leo Tolstoy", 1032)); books.add(new Book("The Great Gatsby", "F. Scott Fitzgerald", 275)); Next, we perform a map and reduce operation to calculate the total number of pages in our set of books. To accomplish this in a parallel manner, we use the stream and parallel methods. We then use the map method with a lambda expression to accumulate all of the page counts from each Book object. Finally, we use the reduce method to merge our page counts into one final value, which is to be assigned to totalPg: totalPg = books .stream() .parallel() .map((b) -> b.pgCnt) .reduce(totalPg, (accumulator, _item) -> { out.println(accumulator + " " +_item); return accumulator + _item; }); Notice in the preceding reduce method we have chosen to print out information about the reduction operation's cumulative value and individual items. The accumulator represents the aggregation of our page counts. The _item represents the individual task within the map-reduce process undergoing reduction at any given moment. In the output that follows, we will first see the accumulator value stay at zero as each individual book item is processed. Gradually, the accumulator value increases. The final operation is the reduction of the values 1223 and 2279. The sum of these two numbers is 3502, or the total page count for all of our books: 0 822 0 189 0 299 0 673 0 212 299 673 0 1032 0 275 1032 275 972 1307 189 212 822 401 1223 2279 Next, we will add code to calculate the average page count of our set of books. We multiply our totalPg value, determined using map-reduce, by 1.0 to prevent truncation when we divide by the integer returned by the size method. We then print out average. average = 1.0 * totalPg / books.size(); out.printf("Average Page Count: %.4fn", average); Our output is as follows: Average Page Count: 500.2857 We could have used Java 8 streams to calculate the average directly using the map method. Add the following code to the main method. We use parallelStream with our map method to simultaneously get the page count for each of our books. We then use mapToDouble to ensure our data is of the correct type to calculate our average. Finally, we use the average and getAsDouble methods to calculate our average page count: average = books .parallelStream() .map(b -> b.pgCnt) .mapToDouble(s -> s) .average() .getAsDouble(); out.printf("Average Page Count: %.4fn", average); Then we print out our average. Our output, identical to our previous example, is as follows: Average Page Count: 500.2857 The above techniques leveraged Java 8 capabilities on the map-reduce framework to solve numeric problems. This type of process can also be applied to other types of data, including text-based data. The true benefit is seen when these processes handle extremely large datasets within a significant reduction in time frame. To know various other mathematical and parallel techniques in Java for building a complete data analysis application, you may read through the book Java for Data Science to get a better integrated approach.

0
0
22238

article-image-session-4-fair-classification

Sugandha Lahoti

23 Feb 2018

7 min read

FAT Conference 2018 Session 4: Fair Classification

Sugandha Lahoti

23 Feb 2018

7 min read

As algorithms are increasingly used to make decisions of social consequence, the social values encoded in these decision-making procedures are the subject of increasing study, with fairness being a chief concern. The Conference on Fairness, Accountability, and Transparency (FAT) scheduled on Feb 23 and 24 this year in New York is an annual conference dedicated to bringing theory and practice of fair and interpretable Machine Learning, Information Retrieval, NLP, Computer Vision, Recommender systems, and other technical disciplines. This year's program includes 17 peer-reviewed papers and 6 tutorials from leading experts in the field. The conference will have three sessions. Session 4 of the two-day conference on Saturday, February 24, is in the field of fair classification. In this article, we give our readers a peek into the four papers that have been selected for presentation in Session 4. You can also check out Session 1, Session 2, and Session 3 summaries in case you’ve missed them. The cost of fairness in binary classification What is the paper about? This paper provides a simple approach to the Fairness-aware problem which involves suitably thresholding class-probability estimates. It has been awarded Best paper in Technical contribution category. The authors have studied the inherent tradeoffs in learning classifiers with a fairness constraint in the form of two questions: What is the best accuracy we can expect for a given level of fairness? What is the nature of these optimal fairness aware classifiers? The authors showed that for cost-sensitive approximate fairness measures, the optimal classifier is an instance-dependent thresholding of the class probability function. They have quantified the degradation in performance by a measure of alignment of the target and sensitive variable. This analysis is then used to derive a simple plugin approach for the fairness problem. Key takeaways For Fairness-aware learning, the authors have designed an algorithm targeting a particular measure of fairness. They have reduced two popular fairness measures (disparate impact and mean difference) to cost-sensitive risks. They show that for cost-sensitive fairness measures, the optimal Fairness-aware classifier is an instance-dependent thresholding of the class-probability function. They quantify the intrinsic, method independent impact of the fairness requirement on accuracy via a notion of alignment between the target and sensitive feature. The ability to theoretically compute the tradeoffs between fairness and utility is perhaps the most interesting aspect of their technical results. They have stressed that the tradeoff is intrinsic to the underlying data. That is, any fairness or unfairness, is a property of the data, not of any particular technique. They have theoretically computed what price one has to pay (in utility) in order to achieve a desired degree of fairness: in other words, they have computed the cost of fairness. Decoupled Classifiers for Group-Fair and Efficient Machine Learning What is the paper about? This paper considers how to use a sensitive attribute such as gender or race to maximize fairness and accuracy, assuming that it is legal and ethical. Simple linear classifiers may use the raw data, upweight/oversample data from minority groups, or employ advanced approaches to fitting linear classifiers that aim to be accurate and fair. However, an inherent tradeoff between accuracy on one group and accuracy on another still prevails. This paper defines and explores decoupled classification systems, in which a separate classifier is trained on each group. The authors present experiments on 47 datasets. The experiments are “semi-synthetic” in the sense that the first binary feature was used as a substitute sensitive feature. The authors found that on many data sets the decoupling algorithm improves performance while less often decreasing performance. Key takeaways The paper describes a simple technical approach for a practitioner using ML to incorporate sensitive attributes. This approach avoids unnecessary accuracy tradeoffs between groups and can accommodate an application-specific objective, generalizing the standard ML notion of loss. For a certain family of “weakly monotonic” fairness objectives, the authors provide a black-box reduction that can use any off-the-shelf classifier to efficiently optimize the objective. This work requires the application designer to pin down a specific loss function that trades off accuracy for fairness. Experiments demonstrate that decoupling can reduce the loss on some datasets for some potentially sensitive features A case study of algorithm-assisted decision making in child maltreatment hotline screening decisions What is the paper about? The work is based on the use of predictive analytics in the area of child welfare. It won the best paper award in the Technical and Interdisciplinary Contribution. The authors have worked on developing, validating, fairness auditing, and deploying a risk prediction model in Allegheny County, PA, USA. The authors have described competing models that are being developed in the Allegheny County as part of an ongoing redesign process in comparison to the previous models. Next, they investigate the predictive bias properties of the current tool and a Random forest model that has emerged as one of the best performing competing models. Their predictive bias assessment is motivated both by considerations of human bias and recent work on fairness criteria. They then discuss some of the challenges in incorporating algorithms into human decision-making processes and reflect on the predictive bias analysis in the context of how the model is actually being used. They also propose an “oracle test” as a tool for clarifying whether particular concerns pertain to the statistical properties of a model or if these concerns are targeted at other potential deficiencies. Key takeaways The goal in Allegheny County is to improve both the accuracy and equity of screening decisions by taking a Fairness-aware approach to incorporating prediction models into the decision-making pipeline. The paper reports on the lessons learned so far by the authors, their approaches to predictive bias assessment, and several outstanding challenges in the child maltreatment hotline context. This report contributes to the ongoing conversation concerning the use of algorithms in supporting critical decisions in government—and the importance of considering fairness and discrimination in data-driven decision making. The paper discussion and general analytic approach are also broadly applicable to other domains where predictive risk modeling may be used. Fairness in Machine Learning: Lessons from Political Philosophy What is the paper about? Plenty of moral and political philosophers have expended significant efforts in formalizing and defending the central concepts of discrimination, egalitarianism, and justice. Thus it is unsurprising to know that the attempts to formalize ‘fairness’ in machine learning contain echoes of these old philosophical debates. This paper draws on existing work in moral and political philosophy in order to elucidate emerging debates about fair machine learning. It answers the following questions: What does it mean for a machine learning model to be ‘fair’, in terms which can be operationalized? Should fairness consist of ensuring everyone has an equal probability of obtaining some benefit, or should we aim instead to minimize the harms to the least advantaged? Can the relevant ideal be determined by reference to some alternative state of affairs in which a particular social pattern of discrimination does not exist? Key takeaways This paper aims to provide an overview of some of the relevant philosophical literature on discrimination, fairness, and egalitarianism in order to clarify and situate the emerging debate within fair machine learning literature. The author addresses the conceptual distinctions drawn between terms frequently used in the fair ML literature–including ‘discrimination’ and ‘fairness’–and the use of related terms in the philosophical literature. He suggests that ‘fairness’ as used in the fair machine learning community is best understood as a placeholder term for a variety of normative egalitarian considerations. He also provides an overview of implications for the incorporation of ‘fairness’ into algorithmic decision-making systems. We hope you like the coverage of Session 4. Don’t miss our coverage on Session 5 on Fat recommenders and more.

0
0
22236

article-image-10-machine-learning-tools-to-look-out-for-in-2018

Amey Varangaonkar

26 Dec 2017

7 min read

10 Machine Learning Tools to watch in 2018

Amey Varangaonkar

26 Dec 2017

7 min read

0
0
22150

article-image-experts-discuss-dark-patterns-and-deceptive-ui-designs-what-are-they-what-do-they-do-how-do-we-stop-them

Sugandha Lahoti

29 Jun 2019

12 min read

Experts discuss Dark Patterns and deceptive UI designs: What are they? What do they do? How do we stop them?

Sugandha Lahoti

29 Jun 2019

12 min read

Dark patterns are often used online to deceive users into taking actions they would otherwise not take under effective, informed consent. Dark patterns are generally used by shopping websites, social media platforms, mobile apps and services as a part of their user interface design choices. Dark patterns can lead to financial loss, tricking users into giving up vast amounts of personal data, or inducing compulsive and addictive behavior in adults and children. Using dark patterns is unambiguously unlawful in the United States (under Section 5 of the Federal Trade Commission Act and similar state laws), the European Union (under the Unfair Commercial Practices Directive and similar member state laws), and numerous other jurisdictions. Earlier this week, at the Russell Senate Office Building, a panel of experts met to discuss the implications of Dark patterns in the session, Deceptive Design and Dark Patterns: What are they? What do they do? How do we stop them? The session included remarks from Senator. Mark Warner and Deb Fischer, sponsors of the DETOUR Act, and a panel of experts including Tristan Harris (Co-Founder and Executive Director, Center for Humane Technology). The entire panel of experts included: Tristan Harris (Co-Founder and Executive Director, Center for Humane Technology) Rana Foroohar (Global Business Columnist and Associate Editor, Financial Times) Amina Fazlullah (Policy Counsel, Common Sense Media) Paul Ohm (Professor of Law and Associate Dean, Georgetown Law School), also the moderator Katie McInnis (Policy Counsel, Consumer Reports) Marshall Erwin (Senior Director of Trust & Security, Mozilla) Arunesh Mathur (Dept. of Computer Science, Princeton University) Dark patterns are growing in social media platforms, video games, shopping websites, and are increasingly used to target children The expert session was inaugurated by Arunesh Mathur (Dept. of Computer Science, Princeton University) who talked about his new study by researchers from Princeton University and the University of Chicago. The study suggests that shopping websites are abundant with dark patterns that rely on consumer deception. The researchers conducted a large-scale study, analyzing almost 53K product pages from 11K shopping websites to characterize and quantify the prevalence of dark patterns. They so discovered 1,841 instances of dark patterns on shopping websites, which together represent 15 types of dark patterns. One of the dark patterns was Sneak into Website, which adds additional products to users’ shopping carts without their consent. For example, you would buy a bouquet on a website and the website without your consent would add a greeting card in the hopes that you will actually purchase it. Katie McInnis agreed and added that Dark patterns not only undermine the choices that are available to users on social media and shopping platforms but they can also cost users money. User interfaces are sometimes designed to push a user away from protecting their privacy, making it tough to evaluate them. Amina Fazlullah, Policy Counsel, Common Sense Media said that dark patterns are also being used to target children. Manipulative apps use design techniques to shame or confuse children into in-app purchases or trying to keep them on the app for longer. Children mostly are unable to discern these manipulative techniques. Sometimes the screen will have icons or buttons that will appear to be a part of game play and children will click on them not realizing that they're either being asked to make a purchase or being shown an ad or being directed onto another site. There are games which ask for payments or microtransactions to continue the game forward. Mozilla uses product transparency to curb Dark patterns Marshall Erwin, Senior Director of Trust & Security at Mozilla talked about the negative effects of dark patterns and how they make their own products at Mozilla more transparent. They have a set of checks and principles in place to avoid dark patterns. No surprises: If users were to figure out or start to understand exactly what is happening with the browser, it should be consistent with their expectations. If the users are surprised, this means browsers need to make a change either by stopping the activity entirely or creating additional transparency that helps people understand. Anti-tracking technology: Cross-site tracking is one of the most pervasive and pernicious dark patterns across the web today that is enabled by cookies. Browsers should take action to decrease the attack surface in the browser and actively protect people from those patterns online. Mozilla and Apple have introduced anti tracking technology to actively intervene to protect people from the diverse parties that are probably not trustworthy. Detour Act by Senators Warner and Fisher In April, Warner and Fischer had introduced the Deceptive Experiences To Online Users Reduction (DETOUR) Act, a bipartisan legislation to prohibit large online platforms from using dark patterns to trick consumers into handing over their personal data. This act focuses on the activities of large online service providers (over a hundred million users visiting in a given month). Under this act you cannot use practices that trick users into obtaining information or consenting. You will experience new controls about conducting ‘psychological experiments on your users’ and you will no longer be able to target children under 13 with the goal of hooking them into your service. It extends additional rulemaking and enforcement abilities to the Federal Trade Commission. “Protecting users personal data and user autonomy online are truly bipartisan issues”: Senator Mark Warner In his presentation, Warner talked about how 2019 is the year when we need to recognize dark patterns and their ongoing manipulation of American consumers. While we've all celebrated the benefits that communities have brought from social media, there is also an enormous dark underbelly, he says. It is important that Congress steps up and we play a role as senators such that Americans and their private data is not misused or manipulated going forward. Protecting users personal data and user autonomy online are truly bipartisan issues. This is not a liberal versus conservative, it's much more a future versus past and how we get this future right in a way that takes advantage of social media tools but also put some of the appropriate constraints in place. He says that the driving notion behind the Detour act is that users should have the choice and autonomy when it comes to their personal data. When a company like Facebook asks you to upload your phone contacts or some other highly valuable data to their platform, you ought to have a simple choice yes or no. Companies that run experiments on you without your consent are coercive and Detour act aims to put appropriate protections in place that defend user's ability to make informed choices. In addition to prohibiting large online platforms from using dark patterns to trick consumers into handing over their personal data, the bill would also require informed consent for behavior experimentation. In the process, the bill will be sending a clear message to the platform companies and the FTC that they are now in the business of preserving user's autonomy when it comes to the use of their personal data. The goal, Warner says, is simple - to bring some transparency to what remains a very opaque market and give consumers the tools they need to make informed choices about how and when to share their personal information. “Curbing the use of dark patterns will be foundational to increasing trust online” : Senator Deb Fischer Fischer argued that tech companies are increasingly tailoring users’ online experiences in ways that are more granular. On one hand, she says, you get a more personalized user experience and platforms are more responsive, however it's this variability that allows companies to take that design just a step too far. Companies are constantly competing for users attention and this increases the motivation for a more intrusive and invasive user design. The ability for online platforms to guide the visual interfaces that billions of people view is an incredible influence. It forces us to assess the impact of design on user privacy and well-being. Fundamentally the detour act would prohibit large online platforms from purposely using deceptive user interfaces - dark patterns. The detour act would provide a better accountability system for improved transparency and autonomy online. The legislation would take an important step to restore the hidden options. It would give users a tool to get out of the maze that coaxes you to just click on ‘I agree’. A privacy framework that involves consent cannot function properly if it doesn't ensure the user interface presents fair and transparent options. The detour act would enable the creation of a professional standards body which can register with the Federal Trade Commission. This would serve as a self regulatory body to develop best practices for UI design with the FTC as a backup. She adds, “We need clarity for the enforcement of dark patterns that don't directly involve our wallets. We need policies that place value on user choice and personal data online. We need a stronger mechanism to protect the public interest when the goal for tech companies is to make people engage more and more. User consent remains weakened by the presence of dark patterns and unethical design. Curbing the use of dark patterns will be foundational to increasing trust online. The detour act does provide a key step in getting there.” “The DETOUR act is calling attention to asymmetry and preventing deceptive asymmetry”: Tristan Harris Tristan says that companies are now competing not on manipulating your immediate behavior but manipulating and predicting the future. For example, Facebook has something called loyalty prediction which allows them to sell to an advertiser the ability to predict when you're going to become disloyal to a brand. It can sell that opportunity to another advertiser before probably you know you're going to switch. The DETOUR act is a huge step in the right direction because it's about calling attention to asymmetry and preventing deceptive asymmetry. We need a new relationship for this asymmetric power by having a duty of care. It’s about treating asymmetrically powerful technologies to be in the service of the systems that they are supposed to protect. He says, we need to switch to a regenerative energy economy that actually treats attention as sacred and not directly tying profit to user extraction. Top questions raised by the panel and online viewers Does A/B testing result in dark patterns? Dark patterns are often a result of A/B testing right where a designer may try things that lead to better engagement or maybe nudge users in a way where the company benefits. However, A/B testing isn't the problem, it’s the intention of how A/B testing is being used. Companies and other organizations should have an oversight on the different experiments that they are conducting to see if A/B testing is actually leading to some kind of concrete harm. The challenge in the space is drawing a line about A/B testing features and optimizing for engagement and decreasing friction. Are consumers smart enough to tackle dark patterns on their own or do we need a legislation? It's well established that for children whose brains are just developing, they're unable to discern these types of deceptive techniques so especially for kids, these types of practices should be banned. For vulnerable families who are juggling all sorts of concerns around income and access to jobs and transportation and health care, putting this on their plate as well is just unreasonable. Dark patterns are deployed for an array of opaque reasons the average user will never recognize. From a consumer perspective, going through and identifying dark pattern techniques--that these platform companies have spent hundreds of thousands of dollars developing to be as opaque and as tricky as possible--is an unrealistic expectation put on consumers. This is why the DETOUR act and this type of regulation are absolutely necessary and the only way forward. What is it about the largest online providers that make us want to focus on them first or only? Is it their scale or do they have more powerful dark patterns? Is it because they're just harming more people or is it politics? Sometimes larger companies stay wary of indulging in dark patterns because they have a greater risk in terms of getting caught and the PR backlash. However, they do engage in manipulative practices and that warrants a lot of attention. Moreover, targeting bigger companies is just one part of a more comprehensive privacy enforcement environment. Hitting companies that have a large number of users is also great for consumer engagement. Obviously there is a need to target more broadly but this is a starting point. If Facebook were to suddenly reclass itself and its advertising business model, would you still trust them? No, the leadership that's in charge now for Facebook can not be trusted, especially the organizational cultures that have been building. There are change efforts going on inside of Google and Facebook right now but it’s getting gridlocked. Even if employees want to see policies being changed, they still have bonus structures and employee culture to keep in mind. We recommend you to go through the full hearing here. You can read more about the Detour Act here. U.S. senators introduce a bipartisan bill that bans social media platforms from using ‘dark patterns’ to trick its users. How social media enabled and amplified the Christchurch terrorist attack A new study reveals how shopping websites use ‘dark patterns’ to deceive you into buying things you may not want

0
0
22147

article-image-basics-image-histograms-opencv

Packt

12 Oct 2016

11 min read

Basics of Image Histograms in OpenCV

Packt

12 Oct 2016

11 min read

In this article by Samyak Datta, author of the book Learning OpenCV 3 Application Development we are going to focus our attention on a different style of processing pixel values. The output of the techniques, which would comprise our study in the current article, will not be images, but other forms of representation for images, namely image histograms. We have seen that a two-dimensional grid of intensity values is one of the default forms of representing images in digital systems for processing as well as storage. However, such representations are not at all easy to scale. So, for an image with a reasonably low spatial resolution, say 512 x 512 pixels, working with a two-dimensional grid might not pose any serious issues. However, as the dimensions increase, the corresponding increase in the size of the grid may start to adversely affect the performance of the algorithms that work with the images. A primary advantage that an image histogram has to offer is that the size of a histogram is a constant that is independent of the dimensions of the image. As a consequence of this, we are guaranteed that irrespective of the spatial resolution of the images that we are dealing with, the algorithms that power our solutions will have to deal with a constant amount of data if they are working with image histograms. (For more resources related to this topic, see here.) Each descriptor captures some particular aspects or features of the image to construct its own form of representation. One of the common pitfalls of using histograms as a form of image representation as compared to its native form of using the entire two-dimensional grid of values is loss of information. A full-fledged image representation using pixel intensity values for all pixel locations naturally consists of all the information that you would need to reconstruct a digital image. However, the same cannot be said about histograms. When we study about image histograms in detail, we'll get to see exactly what information do we stand to lose. And this loss in information is prevalent across all forms of image descriptors. The basics of histograms At the outset, we will briefly explain the concept of a histogram. Most of you might already know this from your lessons on basic statistics. However, we will reiterate this for the sake of completeness. Histogram is a form of data representation technique that relies on an aggregation of data points. The data is aggregated into a set of predefined bins that are represented along the x axis, and the number of data points that fall within each of the bins make up the corresponding counts on the y axis. For example, let's assume that our data looks something like the following: D={2,7,1,5,6,9,14,11,8,10,13} If we define three bins, namely Bin_1 (1 - 5), Bin_2 (6 - 10), and Bin_3 (11 - 15), then the histogram corresponding to our data would look something like this: Bins Frequency Bin_1 (1 - 5) 3 Bin_2 (6 - 10) 5 Bin_3 (11 - 15) 3 What this histogram data tells us is that we have three values between 1 and 5, five between 6 and 10, and three again between 11 and 15. Note that it doesn't tell us what the values are, just that some n values exist in a given bin. A more familiar visual representation of the histogram in discussion is shown as follows: As you can see, the bins have been plotted along the x axis and their corresponding frequencies along the y axis. Now, in the context of images, how is a histogram computed? Well, it's not that difficult to deduce. Since the data that we have comprise pixel intensity values, an image histogram is computed by plotting a histogram using the intensity values of all its constituent pixels. What this essentially means is that the sequence of pixel intensity values in our image becomes the data. Well, this is in fact the simplest kind of histogram that you can compute using the information available to you from the image. Now, coming back to image histograms, there are some basic terminologies (pertaining to histograms in general) that you need to be aware of before you can dip your hands into code. We have explained them in detail here: Histogram size: The histogram size refers to the number of bins in the histogram. Range: The range of a histogram is the range of data that we are dealing with. The range of data as well as the histogram size are both important parameters that define a histogram. Dimensions: Simply put, dimensions refer to the number of the type of items whose values we aggregate in the histogram bins. For example, consider a grayscale image. We might want to construct a histogram using the pixel intensity values for such an image. This would be an example of a single-dimensional histogram because we are just interested in aggregating the pixel intensity values and nothing else. The data, in this case, is spread over a range of 0 to 255. On account of being one-dimensional, such histograms can be represented graphically as 2D plots—one-dimensional data (pixel intensity values) being plotted on the x axis (in the form of bins) along with the corresponding frequency counts along the y axis. We have already seen an example of this before. Now, imagine a color image with three channels: red, green, and blue. Let's say that we want to plot a histogram for the intensities in the red and green channels combined. This means that our data now becomes a pair of values (r, g). A histogram that is plotted for such data will have a dimensionality of 2. The plot for such a histogram will be a 3D plot with the data bins covering the x and y axes and the frequency counts plotted along the z axis. Now that we have discussed the theoretical aspects of image histograms in detail, let's start thinking along the lines of code. We will start with the simplest (and in fact the most ubiquitous) design of image histograms. The range of our data will be from 0 to 255 (both inclusive), which means that all our data points will be integers that fall within the specified range. Also, the number of data points will equal the number of pixels that make up our input image. The simplicity in design comes from the fact that we fix the size of the histogram (the number of bins) as 256. Now, take a moment to think about what this means. There are 256 different possible values that our data points can take and we have a separate bin corresponding to each one of those values. So such an image histogram will essentially depict the 256 possible intensity values along with the counts of the number of pixels in the image that are colored with each of the different intensities. Before taking a peek at what OpenCV has to offer, let's try to implement such a histogram on our own! We define a function named computeHistogram() that takes the grayscale image as an input argument and returns the image histogram. From our earlier discussions, it is evident that the histogram must contain 256 entries (for the 256 bins): one for each integer between 0 and 255. The value stored in the histogram corresponding to each of the 256 entries will be the count of the image pixels that have a particular intensity value. So, conceptually, we can use an array for our implementation such that the value stored in the histogram [ i ] (for 0≤i≤255) will be the count of the number of pixels in the image having the intensity of i. However, instead of using a C++ array, we will comply with the rules and standards followed by OpenCV and represent the histogram as a Mat object. We have already seen that a Mat object is nothing but a multidimensional array store. The implementation is outlined in the following code snippet: Mat computeHistogram(Mat input_image) { Mat histogram = Mat::zeros(256, 1, CV_32S); for (int i = 0; i < input_image.rows; ++i) { for (int j = 0; j < input_image.cols; ++j) { int binIdx = (int) input_image.at<uchar>(i, j); histogram.at<int>(binIdx, 0) += 1; } } return histogram; } As you can see, we have chosen to represent the histogram as a 256-element-column-vector Mat object. We iterate over all the pixels in the input image and keep on incrementing the corresponding counts in the histogram (which had been initialized to 0). As per our description of the image histogram properties, it is easy to see that the intensity value of any pixel is the same as the bin index that is used to index into the appropriate histogram bin to increment the count. Having such an implementation ready, let's test it out with the help of an actual image. The following code demonstrates a main() function that reads an input image, calls the computeHistogram() function that we have defined just now, and displays the contents of the histogram that is returned as a result: int main() { Mat input_image = imread("/home/samyak/Pictures/lena.jpg", IMREAD_GRAYSCALE); Mat histogram = computeHistogram(input_image); cout << "Histogram...n"; for (int i = 0; i < histogram.rows; ++i) cout << i << " : " << histogram.at<int>(i, 0) << "n"; return 0; } We have used the fact that the histogram that is returned from the function will be a single column Mat object. This makes the code that displays the contents of the histogram much cleaner. Histograms in OpenCV We have just seen the implementation of a very basic and minimalistic histogram using the first principles in OpenCV. The image histogram was basic in the sense that all the bins were uniform in size and comprised only a single pixel intensity. This made our lives simple when we designed our code for the implementation; there wasn't any need to explicitly check the membership of a data point (the intensity value of a pixel) with all the bins of our histograms. However, we know that a histogram can have bins whose sizes span more than one. Can you think of the changes that we might need to make in the code that we had written just now to accommodate for bin sizes larger than 1? If this change seems doable to you, try to figure out how to incorporate the possibility of non-uniform bin sizes or multidimensional histograms. By now, things might have started to get a little overwhelming to you. No need to worry. As always, OpenCV has you covered! The developers at OpenCV have provided you with a calcHist() function whose sole purpose is to calculate the histograms for a given set of arrays. By arrays, we refer to the images represented as Mat objects, and we use the term set because the function has the capability to compute multidimensional histograms from the given data: Mat computeHistogram(Mat input_image) { Mat histogram; int channels[] = { 0 }; int histSize[] = { 256 }; float range[] = { 0, 256 }; const float* ranges[] = { range }; calcHist(&input_image, 1, channels, Mat(), histogram, 1, histSize, ranges, true, false); return histogram; } Before we move on to an explanation of the different parameters involved in the calcHist() function call, I want to bring your attention to the abundant use of arrays in the preceding code snippet. Even arguments as simple as histogram sizes are passed to the function in the form of arrays rather than integer values, which at first glance seem quite unnecessary and counter-intuitive. The usage of arrays is due to the fact that the implementation of calcHist() is equipped to handle multidimensional histograms as well, and when we are dealing with such multidimensional histogram data, we require multiple parameters to be passed, one for each dimension. This would become clearer once we demonstrate an example of calculating multidimensional histograms using the calcHist() function. For the time being, we just wanted to clear the immediate confusion that might have popped up in your minds upon seeing the array parameters. Here is a detailed list of the arguments in the calcHist() function call: Source images Number of source images Channel indices Mask Dimensions (dims) Histogram size Ranges Uniform flag Accumulate flag The last couple of arguments (the uniform and accumulate flags) have default values of true and false, respectively. Hence, the function call that you have seen just now can very well be written as follows: calcHist(&input_image, 1, channels, Mat(), histogram, 1, histSize, ranges); Summary Thus in this article we have successfully studied fundamentals of using histograms in OpenCV for image processing. Resources for Article: Further resources on this subject: Remote Sensing and Histogram [article] OpenCV: Image Processing using Morphological Filters [article] Learn computer vision applications in Open CV [article]

0
0
22139

Packt

25 Aug 2014

18 min read

Camera Calibration

Packt

25 Aug 2014

18 min read

This article by Robert Laganière, author of OpenCV Computer Vision Application Programming Cookbook Second Edition, includes that images are generally produced using a digital camera, which captures a scene by projecting light going through its lens onto an image sensor. The fact that an image is formed by the projection of a 3D scene onto a 2D plane implies the existence of important relationships between a scene and its image and between different images of the same scene. Projective geometry is the tool that is used to describe and characterize, in mathematical terms, the process of image formation. In this article, we will introduce you to some of the fundamental projective relations that exist in multiview imagery and explain how these can be used in computer vision programming. You will learn how matching can be made more accurate through the use of projective constraints and how a mosaic from multiple images can be composited using two-view relations. Before we start the recipe, let's explore the basic concepts related to scene projection and image formation. (For more resources related to this topic, see here.) Image formation Fundamentally, the process used to produce images has not changed since the beginning of photography. The light coming from an observed scene is captured by a camera through a frontal aperture; the captured light rays hit an image plane (or an image sensor) located at the back of the camera. Additionally, a lens is used to concentrate the rays coming from the different scene elements. This process is illustrated by the following figure: Here, do is the distance from the lens to the observed object, di is the distance from the lens to the image plane, and f is the focal length of the lens. These quantities are related by the so-called thin lens equation: In computer vision, this camera model can be simplified in a number of ways. First, we can neglect the effect of the lens by considering that we have a camera with an infinitesimal aperture since, in theory, this does not change the image appearance. (However, by doing so, we ignore the focusing effect by creating an image with an infinite depth of field.) In this case, therefore, only the central ray is considered. Second, since most of the time we have do>>di, we can assume that the image plane is located at the focal distance. Finally, we can note from the geometry of the system that the image on the plane is inverted. We can obtain an identical but upright image by simply positioning the image plane in front of the lens. Obviously, this is not physically feasible, but from a mathematical point of view, this is completely equivalent. This simplified model is often referred to as the pin-hole camera model, and it is represented as follows: From this model, and using the law of similar triangles, we can easily derive the basic projective equation that relates a pictured object with its image: The size (hi) of the image of an object (of height ho) is therefore inversely proportional to its distance (do) from the camera, which is naturally true. In general, this relation describes where a 3D scene point will be projected on the image plane given the geometry of the camera. Calibrating a camera From the introduction of this article, we learned that the essential parameters of a camera under the pin-hole model are its focal length and the size of the image plane (which defines the field of view of the camera). Also, since we are dealing with digital images, the number of pixels on the image plane (its resolution) is another important characteristic of a camera. Finally, in order to be able to compute the position of an image's scene point in pixel coordinates, we need one additional piece of information. Considering the line coming from the focal point that is orthogonal to the image plane, we need to know at which pixel position this line pierces the image plane. This point is called the principal point. It might be logical to assume that this principal point is at the center of the image plane, but in practice, this point might be off by a few pixels depending on the precision at which the camera has been manufactured. Camera calibration is the process by which the different camera parameters are obtained. One can obviously use the specifications provided by the camera manufacturer, but for some tasks, such as 3D reconstruction, these specifications are not accurate enough. Camera calibration will proceed by showing known patterns to the camera and analyzing the obtained images. An optimization process will then determine the optimal parameter values that explain the observations. This is a complex process that has been made easy by the availability of OpenCV calibration functions. How to do it... To calibrate a camera, the idea is to show it a set of scene points for which their 3D positions are known. Then, you need to observe where these points project on the image. With the knowledge of a sufficient number of 3D points and associated 2D image points, the exact camera parameters can be inferred from the projective equation. Obviously, for accurate results, we need to observe as many points as possible. One way to achieve this would be to take one picture of a scene with many known 3D points, but in practice, this is rarely feasible. A more convenient way is to take several images of a set of some 3D points from different viewpoints. This approach is simpler but requires you to compute the position of each camera view in addition to the computation of the internal camera parameters, which fortunately is feasible. OpenCV proposes that you use a chessboard pattern to generate the set of 3D scene points required for calibration. This pattern creates points at the corners of each square, and since this pattern is flat, we can freely assume that the board is located at Z=0, with the X and Y axes well-aligned with the grid. In this case, the calibration process simply consists of showing the chessboard pattern to the camera from different viewpoints. Here is one example of a 6x4 calibration pattern image: The good thing is that OpenCV has a function that automatically detects the corners of this chessboard pattern. You simply provide an image and the size of the chessboard used (the number of horizontal and vertical inner corner points). The function will return the position of these chessboard corners on the image. If the function fails to find the pattern, then it simply returns false: // output vectors of image points std::vector<cv::Point2f> imageCorners; // number of inner corners on the chessboard cv::Size boardSize(6,4); // Get the chessboard corners bool found = cv::findChessboardCorners(image, boardSize, imageCorners); The output parameter, imageCorners, will simply contain the pixel coordinates of the detected inner corners of the shown pattern. Note that this function accepts additional parameters if you needs to tune the algorithm, which are not discussed here. There is also a special function that draws the detected corners on the chessboard image, with lines connecting them in a sequence: //Draw the corners cv::drawChessboardCorners(image, boardSize, imageCorners, found); // corners have been found The following image is obtained: The lines that connect the points show the order in which the points are listed in the vector of detected image points. To perform a calibration, we now need to specify the corresponding 3D points. You can specify these points in the units of your choice (for example, in centimeters or in inches); however, the simplest is to assume that each square represents one unit. In that case, the coordinates of the first point would be (0,0,0) (assuming that the board is located at a depth of Z=0), the coordinates of the second point would be (1,0,0), and so on, the last point being located at (5,3,0). There are a total of 24 points in this pattern, which is too small to obtain an accurate calibration. To get more points, you need to show more images of the same calibration pattern from various points of view. To do so, you can either move the pattern in front of the camera or move the camera around the board; from a mathematical point of view, this is completely equivalent. The OpenCV calibration function assumes that the reference frame is fixed on the calibration pattern and will calculate the rotation and translation of the camera with respect to the reference frame. Let's now encapsulate the calibration process in a CameraCalibrator class. The attributes of this class are as follows: class CameraCalibrator { // input points: // the points in world coordinates std::vector<std::vector<cv::Point3f>> objectPoints; // the point positions in pixels std::vector<std::vector<cv::Point2f>> imagePoints; // output Matrices cv::Mat cameraMatrix; cv::Mat distCoeffs; // flag to specify how calibration is done int flag; Note that the input vectors of the scene and image points are in fact made of std::vector of point instances; each vector element is a vector of the points from one view. Here, we decided to add the calibration points by specifying a vector of the chessboard image filename as input: // Open chessboard images and extract corner points int CameraCalibrator::addChessboardPoints( const std::vector<std::string>& filelist, cv::Size & boardSize) { // the points on the chessboard std::vector<cv::Point2f> imageCorners; std::vector<cv::Point3f> objectCorners; // 3D Scene Points: // Initialize the chessboard corners // in the chessboard reference frame // The corners are at 3D location (X,Y,Z)= (i,j,0) for (int i=0; i<boardSize.height; i++) { for (int j=0; j<boardSize.width; j++) { objectCorners.push_back(cv::Point3f(i, j, 0.0f)); } } // 2D Image points: cv::Mat image; // to contain chessboard image int successes = 0; // for all viewpoints for (int i=0; i<filelist.size(); i++) { // Open the image image = cv::imread(filelist[i],0); // Get the chessboard corners bool found = cv::findChessboardCorners( image, boardSize, imageCorners); // Get subpixel accuracy on the corners cv::cornerSubPix(image, imageCorners, cv::Size(5,5), cv::Size(-1,-1), cv::TermCriteria(cv::TermCriteria::MAX_ITER + cv::TermCriteria::EPS, 30, // max number of iterations 0.1)); // min accuracy //If we have a good board, add it to our data if (imageCorners.size() == boardSize.area()) { // Add image and scene points from one view addPoints(imageCorners, objectCorners); successes++; } } return successes; } The first loop inputs the 3D coordinates of the chessboard, and the corresponding image points are the ones provided by the cv::findChessboardCorners function. This is done for all the available viewpoints. Moreover, in order to obtain a more accurate image point location, the cv::cornerSubPix function can be used, and as the name suggests, the image points will then be localized at a subpixel accuracy. The termination criterion that is specified by the cv::TermCriteria object defines the maximum number of iterations and the minimum accuracy in subpixel coordinates. The first of these two conditions that is reached will stop the corner refinement process. When a set of chessboard corners have been successfully detected, these points are added to our vectors of the image and scene points using our addPoints method. Once a sufficient number of chessboard images have been processed (and consequently, a large number of 3D scene point / 2D image point correspondences are available), we can initiate the computation of the calibration parameters as follows: // Calibrate the camera // returns the re-projection error double CameraCalibrator::calibrate(cv::Size &imageSize) { //Output rotations and translations std::vector<cv::Mat> rvecs, tvecs; // start calibration return calibrateCamera(objectPoints, // the 3D points imagePoints, // the image points imageSize, // image size cameraMatrix, // output camera matrix distCoeffs, // output distortion matrix rvecs, tvecs, // Rs, Ts flag); // set options } In practice, 10 to 20 chessboard images are sufficient, but these must be taken from different viewpoints at different depths. The two important outputs of this function are the camera matrix and the distortion parameters. These will be described in the next section. How it works... In order to explain the result of the calibration, we need to go back to the figure in the introduction, which describes the pin-hole camera model. More specifically, we want to demonstrate the relationship between a point in 3D at the position (X,Y,Z) and its image (x,y) on a camera specified in pixel coordinates. Let's redraw this figure by adding a reference frame that we position at the center of the projection as seen here: Note that the y axis is pointing downward to get a coordinate system compatible with the usual convention that places the image origin at the upper-left corner. We learned previously that the point (X,Y,Z) will be projected onto the image plane at (fX/Z,fY/Z). Now, if we want to translate this coordinate into pixels, we need to divide the 2D image position by the pixel's width (px) and height (py), respectively. Note that by dividing the focal length given in world units (generally given in millimeters) by px, we obtain the focal length expressed in (horizontal) pixels. Let's then define this term as fx. Similarly, fy =f/py is defined as the focal length expressed in vertical pixel units. Therefore, the complete projective equation is as follows: Recall that (u0,v0) is the principal point that is added to the result in order to move the origin to the upper-left corner of the image. These equations can be rewritten in the matrix form through the introduction of homogeneous coordinates, in which 2D points are represented by 3-vectors and 3D points are represented by 4-vectors (the extra coordinate is simply an arbitrary scale factor, S, that needs to be removed when a 2D coordinate needs to be extracted from a homogeneous 3-vector). Here is the rewritten projective equation: The second matrix is a simple projection matrix. The first matrix includes all of the camera parameters, which are called the intrinsic parameters of the camera. This 3x3 matrix is one of the output matrices returned by the cv::calibrateCamera function. There is also a function called cv::calibrationMatrixValues that returns the value of the intrinsic parameters given by a calibration matrix. More generally, when the reference frame is not at the projection center of the camera, we will need to add a rotation vector (a 3x3 matrix) and a translation vector (a 3x1 matrix). These two matrices describe the rigid transformation that must be applied to the 3D points in order to bring them back to the camera reference frame. Therefore, we can rewrite the projection equation in its most general form: Remember that in our calibration example, the reference frame was placed on the chessboard. Therefore, there is a rigid transformation (made of a rotation component represented by the matrix entries r1 to r9 and a translation represented by t1, t2, and t3) that must be computed for each view. These are in the output parameter list of the cv::calibrateCamera function. The rotation and translation components are often called the extrinsic parameters of the calibration, and they are different for each view. The intrinsic parameters remain constant for a given camera/lens system. The intrinsic parameters of our test camera obtained from a calibration based on 20 chessboard images are fx=167, fy=178, u0=156, and v0=119. These results are obtained by cv::calibrateCamera through an optimization process aimed at finding the intrinsic and extrinsic parameters that will minimize the difference between the predicted image point position, as computed from the projection of the 3D scene points, and the actual image point position, as observed on the image. The sum of this difference for all the points specified during the calibration is called the re-projection error. Let's now turn our attention to the distortion parameters. So far, we have mentioned that under the pin-hole camera model, we can neglect the effect of the lens. However, this is only possible if the lens that is used to capture an image does not introduce important optical distortions. Unfortunately, this is not the case with lower quality lenses or with lenses that have a very short focal length. You may have already noted that the chessboard pattern shown in the image that we used for our example is clearly distorted—the edges of the rectangular board are curved in the image. Also, note that this distortion becomes more important as we move away from the center of the image. This is a typical distortion observed with a fish-eye lens, and it is called radial distortion. The lenses used in common digital cameras usually do not exhibit such a high degree of distortion, but in the case of the lens used here, these distortions certainly cannot be ignored. It is possible to compensate for these deformations by introducing an appropriate distortion model. The idea is to represent the distortions induced by a lens by a set of mathematical equations. Once established, these equations can then be reverted in order to undo the distortions visible on the image. Fortunately, the exact parameters of the transformation that will correct the distortions can be obtained together with the other camera parameters during the calibration phase. Once this is done, any image from the newly calibrated camera will be undistorted. Therefore, we have added an additional method to our calibration class: // remove distortion in an image (after calibration) cv::Mat CameraCalibrator::remap(const cv::Mat &image) { cv::Mat undistorted; if (mustInitUndistort) { // called once per calibration cv::initUndistortRectifyMap( cameraMatrix, // computed camera matrix distCoeffs, // computed distortion matrix cv::Mat(), // optional rectification (none) cv::Mat(), // camera matrix to generate undistorted image.size(), // size of undistorted CV_32FC1, // type of output map map1, map2); // the x and y mapping functions mustInitUndistort= false; } // Apply mapping functions cv::remap(image, undistorted, map1, map2, cv::INTER_LINEAR); // interpolation type return undistorted; } Running this code results in the following image: As you can see, once the image is undistorted, we obtain a regular perspective image. To correct the distortion, OpenCV uses a polynomial function that is applied to the image points in order to move them at their undistorted position. By default, five coefficients are used; a model made of eight coefficients is also available. Once these coefficients are obtained, it is possible to compute two cv::Mat mapping functions (one for the x coordinate and one for the y coordinate) that will give the new undistorted position of an image point on a distorted image. This is computed by the cv::initUndistortRectifyMap function, and the cv::remap function remaps all the points of an input image to a new image. Note that because of the nonlinear transformation, some pixels of the input image now fall outside the boundary of the output image. You can expand the size of the output image to compensate for this loss of pixels, but you will now obtain output pixels that have no values in the input image (they will then be displayed as black pixels). There's more... More options are available when it comes to camera calibration. Calibration with known intrinsic parameters When a good estimate of the camera's intrinsic parameters is known, it could be advantageous to input them in the cv::calibrateCamera function. They will then be used as initial values in the optimization process. To do so, you just need to add the CV_CALIB_USE_INTRINSIC_GUESS flag and input these values in the calibration matrix parameter. It is also possible to impose a fixed value for the principal point (CV_CALIB_FIX_PRINCIPAL_POINT), which can often be assumed to be the central pixel. You can also impose a fixed ratio for the focal lengths fx and fy (CV_CALIB_FIX_RATIO); in which case, you assume the pixels of the square shape. Using a grid of circles for calibration Instead of the usual chessboard pattern, OpenCV also offers the possibility to calibrate a camera by using a grid of circles. In this case, the centers of the circles are used as calibration points. The corresponding function is very similar to the function we used to locate the chessboard corners: cv::Size boardSize(7,7); std::vector<cv::Point2f> centers; bool found = cv:: findCirclesGrid( image, boardSize, centers); See also The A flexible new technique for camera calibration article by Z. Zhang in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 22, no 11, 2000, is a classic paper on the problem of camera calibration Summary In this article, we explored the projective relations that exist between two images of the same scene. Resources for Article: Further resources on this subject: Creating an Application from Scratch [Article] Wrapping OpenCV [Article] New functionality in OpenCV 3.0 [Article]

0
0
22104

article-image-speech2face-a-neural-network-that-imagines-faces-from-hearing-voices-is-it-too-soon-to-worry-about-ethnic-profiling

Savia Lobo

28 May 2019

8 min read

Speech2Face: A neural network that “imagines” faces from hearing voices. Is it too soon to worry about ethnic profiling?

Savia Lobo

28 May 2019

8 min read

Last week, a few researchers from the MIT CSAIL and Google AI published their research study of reconstructing a facial image of a person from a short audio recording of that person speaking, in their paper titled, “Speech2Face: Learning the Face Behind a Voice”. The researchers designed and trained a neural network which uses millions of natural Internet/YouTube videos of people speaking. During training, they demonstrated that the model learns voice-face correlations that allows it to produce images that capture various physical attributes of the speakers such as age, gender, and ethnicity. The entire training was done in a self-supervised manner, by utilizing the natural co-occurrence of faces and speech in Internet videos, without the need to model attributes explicitly. They said they further evaluated and numerically quantified how their Speech2Face reconstructs, obtains results directly from audio, and how it resembles the true face images of the speakers. For this, they tested their model both qualitatively and quantitatively on the AVSpeech dataset and the VoxCeleb dataset. The Speech2Face model The researchers utilized the VGG-Face model, a face recognition model pre-trained on a large-scale face dataset called DeepFace and extracted a 4096-D face feature from the penultimate layer (fc7) of the network. These face features were shown to contain enough information to reconstruct the corresponding face images while being robust to many of the aforementioned variations. The Speech2Face pipeline consists of two main components: 1) a voice encoder, which takes a complex spectrogram of speech as input, and predicts a low-dimensional face feature that would correspond to the associated face; and 2) a face decoder, which takes as input the face feature and produces an image of the face in a canonical form (frontal-facing and with neutral expression). During training, the face decoder is fixed, and only the voice encoder is trained which further predicts the face feature. How were the facial features evaluated? To quantify how well different facial attributes are being captured in Speech2Face reconstructions, the researchers tested different aspects of the model. Demographic attributes Researchers used Face++, a leading commercial service for computing facial attributes. They evaluated and compared age, gender, and ethnicity, by running the Face++ classifiers on the original images and our Speech2Face reconstructions. The Face++ classifiers return either “male” or “female” for gender, a continuous number for age, and one of the four values, “Asian”, “black”, “India”, or “white”, for ethnicity. Source: Arxiv.org Craniofacial attributes Source: Arxiv.org The researchers evaluated craniofacial measurements commonly used in the literature, for capturing ratios and distances in the face. They computed the correlation between F2F and the corresponding S2F reconstructions. Face landmarks were computed using the DEST library. As can be seen, there is statistically significant (i.e., p < 0.001) positive correlation for several measurements. In particular, the highest correlation is measured for the nasal index (0.38) and nose width (0.35), the features indicative of nose structures that may affect a speaker’s voice. Feature similarity The researchers further test how well a person can be recognized from on the face features predicted from speech. They, first directly measured the cosine distance between the predicted features and the true ones obtained from the original face image of the speaker. The table above shows the average error over 5,000 test images, for the predictions using 3s and 6s audio segments. The use of longer audio clips exhibits consistent improvement in all error metrics; this further evidences the qualitative improvement observed in the image below. They further evaluated how accurately they could retrieve the true speaker from a database of face images. To do so, they took the speech of a person to predict the feature using the Speech2Face model and query it by computing its distances to the face features of all face images in the database. Ethical considerations with Speech2Face model Researchers said that the training data used is a collection of educational videos from YouTube and that it does not represent equally the entire world population. Hence, the model may be affected by the uneven distribution of data. They have also highlighted that “ if a certain language does not appear in the training data, our reconstructions will not capture well the facial attributes that may be correlated with that language”. “In our experimental section, we mention inferred demographic categories such as “White” and “Asian”. These are categories defined and used by a commercial face attribute classifier and were only used for evaluation in this paper. Our model is not supplied with and does not make use of this information at any stage”, the paper mentions. They also warn that any further investigation or practical use of this technology would be carefully tested to ensure that the training data is representative of the intended user population. “If that is not the case, more representative data should be broadly collected”, the researchers state. Limitations of the Speech2Face model In order to test the stability of the Speech2Face reconstruction, the researchers used faces from different speech segments of the same person, taken from different parts within the same video, and from a different video. The reconstructed face images were consistent within and between the videos. They further probed the model with an Asian male example speaking the same sentence in English and Chinese to qualitatively test the effect of language and accent. While having the same reconstructed face in both cases would be ideal, the model inferred different faces based on the spoken language. In other examples, the model was able to successfully factor out the language, reconstructing a face with Asian features even though the girl was speaking in English with no apparent accent. “In general, we observed mixed behaviors and a more thorough examination is needed to determine to which extent the model relies on language. More generally, the ability to capture the latent attributes from speech, such as age, gender, and ethnicity, depends on several factors such as accent, spoken language, or voice pitch. Clearly, in some cases, these vocal attributes would not match the person’s appearance”, the researchers state in the paper. Speech2Cartoon: Converting generated image into cartoon faces The face images reconstructed from speech may also be used for generating personalized cartoons of speakers from their voices. The researchers have used Gboard, the keyboard app available on Android phones, which is also capable of analyzing a selfie image to produce a cartoon-like version of the face. Such cartoon re-rendering of the face may be useful as a visual representation of a person during a phone or a video conferencing call when the person’s identity is unknown or the person prefers not to share his/her picture. The reconstructed faces may also be used directly, to assign faces to machine-generated voices used in home devices and virtual assistants. https://twitter.com/NirantK/status/1132880233017761792 A user on HackerNews commented, “This paper is a neat idea, and the results are interesting, but not in the way I'd expected. I had hoped it would the domain of how much person-specific information this can deduce from a voice, e.g. lip aperture, overbite, size of the vocal tract, openness of the nares. This is interesting from a speech perception standpoint. Instead, it's interesting more in the domain of how much social information it can deduce from a voice. This appears to be a relatively efficient classifier for gender, race, and age, taking voice as input.” “I'm sure this isn't the first time it's been done, but it's pretty neat to see it in action, and it's a worthwhile reminder: If a neural net is this good at inferring social, racial, and gender information from audio, humans are even better. And the idea of speech as a social construct becomes even more relevant”, he further added. This recent study is interesting considering the fact that it is taking AI to another level wherein we are able to predict the face just by using audio recordings and even without the need for a DNA. However, there can be certain repercussions, especially when it comes to security. One can easily misuse such technology by impersonating someone else and can cause trouble. It would be interesting to see how this study turns out to be in the near future. To more about the Speech2Face model in detail, head over to the research paper. OpenAI introduces MuseNet: A deep neural network for generating musical compositions An unsupervised deep neural network cracks 250 million protein sequences to reveal biological structures and functions OpenAI researchers have developed Sparse Transformers, a neural network which can predict what comes next in a sequence

0
0
22010

article-image-how-to-perform-full-text-search-fts-in-postgresql

Sugandha Lahoti

27 Mar 2018

8 min read

How to perform full-text search (FTS) in PostgreSQL

Sugandha Lahoti

27 Mar 2018

8 min read

0
0
21944

article-image-building-classification-system-logistic-regression-opencv

Savia Lobo

28 Nov 2017

7 min read

Building a classification system with logistic regression in OpenCV

Savia Lobo

28 Nov 2017

7 min read

[box type="note" align="" class="" width=""]This article is an excerpt from a book by Michael Beyeler titled Machine Learning for OpenCV. The code and related files are available on Github here.[/box] A famous dataset in the world of machine learning is called the Iris dataset. The Iris dataset contains measurements of 150 iris flowers from three different species: setosa, versicolor, and viriginica. These measurements include the length and width of the petals, and the length and width of the sepals, all measured in centimeters: Understanding logistic regression Despite its name, logistic regression can actually be used as a model for classification. It uses a logistic function (or sigmoid) to convert any real-valued input x into a predicted output value ŷ that take values between 0 and 1, as shown in the following figure: The logistic function Rounding ŷ to the nearest integer effectively classifies the input as belonging either to class 0 or 1. Of course, most often, our problems have more than one input or feature value, x. For example, the Iris dataset provides a total of four features. For the sake of simplicity, let's focus here on the first two features, sepal length—which we will call feature f1—and sepal width—which we will call f2. Using the tricks we learned when talking about linear regression, we know we can express the input x as a linear combination of the two features, f1 and f2: However, in contrast to linear regression, we are not done yet. From the previous section, we know that the sum of products would result in a real-valued, output—but we are interested in a categorical value, zero or one. This is where the logistic function comes in: it acts as a squashing function, σ, that compresses the range of possible output values to the range [0, 1]: [box type="shadow" align="" class="" width=""]Because the output is always between 0 and 1, it can be interpreted as a probability. If we only have a single input variable x, the output value ŷ can be interpreted as the probability of x belonging to class 1.[/box] Now let's apply this knowledge to the Iris dataset! Loading the training data The Iris dataset is included with scikit-learn. We first load all the necessary modules, as we did in our earlier examples: In [1]: import numpy as np ... import cv2 ... from sklearn import datasets ... from sklearn import model_selection ... from sklearn import metrics ... import matplotlib.pyplot as plt ... %matplotlib inline In [2]: plt.style.use('ggplot') Then, loading the dataset is a one-liner: In [3]: iris = datasets.load_iris() This function returns a dictionary we call iris, which contains a bunch of different fields: In [4]: dir(iris) Out[4]: ['DESCR', 'data', 'feature_names', 'target', 'target_names'] Here, all the data points are contained in 'data'. There are 150 data points, each of which has four feature values: In [5]: iris.data.shape Out[5]: (150, 4) These four features correspond to the sepal and petal dimensions mentioned earlier: In [6]: iris.feature_names Out[6]: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)'] For every data point, we have a class label stored in target: In [7]: iris.target.shape Out[7]: (150,) We can also inspect the class labels, and find that there is a total of three classes: In [8]: np.unique(iris.target) Out[8]: array([0, 1, 2]) Making it a binary classification problem For the sake of simplicity, we want to focus on a binary classification problem for now, where we only have two classes. The easiest way to do this is to discard all data points belonging to a certain class, such as class label 2, by selecting all the rows that do not belong to class 2: In [9]: idx = iris.target != 2 ... data = iris.data[idx].astype(np.float32) ... target = iris.target[idx].astype(np.float32) Inspecting the data Before you get started with setting up a model, it is always a good idea to have a look at the data. We did this earlier for the town map example, so let's continue our streak. Using Matplotlib, we create a scatter plot where the color of each data point corresponds to the class label: In [10]: plt.scatter(data[:, 0], data[:, 1], c=target, cmap=plt.cm.Paired, s=100) ... plt.xlabel(iris.feature_names[0]) ... plt.ylabel(iris.feature_names[1]) Out[10]: <matplotlib.text.Text at 0x23bb5e03eb8> To make plotting easier, we limit ourselves to the first two features (iris.feature_names[0] being the sepal length and iris.feature_names[1] being the sepal width). We can see a nice separation of classes in the following figure: Plotting the ﬁrst two features of the Iris dataset Splitting the data into training and test sets We learned in the previous chapter that it is essential to keep training and test data separate. We can easily split the data using one of scikit-learn's many helper functions: In [11]: X_train, X_test, y_train, y_test = model_selection.train_test_split( ... data, target, test_size=0.1, random_state=42 ... ) Here we want to split the data into 90 percent training data and 10 percent test data, which we specify with test_size=0.1. By inspecting the return arguments, we note that we ended up with exactly 90 training data points and 10 test data points: In [12]: X_train.shape, y_train.shape Out[12]: ((90, 4), (90,)) In [13]: X_test.shape, y_test.shape Out[13]: ((10, 4), (10,)) Training the classifier Creating a logistic regression classifier involves pretty much the same steps as setting up k- NN: In [14]: lr = cv2.ml.LogisticRegression_create() We then have to specify the desired training method. Here, we can choose cv2.ml.LogisticRegression_BATCH or cv2.ml.LogisticRegression_MINI_BATCH. For now, all we need to know is that we want to update the model after every data point, which can be achieved with the following code: In [15]: lr.setTrainMethod(cv2.ml.LogisticRegression_MINI_BATCH) ... lr.setMiniBatchSize(1) We also want to specify the number of iterations the algorithm should run before it terminates: In [16]: lr.setIterations(100) We can then call the training method of the object (in the exact same way as we did earlier), which will return True upon success: In [17]: lr.train(X_train, cv2.ml.ROW_SAMPLE, y_train) Out[17]: True As we just saw, the goal of the training phase is to find a set of weights that best transform the feature values into an output label. A single data point is given by its four feature values (f0, f1, f2, f3). Since we have four features, we should also get four weights, so that x = w0 f0 + w1 f1 + w2 f2 + w3 f3, and ŷ=σ(x). However, as discussed previously, the algorithm adds an extra weight that acts as an offset or bias, so that x = w0 f0 + w1 f1 + w2 f2 + w3 f3 + w4. We can retrieve these weights as follows: In [18]: lr.get_learnt_thetas() Out[18]: array([[-0.04109113, -0.01968078, -0.16216497, 0.28704911, 0.11945518]], dtype=float32) This means that the input to the logistic function is x = -0.0411 f0 - 0.0197 f1 - 0.162 f2 + 0.287 f3 + 0.119. Then, when we feed in a new data point (f0, f1, f2, f3) that belongs to class 1, the output ŷ=σ(x) should be close to 1. But how well does that actually work? Testing the classifier Let's see for ourselves by calculating the accuracy score on the training set: In [19]: ret, y_pred = lr.predict(X_train) In [20]: metrics.accuracy_score(y_train, y_pred) Out[20]: 1.0 Perfect score! However, this only means that the model was able to perfectly memorize the training dataset. This does not mean that the model would be able to classify a new, unseen data point. For this, we need to check the test dataset: In [21]: ret, y_pred = lr.predict(X_test) ... metrics.accuracy_score(y_test, y_pred) Out[21]: 1.0 Luckily, we get another perfect score! Now we can be sure that the model we built is truly awesome. If you enjoyed building a classifier using logistic regression and would like to learn more machine learning tasks using OpenCV, be sure to check out the book, Machine Learning for OpenCV, where this section originally appears.

0
0
21854

article-image-generative-models-action-create-van-gogh-neural-artistic-style-transfer

Sunith Shetty

03 Apr 2018

14 min read

Generative Models in action: How to create a Van Gogh with Neural Artistic Style Transfer

Sunith Shetty

03 Apr 2018

14 min read

0
0
21816

article-image-measures-and-measure-groups-microsoft-analysis-services-part-2

Packt

15 Oct 2009

20 min read

Measures and Measure Groups in Microsoft Analysis Services: Part 2

Packt

15 Oct 2009

20 min read

Measure groups All but the simplest data warehouses will contain multiple fact tables, and Analysis Services allows you to build a single cube on top of multiple fact tables through the creation of multiple measure groups. These measure groups can contain different dimensions and be at different granularities, but so long as you model your cube correctly, your users will be able to use measures from each of these measure groups in their queries easily and without worrying about the underlying complexity. Creating multiple measure groups To create a new measure group in the Cube Editor, go to the Cube Structure tab and right-click on the cube name in the Measures pane and select 'New Measure Group'. You'll then need to select the fact table to create the measure group from and then the new measure group will be created; any columns that aren't used as foreign key columns in the DSV will automatically be created as measures, and you'll also get an extra measure of aggregation type Count. It's a good idea to delete any measures you are not going to use at this stage. Once you've created a new measure group, BIDS will try to set up relationships between it and any existing dimensions in your cube based on the relationships you've defined in your DSV. Since doing this manually can be time-consuming, this is another great reason for defining relationships in the DSV. You can check the relationships that have been created on the Dimension Usage tab of the Cube Editor: In Analysis Services 2005, it was true in some cases that query performance was better on cubes with fewer measure groups, and that breaking a large cube with many measure groups up into many smaller cubes with only one or two measure groups could result in faster queries. This is no longer the case in Analysis Services 2008. Although there are other reasons why you might want to consider creating separate cubes for each measure group, this is still something of a controversial subject amongst Analysis Services developers. The advantages of a single cube approach are: All of your data is in one place. If your users need to display measures from multiple measure groups, or you need to create calculations that span measure groups, everything is already in place. You only have one cube to manage security and calculations on; with multiple cubes the same security and calculations might have to be duplicated. The advantages of the multiple cube approach are: If you have a complex cube but have to use Standard Edition, you cannot use Perspectives to hide complexity from your users. In this case, creating multiple cubes might be a more user-friendly approach. Depending on your requirements, security might be easier to manage with multiple cubes. It's very easy to grant or deny a role access to a cube; it's much harder to use dimension security to control which measures and dimensions in a multi-measure group cube a role can access. If you have complex calculations, especially MDX Script assignments, it's too easy to write a calculation that has an effect on part of the cube you didn't want to alter. With multiple cubes, the chances of this happening are reduced. Creating measure groups from dimension tables Measure groups don't always have to be created from fact tables. In many cases, it can be useful to build measure groups from dimension tables too. One common scenario where you might want to do this is when you want to create a measure that counts the number of days in the currently selected time period, so if you had selected a year on your Time dimension's hierarchy, the measure would show the number of days in the year. You could implement this with a calculated measure in MDX, but it would be hard to write code that worked in all possible circumstances, such as when a user multi-selects time periods. In fact, it's a better idea to create a new measure group from your Time dimension table containing a new measure with AggregateFunction Count, so you're simply counting the number of days as the number of rows in the dimension table. This measure will perform faster and always return the values you expect. This post on Mosha Pasumansky's blog discusses the problem in more detail: http://tinyurl.com/moshadays MDX formulas vs pre-calculating valuesIf you can somehow model a calculation into the structure of your cube, or perform it in your ETL, you should do so in preference to doing it in MDX only so long as you do not compromise the functionality of your cube. A pure MDX approach will be the most flexible and maintainable since it only involves writing code, and if calculation logic needs to change, then you just need to redeploy your updated MDX Script; doing calculations upstream in the ETL can be much more time-consuming to implement and if you decide to change your calculation logic, then it could involve reloading one or more tables. However, an MDX calculation, even one that is properly tuned, will of course never perform as well as a pre-calculated value or a regular measure. The day count measure, discussed in the previous paragraph, is a perfect example of where a cube-modeling approach trumps MDX. If your aim was to create a measure that showed average daily sales, though, it would make no sense to try to pre-calculate all possible values since that would be far too time-consuming and would result in a non-aggregatable measure. The best solution here would be a hybrid: create real measures for sales and day count, and then create an MDX calculated measure that divided the former by the latter. However, it's always necessary to consider the type of calculation, the volume of data involved and the chances of the calculation algorithm changing in the future before you can make an informed decision on which approach to take. Handling different dimensionality When you have different measure groups in a cube, they are almost always going to have different dimensions associated with them; indeed, if you have measure groups that have identical dimensionality, you might consider combining them into a single measure group if it is convenient to do so. As we've already seen, the Dimension Usage tab shows us which dimensions have relationships with which measure groups. When a dimension has a relationship with a measure group it goes without saying that making a selection on that dimension will affect the values that are displayed for measures on that measure group. But what happens to measures when you make a selection on a dimension that has no relationship with a measure group? In fact, you have two options here, controlled by the IgnoreUnrelatedDimensions property of a measure group: IgnoreUnrelatedDimensions=False displays a null value for all members below the root (the intersection of all of the All Members or default members on every hierarchy) of the dimension, except the Unknown member, or IgnoreUnrelatedDimensions=True repeats the value displayed at the root of the dimension for every member on every hierarchy of the dimension. This is the default state. The screenshot below shows what happens for two otherwise identical measures from measure groups which have IgnoreUnrelatedDimensions set to True and to False when they're displayed next to a dimension they have no relationship with: It's usually best to keep IgnoreUnrelatedDimensions set to True since if the users are querying measures from multiple measure groups, then they don't want some of their selected measures suddenly returning null if they slice by a dimension that has a regular relationship with their other selected measures. Handling different granularities Even when measure groups share the same dimensions, they may not share the same granularity. For example, we may hold sales information in one fact table down to the day level, but also hold sales quotas in another fact table at the quarter level. If we created measure groups from both these fact tables, then they would both have regular relationships with our Time dimension but at different granularities. Normally, when you create a regular relationship between a dimension and a measure group, Analysis Services will join the columns specified in the KeyColumns property of the key attribute of the dimension with the appropriate foreign key columns of the fact table (note that during processing, Analysis Services won't usually do the join in SQL, it does it internally). However, when you have a fact table of a higher granularity, you need to change the granularity attribute property of the relationship to choose the attribute from the dimension you do want to join on instead: In the previous screenshot, we can see an amber warning triangle telling us that by selecting a non-key attribute, the server may have trouble aggregating measure values. What does this mean exactly? Let's take a look at the attribute relationships defined on our Time dimension again: If we're loading data at the Quarter level, what do we expect to see at the Month and Date level? We can only expect to see useful values at the level of the granularity attribute we've chosen, and for only those attributes whose values can be derived from that attribute; this is yet another good reason to make sure your attribute relationships have been optimized. Below the granularity attribute, we've got the same options regarding what gets displayed as we had with dimensions that have no relationship at all with a measure group: either repeated values or null values. The IgnoreUnrelatedDimensions property is again used to control this behavior. Unfortunately, the default True setting for IgnoreUnrelatedDimensions is usually not the option you want to use in this scenario (users usually prefer to see nulls below the granularity of a measure in our experience) and this may conflict with how we want to set IgnoreUnrelatedDimensions to control the behavior of dimensions which have no relationship with a measure group. There are ways of resolving this conflict such as using MDX Script assignments to set cell values to null or by using the ValidMeasure() MDX function, but none are particularly elegant. Non-aggregatable measures: a different approach We've already seen how we can use parent/child hierarchies to load non-aggregatable measure values into our cube. However, given the problems associated with using parent/child hierarchies and knowing what we now know about measure groups, let's consider a different approach to solving this problem. A non-aggregatable measure will have, by its very nature, data stored for many different granularities of a dimension. Rather than storing all of these different granularities of values in the same fact table, we could create multiple fact tables for each granularity of value. Having built measure groups from these fact tables, we would then be able to join our dimension to each of them with a regular relationship but at different granularities. We'd then be in the position of having multiple measures representing the different granularities of a single, logical measure. What we actually want is a single non-aggregatable measure, and we can get this by using MDX Script assignments to combine the different granularities. Let's say we have a regular (non-parent/child) dimension called Employee with three attributes Manager, Team Leader and Sales Person, and a logical non-aggregatable measure called Sales Quota appearing in three measure groups as three measures called Sales Amount Quota_Manager, Sales Amount Quota_TeamLead and Sales Amount Quota for each of these three granularities. Here's a screenshot showing what a query against this cube would show at this stage: We can combine the three measures into one like this: SCOPE([Measures].[Sales Amount Quota]); SCOPE([Employee].[Salesperson].[All]); THIS=[Measures].[Sales Amount Quota_TeamLead]; END SCOPE; SCOPE([Employee].[Team Lead].[All]); THIS=[Measures].[Sales Amount Quota_Manager]; END SCOPE;END SCOPE; This code takes the lowest granularity measure Sales Amount Quota, and then overwrites it twice: the first assignment replaces all of the values above the Sales Person granularity with the value of the measure containing Sales Amount Quota for Team Leaders; the second assignment then replaces all of the values above the Team Leader granularity with the value of the measure containing Sales Quotas for Managers. Once we've set Visible=False for the Sales Amount Quota_TeamLead and Sales Amount Quota_Manager measures, we're left with just the Sales Amount Quota measure visible, thus displaying the non-aggregatable values that we wanted. The user would then see this: Using linked dimensions and measure groups Creating linked dimensions and measure groups allows you to share the same dimensions and measure groups across separate Analysis Services databases, and the same measure group across multiple cubes. To do this, all you need to do is to run the 'New Linked Object' wizard from the Cube Editor, either by clicking on the button in the toolbar on the Cube Structure or Dimension Usage tabs, or by selecting it from the right-click menu in the Measures pane of the Cube Structure tab. Doing this has the advantage of reducing the amount of processing and maintenance needed: instead of having many identical dimensions and measure groups to maintain and keep synchronized, all of which need processing separately, you can have a single object which only needs to be changed and processed once. At least that's the theory—in practice, linked objects are not as widely used as they could be because there are a number of limitations in their use: Linked objects represent a static snapshot of the metadata of the source object, and any changes to the source object are not passed through to the linked object. So for example, if you create a linked dimension and then add an attribute to the source dimension, you then have to delete and recreate the linked dimension—there's no option to refresh a linked object. You can also import the calculations defined in the MDX Script of the source cube using the wizard. However, you can only import the entire script and this may include references to objects present in the source cube that aren't in the target cube, and which may need to be deleted to prevent errors. The calculations that remain will also need to be updated manually when those in the source cube are changed, and if there are a lot, this can add an unwelcome maintenance overhead. A linked measure group can only be used with dimensions from the same database as the source measure group. This isn't a problem when you're sharing measure groups between cubes in the same database, but could be if you wanted to share measure groups across databases. As you would expect, when you query a linked measure group, your query is redirected to the source measure group. If the source measure group is on a different server, this may introduce some latency and hurt query performance. Analysis Services does try to mitigate this by doing some caching on the linked measure group's database, though. By default, it will cache data on a per-query basis, but if you change the RefreshPolicy property from ByQuery to ByInterval you can specify a time limit for data to be held in cache. Linked objects can be useful when cube development is split between multiple development teams, or when you need to create multiple cubes containing some shared data, but, in general, we recommend against using them widely because of these limitations. Role-playing dimensions It's also possible to add the same dimension to a cube more than once, and give each instance a different relationship to the same measure group. For example, in our Sales fact table, we might have several different foreign key columns that join to our Time dimension table: one which holds the date an order was placed on, one which holds the date it was shipped from the warehouse, and one which holds the date the order should arrive with the customer. In Analysis Services, we can create a single physical Time dimension in our database, which is referred to as a database dimension, and then add it three times to the cube to create three 'cube dimensions', renaming each cube dimension to something like Order Date, Ship Date and Due Date. These three cube dimensions are referred to as role-playing dimensions: the same dimension is playing three different roles in the same cube. Role playing dimensions are a very useful feature. They reduce maintenance overheads because you only need to edit one dimension, and unlike linked dimensions, any changes made to the underlying database dimension are propagated to all of the cube dimensions that are based on it. They also reduce processing time because you only need to process the database dimension once. However, there is one frustrating limitation with role-playing dimensions and that is that while you can override certain properties of the database dimension on a per-cube dimension basis, you can't change the name of any of the attributes or hierarchies of a cube dimension. So if you have a user hierarchy called 'Calendar' on your database dimension, all of your cube dimensions will also have a user hierarchy called 'Calendar', and your users might find it difficult to tell which hierarchy is which in certain client tools (Excel 2003 is particularly bad in this respect) or in reports. Unfortunately, we have seen numerous cases where this problem alone meant role-playing dimensions couldn't be used. Dimension/measure group relationships So far we've seen dimensions either having no relationship with a measure group or having a regular relationship, but that's not the whole story: there are many different types of relationships that a dimension can have with a measure group. Here's the complete list: No relationship Regular Fact Referenced Many-to-Many Data Mining Fact relationships Fact or degenerate dimensions are dimensions that are built directly from columns in a fact table, not from a separate dimension table. From an Analysis Services dimension point of view, they are no different from any other kind of dimension, except that there is a special fact relationship type that a dimension can have with a measure group. There are in fact very few differences between a fact relationship and a regular relationship, and they are: A fact relationship will result in marginally more efficient SQL being generated when the fact dimension is used in ROLAP drillthrough. Fact relationships are visible to client tools in the cube's metadata, so client tools may choose to display fact dimensions differently. A fact relationship can only be defined on dimensions and measure groups that are based on the same table in the DSV. A measure group can only have a fact relationship with one database dimension. It can have more than one fact relationship, but all of them have to be with cube dimensions based on the same database dimension. It still makes sense though to define relationships as fact relationships when you can. Apart from the reasons given above, the functionality might change in future versions of Analysis Services and fact relationship types might be further optimized in some way. Referenced relationships A referenced relationship is where a dimension joins to a measure group through another dimension. For example, you might have a Customer dimension that includes geographic attributes up to and including a customer's country; also, your organization might divide the world up into international regions such as North America, Europe, Middle East and Africa (EMEA), Latin America (LATAM) and Asia-Pacific and so on for financial reporting, and you might build a dimension for this too. If your sales fact table only contained a foreign key for the Customer dimension, but you wanted to analyze sales by international region, you would be able to create a referenced relationship from the Region dimension through the Customer dimension to the Sales measure group. When setting up a referenced relationship in the Define Relationship dialog in the Dimension Usage tab, you're asked to first choose the dimension that you wish to join through and then which attribute on the reference dimension joins to which attribute on the intermediate dimension: When the join is made between the attributes you've chosen on the reference dimension, once again it's the values in the columns that are defined in the KeyColumns property of each attribute that you're in fact joining on. The Materialize checkbox is automatically checked, and this ensures maximum query performance by resolving the join between the dimensions at processing time, which can lead to a significant decrease in processing performance. Unchecking this box means that no penalty is paid at processing time but query performance may be worse. The question you may well be asking yourself at this stage is: why bother to use referenced relationships at all? It is in fact a good question to ask, because, in general, it's better to include all of the attributes you need in a single Analysis Services dimension built from multiple tables rather than use a referenced relationship. The single dimension approach will perform better and is more user-friendly: for example, you can't define user hierarchies that span a reference dimension and its intermediate dimension. That said, there are situations where referenced relationships are useful because it's simply not feasible to add all of the attributes you need to a dimension. You might have a Customer dimension, for instance, that has a number of attributes representing dates—the date of a customer's first purchase, the date of a customer's tenth purchase, the date of a customer's last purchase and so on. If you had created these attributes with keys that matched the surrogate keys of your Time dimension, you could create multiple, referenced (but not materialized) role-playing Time dimensions joined to each of these attributes that would give you the ability to analyze each of these dates. You certainly wouldn't want to duplicate all of the attributes from your Time dimension for each of these dates in your Customer dimension. Another good use for referenced relationships is when you want to create multiple parent/child hierarchies from the same dimension table Data mining relationships The data mining functionality of Analysis Services is outside the scope of this article, so we won't spend much time on the data mining relationship type. Suffice to say that when you create an Analysis Services mining structure from data sourced from a cube, you have the option of using that mining structure as the source for a special type of dimension, called a data mining dimension. The wizard will also create a new cube containing linked copies of all of the dimensions and measure groups in the source cube, plus the new data mining dimension, which then has a data mining relationships with the measure groups. Summary In this part, we focused on how to create new measure groups and handle the problems of different dimensionality and granularity, and looked at the different types of relationships that are possible between dimensions and measure groups.

0
0
21796

article-image-perform-data-partitioning-postgresql-10

Sugandha Lahoti

09 Mar 2018

11 min read

How to perform data partitioning in PostgreSQL 10

Sugandha Lahoti

09 Mar 2018

11 min read

0
0
21791

article-image-conversational-ai-in-2018-an-arms-race-of-new-products-acquisitions-and-more

Bhagyashree R

21 Jan 2019

5 min read

Conversational AI in 2018: An arms race of new products, acquisitions, and more

Bhagyashree R

21 Jan 2019

5 min read

Conversational AI is one of the most interesting applications of artificial intelligence in recent years. While the trend isn’t yet ubiquitous in the way that recommendation systems are (perhaps unsurprising), it has been successfully productized by a number of tech giants, in the form of Google Home and Amazon Echo (which is ‘powered by’ Alexa). The conversational AI arms race Arguably, 2018 has seen a bit of an arms race in conversational AI. As well as Google and Amazon, the likes of IBM, Microsoft, and Apple have wanted a piece of the action. Here are some of the new conversational AI tools and products these companies introduced this year: Google Google worked towards enhancing its conversational interface development platform, Dialogflow. In July, at the Google Cloud Next event, it announced several improvements and new capabilities to Dialogflow including Text to Speech via DeepMind's WaveNet and Dialogflow Phone Gateway for telephony integration. It also launched a new product called Contact Center AI that comes with Dialogflow Enterprise Edition and additional capabilities to assist live agents and perform analytics. Google Assistant became better in having a back-and-forth conversation with the help of Continued Conversation, which was unveiled at the Google I/O conference. The assistant became multilingual in August, which means users can speak to it in more than one language at a time, without having to adjust their language settings. Users can enable this multilingual functionality by selecting two of the supported languages. Following the footsteps of Amazon, Google also launched its own smart display named Google Home Hub at the ‘Made by Google’ event held in October. Microsoft Microsoft in 2018 introduced and improved various bot-building tools for developers. In May, at the Build conference, Microsoft announced major updates in their conversational AI tools: Azure Bot Service, Microsoft Cognitive Services Language Understanding, and QnAMaker. To enable intelligent bots to learn from example interactions and handle common small talk, it launched new experimental projects from named Conversation Learner and Personality Chat. At Microsoft Ignite, Bot Framework SDK V4.0 was made generally available. Later in November, Microsoft announced the general availability of the Bot Framework Emulator V4 and Web Chat control. In May, to drive more research and development in its conversational AI products, Microsoft acquired Semantic Machines and established conversational AI center of excellence in Berkeley. In November, the organization's acquisition of Austin-based bot startup XOXCO was a clear indication that it wants to get serious about using artificial intelligence for conversational bots. Producing guidelines on developing ‘responsible’ conversational AI further confirmed Microsoft wants to play a big part in the future evolution of the area. Microsoft were the chosen tech partner by UK based conversational AI startup ICS.ai. The team at ICS are using Azure and LUIS from Microsoft in their public sector AI chatbots, aimed at higher education, healthcare trusts and county councils. Amazon Amazon with the aims to improve Alexa’s capabilities released Alexa Skills Kit (ASK) which consists of APIs, tools, documentation, and code samples using which developers can build new skills for Alexa. In September, it announced a preview of a new design language named Alexa Presentation Language (APL). With APL, developers can build visual skills that include graphics, images, slideshows, and video, and to customize them for different device types. Amazon’s smart speaker Echo Dot saw amazing success with becoming the best seller in smart speaker category on Amazon. At its 2018 hardware event in Seattle, Amazon announced the launch of redesigned Echo Dot and a new addition to Alexa-powered A/V device called Echo Plus. As well as the continuing success of Alexa and the Amazon Echo, Amazon’s decision to launch the Alexa Fellowship at a number of leading academic institutions also highlights that for the biggest companies conversational AI is as much about research and exploration as it is products. Like Microsoft, it appears that Amazon is well aware that conversational AI is an area only in its infancy, still in development - as much as great products, it requires clear thinking and cutting-edge insight to ensure that it develops in a way that is both safe and impactful. What’s next? This huge array of products is a result of advances in deep learning researches. Now conversational AI is not just limited to small tasks like setting an alarm or searching the best restaurant. We can have a back and forth conversation with the conversational agent. But, needless to say, it still needs more work. Conversational agents are yet to meet user expectations related to sensing and responding with emotion. In the coming years, we will see these systems understand and do a good job at generating natural language. They will be able to have reasonably natural conversations with humans in certain domains, grounded in context. Also, the continuous development in IoT will provide AI systems with more context. Apple has introduced Shortcuts for iOS 12 to automate your everyday tasks Microsoft amplifies focus on conversational AI: Acquires XOXCO; shares guide to developing responsible bots Amazon is supporting research into conversational AI with Alexa fellowships

0
0
21786

How-To Tutorials - Data

Deepfakes House Committee Hearing: Risks, Vulnerabilities and Recommendations

Julia Computing research team runs machine learning model on encrypted data without decrypting it

Running Parallel Data Operations using Java Streams

FAT Conference 2018 Session 4: Fair Classification

10 Machine Learning Tools to watch in 2018

Experts discuss Dark Patterns and deceptive UI designs: What are they? What do they do? How do we stop them?

Basics of Image Histograms in OpenCV

Camera Calibration

Speech2Face: A neural network that “imagines” faces from hearing voices. Is it too soon to worry about ethnic profiling?

How to perform full-text search (FTS) in PostgreSQL

Trending Topics

Building a classification system with logistic regression in OpenCV

Generative Models in action: How to create a Van Gogh with Neural Artistic Style Transfer

Measures and Measure Groups in Microsoft Analysis Services: Part 2

How to perform data partitioning in PostgreSQL 10

Conversational AI in 2018: An arms race of new products, acquisitions, and more

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access