Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-what-the-us-china-tech-and-ai-arms-race-means-for-the-world-frederick-kempe-at-davos-2019

24 Jan 2019

6 min read

What the US-China tech and AI arms race means for the world - Frederick Kempe at Davos 2019

24 Jan 2019

Atlantic Council CEO, Frederick Kempe spoke in the World Economic Forum (WEF) in Davos, Switzerland. He talked about the Cold war between the US and China and why the countries need to co-operate and not compete in the tech arms race, in his presentation Future Frontiers of Technology Control. He began his presentation by posing a question set forth by Former US Foreign National Security Advisor Stephen Hadley, “Can the incumbent US and insurgent China become strategic collaborators and strategic competitors in this tech space at the same time?” Read also: The New AI Cold War Between China and the USA Kempe’s three framing arguments Geopolitical Competition This fusion of tech breakthroughs blurring lines of the physical, digital, and biological space is reaching an inflection point that makes it already clear that they will usher in a revolution that will determine the shape of the global economy. It will also determine which nations and political constructs may assume the commanding heights of global politics in the coming decade. Technological superiority Over the course of history, societies that dominated economic innovation and progress have dominated in international relations — from military superiority to societal progress and prosperity. On balance, technological progress has contributed to higher standards of living in most parts of the world; however, the disproportionate benefit goes to first movers. Commanding Heights The technological arms race for supremacy in the fourth industrial revolution has essentially become a two-horse contest between the United States and China. We are in the early stages of this race, but how it unfolds and is conducted will do much to shape global human relations. The shift in 2018 in US-China relations from a period of strategic engagement to greater strategic competition has also significantly accelerated the Tech arms race. China vs the US: Why China has the edge? It was Vladimir Putin, President of the Russian Federation who said that “The one who becomes the leader in Artificial Intelligence, will rule the world.” In 2017, DeepMind’s AlphaGo defeated a Chinese master in Go, a traditional Chinese game. Following this defeat, China launched an ambitious roadmap, called the next generation AI plan. The goal was to become the Global leader in AI by 2030 in theory, technology, and application. On current trajectories, in the four primary areas of AI over the next 5 years, China will emerge the winner of this new technology race. Kempe also quotes, author of the book, AI superpowers, Kai-fu Lee who argues that harnessing of the power of AI today- the electricity of the 21st century- requires abundant data, hungry entrepreneurs, AI scientists, and an AI friendly policy. He believes that China has the edge in all of these. The current AI has translated from out of the box research, where the US has expertise in, to actual implementation, where China has the edge. Per, Kai-fu Lee China already has the edge in entrepreneurship, data, and government support, and is rapidly catching up to the U.S. in expertise. The world has translated from the age of world-leading expertise (US department) to the age of data, where China wins hands down. Economists call China the Saudi Arabia of Data and with that as the fuel for AI, it has an enormous advantage. The Chinese government without privacy restrictions can gain and use data in a manner that is out of reach of any democracy. Kemper concludes that the nature of this technological arms contest may favor insurgent China rather than the incumbent US. What are the societal implications of this tech cold war He also touched upon the societal implications of AI and the cold war between the US and China. A number of jobs will be lost by 2030. Quoting from Kai-fu Lee’s book, Kempe says that Job displacement caused by artificial intelligence and advanced robotics could possibly displace up to 54 million US workers which comprise 30% of the US labor force. It could also displace up to 100 million Chinese workers which are 12% of the Chinese labor force. What is the way forward with these huge societal implications of a bi-lateral race underway? Kempe sees three possibilities. A sloppy Status Quo A status quo where China and the US will continue to cooperate but increasingly view each other with suspicion. They will manage their rising differences and distrust imperfectly, never bridging them entirely, but also not burning bridges, either between researchers, cooperations, or others. Techno Cold War China and the US turn the global tech contest into more of a zero-sum battle for global domination. They organize themselves in a manner that separates their tech sectors from each other and ultimately divides up the world. Collaborative Future - the one we hope for Nicholas Thompson and Ian Bremmer argued in a wired interview that despite the two countries’ societal difference, the US should wrap China in a tech embrace. The two countries should work together to establish international standards to ensure that the algorithms governing people’s lives and livelihoods are transparent and accountable. They should recognize that while the geopolitics of technological change is significant, even more important will be the challenges AI poses to all societies across the world in terms of job automation and the social disruptions that may come with it. It may sound utopian to expect US and China to cooperate in this manner, but this is what we should hope for. To do otherwise would be self-defeating and at the cost of others in the global community which needs our best thinking to navigate the challenges of the fourth industrial revolution. Kempe concludes his presentation with a quote by Henry Kissinger, Former US Secretary of State and National Security Advisor, “We’re in a position in which the peace and prosperity of the world depend on whether China and the US can find a method to work together, not always in agreement, but to handle our disagreements...This is the key problem of our time.” Note: All images in this article are taken from Frederick Kempe’s presentation. We must change how we think about AI, urge AI founding fathers Does AI deserve to be so Overhyped? Alarming ways governments are using surveillance tech to watch you

0
0
21166

article-image-conversational-ai-in-2018-an-arms-race-of-new-products-acquisitions-and-more

Bhagyashree R

21 Jan 2019

5 min read

Conversational AI in 2018: An arms race of new products, acquisitions, and more

Bhagyashree R

21 Jan 2019

5 min read

Conversational AI is one of the most interesting applications of artificial intelligence in recent years. While the trend isn’t yet ubiquitous in the way that recommendation systems are (perhaps unsurprising), it has been successfully productized by a number of tech giants, in the form of Google Home and Amazon Echo (which is ‘powered by’ Alexa). The conversational AI arms race Arguably, 2018 has seen a bit of an arms race in conversational AI. As well as Google and Amazon, the likes of IBM, Microsoft, and Apple have wanted a piece of the action. Here are some of the new conversational AI tools and products these companies introduced this year: Google Google worked towards enhancing its conversational interface development platform, Dialogflow. In July, at the Google Cloud Next event, it announced several improvements and new capabilities to Dialogflow including Text to Speech via DeepMind's WaveNet and Dialogflow Phone Gateway for telephony integration. It also launched a new product called Contact Center AI that comes with Dialogflow Enterprise Edition and additional capabilities to assist live agents and perform analytics. Google Assistant became better in having a back-and-forth conversation with the help of Continued Conversation, which was unveiled at the Google I/O conference. The assistant became multilingual in August, which means users can speak to it in more than one language at a time, without having to adjust their language settings. Users can enable this multilingual functionality by selecting two of the supported languages. Following the footsteps of Amazon, Google also launched its own smart display named Google Home Hub at the ‘Made by Google’ event held in October. Microsoft Microsoft in 2018 introduced and improved various bot-building tools for developers. In May, at the Build conference, Microsoft announced major updates in their conversational AI tools: Azure Bot Service, Microsoft Cognitive Services Language Understanding, and QnAMaker. To enable intelligent bots to learn from example interactions and handle common small talk, it launched new experimental projects from named Conversation Learner and Personality Chat. At Microsoft Ignite, Bot Framework SDK V4.0 was made generally available. Later in November, Microsoft announced the general availability of the Bot Framework Emulator V4 and Web Chat control. In May, to drive more research and development in its conversational AI products, Microsoft acquired Semantic Machines and established conversational AI center of excellence in Berkeley. In November, the organization's acquisition of Austin-based bot startup XOXCO was a clear indication that it wants to get serious about using artificial intelligence for conversational bots. Producing guidelines on developing ‘responsible’ conversational AI further confirmed Microsoft wants to play a big part in the future evolution of the area. Microsoft were the chosen tech partner by UK based conversational AI startup ICS.ai. The team at ICS are using Azure and LUIS from Microsoft in their public sector AI chatbots, aimed at higher education, healthcare trusts and county councils. Amazon Amazon with the aims to improve Alexa’s capabilities released Alexa Skills Kit (ASK) which consists of APIs, tools, documentation, and code samples using which developers can build new skills for Alexa. In September, it announced a preview of a new design language named Alexa Presentation Language (APL). With APL, developers can build visual skills that include graphics, images, slideshows, and video, and to customize them for different device types. Amazon’s smart speaker Echo Dot saw amazing success with becoming the best seller in smart speaker category on Amazon. At its 2018 hardware event in Seattle, Amazon announced the launch of redesigned Echo Dot and a new addition to Alexa-powered A/V device called Echo Plus. As well as the continuing success of Alexa and the Amazon Echo, Amazon’s decision to launch the Alexa Fellowship at a number of leading academic institutions also highlights that for the biggest companies conversational AI is as much about research and exploration as it is products. Like Microsoft, it appears that Amazon is well aware that conversational AI is an area only in its infancy, still in development - as much as great products, it requires clear thinking and cutting-edge insight to ensure that it develops in a way that is both safe and impactful. What’s next? This huge array of products is a result of advances in deep learning researches. Now conversational AI is not just limited to small tasks like setting an alarm or searching the best restaurant. We can have a back and forth conversation with the conversational agent. But, needless to say, it still needs more work. Conversational agents are yet to meet user expectations related to sensing and responding with emotion. In the coming years, we will see these systems understand and do a good job at generating natural language. They will be able to have reasonably natural conversations with humans in certain domains, grounded in context. Also, the continuous development in IoT will provide AI systems with more context. Apple has introduced Shortcuts for iOS 12 to automate your everyday tasks Microsoft amplifies focus on conversational AI: Acquires XOXCO; shares guide to developing responsible bots Amazon is supporting research into conversational AI with Alexa fellowships

0
0
21786

article-image-googlers-launch-industry-wide-awareness-campaign-to-fight-against-forced-arbitration

Natasha Mathur

17 Jan 2019

6 min read

Googlers launch industry-wide awareness campaign to fight against forced arbitration

Natasha Mathur

17 Jan 2019

6 min read

A group of Googlers launched a public awareness social media campaign from 9 AM to 6 PM EST yesterday. The group, called, ‘Googlers for ending forced arbitration’ shared information about arbitration on their Twitter and Instagram accounts throughout the day. https://twitter.com/endforcedarb/status/1084813222505410560 The group tweeted out yesterday, as part of the campaign, that in surveying employees of 30+ tech companies and 10+ common Temp/Contractor suppliers in the industry, none of them could meet the three primary criteria needed for a transparent workplace. The three basic criteria include: optional arbitration policy for all employees and for all forms of discrimination (including contractors/temps), no class action waivers, and no gag rule that keeps arbitration hearings proceedings confidential. The group shared some hard facts about Arbitration and also busted myths regarding the same. Let’s have a look at some of the key highlights from yesterday’s campaign. At least 60 million Americans are forced to use arbitration The group states that the implementation of forced arbitration policy has grown significantly in the past seven years. Over 65% of the companies consisting of 1,000 or more employees, now have mandatory arbitration procedures. Employees don’t have an option to take their employers to court in cases of harassment or discrimination. People of colour and women are often the ones who get affected the most by this practice. How employers use forced Arbitration Forced arbitration is extremely unfair Arbitration firms that are hired by the companies usually always favour the companies over its employees. This is due to the fear of being rejected the next time by an employer lest the arbitration firm decides to favour the employee. The group states that employees are 1.7 times more likely to win in Federal courts and 2.6 times more likely to win in state courts than in arbitration. There are no public filings of the complaint details, meaning that the company won’t have anyone to answer to regarding the issues within the organization. The company can also limit its obligation when it comes to disclosing the evidence that you need to prove your case. Arbitration hearings happen behind closed doors within a company When it comes to arbitration hearings, it's just an employee and their lawyer, other party and their lawyer, along with a panel of one to three arbitrators. Each party gets to pick one arbitrator each, who is also hired by your employers. However, there’s usually only a single arbitrator panel involved as three-arbitrator panel costs five times more than a single arbitrator panel, as per the American Arbitration Association. Forced Arbitration requires employees to sign away their right to class action lawsuits at the start of the employment itself The group states that irrespective of having legal disputes or not, forced arbitration bans employees from coming together as a group in case of arbitration as well as in case of class action lawsuits. Most employers also practice “gag rule” which restricts the employee to even talk about their experience with the arbitration policy. There are certain companies that do give you an option to opt out of forced arbitration using an opt-out form but comes with a time constraint depending on your agreement with that company. For instance, companies such as Twitter, Facebook, and Adecco give their employees a chance to opt out of forced arbitration. Arbitration opt-out option JAMS and AAA are among the top arbitration organizations used by major tech giants JAMS, Judicial Arbitration and Mediation Services, is a private company that is used by employers like Google, Airbnb, Uber, Tesla, and VMware. JAMS does not publicly disclose the diversity of its arbitrators. Similarly, AAA, America Arbitration Association, is a non-profit organization where usually retired judges or lawyers serve as arbitrators. Arbitrators in AAA have an overall composition of 24% women and minorities. AAA is one of the largest arbitration organizations used by companies such as Facebook, Lyft, Oracle, Samsung, and Two Sigma. Katherine Stone, a professor from UCLA law school, states that the procedure followed by these arbitration firms don’t allow much discovery. What this means is that these firms don’t usually permit depositions or various kinds of document exchange before the hearing. “So, the worker goes into the hearing...armed with nothing, other than their own individual grievances, their own individual complaints, and their own individual experience. They can’t learn about the experience of others,” says Stone. Female workers and African-American workers are most likely to suffer from forced arbitration 58% female workers and 59% African American workers face mandatory arbitration depending on the workgroups. For instance, in the construction industry, which is a highly male-dominated industry, the imposition of forced arbitration is at the lowest rate. But, in the education and health industries, which has the majority of the female workforce, the imposition rate of forced arbitration is high. Forced Arbitration rate among different workgroups Supreme Court has gradually allowed companies to expand arbitration to employees & consumers The group states that the 1925 Federal Arbitration Act (FAA) had legalized arbitration between shipping companies in cases of settling commercial disputes. The supreme court, however, expanded this practice of arbitration to companies too. Supreme court decisions Apart from sharing these facts, the group also shed insight on dos and don’t that employees should follow under forced arbitration clauses. Dos and Dont’s The social media campaign by Googlers for forced arbitration represents an upsurge in the strength and courage among the employees within the tech industry as not just the Google employees but also employees from different tech companies shared their experience regarding forced arbitration. The group had researched academic institutions, labour attorneys, advocacy groups, etc, and the contracts of around 30 major tech companies, as a part of the campaign. To follow all the highlights from the campaign, follow the End Forced Arbitration Twitter account. Shareholders sue Alphabet’s board members for protecting senior execs accused of sexual harassment Recode Decode #GoogleWalkout interview shows why data and evidence don’t always lead to right decisions in even the world’s most data-driven company Tech Workers Coalition volunteers talk unionization and solidarity in Silicon Valley

0
0
15985

article-image-pay-it-forward-this-new-year-rewriting-the-code-on-career-development

Packt Editorial Staff

09 Jan 2019

3 min read

Pay it Forward this New Year – Rewriting the code on career development

Packt Editorial Staff

09 Jan 2019

3 min read

This Festive and New Year period, Packt Publishing Ltd are commissioning their newest group of authors – you, the everyday expert – in order to help the next generation of developers, coders, and architects. Packt, a global leader in publishing technology and coding eBooks and videos, are asking the technology community to ‘pay it forward’ by looking back at their career and paying their advice forward to support the next generation of technology leaders via a survey. The aim is to rewrite the code on career development and find out what everyday life looks like for those in our community. The Pay it Forward eBook that will be created, will provide tips and insights from the tech profession. Rather than giving off the shelf advice on how to better your career, Packt are asking everyday experts – the professionals across the globe who make the industry tick – for the insights and advice they would give from the good and the bad that they have seen. The most insightful and useful responses to the survey will be published by Packt in a new eBook, which will be available for free in early 2019. Some of the questions Pay it Forward will seek answers to, include: What is the biggest myth about working in tech? If you could give one career hack, what would it be? How do you keep on top of new developments and news? What are the common challenges you have seen or experienced in your profession? Who do you most admire and why? What is the best piece of advice you have received that has helped you in your career? What advice would you give to a student wishing to enter your profession? Have you actually broken the internet? We all make mistakes, how do you handle them? What do you love about what you do? People can offer their responses here: http://payitforward.packtpub.com/ Commenting on Pay it Forward, Packt Publishing Ltd CEO and founder Dave Maclean, said, “Over time we all gain knowledge through our experiences. We’ve all failed and learned and found better ways to do things. As we come into the New Year, we’re reflecting on what we have learned and we’re calling on our community of everyday experts to share their knowledge with people who are new to the industry, to the next generation of changemakers.” “For our part, Packt will produce a book that pulls together this advice and make it available for free to help those wishing to pursue a career within technology.” The survey should take no more than 10 minutes to complete and is in complete confidence, with no disclosure of names or details, unless agreed.

0
0
2820

article-image-ces-2019-is-bullshit-we-dont-need-after-2018s-techlash

Richard Gall

08 Jan 2019

6 min read

CES 2019 is bullshit we don't need after 2018's techlash

Richard Gall

08 Jan 2019

6 min read

The asinine charade that is CES is running in Las Vegas this week. Describing itself as 'the global stage of innovation', CES attempts to set the agenda for a new year in tech. While ostensibly it's an opportunity to see how technology might impact the lives of all of us over the next decade (or more), it is, in truth, a vapid carnival that does nothing but make the technology industry look stupid. Okay, perhaps I'm being a fun sponge: what's wrong with smart doorbells, internet connected planks of wood and other madcap ideas? Well, nothing really - but those inventions are only the tip of the iceberg. Disagree? Don't worry: you can find the biggest announcements from day one of CES 2019 here. What CES gets wrong Where CES really gets it wrong - and where it drives down a dead end of vacuity - is how it showcases the mind numbing rush to productize and then commercialize some of the really serious developments that could transform the world in a way that is ultimately far less trivial than the glitz and glamor of the way it is presented in the media would suggest. This isn't to say that there there won't be important news and interesting discussions to come out of CES. But even the more interesting topics can be diluted, becoming buzzwords for marketers to latch onto. As Wired remarks on Twitter, "the term AI-powered is used loosely and is almost always a marketing ploy, whether or not a product is impacted by AI." In the same thread, the publication's account also notes that 5G, another big theme for the event, won't be widely available for at least another 12 months. https://twitter.com/WIRED/status/1082294957979910144 Ultimately, what this tells us is that the focus of CES isn't really technology - not in the sense of how we build it and how we should use it. Instead, it is an event dedicated to the ways we can sell it. Perhaps in previous years, the gleeful excitement of CES was nothing but a bit of light as we recover from the holiday period. But this year it's different. 2018 was a year of reckoning in tech, as a range of scandals emerged that underlined the ways in which exciting technological innovation can be misused and deployed against the very people we assume it should be helping. From the Cambridge Analytica scandal to the controversy surrounding Amazon's Rekognition, Google's Project Dragonfly, and Microsoft's relationship with ICE, 2018 was a year that made it clearer than ever that buried somewhere beneath novel and amusing inventions, and better quality television screens are a set of interests that have little interest in making life better for people. The corporate glamor of CES 2019 is just kitsch It's not news that there are certain organisations and institutions that don't have the interests of the majority at heart. But CES 2019 does take on a new complexion in the shadow of all that has happened in 2019. The question 'what's the point of all this' takes on a more serious edge. When you add in the dissent that has come from a growing part of the Silicon Valley workforce, CES 2019 starts to look like an event that, much like many industry leaders, wants to bury the messy and complex reality of building software in favor of marketing buzz. In The Unbearable Lightness of Being, the author Milan Kundera describes kitsch as "the absolute denial of shit." It's following this definition that you can see CES as a kitsch event. This is because the it pushes the decisions and inevitable trade offs that go into developing new technologies and products into the shadows. It doesn't take negative consequences seriously. It's all just 'shit' that should be ignored. This all adds up to a message that seems to be: better doesn't even need to be built. It's here already, no risks, no challenges. Developers don't really feature at CES. That's not necessarily a problem - after all, it's not an event for them, and what developer wants to spend time hearing marketers talk about AI? But if 2018 has taught us anything, it's that a culture of commercialization that refuses to consider consequences other than what can be done in the service of business growth can be immensely damaging. It hurts people, and it might even be hurting democracy. Okay, the way to correct things probably isn't to simply invite more engineers to CES. But by the same token, CES is hardly helping things either. Everything important is happening outside the event Everything important seems to be happening at the periphery of this year's CES, in some instances quite literally outside the building. Apple's ad, for example, might have been a clever piece of branding, but it has captured the attention of the world. Arguably, it's more memorable than much of what's happening inside the event. And although it's possible to be cynical, it does nevertheless raise important questions about a number of companies attitudes to user data. https://twitter.com/NateIngraham/status/1081612316532064257 Another big talking point as this year's event began is who isn't present. Due to the government shutdown a number of officials that were due to attend and speak have had to cancel. This acts as a reminder of the wider context in which CES 2019 is taking place, in which a nativist government looks set on controlling controlling who and how people move across borders. It also highlights how euphemistic the phrase 'consumer technology' really is. TVs and cloud connected toilets might take the headlines, but its government surveillance that will likely have the biggest impact on our lives in the future. Not that any of this seemed to matter to Gary Shapiro, the Chief Executive of the Consumer Technology Association (the organization that puts on CES). Speaking to the BBC, Shapiro said: “It’s embarrassing to be on the world stage with a dominant event in the world of technology, and our federal government... can't be there to host their colleague government executives from around the world.” Shapiro's frustration is understandable from an organizer's perspective. But it also betrays the apparent ethos of CES: what's happening outside doesn't matter. We all deserve better than CES 2019 The new products on show at CES 2019 won't make everything better. There's a chance they will make everything worse. Arguably, the more blindly optimistic we are that they'll make things better, the more likely they are to make things worse. It's only by thinking through complex questions, and taking time to consider the possible consequences of our decision making as developers, product managers, or business people that we can actually be sure that things will get better. This doesn't mean we need to stop getting excited about new inventions and innovations. But things like smart cities and driverless cars pose a whole range of issues that shouldn't be buried in the optimistic schmaltz of events like CES. They need care and attention from policy makers, designers, software engineers, and many others to ensure they are actually going to help to build a better world for people.

0
0
20214

article-image-all-of-my-engineering-teams-have-a-machine-learning-feature-on-their-roadmap-will-ballard-talks-artificial-intelligence-in-2019-interview

Packt Editorial Staff

02 Jan 2019

3 min read

“All of my engineering teams have a machine learning feature on their roadmap” - Will Ballard talks artificial intelligence in 2019 [Interview]

Packt Editorial Staff

02 Jan 2019

3 min read

The huge advancements of deep learning and artificial intelligence were perhaps the biggest story in tech in 2018. But we wanted to know what the future might hold - luckily, we were able to speak to Packt author Will Ballard about what they see as in store for artificial in 2019 and beyond. Will Ballard is the chief technology officer at GLG, responsible for engineering and IT. He was also responsible for the design and operation of large data centers that helped run site services for customers including Gannett, Hearst Magazines, NFL, NPR, The Washington Post, and Whole Foods. He has held leadership roles in software development at NetSolve (now Cisco), NetSpend, and Works (now Bank of America). Explore Will Ballard's Packt titles here. Packt: What do you think the biggest development in deep learning / AI was in 2018? Will Ballard: I think attention models beginning to take the place of recurrent networks is a pretty impressive breakout on the algorithm side. In Packt’s 2018 Skill Up survey, developers across disciplines and job roles identified machine learning as the thing they were most likely to be learning in the coming year. What do you think of that result? Do you think machine learning is becoming a mandatory multidiscipline skill, and why? Almost all of my engineering teams have an active, or a planned machine learning feature on their roadmap. We’ve been able to get all kinds of engineers with different backgrounds to use machine learning -- it really is just another way to make functions -- probabilistic functions -- but functions. What do you think the most important new deep learning/AI technique to learn in 2019 will be, and why? In 2019 -- I think it is going to be all about PyTorch and TensorFlow 2.0, and learning how to host these on cloud PaaS. The benefits of automated machine learning and metalearning How important do you think automated machine learning and metalearning will be to the practice of developing AI/machine learning in 2019? What benefits do you think they will bring? Even ‘simple’ automation techniques like grid search and running multiple different algorithms on the same data are big wins when mastered. There is almost no telling which model is ‘right’ till you try it, so why not let a cloud of computers iterate through scores of algorithms and models to give you the best available answer? Artificial intelligence and ethics Do you think ethical considerations will become more relevant to developing AI/machine learning algorithms going forwards? If yes, how do you think this will be implemented? I think the ethical issues are important on outcomes, and on how models are used, but aren’t the place of algorithms themselves. If a developer was looking to start working with machine learning/AI, what tools and software would you suggest they learn in 2019? Python and PyTorch.

0
0
30007

article-image-quantum-computing-edge-analytics-and-meta-learning-key-trends-in-data-science-and-big-data-in-2019

Richard Gall

18 Dec 2018

11 min read

Quantum computing, edge analytics, and meta learning: key trends in data science and big data in 2019

Richard Gall

18 Dec 2018

11 min read

0
0
41148

article-image-troll-patrol-report-amnesty-international-and-element-ai-use-machine-learning-to-understand-online-abuse-against-women

Sugandha Lahoti

18 Dec 2018

5 min read

Troll Patrol Report: Amnesty International and Element AI use machine learning to understand online abuse against women

Sugandha Lahoti

18 Dec 2018

5 min read

Amnesty International has partnered with Element AI to release a Troll Patrol report on the online abuse against women on Twitter. This finding was a part of their Troll patrol project which invites human rights researchers, technical experts, and online volunteers to build a crowd-sourced dataset of online abuse against women. https://twitter.com/amnesty/status/1074946094633836544 Abuse of women on social media websites has been rising at an unprecedented rate. Social media websites have a responsibility to respect human rights and to ensure that women using the platform are able to express themselves freely and without fear. However, this has not been the case with Twitter and Amnesty has unearthed certain discoveries. Amnesty’s methodology was powered by machine learning Amnesty and Element AI surveyed 778 journalists and politicians from the UK and US throughout 2017 and then use machine learning techniques to qualitatively analyze abuse against women. The first process was to design large, unbiased dataset of tweets mentioning 778 women politicians and journalists from the UK and US. Next, over 6,500 volunteers (aged between 18 to 70 years old and from over 150 countries) analyzed 288,000 unique tweets to create a labeled dataset of abusive or problematic content. This was based on simple questions such as if the tweets were abusive or problematic, and if so, whether they revealed misogynistic, homophobic or racist abuse or other types of violence. Three experts also categorized a sample of 1,000 tweets to assess the quality of the tweets labeled by digital volunteers. Element AI used data science specifically using a subset of the Decoders and experts’ categorization of the tweets, to extrapolate the abuse analysis. Key findings from the report Per the findings of the Troll Patrol report, 7.1% of tweets sent to the women in the study were “problematic” or “abusive”. This amounts to 1.1 million tweets mentioning 778 women across the year, or one every 30 seconds. Women of color, (black, Asian, Latinx and mixed-race women) were 34% more likely to be mentioned in abusive or problematic tweets than white women. Black women were disproportionately targeted, being 84% more likely than white women to be mentioned in abusive or problematic tweets. Source: Amnesty Online abuse targets women from across the political spectrum faced similar levels of online abuse and both liberals and conservatives alike, as well as left and right-leaning media organizations, were targeted. Source: Amnesty What does this mean for people in tech Social media organizations are repeatedly failing in their responsibility to protect women’s rights online. They fall short of adequately investigating and responding to reports of violence and abuse in a transparent manner which leads many women to silence or censor themselves on the platform. Such abuses also hinder the freedom of expression online and also undermines women’s mobilization for equality and justice, particularly those groups who already face discrimination and marginalization. What can tech platforms do? One of the recommendations of the report is that social media platforms should publicly share comprehensive and meaningful information about reports of violence and abuse against women, as well as other groups, on their platforms. They should also talk in detail about how they are responding to it. Although Twitter and other platforms are using machine learning for content moderation and flagging, they should be transparent about the algorithms they use. They should publish information about training data, methodologies, moderation policies and technical trade-offs (such as between greater precision or recall) for public scrutiny. Machine learning automation should ideally be part of a larger content moderation system characterized by human judgment, greater transparency, rights of appeal and other safeguards. Amnesty in collaboration with Element AI also developed a machine learning model to better understand the potential and risks of using machine learning in content moderation systems. This model was able to achieve results comparable to their digital volunteers at predicting abuse, although it is ‘far from perfect still’, Amnesty notes. It achieves about a 50% accuracy level when compared to the judgment of experts. It was able to correctly identify 2 in every 14 tweets as abusive or problematic in comparison to experts who identified 1 in every 14 tweets as abusive or problematic. “Troll Patrol isn’t about policing Twitter or forcing it to remove content. We are asking it to be more transparent, and we hope that the findings from Troll Patrol will compel it to make that change. Crucially, Twitter must start being transparent about how exactly they are using machine learning to detect abuse, and publish technical information about the algorithms they rely on”. said Milena Marin senior advisor for tactical research at Amnesty International. Read more: The full list of Amnesty’s recommendations to Twitter. People on Twitter (the irony) are shocked at the release of Amnesty’s report and #ToxicTwitter is trending. https://twitter.com/gregorystorer/status/1074959864458178561 https://twitter.com/blimundaseyes/status/1074954027287396354 https://twitter.com/MikeWLink/status/1074500992266354688 https://twitter.com/BethRigby/status/1074949593438265344 Check out the full Troll Patrol report on Amnesty. Also, check out their machine learning based methodology in detail. Amnesty International takes on Google over Chinese censored search engine, Project Dragonfly. Twitter CEO, Jack Dorsey slammed by users after a photo of him holding ‘smash Brahminical patriarchy’ poster went viral Twitter plans to disable the ‘like’ button to promote healthy conversations; should retweet be removed instead?

0
0
10551

article-image-neurips-2018-how-machine-learning-experts-can-work-with-policymakers-to-make-good-tech-decisions-invited-talk

Bhagyashree R

18 Dec 2018

6 min read

NeurIPS 2018: How machine learning experts can work with policymakers to make good tech decisions [Invited Talk]

Bhagyashree R

18 Dec 2018

6 min read

At the 32nd annual NeurIPS conference held earlier this month, Edward William Felten, a professor of computer science and public affairs at Princeton University spoke about how decision makers and tech experts can work together to make better policies. The talk was aimed at answering questions such as why should public policy matter to AI researchers, what role can researchers play in policy debates, and how can researchers help bridge divides between the research and policy communities. While AI and machine learning are being used in high impact areas and have seen heavy adoption in every field, in recent years, they have also gained a lot of attention from the policymakers. Technology has become a huge topic of discussion among policymakers mainly because of its cases of failure and how it is being used or misused. They have now started formulating laws and regulations and holding discussions about how society will govern the development of these technologies. Prof. Felten explained how having constructive engagement with policymakers will lead to better outcomes for technology, government, and society. Why tech should be regulated? Regulating tech is important, and for that researchers, data scientists, and other people in tech fields have to close the gap between their research labs, cubicles, and society. Prof. Felten emphasizes that it is up to the tech people to bridge this gap as we not only have the opportunity but also a duty to be more active and productive in participating in public life. There are many people coming to the conclusion that tech should be regulated before it is too late. In a piece published by the Wall Street Journal, three experts debated about whether the government should regulate AI. One of them, Ryan Calo explains, “One of the ironies of artificial intelligence is that proponents often make two contradictory claims. They say AI is going to change everything, but there should be no changes to the law or legal institutions in response.” Prof. Felten points out that law and policies are meant to change in order to adapt according to the current conditions. They are not just written once and for all for the cases of today and the future, rather law is a living system that adapts to what is going on in the society. And, if we believe that technology is going to change everything, we can expect that law will change. Prof. Felten also said that not only the tech researchers and policymakers but the society also should also have some say in how the technology is developed, “After all the people who are affected by the change that we are going to cause deserve some say in how that change happens, how it is used. If we believe in a society which is fundamentally democratic in which everyone has a stake and everyone has a voice then it is only fair that those lives we are going to change have some say in how that change come about and what kind of changes are going to happen and which are not.” How experts can work with decision makers to make good tech decisions The three key approaches that we can take to engage with policymakers to take a decision about technology: Engage in a two-way dialogue with policymakers As a researcher, we might think that we are tech experts/scientists and we do not need to get involved in politics. We need to just share the facts we know and our job is done. But if researchers really want to maximize their impact in policy debates, they need to combine the knowledge and preferences of policymakers with their knowledge and preferences. Which means, they need to take into account what policymakers might already have heard about a particular subject and the issues or approaches that resonate with them. Prof. Felten explains that this type of understanding and exchange of ideas can be done in two stages. Researchers need to ask several questions to policymakers, which is not a one-time thing, rather a multi-round protocol. They have to go back and forth with the person and need to build engagement over time and mutual trust. And, then they need to put themselves into the shoes of a decision maker and understand how to structure the decision space for them. Be present in the room when the decisions are being made To have their influence on the decisions that get made, researchers need to have “boots on the ground.” Though not everyone has to engage in this deep and long-term process of decision making, we need some people from the community to engage on behalf of the community. Researchers need to be present in the room when the decisions are being made. This means taking posts as advisers or civil servants. We already have a range of such posts at both local and national government levels, alongside a range of opportunities to engage less formally in policy development and consultations. Creating a career path and rewarding policy engagement To drive this engagement, we need to create a career path which rewards policy engagement. We should have a way through which researchers can move between policy and research careers. Prof. Felten pointed to a range of US-based initiatives that seek to bring those with technical expertise into policy-oriented roles, such as the US Digital Service. He adds that if we do not create these career paths and if this becomes something that people can do only after sacrificing their careers then very few people will do it. This needs to be an activity that we learn to respect when people in the community do it well. We need to build incentives whether it is in career incentives in academia, whether it is understanding that working in government or on policy issues is a valuable part of one kind of academic career and not thinking of it as deter or a stop. To watch the full talk, check out NeurIPS Facebook page. NeurIPS 2018: Rethinking transparency and accountability in machine learning NeurIPS 2018: Developments in machine learning through the lens of Counterfactual Inference [Tutorial] Accountability and algorithmic bias: Why diversity and inclusion matters [NeurIPS Invited Talk]

0
0
15736

article-image-nvidia-demos-a-style-based-generative-adversarial-network-that-can-generate-extremely-realistic-images-has-ml-community-enthralled

Prasad Ramesh

17 Dec 2018

4 min read

NVIDIA demos a style-based generative adversarial network that can generate extremely realistic images; has ML community enthralled

Prasad Ramesh

17 Dec 2018

4 min read

In a paper published last week, NVIDIA researchers come up with a way to generate photos that look like they were clicked with a camera. This is done via using generative adversarial networks (GANs). An alternative architecture for GANs Borrowing from style transfer literature, the researchers use an alternative generator architecture for GANs. The new architecture induces an automatically learned unsupervised separation of high-level attributes of an image. These attributes can be pose or identity of a person. Images generated via the architecture have some stochastic variation applied to them like freckles, hair placement etc. The architecture allows intuitive and scale-specific control of the synthesis to generate different variations of images. Better image quality than a traditional GAN This new generator is better than the state-of-the-art with respect to image quality, the images have better interpolation properties and disentangles the latent variation factors better. In order to quantify the interpolation quality and disentanglement, the researchers propose two new automated methods which are applicable to any generator architecture. They use a new high quality, highly varied data set with human faces. With motivation from transfer literature, NVIDIA researchers re-design the generator architecture to expose novel ways of controlling image synthesis. The generator starts from a learned constant input and adjusts the style of an image at each convolution layer. It makes the changes based on the latent code thereby having direct control over the strength of image features across different scales. When noise is injected directly into the network, this architectural change causes automatic separation of high-level attributes in an unsupervised manner. Source: A Style-Based Generator Architecture for Generative Adversarial Networks In other words, the architecture combines different images, their attributes from the dataset, applies some variations to synthesize images that look real. As proven in the paper, surprisingly, the redesign of images does not compromise image quality but instead improves it considerably. In conclusion with other works, a traditional GAN generator architecture is inferior to a style-based design. Not only human faces but they also generate bedrooms, cars, and cats with this new architecture. Public reactions This synthetic image generation has generated excitement among the public. A comment from Hacker News reads: “This is just phenomenal. Can see this being a fairly disruptive force in the media industry. Also, sock puppet factories could use this to create endless numbers of fake personas for social media astroturfing.” Another comment reads: “The improvements in GANs from 2014 are amazing. From coarse 32x32 pixel images, we have gotten to 1024x1024 images that can fool most humans.” Fake photographic images as evidence? As a thread on Twitter suggests, can this be the end of photography as evidence? Not very likely, at least for the time being. For something to be considered as evidence, there are many poses, for example, a specific person doing a specific action. As seen from the results in tha paper, some cat images are ugly and deformed, far from looking like the real thing. Also “Our training time is approximately one week on an NVIDIA DGX-1 with 8 Tesla V100 GPUs” now that a setup that costs up to $70K. Besides, some speculate that there will be bills in 2019 to control the use of such AI systems: https://twitter.com/BobbyChesney/status/1074046157431717894 Even the big names in AI are noticing this paper: https://twitter.com/goodfellow_ian/status/1073294920046145537 You can see a video showcasing the generated images on YouTube. This AI generated animation can dress like humans using deep reinforcement learning DeepMasterPrints: ‘master key’ fingerprints made by a neural network can now fake fingerprints UK researchers have developed a new PyTorch framework for preserving privacy in deep learning

0
0
17949

article-image-neurips-2018-rethinking-transparency-and-accountability-in-machine-learning

Bhagyashree R

16 Dec 2018

8 min read

NeurIPS 2018: Rethinking transparency and accountability in machine learning

Bhagyashree R

16 Dec 2018

8 min read

Key takeaways from the discussion To solve problems with machine learning, you must first understand them. Different people or groups of people are going to define a problem in a different way. So, we shouldn't believe that the way we want to frame the problem computationally is the right way. If we allow that our systems include people and society, it is clear that we have to help negotiate values, not simply define them. Last week, at the 32nd NeurIPS 2018 annual conference, Nitin Koli, Joshua Kroll, and Deirdre Mulligan presented the common pitfalls we see when studying the human side of machine learning. Machine learning is being used in high-impact areas like medicine, criminal justice, employment, and education for making decisions. In recent years, we have seen that this use of machine learning and algorithmic decision making have resulted in unintended discrimination. It’s becoming clear that even models developed with the best of intentions may exhibit discriminatory biases and perpetuate inequality. Although researchers have been analyzing how to put concepts like fairness, accountability, transparency, explanation, and interpretability into practice in machine learning, properly defining these things can prove a challenge. Attempts have been made to define them mathematically, but this can bring new problems. This is because applying mathematical logic to human concepts that have unique and contested political and social dimensions necessarily has blind spots - every point of contestation can’t be integrated into a single formula. In turn, this can cause friction with other disciplines as well as the public. Based on their research on what various terms mean in different contexts, Nitin Koli, Joshua Krill, and Deirdre Mulligan drew out some of the most common misconceptions machine learning researchers and practitioners hold. Sociotechnical problems To find a solution to a particular problem, data scientists need precise definitions. But how can we verify that these definitions are correct? Indeed, many definitions will be contested, depending on who you are and what you want them to mean. A definition that is fair to you will not necessarily be fair to me”, remarks Mr. Kroll. Mr. Kroll explained that while definitions can be unhelpful, they are nevertheless essential from a mathematical perspective. This means there appears to be an unresolved conflict between concepts and mathematical rigor. But there might be a way forward. Perhaps it’s wrong to simply think in this dichotomy of logical rigor v. the messy reality of human concepts. One of the ways out of this impasse is to get beyond this dichotomy. Although it’s tempting to think of the technical and mathematical dimension on one side, with the social and political aspect on the other, we should instead see them as intricately related. They are, Kroll suggests, socio-technical problems. Kroll goes on to say that we cannot ignore the social consequences of machine learning: “Technologies don’t live in a vacuum and if we pretend that they do we kind of have put our blinders on and decided to ignore any human problems.” Fairness in machine learning In the real world, fairness is a concept directly linked to processes. Think, for example, of the voting system. Citizens cast votes to their preferred candidates and the candidate who receives the most support is elected. Here, we can say that even though the winning candidate was not the one a candidate voted for, but at least he/she got the chance to participate in the process. This type of fairness is called procedural fairness. However, in the technical world, fairness is often viewed in a subtly different way. When you place it in a mathematical context, fairness centers on outcome rather than process. Kohli highlighted that trade offs between these different concepts can’t be avoided. They’re inevitable. A mathematical definition of fairness places a constraint over the behavior of a system, and this constraint will narrow down the cause of models that can satisfy these conditions. So, if we decide to add too many fairness constraints to the system, some of them will be self-contradictory. One more important point machine learning practitioners should keep in mind is that when we talk about the fairness of a system, that system isn’t a self-contained and coherent thing. It is not a logical construct - it’s a social one. This means there are a whole host of values, ideas, and histories that have an impact on its reality.. In practice, this ultimately means that the complexity of the real world from which we draw and analyze data can have an impact on how a model works. Kohli explained this by saying, “it doesn’t really matter... whether you are building a fair system if the context in which it is developed and deployed in is fundamentally unfair.” Accountability in machine learning Accountability is ultimately about trust. It’s about the extent you can be sure you know what is ‘true’ about a system. It refers to the fact that you know how it works and why it does things in certain ways. In more practical terms, it’s all about invariance and reliability. To ensure accountability inside machine learning models, we need to follow a layered model. The bottom layer is an accounting or recording layer, that keeps track of what a given system is doing and the ways in which it might have been changed.. The next layer is a more analytical layer. This is where those records on the bottom layer are analyzed, with decisions made about performance - whether anything needs to be changed and how they should be changed. The final and top-most layer is about responsibility. It’s where the proverbial buck stops - with those outside of the algorithm, those involved in its construction. “Algorithms are not responsible, somebody is responsible for the algorithm,” explains Kroll. Transparency Transparency is a concept heavily tied up with accountability. Arguably you have no accountability without transparency. The layered approach discussed above should help with transparency, but it’s also important to remember that transparency is about much more than simply making data and code available. Instead, it demands that the decisions made in the development of the system are made available and clear too. Mr. Kroll emphasizes, “to the person at the ground-level for whom the decisions are being taken by some sort of model, these technical disclosures aren’t really useful or understandable.” Explainability In his paper Explanation in Artificial Intelligence: Insights from the Social Sciences, Tim Miller describes what is explainable artificial intelligence. According to Miller, explanation takes many forms such as causal, contrastive, selective, and social. Causal explanation gives reasons behind why something happened, for example, while contrastive explanations can provide answers to questions like“Why P rather than not-P?". But the most important point here is that explanations are selective. An explanation cannot include all reasons why something happened; explanations are always context-specific, a response to a particular need or situation. Think of it this way: if someone asks you why the toaster isn’t working, you could just say that it’s broken. That might be satisfactory in some situations, but you could, of course, offer a more substantial explanation, outlining what was technically wrong with the toaster, how that technical fault came to be there, how the manufacturing process allowed that to happen, how the business would allow that manufacturing process to make that mistake… you could, of course, go on and on. Data is not the truth Today, there is a huge range of datasets available that will help you develop different machine learning models. These models can be useful, but it’s essential to remember that they are models. A model isn’t the truth - it’s an abstraction, a representation of the world in a very specific way. One way of taking this fact into account is the concept of ‘construct validity’. This sounds complicated, but all it really refers to is the extent to which a test - say a machine learning algorithm - actually measures what it says it’s trying to measure. The concept is widely used in disciplines like psychology, but in machine learning, it simply refers to the way we validate a model based on its historical predictive accuracy. In a nutshell, it’s important to remember that just as data is an abstraction of the world, models are also an abstraction of the data. There’s no way of changing this, but having an awareness that we’re dealing in abstractions ensures that we do not lapse into the mistake of thinking we are in the realm of ‘truth’. To build a fair(er) systems will ultimately require an interdisciplinary approach, involving domain experts working in a variety of fields. If machine learning and artificial intelligence is to make a valuable and positive impact in fields such as justice, education, and medicine, it’s vital that those working in those fields work closely with those with expertise in algorithms. This won’t fix everything, but it will be a more robust foundation from which we can begin to move forward. You can watch the full talk on the Facebook page of NeurIPS. Researchers unveil a new algorithm that allows analyzing high-dimensional data sets more effectively, at NeurIPS conference Accountability and algorithmic bias: Why diversity and inclusion matters [NeurIPS Invited Talk] NeurIPS 2018: A quick look at data visualization for Machine learning by Google PAIR researchers [Tutorial]

0
0
17388

article-image-neurips-2018-developments-in-machine-learning-through-the-lens-of-counterfactual-inference-tutorial

Savia Lobo

15 Dec 2018

7 min read

NeurIPS 2018: Developments in machine learning through the lens of Counterfactual Inference [Tutorial]

Savia Lobo

15 Dec 2018

7 min read

The 32nd NeurIPS Conference kicked off on the 2nd of December and continued till the 8th of December in Montreal, Canada. This conference covered tutorials, invited talks, product releases, demonstrations, presentations, and announcements related to machine learning research. “Counterfactual Inference” is one such tutorial presented during the NeurIPS by Susan Athey, The Economics of Technology Professor at the Stanford Graduate School of Business. This tutorial reviewed the literature that brings together recent developments in machine learning with methods for counterfactual inference. It will focus on problems where the goal is to estimate the magnitude of causal effects, as well as to quantify the researcher’s uncertainty about these magnitudes. She starts by mentioning that there are two sets of issues make causal inference must know concepts for AI. Some gaps between what we are doing in our research, and what the firms are applying. There are success stories such as Google images and so on. However, the top tech companies also do not fully adopt all the machine learning / AI concepts fully. If a firm dumps their old simple regression credit scoring model and makes use of a black box based on ML, are they going to worry what’s going to happen when they use the Black Box algorithm? According to Susan, the reason why firms and economists historically use simple models is that just by looking at the data it is difficult to understand whether the approach used is right. Whereas, using a Black box algorithm imparts some of the properties such as Interpretability, which helps in reasoning about the correctness of the approach. This helps researchers to make improvements in the model. Secondly, stability and robustness are also important for applications. Transfer learning helps estimate the model in one setting and use the same learning in some other setting. Also, these models will show fairness as many aspects of discrimination relates to correlation vs. causation. Finally, machine learning imparts a Human-like AI behavior that gives them the ability to make reasonable and never seen before decisions. All of these desired properties can be obtained in a causal model. The Causal Inference Framework In this framework, the goal is to learn a model of how the world works. For example, what happens to a body while a drug enters. Impact of intervention can be context specific. If a user learns something in a particular setting but it isn't working well in the other setting, it is not a problem with the framework. It’s, however, hard to do causal inference, there are some challenges including: We do not have the right kind of variation in the data. Lack of quasi-experimental data for estimation Unobserved contexts/confounders or insufficient data to control for observed confounders Analyst’s lack of knowledge about model Prof. Athey explains the true AI algorithm by using an example of contextual bandit under which there might be different treatments. In this example, one can select among alternative choices. They must have an explicit or implicit model of payoffs from alternatives. They also learn from past data. Here, the initial stages of learning have limited data, where there is a statistician inside the AI which performs counterfactual reasoning. A statistician should use best performing techniques (efficiency, bias). Counterfactual Inference Approaches Approach 1: Program Evaluation or Treatment Effect Estimation The goal of this approach is to estimate the impact of an intervention or treatment assignment policies. This literature focuses mainly on low dimensional interventions. Here, the estimands or the things that people want to learn is the average effect (Did it work?). For more sophisticated projects, people seek the heterogeneous effect (For whom did it work?) and optimal policy (policy mapping of people’s behavior to their assignments). The main goal here is to set confidence intervals around these effects to avoid bias or noisy sampling. This literature focuses on design that enables identification and estimation of these effects without using randomized experiments. Some of the designs include Regression discontinuity, difference-in-difference, and so on. Approach 2: Structural Estimation or ‘Generative models and counterfactuals’ Here the goal is to impact on welfare/profits of participants in alternative counterfactual regimes. These regimes may not have ever been observed in relevant contexts. These also need a behavioral model of participants. One can make use of Dynamic structural models to learn about value function from agent choices in different states. Approach 3: Causal discovery The goal of this approach is to uncover the causal structure of a system. Here the analyst believes that there is an underlying structure where some variables are causes of others, e.g. a physical stimulus leads to biological responses. Application of this can be found in understanding software systems and biological systems. [box type="shadow" align="" class="" width=""]Recent literature brings causal reasoning, statistical theory, and modern machine learning algorithms together to solve important problems. The difference between supervised learning and causal inference is that supervised learning can evaluate in a test set in a model‐free way. In causal inference, parameter estimation is not observed in a test set. Also, it requires theoretical assumptions and domain knowledge. [/box] Estimating ATE (Average Treatment Effects) under unconfoundedness Here only the observational data is available and only an analyst has access to the data that is sufficient for the part of the information used to assign units to treatments that is related to potential outcomes. The speaker here has used an example of how online Ads are targeted using cookies. The user sees car ads because the advertiser knows that the user has visited car reviewer websites. Here the purchases cannot be related to users who saw an ad versus the ones who did not. Hence, the interest in cars is the unobserved confounder. However, the analyst can see the history of the websites visited by the user. This is the main source of information for the advertiser about user interests. Using Supervised ML to measure estimate ATE under unconfoundedness The first supervised ML method is propensity score weighting or KNN on propensity score. For instance, make use of the LASSO regression model to estimate the propensity score. The second method is Regression adjustment which tries to estimate the further outcomes or access the features of further outcomes to get a causal effect. The next method is estimating CATE (Conditional average treatment effect) and take averages using the BART model. The method mentioned by Prof. Athey here is, Double robust/ double machine learning which uses cross-fitted augmented inverse propensity scores. Another method she mentioned was Residual Balancing which avoids assuming a sparse model thus allowing applications with a complex assignment. If unconfoundedness fails, the alternate assumption: there exists an instrumental variable Zi that is correlated with Wi (“relevance”) and where: Structural Models Structural models enable counterfactuals for never‐seen worlds. Combining Machine learning with structural model provides attention to identification, estimation using “good” exogenous variation in data. Also, adding a sensible structure improves performance required for never‐seen counterfactuals, increased efficiency for sparse data (e.g. longitudinal data) Nature of structure includes: Learning underlying preferences that generalize to new situations Incorporating nature of choice problem Many domains have established setups that perform well in data‐poor environments With the help of Discrete Choice Model, users can evaluate the impact of a new product introduction or the removal of a product from choice set. On combining these Discrete Choice Models with ML, we have two approaches to product interactions: Use information about product categories, assume products substitutes within categories Do not use available information about categories, estimate subs/complements Susan has concluded by mentioning some of the challenges on Causal inference, which include data sufficiency, finding sufficient/useful variation in historical data. She also mentions that recent advances in computational methods in ML don’t help with this. However, tech firms conducting lots of experiments, running bandits, and interacting with humans at large scale can greatly expand the ability to learn about causal effects! Head over to the Susan Athey’s entire tutorial on Counterfactual Inference at NeurIPS Facebook page. Researchers unveil a new algorithm that allows analyzing high-dimensional data sets more effectively, at NeurIPS conference Accountability and algorithmic bias: Why diversity and inclusion matters [NeurIPS Invited Talk] NeurIPS 2018: A quick look at data visualization for Machine learning by Google PAIR researchers [Tutorial]

0
0
17477

article-image-key-takeaways-from-sundar-pichais-congress-hearing-over-user-data-political-bias-and-project-dragonfly

Natasha Mathur

14 Dec 2018

12 min read

Key Takeaways from Sundar Pichai’s Congress hearing over user data, political bias, and Project Dragonfly

Natasha Mathur

14 Dec 2018

12 min read

Google CEO, Sundar Pichai testified before the House Judiciary Committee earlier this week. The hearing titled “Transparency & Accountability: Examining Google and its Data Collection, Use, and Filtering Practices” was a three-and-a-half-hour question-answer session that centered mainly around user data collection at Google, allegations of political bias in its search algorithms, and Google’s controversial plans with China. “All of these topics, competition, censorship, bias, and others..point to one fundamental question that demands the nation’s attention. Are America’s technology companies serving as instruments of freedom or instruments of control?,” said Representative Kevin McCarthy of California, the House Republican leader. The committee members could have engaged with Pichai on more important topics had they not been busy focussing on opposing each other’s opinions over whether Google search and its other products are biased against conservatives. Also, most of Pichai’s responses were unsatisfactory as he cleverly dodged questions regarding its Project Dragonfly and user data. Here are the key highlights from the testimony. Allegations of Political Bias One common theme throughout the long hearing session was Republicans asking questions based around alleged bias against conservatives on Google's platforms. Google search Bias Rep. Lamar Smith asked questions regarding the alleged political bias that is “imbibed” in Google’s search algorithms and its culture. Smith presented an example of a study by Robert Epstein, a Harvard trained psychologist. As per the study’s results, Google’s search bias likely swung 2.6 million votes to Hillary Clinton during the 2016 elections. To this Pichai’s reply was that Google has investigated some of the studies including the one by Dr. Epstein, and found that there were issues with the methodology and its sample size. He also mentioned how Google evaluates their search results for accuracy by using a “robust methodology” that it has been using for the past 20 years. Pichai also added that “providing users with high quality, accurate, and trusted information is sacrosanct to us. It’s what our principles are and our business interests and our natural long-term incentives are aligned with that. We need to serve users everywhere and we need to earn their trust in order to do so.” Google employees’ bias, the reason for biased search algorithms, say Republicans Smith also presented examples of pro-Trump content and immigration laws being tagged as hate speech on Google search results posing threat to the democratic form of government. He also alleged that people at Google were biased and intentionally transferred their biases into these search algorithms to get the results they want and management allows it. Pichai clarified that Google doesn't manually intervene on any particular search result. “Google doesn’t choose conservative voices over liberal voices. There’s no political bias and Google operates in a neutral way,” added Pichai. Would Google allow an independent third party to study its search results to determine the degree of political bias? Pichai responded to this question saying that they already have third parties that are completely independent and haven’t been appointed by Google in place for evaluating its search algorithms. “We’re transparent as to how we evaluate our search. We publish our rater guidelines. We publish it externally and raters evaluate it, we’re trying hard to understand what users want and this is what we think is right. It’s not possible for an employee or a group of employees to manipulate our search algorithm”. Political advertising bias The Committee Chairman Bob Goodlatte, a Republican from Virginia also asked Pichai about political advertising bias on Google’s ad platforms that offer different rates for different political candidates to reach prospective voters. This is largely different than how other competitive media platforms like TV and radio operate - offering the lowest rate to all political candidates. He asked if Google should charge the same effective ad rates to political candidates. Pichai explained that their advertising products are built without any bias and the rates are competitive and set by a live auction process. The prices are calculated automatically based on the keywords that you’re bidding for, and on the demand in the auction. There won’t be a difference in rates based on any political reasons unless there are keywords that are of particular interest. He referred the whole situation to a demand-supply equilibrium, where the rates can differ but that will vary from time to time. There could be occasions when there is a substantial difference in rates based on the time of the day, location, how keywords are chosen etc, and it’s a process that Google has been using for over 20 years. Pichai further added that “anything to do with the civic process, we make sure to do it in a non-partisan way and it's really important for us”. User data collection and security Another highlight of the hearing was Google’s practices around user data collection and security. “Google is able to collect an amount of information about its users that would even make the NSA blush. Americans have no idea the sheer volume of information that is collected”, said Goodlatte. Location tracking data related privacy concerns During Mr. Pichai’s testimony, the first question from Rep. Goodlatte was about whether consumers understand the frequency and amount of location data that Google collects from its Android operating system. Goodlatte asked Pichai about the collection of location data and apps running on Android. To this Pichai replied that Google offers users controls for limiting location data collection. “We go to great lengths to protect their privacy, we give them transparency, choice, and control,” says Pichai. Pichai highlighted that Android is a powerful smartphone that offers services to over 2 billion people. User data that is collected via Android depends on the applications that users choose to use. He also pointed out that Google makes it very clear to its users about what information is collected. He pointed out that there are terms of service and also a “privacy checkup”. Going to “my account” settings on Gmail gives you a clear picture of what user data they have. He also says that users can take that data to other platforms if they choose to stop using Google. On Google+ data breach Another Rep. Jerrold Nadler talked about the recent Google plus data breach that affected some 52.5 million users. He asked Pichai about the legal obligations that the company is under to publicly expose the security issues. Pichai responded to this saying that Google “takes privacy seriously,” and that Google needs to alert the users and the necessary authorities of any kind of data breach or bugs within 72 hours. He also mentions "building software inevitably has bugs associated as part of the process”. Google undertakes a lot of efforts to find bugs and the root cause of it, and make sure to take care of it. He also says how they have advanced protection in Gmail to offer a stronger layer of security to its users. Google’s commitment to protecting U.S. elections from foreign interference It was last year when Google discovered that Russian operatives spent tens of thousands of dollars on ads on its YouTube, Gmail and Google Search products in an effort to meddle in the 2016 US presidential election. “Does Google now know the full extent to which its online platforms were exploited by Russian actors in the election 2 years ago?” asked Nadler. Pichai responded that Google conducted a thorough investigation in 2016. It found out that there were two main ads accounts linked to Russia which advertised on google for about 4700 dollars in advertising. “We found a limited activity, improper activity, we learned from that and have increased the protections dramatically we have around our elections offering”, says Pichai. He also added that to protect the US elections, Google will do a significant review of how ads are bought, it will look for the origin of these accounts, share and collaborate with law enforcement, and other tech companies. “Protecting our elections is foundational to our democracy and you have my full commitment that we will do that,” said Pichai. Google’s plans with China Rep. Sheila Jackson Lee was the first person to directly ask Pichai about the company’s Project Dragonfly i.e. its plans of building a censored search engine with China. “We applauded you in 2010 when Google took a very powerful stand principle and democratic values over profits and came out of China,” said Jackson. Other who asked Pichai regarding Google's China plans were Rep. Tom Marino and Rep. David Cicilline. Google left China in 2010 because of concerns regarding hacking, attacks, censorship, and how the Chinese government was gaining access to its data. How is working with the Chinese govt to censor search results a part of Google’s core values? Pichai repeatedly said that Google has no plans currently to launch in China. “We don't have a search product there. Our core mission is to provide users with access to information and getting access to information is an important right (of users) so we try hard to provide that information”, says Pichai. He added that Google always has evidence based on every country that it has operated in. “Us reaching out and giving users more information has a very positive impact and we feel that calling but right now there are no plans to launch in China,” says Pichai. He also mentioned that if Google ever approaches a decision like that he’ll be fully transparent with US policymakers and “engage in consult widely”. He further added that Google only provides Android services in China for which it has partners and manufacturers all around the world. “We don't have any special agreements on user data with the Chinese government”, said Pichai. On being asked by Rep. Marino about a report from The Intercept that said Google created a prototype for a search engine to censor content in China, Pichai replied, “we designed what a search could look like if it were to be launched in a country like China and that’s what we explored”. Rep. Cicilline asked Pichai whether any employees within Google are currently attending product meetings on Dragonfly. Pichai replied evasively saying that Google has “undertaken an internal effort, but right now there are no plans to launch a search service in China necessarily”. Cicilline shot another question at Pichai asking if Google employees are talking to members of the Chinese government, which Pichai dodged by responding with "Currently we are not in discussions around launching a search product in China," instead. Lastly, when Pichai was asked if he would rule out "launching a tool for surveillance and censorship in China”, he replied that Google’s mission is providing users with information, and that “we always think it’s in our duty to explore possibilities to give users access to information. I have a commitment, but as I’ve said earlier we’ll be very thoughtful and we’ll engage widely as we make progress”. On ending forced arbitration for all forms of discrimination Last month 20,000 Google employees along with Temps, Vendors, and Contractors walked out of their respective Google offices to protest discrimination and sexual harassment in the workplace. As part of the walkout, Google employees laid out five demands urging Google to bring about structural changes within the workplace. One of the demands was ending forced arbitration meaning that Google should no longer require people to waive their right to sue. Also, that every co-worker should have the right to bring a representative, or supporter of their choice when meeting with HR for filing a harassment claim. Rep. Pramila Jayapal asked Pichai if he can commit to expanding the policy of ending forced arbitration for any violation of an employee’s (also contractors) right not just sexual harassment. To this Pichai replied that Google is currently definitely looking into this further. “It’s an area where I’ve gotten feedback personally from our employees so we’re currently reviewing what we could do and I’m looking forward to consulting, and I’m happy to think about more changes here. I’m happy to have my office follow up to get your thoughts on it and we are definitely committed to looking into this more and making changes”, said Pichai. Managing misinformation and hate speech During the hearing, Pichai was questioned about how Google is handling misinformation and hate speech. Rep. Jamie Raskin asked why videos promoting conspiracy theory known as “Frazzledrip,” ( Hillary Clinton kills young women and drinks their blood) are still allowed on YouTube. To this Pichai responded with, “We would need to validate whether that specific video violates our policies”. Rep. Jerry Nadler also asked Pichai about Google’s actions to "combat white supremacy and right-wing extremism." Pichai said Google has defined policies against hate speech and that if Google finds violations, it takes down the content. “We feel a tremendous sense of responsibility to moderate hate speech, define hate speech clearly inciting violence or hatred towards a group of people. It's absolutely something we need to take a strict line on. We’ve stated our policies strictly and we’re working hard to make our enforcement better and we’ve gotten a lot better but it's not enough so yeah we’re committed to doing a lot more here”, said Pichai. Our Take Hearings between tech companies and legislators, in the current form, are an utter failure. In addition to making tech reforms, there is an urgent need to also make reforms in how policy hearings are conducted. It is high time we upgraded ourselves to the 21st century. These were the key highlights of the hearing held on 11th December 2018. We recommend you watch the complete hearing for a more comprehensive context. As Pichai defends Google’s “integrity” ahead of today’s Congress hearing, over 60 NGOs ask him to defend human rights by dropping Drag Google bypassed its own security and privacy teams for Project Dragonfly reveals Intercept Google employees join hands with Amnesty International urging Google to drop Project Dragonfly

0
0
25368

article-image-the-cruelty-of-algorithms-heartbreaking-open-letter-criticizes-tech-companies-for-showing-baby-ads-after-stillbirth

Bhagyashree R

13 Dec 2018

3 min read

The cruelty of algorithms: Heartbreaking open letter criticizes tech companies for showing baby ads after stillbirth

Bhagyashree R

13 Dec 2018

3 min read

2018 has thrown up a huge range of examples of the unintended consequences of algorithms. From the ACLU’s research in July which showed how the algorithm in Amazon’s facial recognition software incorrectly matched images of congress members with mugshots, to the same organization’s sexist algorithm used in the hiring process, this has been a year where the damage that algorithms can cause has become apparent. But this week, an open letter by Gillian Brockell, who works at The Washington Post, highlighted the traumatic impact algorithmic personalization can have. In it, Brockell detailed how personalized ads accompanied her pregnancy, and speculated how the major platforms that dominate our digital lives. “...I bet Amazon even told you [the tech companies to which the letter is addressed] my due date… when I created an Amazon registry,” she wrote. But she went on to explain how those very algorithms were incapable of processing the tragic death of her unborn baby, blind to the grief that would unfold in the aftermath. “Did you not see the three days silence, uncommon for a high frequency user like me”. https://twitter.com/STFUParents/status/1072759953545416706 But Brockell’s grief was compounded by the way those companies continued to engage with her through automated messaging. She explained that although she clicked the “It’s not relevant to me” option those ads offer users, this only led algorithms to ‘decide’ that she had given birth, offering deals on strollers and nursing bras. As Brockell notes in her letter, stillbirths aren’t as rare as many think, with 26,000 happening in the U.S. alone every year. This fact only serves to emphasise the empathetic blind spots in the way algorithms are developed. “If you’re smart enough to realize that I’m pregnant, that I’ve given birth, then surely you’re smart enough to realize my baby died.” Brockell’s open letter garnered a lot of attention on social media, to such an extent that a number of the companies at which Brockell had directed her letter responded. Speaking to CNBC, a Twitter spokesperson said, “We cannot imagine the pain of those who have experienced this type of loss. We are continuously working on improving our advertising products to ensure they serve appropriate content to the people who use our services.” Meanwhile, a Facebook advertising executive, Rob Goldman responded, “I am so sorry for your loss and your painful experience with our products.” He also explained how these ads could be blocked. “We have a setting available that can block ads about some topics people may find painful — including parenting. It still needs improvement, but please know that we’re working on it & welcome your feedback.” Experian did not respond to requests for comment. However, even after taking Goldman’s advice, Brockell revealed she was then shown adoption adverts: https://twitter.com/gbrockell/status/1072992972701138945 “It crossed the line from marketing into Emotional Stalking,” said one Twitter user. While the political impact of algorithms has led to sustained commentary and criticism in 2018, this story reveals the personal impact algorithms can have. It highlights that as artificial intelligence systems become more and more embedded in everyday life, engineers will need an acute sensitivity and attention to detail to the potential use cases and consequences that certain algorithms may have. You can read Brockell’s post on Twitter. Facebook’s artificial intelligence research team, FAIR, turns five. But what are its biggest accomplishments? FAT Conference 2018 Session 3: Fairness in Computer Vision and NLP FAT Conference 2018 Session 4: Fair Classification

0
0
13738

article-image-deep-learning-indaba-presents-the-state-of-natural-language-processing-in-2018

Sugandha Lahoti

12 Dec 2018

5 min read

Deep Learning Indaba presents the state of Natural Language Processing in 2018

Sugandha Lahoti

12 Dec 2018

5 min read

The ’Strengthening African Machine Learning’ conference organized by Deep Learning Indaba, at Stellenbosch, South Africa, is ongoing right now. This 6-day conference will celebrate and strengthen machine learning in Africa through state-of-the-art teaching, networking, policy debate, and through support programmes. Yesterday, three conference organizers, Sebastian Ruder, Herman Kamper, and Stephan Gouws asked tech experts their view on the state of Natural Language Processing, more specifically these 4 questions: What do you think are the three biggest open problems in Natural Language Processing at the moment? What would you say is the most influential work in Natural Language Processing in the last decade, if you had to pick just one? What, if anything, has led the field in the wrong direction? What advice would you give a postgraduate student in Natural Language Processing starting their project now? The tech experts interviewed included the likes of Yoshua Bengio, Hal Daumé III, Barbara Plank, Miguel Ballesteros, Anders Søgaard, Lea Frermann, Michael Roth, Annie Louise, Chris Dyer, Felix Hill, Kevin Knight and more. https://twitter.com/seb_ruder/status/1072431709243744256 Biggest open problems in Natural Language Processing at the moment Although each expert talked about a variety of Natural Language Processing open issues, the following common key themes recurred. No ‘real’ understanding of Natural language understanding Many experts argued that natural Language understanding is central and also important for natural language generation. They agreed that most of our current Natural Language Processing models do not have a “real” understanding. What is needed is to build models that incorporate common sense, and what (biases, structure) should be built explicitly into these models. Dialogue systems and chatbots were mentioned in several responses. Maletšabisa Molapo, a Research Scientist at IBM Research and one of the experts answered, “Perhaps this may be achieved by general NLP Models, as per the recent announcement from Salesforce Research, that there is a need for NLP architectures that can perform well across different NLP tasks (machine translation, summarization, question answering, text classification, etc.)” NLP for low-resource scenarios Another open problem is using NLP for low-resource scenarios. This includes generalization beyond the training data, learning from small amounts of data and other techniques such as Domain-transfer, transfer learning, multi-task learning. Also includes different supervised learning techniques, semi-supervised, weakly-supervised, “Wiki-ly” supervised, distantly-supervised, lightly-supervised, minimally-supervised and unsupervised learning. Per Karen Livescu, Associate Professor Toyota Technological Institute at Chicago, “Dealing with low-data settings (low-resource languages, dialects (including social media text "dialects"), domains, etc.). This is not a completely "open" problem in that there are already a lot of promising ideas out there; but we still don't have a universal solution to this universal problem.” Reasoning about large or multiple contexts Experts believed that NLP has problems in dealing with large contexts. These large context documents can be either text or spoken documents, which currently lack common sense incorporation. According to, Isabelle Augenstein, tenure-track assistant professor at the University of Copenhagen, “Our current models are mostly based on recurrent neural networks, which cannot represent longer contexts well. One recent encouraging work in this direction I like is the NarrativeQA dataset for answering questions about books. The stream of work on graph-inspired RNNs is potentially promising, though has only seen modest improvements and has not been widely adopted due to them being much less straight-forward to train than a vanilla RNN.” Defining problems, building diverse datasets and evaluation procedures “Perhaps the biggest problem is to properly define the problems themselves. And by properly defining a problem, I mean building datasets and evaluation procedures that are appropriate to measure our progress towards concrete goals. Things would be easier if we could reduce everything to Kaggle style competitions!” - Mikel Artetxe. Experts believe that current NLP datasets need to be evaluated. A new generation of evaluation datasets and tasks are required that show whether NLP techniques generalize across the true variability of human language. Also what is required are more diverse datasets. “Datasets and models for deep learning innovation for African Languages are needed for many NLP tasks beyond just translation to and from English,” said Molapo. Advice to a postgraduate student in NLP starting their project Do not limit yourself to reading NLP papers. Read a lot of machine learning, deep learning, reinforcement learning papers. A PhD is a great time in one’s life to go for a big goal, and even small steps towards that will be valued. — Yoshua Bengio Learn how to tune your models, learn how to make strong baselines, and learn how to build baselines that test particular hypotheses. Don’t take any single paper too seriously, wait for its conclusions to show up more than once. — George Dahl I believe scientific pursuit is meant to be full of failures. If every idea works out, it’s either because you’re not ambitious enough, you’re subconsciously cheating yourself, or you’re a genius, the last of which I heard happens only once every century or so. so, don’t despair! — Kyunghyun Cho Understand psychology and the core problems of semantic cognition. Understand machine learning. Go to NeurIPS. Don’t worry about ACL. Submit something terrible (or even good, if possible) to a workshop as soon as you can. You can’t learn how to do these things without going through the process. — Felix Hill Make sure to go through the complete list of all expert responses for better insights. Google open sources BERT, an NLP pre-training technique Use TensorFlow and NLP to detect duplicate Quora questions [Tutorial] Intel AI Lab introduces NLP Architect Library

0
0
16456

How-To Tutorials - Data

What the US-China tech and AI arms race means for the world - Frederick Kempe at Davos 2019

Conversational AI in 2018: An arms race of new products, acquisitions, and more

Googlers launch industry-wide awareness campaign to fight against forced arbitration

Pay it Forward this New Year – Rewriting the code on career development

CES 2019 is bullshit we don't need after 2018's techlash

“All of my engineering teams have a machine learning feature on their roadmap” - Will Ballard talks artificial intelligence in 2019 [Interview]

Quantum computing, edge analytics, and meta learning: key trends in data science and big data in 2019

Troll Patrol Report: Amnesty International and Element AI use machine learning to understand online abuse against women

NeurIPS 2018: How machine learning experts can work with policymakers to make good tech decisions [Invited Talk]

NVIDIA demos a style-based generative adversarial network that can generate extremely realistic images; has ML community enthralled

Trending Topics

NeurIPS 2018: Rethinking transparency and accountability in machine learning

NeurIPS 2018: Developments in machine learning through the lens of Counterfactual Inference [Tutorial]

Key Takeaways from Sundar Pichai’s Congress hearing over user data, political bias, and Project Dragonfly

The cruelty of algorithms: Heartbreaking open letter criticizes tech companies for showing baby ads after stillbirth

Deep Learning Indaba presents the state of Natural Language Processing in 2018

Create a Free Account To Continue Reading

SignIn Free Account To Continue Reading