Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-following-google-facebook-changes-its-forced-arbitration-policy-for-sexual-harassment-claims

12 Nov 2018

3 min read

Following Google, Facebook changes its forced arbitration policy for sexual harassment claims

12 Nov 2018

It was last Thursday when Google changed its policy of forced arbitration in case of sexual harassment. A day after that, Facebook announced that it is also changing its policy of forced arbitration that requires employees to settle sexual harassment claims in private, as per the Wall Street Journal. This means that employees can now take any of their sexual harassment complaints to a court of law. Following Google’s footsteps, Facebook has also made its arbitration policy optional. Anthony Harrison, corporate media relations director, Facebook, confirmed that Facebook has made arbitration optional. “Today, we are publishing our updated Workplace Relationships policy and amending our arbitration agreements to make arbitration a choice rather than a requirement in sexual harassment claims. Sexual harassment is something that we take very seriously, and there is no place for it at Facebook”, said Harrison. Moreover, Facebook also announced an updated “Relationships at work policy”. As per the updated policy, anyone who starts a relationship with anyone in the management chain must disclose it to the HR. Additionally, if you are a Director or above and you get into a relationship with someone at the company, it must be reported to the HR. Google decided to modify its policy after 20,000 Google employees along with Temps, Vendors, and Contractors walked out earlier this month to protest against the discrimination, racism, and sexual harassment present at Google’s workplace. Google made the arbitration process optional for individual sexual harassment and sexual assault claims. “Google has never required confidentiality in the arbitration process and it still may be the best path for a number of reasons (e.g. personal privacy), but, we recognize that the choice should be up to you”, mentioned Sundar Pichai, Google CEO, on the announcement page. Facebook is the latest tech company who has decided to change the forced arbitration policy. Other companies such as Uber and Microsoft have changed their arbitration policy in the recent past. Uber made arbitration optional, back in May, to bring “transparency, integrity, and accountability” to its handling of sexual harassment. Microsoft was one of the first major organizations who decided to completely eliminate forced arbitration clauses for sexual harassment, last December. It seems like Google Walkout managed to not only push Google to take a stand against sexual assault but also inspired other companies to take the right steps in case of sensitive issues. Facebook’s big music foray: New soundtracking feature for stories and its experiments with video music and live streaming karaoke Facebook is at it again. This time with Candidate Info where politicians can pitch on camera

0
0
17375

article-image-openai-launches-spinning-up-a-learning-resource-for-potential-deep-learning-practitioners

Prasad Ramesh

09 Nov 2018

3 min read

OpenAI launches Spinning Up, a learning resource for potential deep learning practitioners

Prasad Ramesh

09 Nov 2018

3 min read

OpenAI released Spinning Up yesterday. It is an educational resource for anyone who wants to become a skilled deep learning practitioner. Spinning Up has many examples in reinforcement learning, documentation, and tutorials. The inspiration to build Spinning Up comes from OpenAI Scholars and Fellows initiatives. They observed that it’s possible for people with little-to-no experience in machine learning to rapidly become practitioners with the right guidance and resources. Spinning Up in Deep RL is also integrated into the curriculum for OpenAI 2019 cohorts of Scholars and Fellows. A quick overview of Spinning Up course content A short introduction to reinforcement learning. What is it? The terminology used, different types of algorithms and basic theory to develop an understanding. An essay that lays out points and requirements to grow into a reinforcement learning research role. It explains the background, practice learning, and developing a project. A list of important research papers organized by topic for learning. A well-documented code repository of short, standalone implementations of various algorithms. These include Vanilla Policy Gradient (VPG), Trust Region Policy Optimization (TRPO), Proximal Policy Optimization (PPO), Deep Deterministic Policy Gradient (DDPG), Twin Delayed DDPG (TD3), and Soft Actor-Critic (SAC). And finally, a few exercises to solve and start applying what you’ve learned. Support plan for Spinning Up Fast-paced support period For the first three weeks after release OpenAI will quickly work on bug-fixes, installation issues, and resolving errors in the docs. They will work to streamline the user experience so that it as easy as possible to self-study with Spinning Up. A major review in April 2019 Around April next year, OpenAI will perform a serious review of the state of package based on feedback received from the community. After that any plans for future modification will be announced. Public release of internal development On making changes to Spinning Up in Deep RL with OpenAI Scholars and Fellows, the changes will also be pushed to the public repository so that it is available to everyone immediately. In Spinning Up, running deep reinforcement learning algorithms is as easy as: python -m spinup.run ppo --env CartPole-v1 --exp_name hello_world For more details on Spinning Up, visit the OpenAI Blog. This AI generated animation can dress like humans using deep reinforcement learning Curious Minded Machine: Honda teams up with MIT and other universities to create an AI that wants to learn MIT plans to invest $1 billion in a new College of computing that will serve as an interdisciplinary hub for computer science, AI, data science

0
0
17351

article-image-do-google-ads-secretly-track-stack-overflow-users

Vincy Davis

27 Jun 2019

5 min read

Do Google Ads secretly track Stack Overflow users?

Vincy Davis

27 Jun 2019

5 min read

Update: A day after a user found a bug on Stack Overflow’s devtools website, Nick Craver, the Architecture Lead for Stack Overflow, has updated users on their working. He says that the fingerprinting issue has emerged from the ads relayed through 3rd party providers. Stack Overflow has been reaching out to experts and the Google Chrome security team and has also filed a bug in the Chrome tracker. Stack Overflow has contacted Google, their ad server for assistance and are testing deployment of Safe Frame to all ads. The Safe Frame API will configure if all ads on the page should be forced to be rendered using a SafeFrame container. Stack Overflow is also trying to deploy the Feature-Policy header to block access to most browser features from all components in the page. Craver has also specified in the update that Stack Overflow has decided not to turn off these ad campaigns swiftly, as they need the repro to fix these issues. A user by the name greggman has discovered a bug on Stack Overflow’s devtools website. Today, while working on his browser's devtools website, he noticed the following message: Image source: Stack Overflow Meta website greggman then raised the query “Why is Stack Overflow trying to start audio?” on the Stack Overflow Meta website, which is intended for bugs, features, and discussion of Stack Overflow for its users. He then found out that the above message appears whenever a particular ad is appearing on the website. The ad is from Microsoft via Google. Image source: Stack Overflow Meta Website Later another user, TylerH did an investigation and revealed some intriguing information about the identified bug. He found out that the Google Ad is employing the audio API, to collect information from the users’ browser, in an attempt to fingerprint it. He says that “This isn't general speculation, I've spent the last half hour going though the source code linked above, and it goes to considerable lengths to de-anonymize viewers. Your browser may be blocking this particular API, but it's not blocking most of the data.” TylerH claims that this fingerprint tracking of users is definitely not done for legitimate feature detection. He adds that this technique is done in aggregate to generate a user fingerprint, which is included along with the advertising ID, while recording analytics for the publisher. This is done to detect the following : Users’ system resolution and accessibility settings The audio API capabilities, supported by the users’ browser The mobile browser-specific APIs, supported by the users’ browser TylerH states that this bug can detect many other details about the user, without the users’ consent. Hence he issues a warning to all Stack Overflow users to “Use an Ad blocker!” As both these findings gained momentum on the Stack Overflow Meta website, Nick Craver, the Architecture Lead for Stack Overflow replied to greggman and TylerH, “Thanks for letting us know about this. We are aware of it. We are not okay with it.” Craver also mentioned that Stack Overflow has reached out to Google, to obtain their support. He also notified users that “This is not related to ads being tested on the network and is a distinctly separate issue. Programmatic ads are not being tested on Stack Overflow at all.” Users are annoyed at this response by Craver. Many are not ready to believe that the Architecture Lead for Stack Overflow did not have any idea about this and is now going to work on it. A user on Hacker News comments that this response from Craver “encapsulates the entire problem with the current state of digital advertising in 1 simple sentence.” Few users feel like this is not surprising at all, as all websites use ads as tracking mechanisms. A HN user says that “Audio feature detection isn't even a novel technique. I've seen trackers look at download stream patterns to detect whether or not BBR congestion control is used, I have seen mouse latency based on the difference between mouse ups and downs in double clocks and I have seen speed-of-interaction checks in mouse movements.” Another comment reads, “I think ad blocking is a misnomer. What people are trying to do when blocking ads is prevent marketing people from spying on them. And the performance and resource consumption that comes from that. Personal opinion: Laws are needed to make what advertisers are doing illegal. Advertisers are spying on people to the extent where if the government did it they'd need a warrant.” While there is another user, who thinks that the situation is not that bad, with Stack Overflow at least taking responsibility of this bug. The user on Hacker News wrote, “Let's be adults here. This is SO, and I imagine you've used and enjoyed the use of their services just like the rest of us. Support them by letting passive ads sit on the edge of the page, and appreciate that they are actually trying to solve this issue.” Approx. 250 public network users affected during Stack Overflow’s security attack Stack Overflow confirms production systems hacked Facebook again, caught tracking Stack Overflow user activity and data

0
0
17329

article-image-openai-two-new-versions-and-the-output-dataset-of-gpt-2-out

Vincy Davis

07 May 2019

3 min read

OpenAI: Two new versions and the output dataset of GPT-2 out!

Vincy Davis

07 May 2019

3 min read

Today, OpenAI have released the versions of GPT-2, which is a new AI model. GPT-2 is capable of generating coherent paragraphs of text without needing any task-specific training. The release includes a medium 345M version and a small 117M version of GPT-2. They have also shared the 762M and 1.5B versions with partners in the AI and security communities who are working to improve societal preparedness for large language models. The earlier version release of GPT was in the year 2018. In February 2019, Open-AI had made an announcement about GPT-2 with many samples and policy implications. Read More: OpenAI’s new versatile AI model, GPT-2 can efficiently write convincing fake news from just a few words The team at OpenAI has decided on a staged release of GPT-2. Staged release will have the gradual release of family models over time. The reason behind the staged release of GPT-2 is to give people time to assess the properties of these models, discuss their societal implications, and evaluate the impacts of release after each stage. The 345M parameter version of GPT-2 has improved performance relative to the 117M version, though it does not offer much ease of generating coherent text. Also it would be difficult to misuse the 345M version. Many factors like ease of use for generating coherent text, the role of humans in the text generation process, the likelihood and timing of future replication and publication by others, evidence of use in the wild and expert-informed inferences about unobservable uses, etc were considered while releasing this staged 345M version. The team is hopeful that the ongoing research on bias, detection, and misuse will boost them to publish larger models and in six months, they will share a fuller analysis of language models’ societal implications and the heuristics for release decisions. The team at OpenAI is looking for partnerships with academic institutions, non-profits, and industry labs which will focus on increasing societal preparedness for large language models. They are also open to collaborating with researchers working on language model output detection, bias, and publication norms, and with organizations potentially affected by large language models. The output dataset contains GPT-2 outputs from all 4 model sizes, with and without top-k truncation, as well as a subset of the WebText corpus used to train GPT-2. The dataset features approximately 250,000 samples per model/hyperparameter pair, which will be sufficient to help a wider range of researchers perform quantitative and qualitative analysis. To know more about the release, head over to the official release announcement. OpenAI introduces MuseNet: A deep neural network for generating musical compositions OpenAI researchers have developed Sparse Transformers, a neural network which can predict what comes OpenAI Five bots destroyed human Dota 2 players this weekend

0
0
17309

article-image-introducing-vector-a-high-performance-data-router-written-in-rust

Amrata Joshi

03 Jul 2019

3 min read

Introducing Vector, a high-performance data router, written in Rust

Amrata Joshi

03 Jul 2019

3 min read

Yesterday, the team at Timber.io, a cloud-based logging platform, released Vector, a high-performance observability data router that makes transforming, collecting, and sending logs, metrics, and events easy. One of the reasons behind building Vector was to integrate mtail's functionality into a bigger project. mtail is a tool which is used for extracting metrics from application logs. Licensed under the Apache License, Version 2.0, Vector decouples data collection and routing from user services that give users the control and data ownership. Vector which written in Rust, compiles to a single static binary and it has been designed to be deployed across the entire infrastructure. Concepts of Vector Following is a diagram depicting the basic concepts that Vector comprises of: Image source: Vector Sources When Vector ingests data it proceeds to normalize that data into a record, which sets the stage for easy and consistent processing of the data. Examples of sources include syslog, tcp, file, and stdin. Transforms Transform modifies an event or the stream as a whole like a filter, parser, sampler, or aggregator. Sinks A sink is a destination for events and its design and transmission method is controlled by the downstream service it is interacting with. For instance, the TCP sink will stream individual records, while the S3 sink will buffer and flush data. Features of Vector Memory efficient and fast Vector is fast and memory-efficient and doesn't have a runtime and garbage collector. Test cases Vector involves performance and correctness tests, where the performance tests measure performance and capture detailed performance data, whereas, correctness tests verify behavior. The team behind Vector has also invested in a robust test harness that provides a data-driven testing environment. Here are the test results: Image source: GitHub Processing data Vector is used for collecting data from various sources in various shapes. It also sets the stage for easy and consistent processing of the data. Serves as a single tool It serves as a light-weight agent as well as a service that works as a single tool for users. Guarantee support matrix It features a guarantee support matrix that helps users understand their tradeoffs. Easy deployment Vector cross-compiles to a single static binary without any runtime. Users seem to be happy about this news as they think Vector is useful for them. A user commented on HackerNews, "I'm learning Rust and eventually plan to build such a solution but I think a lot of this project can be repurposed for what I asked much faster than building a new one. Cheers on this open source project. I will contribute whatever I can. Thanks!!" It seems more metrics-focused sources and Sinks are expected in Vector in the future. A member from the Vector project commented, "It's still slightly rough around the edges, but Vector can actually ingest metrics today in addition to deriving metrics from log events. We have a source component that speaks the statsd protocol which can then feed into our prometheus sink. We're planning to add more metrics-focused sources and sinks in the future (e.g. graphite, datadog, etc), so check back soon!" To know more about this news, check out Vector's page. Implementing routing with React Router and GraphQL [Tutorial] TP-Link kept thousands of vulnerable routers at risk of remote hijack, failed to alert customers Amazon buys ‘Eero’ mesh router startup, adding fuel to its in-house Alexa smart home ecosystem ambitions

0
0
17265

article-image-canva-faced-security-breach-139-million-users-data-hacked-zdnet-reports

Fatema Patrawala

28 May 2019

3 min read

Canva faced security breach, 139 million users data hacked: ZDNet reports

Fatema Patrawala

28 May 2019

3 min read

Last Friday, ZDNet reported about Canva’s data breach. Canva is a popular Sydney-based startup which offers a graphic design service. According to the hacker, who directly contacted ZDNet, data of roughly 139 million users has been compromised during the breach. Responsible for the data breach is a hacker known as GnosticPlayers online. Since February this year, they have put up the data of 932 million users on sale, which are reportedly stolen from 44 companies around the world. "I download everything up to May 17," the hacker said to ZDNet. "They detected my breach and closed their database server." Source: ZDNet website In a statement on the Canva website, the company confirmed the attack and has notified the relevant authorities. They also tweeted about the data breach on 24th May as soon as they discovered the hack and recommended their users to change their passwords immediately. https://twitter.com/canva/status/1132086889408749573 “At Canva, we are committed to protecting the data and privacy of all our users and believe in open, transparent communication that puts our communities’ needs first,” the statement said. “On May 24, we became aware of a security incident. As soon as we were notified, we immediately took steps to identify and remedy the cause, and have reported the situation to authorities (including the FBI). “We’re aware that a number of our community’s usernames and email addresses have been accessed.” Stolen data included details such as customer usernames, real names, email addresses, and city & country information. For 61 million users, password hashes were also present in the database. The passwords where hashed with the bcrypt algorithm, currently considered one of the most secure password-hashing algorithms around. For other users, the stolen information included Google tokens, which users had used to sign up for the site without setting a password. Of the total 139 million users, 78 million users had a Gmail address associated with their Canva account. Canva is one of Australia's biggest tech companies. Founded in 2012, since the launch, the site has shot up the Alexa website traffic rank, and has been ranking among the Top 200 popular websites. Three days ago, the company announced it raised $70 million in a Series-D funding round, and is now valued at a whopping $2.5 billion. Canva also recently acquired two of the world's biggest free stock content sites -- Pexels and Pixabay. Details of Pexels and Pixabay users were not included in the data stolen by the hacker. According to reports from Business Insider, the community was dissatisfied with how Canva responded to the attack. IT consultant Dave Hall criticized the wording Canva used in a communication sent to users on Saturday. He believes Canva did not respond fast enough. https://twitter.com/skwashd/status/1132258055767281664 One Hacker News user commented , “It seems as though these breaches have limited effect on user behaviour. Perhaps I'm just being cynical but if you are aren't getting access and you are just getting hashed passwords, do people even care? Does it even matter? Of course names and contact details are not great. I get that. But will this even effect Canva?” Another user says, “How is a design website having 189M users? This is astonishing more than the hack!” Facebook again, caught tracking Stack Overflow user activity and data Ireland’s Data Protection Commission initiates an inquiry into Google’s online Ad Exchange services Adobe warns users of “infringement claims” if they continue using older versions of its Creative Cloud products

0
0
17234

article-image-postgresql-wins-dbms-of-the-year-2018-beating-mongodb-and-redis-in-db-engines-ranking

Amrata Joshi

09 Jan 2019

4 min read

PostgreSQL wins ‘DBMS of the year’ 2018 beating MongoDB and Redis in DB-Engines Ranking

Amrata Joshi

09 Jan 2019

4 min read

Last week, DB Engines announced PostgreSQL as the Database Management System (DBMS) of the year 2018, as it gained more popularity in the DB-Engines Ranking last year than any of the other 343 monitored systems. Jonathan S. Katz, PostgreSQL contributor, said, "The PostgreSQL community cannot succeed without the support of our users and our contributors who work tirelessly to build a better database system. We're thrilled by the recognition and will continue to build a database that is both a pleasure to work with and remains free and open source." PostgreSQL, which will turn 30 this year has won the DBMS title for the second time in a row. It has established itself as the preferred data store amongst developers and has been appreciated for its stability and feature set. In the DBMS market, various systems use PostgreSQL as their base technology, this itself justifies that how well-established PostgreSQL is. Simon Riggs, Major PostgreSQL contributor, said, "For the second year in a row, the PostgreSQL team thanks our users for making PostgreSQL the DBMS of the Year, as identified by DB-Engines. PostgreSQL's advanced features cater to a broad range of use cases all within the same DBMS. Rather than going for edge case solutions, developers are increasingly realizing the true potential of PostgreSQL and are relying on the absolute reliability of our hyperconverged database to simplify their production deployments." How the DB-Engines Ranking scores are calculated For determining the DBMS of the year, the team at DB Engines subtracted the popularity scores of January 2018 from the latest scores of January 2019. The team used a difference of these numbers instead of percentage because that would favor systems with tiny popularity at the beginning of the year. The popularity of a system is calculated by using the parameters, such as the number of mentions of the system on websites, the number of mentions in the results of search engine queries. The team at DB Engines uses Google, Bing, and Yandex for this measurement. In order to count only relevant results, the team searches for <system name> together with the term database, e.g. "Oracle" and "database".The next measure is known as General interest in the system, for which the team uses the frequency of searches in Google Trends. The number of related questions and the number of interested users on the well-known IT-related Q&A site such as Stack Overflow and DBA Stack Exchange are also checked in this process. For calculating the ranking, the team also uses the number of offers on the leading job search engines Indeed and Simply Hired. A number of profiles in professional networks such as LinkedIn and Upwork in which the system is mentioned is also taken into consideration. The number of tweets in which the system is mentioned is also counted. The calculated result is a list of DBMSs sorted by how much they managed to increase their popularity in 2018. 1st runner-up: MongoDB For 2018, MongoDB is the first runner-up and has previously won the DBMS of the year in 2013 and 2014. Its growth in popularity has even accelerated ever since, as it is the most popular NoSQL system. MongoDB keeps on adding functionalities that were previously outside the NoSQL scope. Lat year, MongoDB also added ACID support, which got a lot of developers convinced, to rely on it with critical data. With the improved support for analytics workloads, MongoDB is a great choice for a larger range of applications. 2nd runner-up: Redis Redis, the most popular key-value store got the third place for DBMS of the year 2018. It has been in the top three DBMS of the year for 2014. It is best known as high-performance and feature-rich key-value store. Redis provides a loadable modules system, which means third parties can extend the functionality of Redis. These modules offer a graph database, full-text search, and time-series features, JSON data type support and much more. PipelineDB 1.0.0, the high performance time-series aggregation for PostgreSQL, released! Devart releases standard edition of dbForge Studio for PostgreSQL MongoDB switches to Server Side Public License (SSPL) to prevent cloud providers from exploiting its open source code

0
0
17213

article-image-researchers-input-rabbit-duck-illusion-to-google-cloud-vision-api-and-conclude-it-shows-orientation-bias

Bhagyashree R

11 Mar 2019

3 min read

Researchers input rabbit-duck illusion to Google Cloud Vision API and conclude it shows orientation-bias

Bhagyashree R

11 Mar 2019

3 min read

When last week, Janelle Shane, a Research Scientist in optics, fed the infamous rabbit-duck illusion example to the Google Cloud Vision API, it gave “rabbit” as a result. However, when the image was rotated at a different angle, the Google Cloud Vision API predicted a “duck”. https://twitter.com/JanelleCShane/status/1103420287519866880 Inspired by this, Max Woolf, a data scientist at Buzzfeed, further tested and concluded that the result really varies based on the orientation of the image: https://twitter.com/minimaxir/status/1103676561809539072 Google Cloud Vision provides pretrained API models that allow you to derive insights from input images. The API classifies images into thousands of categories, detects individual objects and faces within images, and reads printed words within images. You can also train custom vision models with AutoML Vision Beta. Woolf used Python for rotating the image and get predictions from the API for each rotation. He built the animations with R, ggplot2, and gganimate. To render these animations he used ffmpeg. Many times, in deep learning, a model is trained using a strategy in which the input images are rotated to help the model better generalize. Seeing the results of the experiment, Woolf concluded, “I suppose the dataset for the Vision API didn't do that as much / there may be an orientation bias of ducks/rabbits in the training datasets.” The reaction to this experiment was pretty torn. While many Reddit users felt that there might be an orientation bias in the model, others felt that as the image is ambiguous there is no “right answer” and hence there is no problem with the model. One of the Redditor said, “I think this shows how poorly many neural networks are at handling ambiguity.” Another Redditor commented, “This has nothing to do with a shortcoming of deep learning, failure to generalize, or something not being in the training set. It's an optical illusion drawing meant to be visually ambiguous. Big surprise, it's visually ambiguous to computer vision as well. There's not 'correct' answer, it's both a duck and a rabbit, that's how it was drawn. The fact that the Cloud vision API can see both is actually a strength, not a shortcoming.” Woolf has open-sourced the code used to generate this visualization on his GitHub page, which also includes a CSV of the prediction results at every rotation. In case you are more curious, you can test the Cloud Vision API with the drag-and-drop UI provided by Google. Google Cloud security launches three new services for better threat detection and protection in enterprises Generating automated image captions using NLP and computer vision [Tutorial] Google Cloud Firestore, the serverless, NoSQL document database, is now generally available

0
0
17199

article-image-darpas-2-billion-ai-next-campaign-includes-a-next-generation-nonsurgical-neurotechnology-n3-program

Savia Lobo

11 Sep 2018

3 min read

DARPA’s $2 Billion ‘AI Next’ campaign includes a Next-Generation Nonsurgical Neurotechnology (N3) program

Savia Lobo

11 Sep 2018

3 min read

Last Friday (7th September, 2018), DARPA announced a multi-year investment of more than $2 billion in a new program called the ‘AI Next’ campaign. DARPA’s Agency director, Dr. Steven Walker, officially unveiled the large-scale effort during D60, DARPA’s 60th Anniversary Symposium held in Maryland. This campaign seeks contextual reasoning in AI systems in order to create deeper trust and collaborative partnerships between humans and machines. The key areas the AI Next Campaign may include are: Automating critical DoD (Department of Defense) business processes, such as security clearance vetting in a week or accrediting software systems in one day for operational deployment. Improving the robustness and reliability of AI systems; enhancing the security and resiliency of machine learning and AI technologies. Reducing power, data, and performance inefficiencies. Pioneering the next generation of AI algorithms and applications, such as ‘explainability’ and commonsense reasoning. The Next-Generation Nonsurgical Neurotechnology (N3) program In the conference, DARPA officials also described the next frontier of neuroscience research: technologies for able-bodied soldiers that give them super abilities. Following this, they introduced the Next-Generation Nonsurgical Neurotechnology (N3) program, which was announced in March. This program aims at funding research on tech that can transmit high-fidelity signals between the brain and some external machine without requiring that the user is cut open for rewiring or implantation. Al Emondi, manager of N3, said to IEEE Spectrum that he is currently picking researchers who will be funded under the program and can expect an announcement in early 2019. The program has two tracks: Completely non-invasive: The N3 program aims for new non-invasive tech that can match the high performance currently achieved only with implanted electrodes that are nestled in the brain tissue and therefore have a direct interface with neurons—either recording the electrical signals when the neurons “fire” into action or stimulating them to cause that firing. Minutely invasive: DARPA says it doesn’t want its new brain tech to require even a tiny incision. Instead, minutely invasive tech might come into the body in the form of an injection, a pill, or even a nasal spray. Emondi imagines “nanotransducers” that can sit inside neurons, converting the electrical signal when it fires into some other type of signal that can be picked up through the skull. Justin Sanchez, director of DARPA’s Biological Technologies Office, said that making brain tech easy to use will open the floodgates. He added, “We can imagine a future of how this tech will be used. But this will let millions of people imagine their own futures”. To know more about the AI Next Campaign and the N3 program in detail, visit DARPA blog. Skepticism welcomes Germany’s DARPA-like cybersecurity agency – The federal agency tasked with creating cutting-edge defense technology DARPA on the hunt to catch deepfakes with its AI forensic tools underway

0
0
17196

article-image-alibaba-launches-an-ai-chip-company-named-ping-tou-ge-to-boost-chinas-semiconductor-industry

Savia Lobo

24 Sep 2018

3 min read

Alibaba launches an AI chip company named ‘Ping-Tou-Ge’ to boost China’s semiconductor industry

Savia Lobo

24 Sep 2018

3 min read

Alibaba Group now enters the semiconductor industry by launching its new subsidiary named ‘Ping-Tou-Ge’ that will develop computer chips specifically designed for artificial intelligence. The company made this announcement last week at its Computing Conference in Hangzhou. Why the name, ‘Ping-Tou-Ge’? “Ping-Tou-Ge” is a Mandarin nickname for the Honey Badger, an animal native to Africa, Southwest Asia, and the Indian subcontinent. Alibaba chief technology officer Jeff Zhang says, “Many people know that the honey badger is a legendary animal: it’s not afraid of anything and has skillful hunting techniques and great intelligence”. He further added, “Alibaba’s semiconductor company is new; we’re just starting out. And so we hope to learn from the spirit [of the honey badger]. A chip is small [like the honey badger], and we hope that such a small thing will produce great power.” Ping-Tou-Ge is one of Alibaba’s efforts to improve China's semiconductor industry The main reason for the creation of ‘Ping-Tou-Ge’ was the US ban on Chinese telecom giant ZTE, which brought a realization as to how much China's semiconductor industry depends heavily on imported chipsets. Alibaba has constantly been increasing its footprint in the chip industry. DAMO Academy, which was established in 2017, focuses on areas such as machine intelligence and data computing. Alibaba had also acquired Chinese chipmaker Hangzhou C-SKY Microsystems in April to enhance its own chip production capacity. C-SKY Microsystems is a designer of a domestically developed embedded chipset. Zhang Jianfeng, head of Alibaba's DAMO Academy said in a statement that the Hangzhou-based company will produce its first neural network chip in the second half of next year with an internally developed technology platform and a synergized ecosystem. Ping-Tou-Ge will combine the powers of both DAMO’s chip business and C-Sky Microsystems. It will operate independently in the development of its embedded chip series CK902 and its neural network chip Ali-NPU. The Ali-NPU chip is designed for AI inferencing in the field of image processing, machine learning, etc. Some of its features include: It is expected to be around 40 times more cost-effective than conventional chips perform 10 times better than mainstream CPU and GPU architecture AI chips in the current market cut down power and manufacturing costs to half Pingtouge will also focus on customized AI chips and embedded processors to support Alibaba's developing cloud and Internet of Things (IoT) business. These chips could be used in various industries such as vehicles, home appliances, and manufacturing. To know more about Ping-Tou-Ge in detail, visit MIT Technology Review blog. OpenSky is now a part of the Alibaba family Alibaba Cloud partners with SAP to provide a versatile, one-stop cloud computing environment Why Alibaba cloud could be the dark horse in the public cloud race

0
0
17169

article-image-google-adanet-a-tensorflow-based-automl-framework

Sugandha Lahoti

31 Oct 2018

3 min read

Google AdaNet, a TensorFlow-based AutoML framework

Sugandha Lahoti

31 Oct 2018

3 min read

Google researchers have come up with a new AutoML framework, which can automatically learn high-quality models with minimal expert intervention. Google AdaNet is a fast, flexible, and lightweight TensorFlow-based framework for learning a neural network architecture and learning to ensemble to obtain even better models. How Google Adanet works? As machine learning models increase in number, Adanet will automatically search over neural architectures, and learn to combine the best ones into a high-quality model. Adanet implements an adaptive algorithm for learning a neural architecture as an ensemble of subnetworks. It can add subnetworks of different depths and widths to create a diverse ensemble, and trade off performance improvement with the number of parameters. This saves ML engineers the time spent selecting optimal neural network architectures. Source: Google Adanet: Built on Tensorflow AdaNet implements the TensorFlow Estimator interface. This interface simplifies machine learning programming by encapsulating training, evaluation, prediction and export for serving. Adanet also integrates with open-source tools like TensorFlow Hub modules, TensorFlow Model Analysis, and Google Cloud’s Hyperparameter Tuner. TensorBoard integration helps to monitor subnetwork training, ensemble composition, and performance. Tensorboard is one of the best TensorFlow features for visualizing model metrics during training. When AdaNet is done training, it exports a SavedModel that can be deployed with TensorFlow Serving. How to extend AdaNet to your own projects Machine learning engineers and enthusiasts can define their own AdaNet adanet.subnetwork.Builder using high level TensorFlow APIs like tf.layers. Users who have already integrated a TensorFlow model in their system can use the adanet.Estimator to boost model performance while obtaining learning guarantees. Users are also invited to use their own custom loss functions via canned or custom tf.contrib.estimator.Heads in order to train regression, classification, and multi-task learning problems. Users can also fully define the search space of candidate subnetworks to explore by extending the adanet.subnetwork.Generator class. Experiments: NASNet-A versus AdaNet Google researchers took an open-source implementation of a NASNet-A CIFAR architecture and transformed it into a subnetwork. They were also able to improve upon CIFAR-10 results after eight AdaNet iterations. The model achieves this result with fewer parameters: [caption id="attachment_23810" align="aligncenter" width="640"] Performance of a NASNet-A model versus AdaNet learning to combine small NASNet-A subnetworks on CIFAR-10[/caption] Source: Google You can checkout the Github repo, and walk through the tutorial notebooks for more details. You can also have a look at the research paper. Top AutoML libraries for building your ML pipelines. Anatomy of an automated machine learning algorithm (AutoML) AmoebaNets: Google’s new evolutionary AutoML

0
0
17150

article-image-satpy-0-10-0-python-library-for-manipulating-meteorological-remote-sensing-data-released

Amrata Joshi

26 Nov 2018

2 min read

SatPy 0.10.0, python library for manipulating meteorological remote sensing data, released

Amrata Joshi

26 Nov 2018

2 min read

SatPy is a python library used for reading and manipulating meteorological remote sensing data and writing them to various image/data file formats. Last week, the team at Pytroll announced the release of SatPy 0.10.0. SatPy is responsible for making RGB composites directly from satellite instrument channel data or from higher level processing output. It also makes data loading, manipulating, and analysis easy. https://twitter.com/PyTrollOrg/status/1066865986953986050 Features of SatPy 0.10.0 This version comes with two luminance sharpening compositors, LuminanceSharpeninCompositor and SandwichCompositor. The LuminanceSharpeninCompositor replaces the luminance via RGB. The SandwichCompositor multiplies the RGB channels with the reflectance. SatPy 0.10.0 comes with check_satpy function for finding missing dependencies. This version also allows writers to create output directories in case, they don't exist. In case of multiple matches, SatPy 0.10.0 helps in improving the handling of dependency loading. This version also supports the new olci l2 datasets used for olci l2 reader. Olci is used for ocean and land processing. Since yaml is the new format for area definitions in SatPy 0.10.0, areas.def has been replaced with areas.yaml In SatPy 0.10.0, filenames are used as strings by File handlers. This version also allows readers to accept pathlib.Path instances as filenames. With this version, it is easier to configure in-line composites. A README document has been added to the setup.py description. Resolved issues in SatPy 0.10.0 The issue with resampling a user-defined scene has been resolved. Native resampler now works with DataArrays. It is now possible to review subclasses of BaseFileHander. Readthedocs builds are now working. Custom string formatter has been added in this version for lower/upper support. The inconsistent units of geostationary radiances have been resolved. Major Bug Fixes A discrete data type now gets preserved through resampling. Native resampling has been fixed. The slstr reader has been fixed for consistency. Masking in DayNightCompositor has been fixed. The problem with attributes not getting preserved while adding overlays or decorations has now been fixed. To know more about this news, check out the official release notes. Introducing ReX.js v1.0.0 a companion library for RegEx written in TypeScript Spotify releases Chartify, a new data visualization library in python for easier chart creation Google releases Magenta studio beta, an open source python machine learning library for music artists

0
0
17131

article-image-openais-ai-robot-hand-learns-to-solve-a-rubik-cube-using-reinforcement-learning-and-automatic-domain-randomization-adr

Savia Lobo

16 Oct 2019

5 min read

OpenAI’s AI robot hand learns to solve a Rubik Cube using Reinforcement learning and Automatic Domain Randomization (ADR)

Savia Lobo

16 Oct 2019

5 min read

A team of OpenAI researchers shared their research of training neural networks to solve a Rubik’s Cube with a human-like robot hand. The researchers trained the neural networks only in simulation using the same reinforcement learning code as OpenAI Five paired with a new technique called Automatic Domain Randomization (ADR). In their research paper, the team demonstrates how the system trained only in simulation can handle situations it never saw during training. “Solving a Rubik’s Cube one-handed is a challenging task even for humans, and it takes children several years to gain the dexterity required to master it. Our robot still hasn’t perfected its technique though, as it solves the Rubik’s Cube 60% of the time (and only 20% of the time for a maximally difficult scramble),” the researchers mention on their official blog. The Neural networks were also trained with Kociemba’s algorithm along with RL algorithms, for picking the solution steps. Read Also: DeepCube: A new deep reinforcement learning approach solves the Rubik’s cube with no human help What is Automatic Domain Randomization (ADR)? Domain randomization enables networks trained solely in simulation to transfer to a real robot. However, it was a challenge for the researchers to create an environment with real-world physics in the simulation environment. The team realized that it was difficult to measure factors like friction, elasticity, and dynamics for complex objects like Rubik’s Cubes or robotic hands and domain randomization alone was not enough. To overcome this, the OpenAI researchers developed a new method called Automatic Domain Randomization (ADR), which endlessly generates progressively more difficult environments in simulation. In ADR, the neural network learns to solve the cube with a single, nonrandomized environment. As the neural network gets better at the task and reaches a performance threshold, the amount of domain randomization is increased automatically. This makes the task harder since the neural network must now learn to generalize to more randomized environments. The network keeps learning until it again exceeds the performance threshold, when more randomization kicks in, and the process is repeated. “The hypothesis behind ADR is that a memory-augmented network combined with a sufficiently randomized environment leads to emergent meta-learning, where the network implements a learning algorithm that allows itself to rapidly adapt its behavior to the environment it is deployed in,” the researchers state. Source: OpenAI.com OpenAI's AI-hand and the Giiker Cube The researchers have used the Shadow Dexterous E Series Hand (E3M5R) as a humanoid robot hand and the PhaseSpace motion capture system to track the Cartesian coordinates of all five fingertips. They have also used RGB Basler cameras for vision pose estimation. Sensing the state of a Rubik’s cube from vision alone is a challenging task. The team, therefore, used a “smart” Rubik’s cube with built-in sensors and a Bluetooth module as a stepping stone. They also used a Giiker cube for some of the experiments to test the control policy without compounding errors made by the vision model’s face angle predictions. The hardware is based on the Xiaomi Giiker cube. This cube is equipped with a Bluetooth module and allows one to sense the state of the Rubik’s cube. However, it is limited to a face angle resolution of 90◦ , which is not sufficient for state tracking purposes on the robot setup. The team, therefore, replaced some of the components of the original Giiker cube with custom ones in order to achieve a tracking accuracy of approximately 5 degrees. A few challenges faced OpenAI’s method currently solves the Rubik’s Cube 20% of the time when applying a maximally difficult scramble that requires 26 face rotations. For simpler scrambles that require 15 rotations to undo, the success rate is 60%. Researchers consider an attempt to have failed when the Rubik’s Cube is dropped or a timeout is reached. However, their network is capable of solving the Rubik’s Cube from any initial condition. So if the cube is dropped, it is possible to put it back into the hand and continue solving. The neural network is much more likely to fail during the first few face rotations and flips. The team says this happens because the neural network needs to balance solving the Rubik’s Cube with adapting to the physical world during those early rotations and flips. The team also implemented a few perturbations while training the AI-robot hand, including: Resetting the hidden state: During a trial, the hidden state of the policy was reset. This leaves the environment dynamics unchanged but requires the policy to re-learn them since its memory has been wiped. Re-sampling environment dynamics: This corresponds to an abrupt change of environment dynamics by resampling the parameters of all randomizations while leaving the simulation state18 and hidden state intact. Breaking a random joint: This corresponds to disabling a randomly sampled joint of the robot hand by preventing it from moving. This is a more nuanced experiment since the overall environment dynamics are the same but the way in which the robot can interact with the environment has changed. https://twitter.com/OpenAI/status/1184145789754335232 Here’s the complete video on how the AI-robot hand swiftly solved the Rubik cube single-handedly! https://www.youtube.com/watch?time_continue=84&v=x4O8pojMF0w To know more about this research in detail, you can read the research paper. Open AI researchers advance multi-agent competition by training AI agents in a simple hide and seek environment Introducing Open AI’s Reptile: The latest scalable meta-learning Algorithm on the block Build your first Reinforcement learning agent in Keras [Tutorial]

0
0
17087

article-image-stanford-researchers-introduce-deepsolar-a-deep-learning-framework-that-mapped-every-solar-panel-in-the-us

Bhagyashree R

20 Dec 2018

3 min read

Stanford researchers introduce DeepSolar, a deep learning framework that mapped every solar panel in the US

Bhagyashree R

20 Dec 2018

3 min read

Yesterday, researchers from Stanford University introduced DeepSolar, a deep learning framework that analyzes satellite images to identify the GPS location and size of solar panels. Using this framework they have built a comprehensive database containing all the GPS locations and sizes of solar installations in the US. The system was able to identify 1.47 million individual solar installations across the United States, ranging from small rooftop configurations, solar farms, to utility-scale systems. The DeepSolar database is available publicly to aid researchers to extract further insights into solar adoption. This database will also help policymakers in better understanding the correlation between solar deployment and socioeconomic factors such as household income, population density, and education level. How DeepSolar works? DeepSolar uses transfer learning to train a CNN classifier on 366,467 images. These images are sampled from over 50 cities/towns across the US with merely image-level labels indicating the presence or absence of panels. One of the researchers, Rajagopal explained the model to Gizmodo, “The algorithm breaks satellite images into tiles. Each tile is processed by a deep neural net to produce a classification for each pixel in a tile. These classifications are combined together to detect if a system—or part of—is present in the tile.” The deep neural net then identifies which tile is a solar panel. Once the training is complete, the network produces an activation map, which is also known as a heat map. The heat map outlines the panels, which can be used to obtain the size of each solar panel system. Rajagopal further explained how this model gives better efficiency, “A rooftop PV system typically corresponds to multiple pixels. Thus even if each pixel classification is not perfect, when combined you get a dramatically improved classification. We give higher weights to false negatives to prevent them.” What are some of the observations the researchers made? To measure its classification performance the researchers defined two metrics: utilize precision and recall. Utilize precision is the rate of correct decisions among all positive decisions and recall is the ratio of correct decisions among all positive samples. DeepSolar was able to achieve a precision of 93.1% with a recall of 88.5% in residential areas and a precision of 93.7% with a recall of 90.5% in non-residential areas. To measure its size estimation performance they calculated the mean relative error (MRE). It was recorded to be 3.0% for residential areas and 2.1% for non-residential areas for DeepSolar. Future work Currently, the DeepSolar database only covers the contiguous US region. The researchers are planning to expand its coverage to include all of North America, including remote areas with utility-scale solar, and non-contiguous US states. Ultimately, it will also cover other countries and regions of the world. Also, DeepSolar only estimates the horizontal projection areas of solar panels from satellite imagery. In the future, it would be able to infer high-resolution roof orientation and tilt information from street view images. This will give a more accurate estimation of solar system size and solar power generation capacity. To know more in detail, check out the research paper published by Ram Rajagopal et al: DeepSolar: A Machine Learning Framework to Efficiently Construct a Solar Deployment Database in the United States. Introducing remove.bg, a deep learning based tool that automatically removes the background of any person based image within 5 seconds NeurIPS 2018: How machine learning experts can work with policymakers to make good tech decisions [Invited Talk] NVIDIA makes its new “brain for autonomous AI machines”, Jetson AGX Xavier Module, available for purchase

0
0
17064

article-image-tensorflow-2-0-to-be-released-soon-with-eager-execution-removal-of-redundant-apis-tf-function-and-more

Amrata Joshi

15 Jan 2019

3 min read

TensorFlow 2.0 to be released soon with eager execution, removal of redundant APIs, tf function and more

Amrata Joshi

15 Jan 2019

3 min read

Just two months ago Google’s TensorFlow, one of the most popular machine learning platforms celebrated its third birthday. Last year in August, Martin Wicke, engineer at Google, posted the list of what’s expected in TensorFlow 2.0, an open source machine learning framework, on the Google group. The key features listed by him include: This release will come with eager execution. This release will feature more platforms and languages along with improved compatibility. The deprecated APIs will be removed. Duplications will be reduced. https://twitter.com/aureliengeron/status/1030091835098771457 The early preview of TensorFlow 2.0 is expected soon. TensorFlow 2.0 is expected to come with high-level APIs, robust model deployment, powerful experimentation for research and simplified API. Easy model building with Keras This release will come with Keras, a user-friendly API standard for machine learning which will be used for building and training the models. As Keras provides various model-building APIs including sequential, functional, and subclassing, it becomes easier for users to choose the right level of abstraction for their project. Eager execution and tf.function TensorFlow 2.0 will also feature eager execution, which will be used for immediate iteration and debugging. The tf.function will easily translate the Python programs into TensorFlow graphs. The performance optimizations will remain optimum and by adding the flexibility, tf.function will ease the use of expressing programs in simple Python. Further, the tf.data will be used for building scalable input pipelines. Transfer learning with TensorFlow Hub The team at TensorFlow has made it much easier for those who are not into building a model from scratch. Users will soon get a chance to use models from TensorFlow Hub, a library for reusable parts of machine learning models to train a Keras or Estimator model. API Cleanup Many APIs are removed in this release, some of which are tf.app, tf.flags, and tf.logging. The main tf.* namespace will be cleaned by moving lesser used functions into sub packages such as tf.math. Few APIs have been replaced with their 2.0 equivalents like tf.keras.metrics, tf.summary, and tf.keras.optimizers. The v2 upgrade script can be used to automatically apply these renames. Major Improvements The queue runners will be removed in this release The graph collections will also get removed. The APIs will be renamed in this release for better usability. For example, name_scope can be accessed using tf.name_scope or tf.keras.backend.name_scope. For ease in migration to TensorFlow 2.0, the team at TensorFlow will provide a conversion tool for updating TensorFlow 1.x Python code for using TensorFlow 2.0 compatible APIs. It will flag the cases where code cannot be converted automatically. In this release, the stored GraphDefs or SavedModels will be backward compatible. With this release, the distribution to tf.contrib will no more be in use. Some of the existing contrib modules will be integrated into the core project or will be moved to a separate repository, rest of them will be removed. To know about this news, check out the post by the TensorFlow team on Medium. Building your own Snapchat-like AR filter on Android using TensorFlow Lite [ Tutorial ] Google expands its machine learning hardware portfolio with Cloud TPU Pods (alpha) to effectively train and deploy TensorFlow machine learning models on GCP Google researchers introduce JAX: A TensorFlow-like framework for generating high-performance code from Python and NumPy machine learning programs

0
0
17060

Tech News - Data

Following Google, Facebook changes its forced arbitration policy for sexual harassment claims

OpenAI launches Spinning Up, a learning resource for potential deep learning practitioners

Do Google Ads secretly track Stack Overflow users?

OpenAI: Two new versions and the output dataset of GPT-2 out!

Introducing Vector, a high-performance data router, written in Rust

Canva faced security breach, 139 million users data hacked: ZDNet reports

PostgreSQL wins ‘DBMS of the year’ 2018 beating MongoDB and Redis in DB-Engines Ranking

Researchers input rabbit-duck illusion to Google Cloud Vision API and conclude it shows orientation-bias

DARPA’s $2 Billion ‘AI Next’ campaign includes a Next-Generation Nonsurgical Neurotechnology (N3) program

Alibaba launches an AI chip company named ‘Ping-Tou-Ge’ to boost China’s semiconductor industry

Trending Topics

Google AdaNet, a TensorFlow-based AutoML framework

SatPy 0.10.0, python library for manipulating meteorological remote sensing data, released

OpenAI’s AI robot hand learns to solve a Rubik Cube using Reinforcement learning and Automatic Domain Randomization (ADR)

Stanford researchers introduce DeepSolar, a deep learning framework that mapped every solar panel in the US

TensorFlow 2.0 to be released soon with eager execution, removal of redundant APIs, tf function and more

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access