Tech News

article-image-dr-brandon-explains-transfer-learning

15 Nov 2017

5 min read

Dr. Brandon explains 'Transfer Learning' to Jon

15 Nov 2017

[box type="shadow" align="" class="" width=""]Dr. Brandon: Hello and welcome to another episode of 'Date with Data Science'. Today we are going to talk about a topic that is all the rage these days in the data science community: Transfer Learning. Jon: 'Transfer learning' sounds all sci-fi to me. Is it like the thing that Prof. X does in X-men reading other people's minds using that dome-like headset thing in his chamber? Dr. Brandon: If we are going to get X-men involved, what Prof. X does is closer to deep learning. We will talk about that another time. Transfer learning is simpler to explain. It's what you actually do everytime you get into some character, Jon. Say, you are given the role of Jack Sparrow to play. You will probably read a lot about pirates, watch a lot of pirate movies and even Jonny Depp in character and form your own version of Jack Sparrow. Now after that acting assignment is over, say you are given the opportunity to audition for the role of Captain Hook, the famous pirate from Peter Pan. You won't do your research from ground zero this time. You will retain general mannerisms of a Pirate you learned from your previous role, but will only learn the nuances of Captain Hook, like acting one-handed. Jon: That's pretty cool! So you say machines can also learn this way? Dr.Brandon: Of course, that's what transfer learning is all about: learn something, abstract the learning sufficiently, then apply it to another related problem. The following is an excerpt from a book by Kuntal Ganguly titled Learning Generative Adversarial Networks.[/box] Pre-trained models are not optimized for tackling user specific datasets, but they are extremely useful for the task at hand that has similarity with the trained model task. For example, a popular model, InceptionV3, is optimized for classifying images on a broad set of 1000 categories, but our domain might be to classify some dog breeds. A well-known technique used in deep learning that adapts an existing trained model for a similar task to the task at hand is known as Transfer Learning. And this is why Transfer Learning has gained a lot of popularity among deep learning practitioners and in recent years has become the go-to technique in many real-life use cases. It is all about transferring knowledge (or features) among related domain. Purpose of Transfer Learning Let say you have trained a deep neural network to differentiate between fresh mango and rotten mango. During training, the network requires thousands of rotten and fresh mango images and hours of training to learn knowledge like if any fruit is rotten, a liquid will ooze out of the fruit and it produce a bad odor. Now with this training experience the network, can be used for different task/use-case to differentiate between a rotten apple and fresh apple using the knowledge of rotten features learned during training of mango images. The general approach of Transfer Learning is to train a base network and then copy its first n layers to the first n layers of a target network. The remaining layers of the target network are initialized randomly and trained toward the targeted use-case. The main scenarios for using Transfer Learning in your deep learning workflow are as follows: Smaller datasets: When you have a smaller dataset, building a deep learning model from scratch won't work well. Transfer Learning provides the way to apply a pre-trained model to new classes of data. Let's say a pre-trained model built from one million images of ImageNet data will converge to a decent solution (after training on just a fraction of the available smaller training data, for example, CIFAR-10) compared to a deep learning model built with a smaller dataset from scratch. Less resource: Deep learning process (such as convolution) requires a significant amount of resource and time. Deep learning process are well suited to run on high graded GPU-based machines. But with pre-trained models, you can easily train across a full training set (let's say 50000 images) in less than a minute using your laptop/notebook without GPU, since the majority of time a model is modified in the final layer with a simple update of just a classifier or regressor. Various approaches of using pre-trained models Using pre-trained architecture: Instead of transferring weights of the trained model, we can only use the architecture and initialize our own random weights to our new dataset. Feature extractor: A pre-trained model can be used as a feature extraction mechanism just by simply removing the output layer of the network (that gives the probabilities for being in each of the n classes) and then freezing all the previous layers of the network as a fixed feature extractor for the new dataset. Partially freezing the network: Instead of replacing only the final layer and extracting features from all previous layers, sometime we might train our new model partially (that is, to keep the weights of initial layers of the network frozen while retraining only the higher layers). Choice of the number of frozen layers can be considered as one more hyper-parameter. Next, read about how transfer learning is being used in the real world. If you enjoyed the above excerpt, do check out the book it is from.

0
0
20138

article-image-google-joins-social-coding-colaboratory

Savia Lobo

15 Nov 2017

3 min read

Google joins the social coding movement with CoLaboratory

Savia Lobo

15 Nov 2017

3 min read

Google has made it quite accessible for people to collaborate their documents, spreadsheets, and so on, with the Google Drive feature. What next? If you are one of those data science nerds who love coding, this roll-out from Google would be an amazing experimental ground for you. Google released its coLaboratory project, a new tool, and a boon for data science and analysis. It is designed in a way to make collaborating on data easier; similar to a Google document. This means it is capable of running code and providing simultaneous output within the document itself. Collaboration is what sets coLaboratory apart. It allows an improved collaboration among people having distinct skill sets--one may be great at coding, while the other might be well aware of the front-end or GUI aspects of the project. Just as you store and share a Google document or spreadsheets, you can store and share code with coLaboratory notebooks, in Google Drive. All you have to do is, click on the 'Share' option at the top right of any coLaboratory notebook. You can also look up to the Google Drive file sharing instructions. Thus, it sets new improvements for the ad-hoc workflows without the need of mailing documents back and forth. CoLaboratory includes a Jupyter notebook environment that does not require any setup for using it. With this, one does not need to download, install, or run anything on their computer. All they would need is, just a browser and they can use and share Jupyter notebooks. At present, coLaboratory functions with Python 2.7 on the desktop version of Chrome only. The reason for this is, coLab with Python 2.7 has been an internal tool for Google, for many years. Although, making it available on other browsers and with an added support for other Jupyter Kernels such as R or Scala is on the cards, soon. CoLaboratory’s GitHub repository contains two dependent tools, which one can make use of to leverage the tool onto the browser. First is the coLaboratory Chrome App and the other is coLaboratory with Classic Jupyter Kernels. Both tools can be used for creating and storing notebooks within Google Drive. This allows a collaborative editing within the notebooks. The only difference is that Chrome App executes all the code within its browser using the PNaCl Sandbox. Whereas, the CoLaboratory classic code execution is done using the local Jupyter kernels (IPython kernel) that have a complete access to the host systems and files. The coLaboratory Chrome App aids in setting up a collaborative environment for data analysis. This can be a hurdle at times, as requirements vary among different machines and operating systems. Also, the installation errors can be cryptic too. However, just with a single click, coLaboratory, IPython and a large set of popular scientific python libraries can be installed. Also, because of the Portable Native Client (PNaCl), coLaboratory is secure and runs at local speeds. This allows new users to set out on exploring IPython at a faster speed. Here’s what coLaboratory brings about for the code-lovers: No additional installation required the browser does it all The capabilities of coding now within a document Storing and sharing the notebooks on Google Drive Real-time collaboration possible; no fuss of mailing documents to and fro You can find a detailed explanation of the tool on GitHub.

0
0
20301

article-image-reinforcement-learning-works

Pravin Dhandre

14 Nov 2017

5 min read

How Reinforcement Learning works

Pravin Dhandre

14 Nov 2017

5 min read

[box type="note" align="" class="" width=""]This article is an excerpt from a book by Rodolfo Bonnin titled Machine Learning for Developers.[/box] Reinforcement learning is a field that has resurfaced recently, and it has become more popular in the fields of control, finding the solutions to games and situational problems, where a number of steps have to be implemented to solve a problem. A formal definition of reinforcement learning is as follows: "Reinforcement learning is the problem faced by an agent that must learn behavior through trial-and-error interactions with a dynamic environment.” (Kaelbling et al. 1996). In order to have a reference frame for the type of problem we want to solve, we will start by going back to a mathematical concept developed in the 1950s, called the Markov decision process. Markov decision process Before explaining reinforcement learning techniques, we will explain the type of problem we will attack with them. When talking about reinforcement learning, we want to optimize the problem of a Markov decision process. It consists of a mathematical model that aids decision making in situations where the outcomes are in part random, and in part under the control of an agent. The main elements of this model are an Agent, an Environment, and a State, as shown in the following diagram: Simpliﬁed scheme of a reinforcement learning process The agent can perform certain actions (such as moving the paddle left or right). These actions can sometimes result in a reward rt, which can be positive or negative (such as an increase or decrease in the score). Actions change the environment and can lead to a new state st+1, where the agent can perform another action at+1. The set of states, actions, and rewards, together with the rules for transitioning from one state to another, make up a Markov decision process. Decision elements To understand the problem, let's situate ourselves in the problem solving environment and look at the main elements: The set of states The action to take is to go from one place to another The reward function is the value represented by the edge The policy is the way to complete the task A discount factor, which determines the importance of future rewards The main difference with traditional forms of supervised and unsupervised learning is the time taken to calculate the reward, which in reinforcement learning is not instantaneous; it comes after a set of steps. Thus, the next state depends on the current state and the decision maker's action, and the state is not dependent on all the previous states (it doesn't have memory), thus it complies with the Markov property. Since this is a Markov decision process, the probability of state st+1 depends only on the current state st and action at: Unrolled reinforcement mechanism The goal of the whole process is to generate a policy P, that maximizes rewards. The training samples are tuples, <s, a, r>. Optimizing the Markov process Reinforcement learning is an iterative interaction between an agent and the environment. The following occurs at each timestep: The process is in a state and the decision-maker may choose any action that is available in that state The process responds at the next timestep by randomly moving into a new state and giving the decision-maker a corresponding reward The probability that the process moves into its new state is influenced by the chosen action in the form of a state transition function Basic RL techniques: Q-learning One of the most well-known reinforcement learning techniques, and the one we will be implementing in our example, is Q-learning. Q-learning can be used to find an optimal action for any given state in a finite Markov decision process. Q-learning tries to maximize the value of the Q-function that represents the maximum discounted future reward when we perform action a in state s. Once we know the Q-function, the optimal action a in state s is the one with the highest Q- value. We can then define a policy π(s), that gives us the optimal action in any state, expressed as follows: We can define the Q-function for a transition point (st, at, rt, st+1) in terms of the Q-function at the next point (st+1, at+1, rt+1, st+2), similar to what we did with the total discounted future reward. This equation is known as the Bellman equation for Q-learning: In practice, we can think of the Q-function as a lookup table (called a Q-table) where the states (denoted by s) are rows and the actions (denoted by a) are columns, and the elements (denoted by Q(s, a)) are the rewards that you get if you are in the state given by the row and take the action given by the column. The best action to take at any state is the one with the highest reward: initialize Q-table Q observe initial state s while (! game_finished): select and perform action a get reward r advance to state s' Q(s, a) = Q(s, a) + α(r + γ max_a' Q(s', a') - Q(s, a)) s = s' You will realize that the algorithm is basically doing stochastic gradient descent on the Bellman equation, backpropagating the reward through the state space (or episode) and averaging over many trials (or epochs). Here, α is the learning rate that determines how much of the difference between the previous Q-value and the discounted new maximum Q- value should be incorporated. We can represent this process with the following flowchart: We have successfully reviewed Q-Learning, one of the most important and innovative architecture of reinforcement learning that have appeared in recent. Every day, such reinforcement models are applied in innovative ways, whether to generate feasible new elements from a selection of previously known classes or even to win against professional players in strategy games. If you enjoyed this excerpt from the book Machine learning for developers, check out the book below.

0
0
27404

article-image-introducing-googles-tangent

Sugandha Lahoti

14 Nov 2017

3 min read

Introducing Google's Tangent: A Python library with a difference

Sugandha Lahoti

14 Nov 2017

3 min read

The Google Brain team, in a recent blog post, announced the arrival of Tangent, an open source and free Python library for ahead-of-time automatic differentiation. Most machine learning algorithms require the calculation of derivatives and gradients. If we do it manually, it is time-taking as well as error-prone. Automatic differentiation or autodiff is a set of techniques to accurately compute the derivatives of numeric functions expressed as computer programs. Autodiff techniques can run large-scale machine learning models with high-performance and better usability. Tangent uses the Source code transformation (SCT) in Python to perform automatic differentiation. What it basically does is, take the Python source code as input, and then produce new Python functions as its output. The new python function calculates the gradient of the input. This improves readability of the automatic derivative code similar to the rest of the program. In contrast, TensorFlow and Theano, the two most popular machine learning frameworks do not perform autodiff on the Python Code. They instead use Python as a metaprogramming language to define a data flow graph on which SCT is performed. This at times is confusing to the user, considering it involves a separate programming paradigm. Source: https://github.com/google/tangent/blob/master/docs/toolspace.png Tangent has a one-function API: import tangent df = tangent.grad(f) For printing out derivatives: import tangent df = tangent.grad(f, verbose=1) Because it uses SCT, it generates a new python function. This new function follows standard semantics and its source code can be inspected directly. This makes it easy to understand by users, easy to debug, and has no runtime overhead. Another highlighting feature is the fact that it is easily compatible with TensorFlow and NumPy. It is high performing and is built on Python, which has a large and growing community. For processing arrays of numbers, TensorFlow Eager functions are also supported in Tangent. This library also auto-generates derivatives of codes that contain if statements and loops. It also provides easy methods to generate custom gradients. It improves usability by using abstractions for easily inserting logic into the generated gradient code. Tangent provides forward-mode auto differentiation. This is a better alternative than the backpropagation, which fails for cases where the number of outputs exceeds the number of inputs. In contrast, forward-mode auto diff runs in proportion to the input variables. According to the Github repository, “Tangent is useful to researchers and students who not only want to write their models in Python but also read and debug automatically-generated derivative code without sacrificing speed and flexibility.” Currently Tangent does not support classes and closures. Although the developers do plan on incorporating classes. This will enable class definitions of neural networks and parameterized functions. Tangent is still in the experimental stage. In the future, the developers plan to extend it to other numeric libraries and add support for more aspects of the Python language. These include closures, classes, more NumPy and TensorFlow functions etc. They also plan to add more advanced autodiff and compiler functionalities. To summarize, here’s a bullet list of key features of Tangent: Auto differentiation capabilities Code is easy to interpret, debug, and modify Easily compatible Custom Gradients Forward-mode autodiff High performance and optimization You can learn more about the project on their official GitHub.

0
0
13203

article-image-trending-datascience-news-14th-nov-17-headlines

Packt Editorial Staff

14 Nov 2017

5 min read

14th Nov.' 17 - Headlines

Packt Editorial Staff

14 Nov 2017

5 min read

New machine learning language Tile, new HPC systems from Dell EMC and HPE, Microsoft’s Neural Fuzzing, and Amazon’s project Ironman in today’s data science news. Introducing Tile Tile: A new language for machine learning from Vertex.AI Vertex.AI has released a new machine learning language called Tile. It is a tensor manipulation language that is used in PlaidML’s backend to generate custom kernels for each specific operation on each GPU. The automatically produced kernels make it easier to add support of GPUs and new processors, and saves time and effort overall. Tile’s syntax balances expressiveness and optimization to cover the widest range of operations to build neural networks. It closely resembles mathematical notation for describing linear algebra operations, and fully supports automatic differentiation. Vertex.AI said in its official blog that Tile was designed to be parallelizable as well as analyzable. In Tile, it’s possible to analyze issues ranging from cache coherency, use of shared memory, and memory bank conflicts. Dell EMC announces new HPC systems Dell EMC announces high-performance computing bundles aimed at AI, deep learning At the SuperComputing 2017 conference in Denver, Dell EMC introduced a set of high-performance computing (HPC) systems, Dell EMC Ready Bundles for Machine and Deep Learning. These systems, the companies said, intend to bring HPC and data analytics into mainstream thus helping in fraud detection, image processing, financial investment analysis and personalized medicine. The set of services are expected to be available in the first half of 2018. Dell EMC announces new PowerEdge server designed specifically for HPC workloads Dell EMC introduced a new PowerEdge server designed specifically for HPC workloads: Dell EMC PowerEdge C4140 server. As part of a joint development agreement with NVIDIA, this new server supports up to four NVIDIA Tesla V100 GPU accelerators with PCIe and NVLink high-speed interconnect technology. The servers also leverage two Intel Xeon Scalable Processors, and is thus “ideal for intensive machine learning and deep learning applications to drive advances in scientific imaging, oil and gas exploration, financial services and other HPC industry verticals.” The Dell EMC PowerEdge C4140 is expected to be available worldwide in December 2017. Hewlett Packard announces set of upgraded HPC systems for AI HPE Apollo 2000 Gen10 In a bid to make high-performance computing (HPC) and AI more accessible to enterprises, Hewlett Packard Enterprise has announced a set of upgraded high-density compute and storage systems. The HPE Apollo 2000 Gen10 is a multi-server platform for enterprises looking to support HPC and deep learning applications with limited datacenter space. The platform supports Nvidia Tesla V100 GPU accelerators to enable deep learning training and inference for use cases such as real-time video analytics for public safety. Enterprises deploying the HPE Apollo 2000 Gen10 system can start small with a single 2U shared infrastructure and scale out up to 80 HPE ProLiant Gen10 servers in a 42U rack. HPE Apollo 4510 Gen10 The HPE Apollo 4510 Gen10 system is designed for enterprises with data-intensive workloads that are using object storage as an active archive. The system, which has 16 percent more cores than the previous generation, HPE said, and it offers storage capacity of up to 600TB in a 4U form factor with standard server depth. It also supports NVMe cards. HPE Apollo 70 Hewlett Packard Enterprise has announced the launch of HPE Apollo 70, its first ARM-based HPC system using Cavium's 64-bit ARMv8-A ThunderX2 server processor. Set to become available in 2018, the system is designed for memory-intensive HPC workloads, and is compatible with HPC components from HPE's ecosystem partners including Red Hat Enterprise Linux, SUSE Linux Enterprise Server for ARM, and Mellanox InfiniBand and Ethernet fabric solutions. HPE LTO-8 Tape Hewlett Packard announced HPE LTO-8 Tape, which allows enterprises to offload primary storage to tape, with a storage capacity of 30 terabytes per tape cartridge — double that of the previous LTO-7 generation. The HPE LTO-8 Tape is slated for general availability in December 2017. HPE T950 The HPE T950 tape library now stores up to 300 petabytes of data, Hewlett Packard announced. The HPE TFinity ExaScale tape library provides storage capacity for up to 1.6 exabytes of data, the company said. Announcing Microsoft's Neural Fuzzing Neural Fuzzing: Microsoft uses machine learning, deep neural networks for new vulnerability testing Microsoft has announced a new method for discovering software security vulnerabilities, called ‘neural fuzzing.’ The method combines machine learning and deep neural networks to use past experience in order to identify overlooked issues better. The neural fuzzing method takes traditional fuzz testing and adds a machine learning model to insert a deep neural network in the feedback loop of a ‘greybox fuzzer.’ Development Lead William Blum said the neural fuzzing approach is simple because it is not based on sophisticated handcrafted heuristics; instead, it simply learns from an existing fuzzer. He also argued that the new method explores data more quickly than a traditional fuzzer, and that it could be applied to any fuzzer, including blackbox and random fuzzers. “Right now, our model only learns fuzzing locations, but we could also use it to learn other fuzzing parameters such as the type of mutation or strategy to apply,” Blum said. Amazon to launch Ironman Amazon Web Services set to launch AI project Ironman, ease the use of Google’s TensorFlow Amazon Web services could introduce a new service code-named ‘Ironman’ that will make it easier for people to do artificial intelligence work involving lots of different kinds of data, according to a report published in The Information. The Ironman program includes a new AWS cloud “data warehouse” service that collects data from multiple sources within a company and stores it in a central location. Besides, AWS plans to make it easier for people to use TensorFlow. Google made TensorFlow available under an open-source license in 2015, and the library is now widely used among researchers.

0
0
1477

article-image-introducing-tile-language-machine-learning

Sugandha Lahoti

14 Nov 2017

3 min read

Introducing Tile : A new machine learning language with auto generating GPU Kernels

Sugandha Lahoti

14 Nov 2017

3 min read

Recently, Vertex.AI announced a simple and compact machine learning language for its PlaidML framework. Tile is a tensor manipulation language built to bring the PlaidML framework to a wider developer audience. PlaidML is their open source and portable deep learning framework developed for deploying neural networks on any device. A key obstacle the developers of PlaidML faced was scalability. In order for any framework to be adopted across a wide variety of platforms, software support is required. By software support we mean the implementation of software kernels which is a glue between frameworks and the underlying system. Tile comes as a rescue here because it can automatically generate these kernels. This addresses the problem of compatibility by making it easier to add support for different NVIDIA GPUs as well as other new types of processors such as those from AMD and Intel. Tile runs on the backend of PlaidML to produce custom kernels for each specific operation for each GPU. As these kernels are machine generated they are highly accelerated. A high acceleration leads to easily adding support for different processors. Using Tile, machine learning operations can be methodically implemented on parallel computing architectures. It can also be easily converted into optimized GPU kernels. Another key feature of Tile is the fact that the code is very easy to write and understand. This is because coding in Tile is similar to writing a mathematical notation. In addition to this, all machine learning operations expressed in this language can be automatically differentiated. The fact that it is so easy to understand makes it easily adoptable by both machine learning practitioners as well as software engineers and mathematicians. This is an example for writing a Tile matrix multiply : function (A[M, L], B[L, N]) -> (C) { C[i, j: M, N] = +(A[i, k] * B[k, j]); } Note how closely it resembles linear algebra operations with an easy syntax. This syntax is demonstrative as well as optimized for covering all operations required to build neural networks. PlaidML uses Tile as the intermediate language while integration with Keras. This reduces significant writing of backend Keras code. So, it gets easy to support and implement new operations such as dilated convolutions. Tile can also address and analyze issues such as cache coherency, shared memory usage, and memory bank conflicts. According to the official blog of Vertex AI, Tile is characterized by: Control-flow & side-effect free operations on n-dimensional tensors Mathematically oriented syntax resembling tensor calculus N-Dimensional, parametric, composable, and type-agnostic functions Automatic Nth-order differentiation of all operations Suitability for both JITing and pre-compilation Transparent support for resizing, padding & transposition The developers are currently working to bring the language to a formal specification. In the future, they intend to use a similar approach to make TensorFlow, PyTorch, and other frameworks compatible with PlaidML. If you’re interested in learning how to write code in Tile, you can check the Tile tutorial on their GitHub.

0
0
14462

article-image-ibm-google-quantum-computing

Abhishek Jha

14 Nov 2017

3 min read

Has IBM edged past Google in the battle for Quantum Supremacy?

Abhishek Jha

14 Nov 2017

3 min read

Last month when researchers at Google unveiled a blueprint for quantum supremacy, little did they know that rival IBM was about to snatch the pole position. In what could be the largest and the most sophisticated quantum computer built till date, IBM has announced the development of a quantum computer capable of handling 50 qubits (quantum bits). The Big Blue also announced another 20-qubit processor that will be made available through IBM Q cloud by the end of the year. "Our 20-qubit machine has double the coherence time, at an average of 90 microseconds, compared to previous generations of quantum processors with an average of 50 microseconds. It is also designed to scale; the 50-qubit prototype has similar performance," Dario Gil, who leads IBM's quantum computing and artificial intelligence research division, said in his blog post. IBM’s progress in this space has been truly rapid. After launching the 5-qubit system in May 2016, they followed with a 15-qubit machine this year, and then upgraded the IBM Q experience to 20-qubits, putting 50-qubits in line. That is quite a leap in 18 months. As a technology, quantum computing is a rather difficult area to understand — information is processed differently here. Unlike normal computers that interpret either a 0 or a 1, quantum computers can live in multiple states, leading to all kinds of programming possibilities for such type of computing. Add to it the coherence factor that makes it very difficult for programmers to build a quantum algorithm. While the company did not divulge the technical details about how its engineers could simultaneously expand the number of qubits and increase the coherence times, it did mention that the improvements were due to better “superconducting qubit design, connectivity and packaging.” That the 50-qubit prototype is a “natural extension” of the 20-qubit technology and both exhibit "similar performance metrics." The major goal though is to create a fault tolerant universal system that is capable of correcting errors automatically while having high coherence. "The holy grail is fault-tolerant universal quantum computing. Today, we are creating approximate universal, meaning it can perform arbitrary operations and programs, but it’s approximating so that I have to live with errors and a limited window of time to perform the operations," Gil said. The good news is that an ecosystem is building up. Through the IBM Q experience, more than 60,000 users have run over 1.7 million quantum experiments and generated over 35 third-party research publications. That the beta-testers included 1,500 universities, 300 high schools and 300 private-sector participants means quantum computing is closer to implementation in real world, in areas like medicine, drug discovery and materials science. "Quantum computing will open up new doors in the fields of chemistry, optimisation, and machine learning in the coming years," Gil added. "We should savor this period in the history of quantum information technology, in which we are truly in the process of rebooting computing." All eyes are now on Google, IBM’s nearest rival in quantum computing at this stage. While IBM’s 50-qubit processor has taken away half the charm out of Google’s soon to be announced 49-qubit system, expect more surprises in the offing as Google has so far managed to keep its entire quantum computing machinery behind closed doors.

0
0
21478

article-image-pint-paper-two-mins-making-neural-network-architectures-generalize-via-recursion

Amarabha Banerjee

13 Nov 2017

3 min read

PINT (Paper IN Two mins) - Making Neural Network Architectures generalize via Recursion

Amarabha Banerjee

13 Nov 2017

3 min read

This is a quick summary of the research paper titled Making Neural Programming Architectures Generalize via Recursion by Jonathon Cai, Richard Shin, Dawn Song published on 6th Nov 2016. The idea of solving a common task is central to developing any algorithm or system. The primary challenge in designing any such system is the problem of generalizing the result to a large set of data. Simply put it means that using the same system, we should be able to predict accurate results when the amount of data is vast and varied across different domains. This is where most ANN systems fail. Researchers have claimed that the process of iteration which is inherent in all algorithms if introduced externally, will help us arrive at a system and architecture that can predict accurate results over limitless amounts of data. This technique is called the Recursive Neural Program. For more on this and the different Neural network programs, you can refer to the original research paper. A sample illustration showing a Neural Network Program is shown below: The Problem with Learned Neural Networks The most common technique which was applied till date to was to use Learned Neural Network - a method where a program was given increasingly complex tasks - for example solving the graduate level addition problem, in simpler words, adding two numbers. The problem with this approach was that the program kept on solving correctly as long as the number of digits was less. When the digits increased, the results were chaotic, some were correct and some were not, the reason being the program chose a complex method to solve the problem of increasing complexity. The real reason behind it was actually the architecture, which stayed the same as the complexity of the problem was increased, hence the program could not adapt in the end and gave chaotic response. The Solution of Recursion The essence of recursion is that it helps the system break down the problem into smaller pieces and then it solves these problems separately. This means irrespective of how complex the problem, the recursive process will break it down into standard units, i.e., the solution remains uniform and consistent. Keeping the theory of recursion in mind, a group of researchers have implemented this in their neural network Program and created a recursive architecture called as the Neural Programmer-Interpreter (NPI). This illustration shows the different algorithms and techniques used to create Neural Network based programs. The present system is based on the May 2016 formulation proposed by Reed et al. The system induces a supervised recursion in solving any task, in a way that a particular function stores an output in a particular memory cell, then calls that output value back while checking the actual desired result. This self-calling of the program or the function automatically induces recursion and that itself helps the program to decompose the problem into multiple smaller units and hence the results are more accurate than other techniques. The scientists have successfully applied this technique to solve four common tasks namely Grade School Addition Bubble Sort Topological Sort Quick Sort They have found that the Recursive Neural Network architecture gives 100 percent success rates in predicting correct results in case of all the four above mentioned tasks. The flip- side of this technique is still the amount of supervision required while performing the tasks. These will be subject to further investigation and research. For a more detailed approach and results on the different neural Network programs and their performance, please refer to the original research paper.

0
0
7362

article-image-trending-datascience-news-13th-nov-17-headlines

Packt Editorial Staff

13 Nov 2017

3 min read

13th Nov.' 17 - Headlines

Packt Editorial Staff

13 Nov 2017

3 min read

IBM's 50-qubit machine, a visual analytics tool called SpotLyt, and AllegroGraph Triple Attributes in today’s trending stories in data science news. The largest quantum computer 50Q IBM announces 50-qubit quantum computer IBM has announced a quantum computer that handles 50 quantum bits (qubits). The company said it also has a working prototype of a 20-qubit system that will be made available through IBM Q cloud by year end. IBM did not divulge the technical details about how its engineers could simultaneously expand the number of qubits and increase the coherence times, but it did mention in the official statement that the improvements were due to better “superconducting qubit design, connectivity and packaging.” The 50-qubit prototype, known as 50Q, is a “natural extension” of the 20-qubit technology and exhibits “similar performance metrics,” the company added. The 50-qubit machine is so far the largest and most powerful quantum computer ever built. At this stage, IBM’s nearest rival in quantum computing is Google, which could demonstrate a working 49-qubit system before the end of 2017. Launching SpotLyt Brytlyt announces visual analytics tool SpotLyt for billion row data sets GPU-accelerated database & analytics platform Brytlyt has introduced SpotLyt, a real-time visualization and analytical tool designed for massive datasets. SpotLyt can be used either as a stand-alone visualization tool or as an add-on to a company’s current visualization set-up. “We built SpotLyt because we found existing visualization tools don't handle geo-visualization over 20,000 data points very well,” Brytlyt CEO Richard Heyns said, “Since SpotLyt uses Brytlyt's own data rendering engine to visualize billion row datasets, analysts can now get a holistic and detailed point of view at their fingertips.” AllegroGraph more secure than ever Franz adds Triple Attribute security to AllegroGraph Franz has announced Triple Attributes for its semantic graph database AllegroGraph. The new feature provides the necessary power and flexibility to address high-security data environments such as HIPAA access controls, privacy rules for banks, and security models for policing, intelligence and government. “Enterprises want the flexibility of graph databases, but they also want the security they have come to rely on with relational databases,” Franz CEO Jans Aasman said. Though the Triple Attributes feature was initiated for government level data security, it can be implemented for diverse data analytics from real world events like crop yields to storing blockchain hashes and ICO public keys for KYC applications. Triple Attribute Security is now available in AllegroGraph v6.3.

0
0
1359

Amarabha Banerjee

13 Nov 2017

8 min read

Getting started with Storm Components for Real Time Analytics

Amarabha Banerjee

13 Nov 2017

8 min read

[box type="note" align="" class="" width=""]In this article by Shilpi Saxena and Saurabh Gupta from their book Practical Real-time data Processing and Analytics we shall explore Storm's architecture with its components and configure it to run in a cluster. [/box] Initially, real-time processing was implemented by pushing messages into a queue and then reading the messages from it using Python or any other language to process them one by one. The primary challenges with this approach were: In case of failure of the processing of any message, it has to be put back in to queue for reprocessing Keeping queues and the worker (processing unit) up and running all the time Below are the two main reasons that make Storm a highly reliable real-time engine: Abstraction: Storm is distributed abstraction in the form of Streams. A Stream can be produced and processed in parallel. Spout can produce new Stream and Bolt is a small unit of processing on stream. Topology is the top level abstraction. The advantage of abstraction here is that nobody must be worried about what is going on internally, like serialization/deserialization, sending/receiving message between different processes, and so on. The user must be focused on writing the business logic. A guaranteed message processing algorithm: Nathan Marz developed an algorithm based on random numbers and XORs that would only require about 20 bytes to track each spout tuple, regardless of how much processing was triggered downstream. Storm Architecture and Storm components The nimbus node acts as the master node in a Storm cluster. It is responsible for analyzing topology and distributing tasks on different supervisors as per the availability. Also, it monitors failure; in the case that one of the supervisors dies, it then redistributes the tasks among available supervisors. Nimbus node uses Zookeeper to keep track of tasks to maintain the state. In case of Nimbus node failure, it can be restarted which reads the state from Zookeeper and start from the same point where it failed earlier. Supervisors act as slave nodes in the Storm cluster. One or more workers, that is, JVM processes, can run in each supervisor node. A supervisor co-ordinates with workers to complete the tasks assigned by nimbus node. In the case of worker process failure, the supervisor finds available workers to complete the tasks. A worker process is a JVM running in a supervisor node. It has executors. There can be one or more executors in the worker process. Worker co-ordinates with executor to finish up the task. An executor is single thread process spawned by a worker. Each executor is responsible for running one or more tasks. A task is a single unit of work. It performs actual processing on data. It can be either Spout or Bolt. Apart from above processes, there are two important parts of a Storm cluster; they are logging and Storm UI. The logviewer service is used to debug logs for workers at supervisors on Storm UI. The following are the primary characteristics of Storm that make it special and ideal for real-time processing. Fast Reliable Fault-Tolerant Scalable Programming Language Agnostic Strom Components Tuple: It is the basic data structure of Storm. It can hold multiple values and data type of each value can be different. Topology: As mentioned earlier, topology is the highest level of abstraction. It contains the flow of processing including spout and bolts. It is kind of graph computation. Stream: The stream is core abstraction of Storm. It is a sequence of unbounded tuples. A stream can be processed by the different type of bolts and which results into a new stream. Spout: Spout is a source of stream. It reads messages from sources like Kafka, RabbitMQ, and so on as tuples and emits them in a stream. There are two types of Spout Reliable: Spout keeps track of each tuple and replay tuple in case of any failure. Unreliable: Spout does not care about the tuple once it is emitted as a stream to another bolt or spout. Setting up and configuring Storm Before setting up Storm, we need to setup Zookeeper which is required by Storm: Setting up Zookeeper Below are instructions on how to install, configure and run Zookeeper in standalone and cluster mode: Installing Download Zookeeper from http://www-eu.apache.org/dist/zookeeper/zookeeper-3.4.6/zookeeper-3.4.6.tar.gz. After the download, extract zookeeper-3.4.6.tar.gz as below: tar -xvf zookeeper-3.4.6.tar.gz The following files and folders will be extracted: Configuring There are two types of deployment with Zookeeper; they are standalone and cluster. There is no big difference in configuration, just new extra parameters for cluster mode. Standalone As shown, in the previous figure, go to the conf folder and change the zoo.cfg file as follows: tickTime=2000 # Length of single tick in milliseconds. It is used to # regulate heartbeat and timeouts. initLimit=5 # Amount of time to allow followers to connect and sync # with leader. syncLimit=2 # Amount of time to allow followers to sync with # Zookeeper dataDir=/tmp/zookeeper/tmp # Directory where Zookeeper keeps # transaction logs clientPort=2182 # Listening port for client to connect. maxClientCnxns=30 # Maximum limit of client to connect to Zookeeper # node. Cluster In addition to above configuration, add the following configuration to the cluster as well: server.1=zkp-1:2888:3888 server.2=zkp-2:2888:3888 server.3=zkp-3:2888:3888 server.x=[hostname]nnnn:mmmm : Here x is id assigned to each Zookeeper node. In datadir, configured above, create a file "myid" and put corresponding ID of Zookeeper in it. It should be unique across the cluster. The same ID is used as x here. Nnnn is the port used by followers to connect with leader node and mmmm is the port used for leader election. Running Use the following command to run Zookeeper from the Zookeeper home dir: /bin/zkServer.sh start The console will come out after the below message and the process will run in the background. Starting zookeeper ... STARTED The following command can be used to check the status of Zookeeper process: /bin/zkServer.sh status The following output would be in standalone mode: Mode: standalone The following output would be in cluster mode: Mode: follower # in case of follower node Mode: leader # in case of leader node Setting up Apache Storm Below are instructions on how to install, configure and run Storm with nimbus and supervisors. Installing Download Storm from http://www.apache.org/dyn/closer.lua/storm/apache-storm-1.0.3/apache-storm-1.0.3.tar.gz. After the download, extract apache-storm-1.0.3.tar.gz, as follows: tar -xvf apache-storm-1.0.3.tar.gz Below are the files and folders that will be extracted: Configuring As shown, in the previous figure, go to the conf folder and add/edit properties in storm.yaml: Set the Zookeeper hostname in the Storm configuration: storm.zookeeper.servers: - "zkp-1" - "zkp-2" - "zkp-3" Set the Zookeeper port: storm.zookeeper.port: 2182 Set the Nimbus node hostname so that storm supervisor can communicate with it: nimbus.host: "nimbus" Set Storm local data directory to keep small information like conf, jars, and so on: storm.local.dir: "/usr/local/storm/tmp" Set the number of workers that will run on current the supervisor node. It is best practice to use the same number of workers as the number of cores in the machine. supervisor.slots.ports: - 6700 - 6701 - 6702 - 6703 - 6704 - 6705 Perform memory allocation to the worker, supervisor, and nimbus: worker.childopts: "-Xmx1024m" nimbus.childopts: "-XX:+UseConcMarkSweepGC – XX:+UseCMSInitiatingOccupancyOnly – XX_CMSInitiatingOccupancyFraction=70" supervisor.childopts: "-Xmx1024m" Topologies related configuration: The first configuration is to configure the maximum amount of time (in seconds) for a tuple's tree to be acknowledged (fully processed) before it is considered failed. The second configuration is that Debug logs are false, so Storm will generate only info logs. topology.message.timeout.secs: 60 topology.debug: false Running There are four services needed to start a complete Storm cluster: Nimbus: First of all, we need to start Nimbus service in Storm. The following is the command to start it: /bin/storm nimbus Supervisor: Next, we need to start supervisor nodes to connect with the nimbus node. The following is the command: /bin/storm supervisor UI: To start Storm UI, execute the following command: /bin/storm ui You can access UI on http://nimbus-host:8080. It is shown in following figure. Logviewer: Log viewer service helps to see the worker logs in the Storm UI. Execute the following command to start it: /bin/storm logviewer Summary We started with the history of Storm, where we discussed how Nathan Marz the got idea for Storm and what type of challenges he faced while releasing Storm as open source software and then in Apache. We discussed the architecture of Storm and its components. Nimbus, supervisor worker, executors, and tasks are part of Storm's architecture. Its components are tuple, stream, topology, spout, and bolt. We discussed how to set up Storm and configure it to run in the cluster. Zookeeper is required to be set up first, as Storm requires it. The above was an excerpt from the book Practical Real-time data Processing and Analytics

0
0
18865

article-image-10th-nov-17-data-science-weekly-news

Packt Editorial Staff

13 Nov 2017

2 min read

Week at a Glance (4th - 10th Nov. '17): Top News from Data Science

Packt Editorial Staff

13 Nov 2017

2 min read

Last week saw some interesting partnerships between tech giants, new tool announcements in the conversational AI space, significant version updates and further advancement towards the democratization of AI development and usage. Here is a quick rundown of news in the data science space worth your attention! News Highlights China’s Baidu launches Duer OS Prometheus Project to accelerate conversational AI Frenemies: Intel and AMD partner on laptop chip to keep Nvidia at bay Introducing “Pyro” for deep probabilistic modeling Salesforce myEinstein: Now build AI apps with ‘clicks, not code’ Apache Kafka 1.0: From messaging system to streaming platform Cisco Spark Assistant: World’s first AI voice assistant for meetings In Other News 10th Nov.’ 17 – Headlines AI.io launches PLATO, an AI-based operating platform for enterprise HPE developing its own neural network chip that is faster than anything in the market Atos launches next generation AI servers “BullSequana S” that are ultra-scalable, ultra-flexible 9th Nov.’ 17 – Headlines Bitcoin price surges to record high, then tanks, as plans to split digital currency is called off MongoDB 3.6 released: Change Streams, Retryable Writes among key updates in MongoDB’s biggest ever release Introducing Grid: A scalable Blockchain system for better performance, resource segregation and working governance model 8th Nov.’ 17 – Headlines spaCy 2.0 released with 13 new neural network models for 7+ languages Microsoft says it will extend Hololense AI processor to other devices from daily life Cloud SQL for PostgreSQL integrates high availability and replication Artificial Intelligence creeps into CryptoTrading, AiX claims to develop first AI broker 7th Nov.’ 17 – Headlines Google introduces Tangent, a Python library for automatic differentiation Salesforce, Google form strategic partnership on cloud Rockwell unveils Project Scio, a scalable analytics platform for industrial IoT applications HPE launches Superdome Flex platform for high performance data analytics for mission critical workloads Google releases its internal tool Colaboratory Neuromation announces ICO to facilitate AI adoption with blockchain-powered platform DefinedCrowd unveils data platform API at Web Summit 2017 6th Nov.’ 17 – Headlines Tableau announces support for Amazon Redshift Spectrum in Tableau 10.4 IBM brings new cloud data tools, updates Unified Data Governance Platform IBM’s Goodbye to Bluemix brand Periscope Data unveils new platform to bolster “data driven culture” for professional data teams Caviar announces real estate-backed digital asset platform SIA: MCN collaborates with SAS to unveil single source data platform

0
0
1158

article-image-trending-datascience-news-10th-nov-17-headlines

Packt Editorial Staff

10 Nov 2017

3 min read

10th Nov.' 17 - Headlines

Packt Editorial Staff

10 Nov 2017

3 min read

Duer OS Prometheus Project, PLATO platform, HPE's neural network chip, and BullSequana S AI servers, in today's trending data science news. Baidu's new OS for AI capabilities Baidu launches new operating system Duer OS Prometheus Project to advance conversational AI Baidu Inc., has officially launched a new operating system to speed the conversational AI capabilities. Known as the “Duer OS Prometheus Project,” the operating system is already providing conversational support to 10 major domains and more than 100 subdomains in China. Baidu has announced a $1 million fund to invest in the efforts in this space. Announcing AI platform PLATO AI.io launches PLATO, an AI-based operating platform for enterprise AI.io has announced an AI-based neural network PLATO which expands as Perceptive Learning Artificial intelligence Technology Operating platform. The platform powers apps in consumer, and enterprise use cases for machine learning, deep learning, natural language processing, computer vision, machine reasoning, and cognitive AI. Based on the specific needs of the business, PLATO can be used to extract value from large amounts of structured and unstructured data using an intuitive, easy to use interface and toolkit. The data can then be converted, normalized, and enriched with concepts, relationships, sentiment and tone giving PLATO the ability to understand the content in a fully cognitive manner. Businesses can then embed the data into existing applications and workflows, with new insights. Thus Plato can enhance the overall decision-making resulting in revenue growth. HPE's upcoming processor could well be an accelerator HPE developing its own neural network chip that is faster than anything in the market Hewlett Packard Enterprise could be developing an advanced chip for high-performance computing under intense power and physical space limitations characteristic of space missions. Recently when VP and GM Tom Bradicich was asked about the processor architecture, he said the Dot Product Engine is less of a full processor and more like an accelerator, which takes offload of certain multiplication elements common in neural network inference and broader HPC applications. “DPE is not a neural network per se, in the sense that it’s not a fixed configuration, but rather is reconfigurable, and can be used for inference of several types of neural networks like DNN, CNN, RNN. Hence it can do neural network jobs and workloads,” he said, adding that DPE executes linear algebra in the analog domain, which is more efficient than digital implementations, such as dedicated ASICs. With the added advantage of reconfigurability, DPE is fast because it accelerates vector * matrix math, dot product multiplication, by exploiting Ohms Law on a memristor array. In fact, the way it is designed, it is “faster than anything available on the market, Bradicich claimed, with a “much better fit for the performance, power, and space requirements of extreme edge environments.” Announcing BullSequana S servers Atos launches next generation AI servers “BullSequana S” that are ultra-scalable, ultra-flexible Atos has developed BullSequana S, its in-house next-generation servers optimized for machine learning, business‐critical computing applications and in-memory environments. BullSequana S comes with a unique combination of powerful processors CPUs and GPUs. Leveraging a modular architecture, the BullSequana S server’s flexibility offers customers an agility to add machine learning and AI capacity to existing enterprise workloads, thanks to the introduction of a GPU. Within a single server, GPU, storage and compute modules are mixed. BullSequana S integrates the advanced Intel Xeon Scalable processors Skylake with an innovative architecture designed by Atos’ R&D teams.

0
0
1340

article-image-near-real-time-nrt-applications-work

Amarabha Banerjee

10 Nov 2017

6 min read

How Near Real Time (NRT) Applications work

Amarabha Banerjee

10 Nov 2017

6 min read

[box type="note" align="" class="" width=""]In this article by Shilpi Saxena and Saurabh Gupta from their book Practical Real-time data Processing and Analytics we shall explore what a near real time architecture looks like and how an NRT app works. [/box] It's very important to understand the key aspects where the traditional monolithic application systems are falling short to serve the need of the hour: Backend DB: Single point monolithic data access. Ingestion flow: The pipelines are complex and tend to induce latency in end to end flow. Systems are failure prone, but the recovery approach is difficult and complex. Synchronization and state capture: It's very difficult to capture and maintain the state of facts and transactions in the system. Getting diversely distributed systems and real-time system failures further complicate the design and maintenance of such systems. The answer to the above issues is an architecture that supports streaming and thus provides its end users access to actionable insights in real-time over ever flowing in-streams of real-time fact data. Local state and consistency of the system for large scale high velocity systems Data doesn't arrive at intervals, it keeps flowing in, and it's streaming in all the time No single state of truth in the form of backend database, instead the applications subscribe or tap into stream of fact data Before we delve further, it's worthwhile to understand the notation of time: Looking at this figure, it's very clear to correlate the SLAs with each type of implementation (batch, near real-time, and real-time) and the kinds of use cases each implementation caters to. For instance, batch implementations have SLAs ranging from a couple of hours to days and such solutions are predominantly deployed for canned/pre-generated reports and trends. The real-time solutions have an SLA of a magnitude of few seconds to hours and cater to situations requiring ad-hoc queries, mid-resolution aggregators, and so on. The real-time application's most mission-critical in terms of SLA and resolutions are where each event accounts for and the results have to return within an order of milliseconds to seconds. Near real time (NRT) Architecture In its essence, NRT Architecture consists of four main components/layers, as depicted in the following figure: The message transport pipeline The stream processing component The low-latency data store Visualization and analytical tools The first step is the collection of data from the source and providing for the same to the "data pipeline", which actually is a logical pipeline that collects the continuous events or streaming data from various producers and provides the same to the consumer stream processing applications. These applications transform, collate, correlate, aggregate, and perform a variety of other operations on this live streaming data and then finally store the results in the low-latency data store. Then, there is a variety of analytical, business intelligence, and visualization tools and dashboards that read this data from the data store and present it to the business user. Data collection This is the beginning of the journey of all data processing, be it batch or real time the foremost and most forthright is the challenge to get the data from its source to the systems for our processing. If I can look at the processing unit as a black box and a data source, and at consumers as publishers and subscribers. It's captured in the following diagram: The key aspects that come under the criteria for data collection tools in the general context of big data and real-time specifically are as follows: Performance and low latency Scalability Ability to handle structured and unstructured data Apart from this, the data collection tool should be able to cater to data from a variety of sources such as: Data from traditional transactional systems: To duplicate the ETL process of these traditional systems and tap the data from the source Tap the data from these ETL systems The third and a better approach is to go the virtual data lake architecture for data replication. Structured data from IoT/ Sensors/Devices, or CDRs: This is the data that comes at a very high velocity and in a fixed format – the data can be from a variety of sensors and telecom devices. Unstructured data from media files, text data, social media, and so on: This is the most complex of all incoming data where the complexity is due to the dimensions of volume, velocity, variety, and structure. Stream processing The stream processing component itself consists of three main sub-components, which are: The Broker: that collects and holds the events or data streams from the data collection agents. The "Processing Engine": that actually transforms, correlates, aggregates the data, and performs the other necessary operations The "Distributed Cache": that actually serves as a mechanism for maintaining common data set across all distributed components of the processing engine The same aspects of the stream processing component are zoomed out and depicted in the diagram as follows: There are few key attributes that should be catered to by the stream processing component: Distributed components thus offering resilience to failures Scalability to cater to growing need of the application or sudden surge of traffic Low latency to handle the overall SLAs expected from such application Easy operationalization of use case to be able to support the evolving use cases Build for failures, the system should be able to recover from inevitable failures without any event loss, and should be able to reprocess from the point it failed Easy integration points with respect to off-heap/distributed cache or data stores A wide variety of operations, extensions, and functions to work with business requirements of the use case Analytical layer - serve it to the end user The analytical layer is the most creative and interesting of all the components of an NRT application. So far, all we have talked about is backend processing, but this is the layer where we actually present the output/insights to the end user graphically, visually in form of an actionable item. A few of the challenges these visualization systems should be capable of handling are: Need for speed Understanding the data and presenting it in the right context Dealing with outliers The figure depicts the flow of information from event producers to the collection agents, followed by the brokers and processing engine (transformation, aggregation, and so on) and then the long-term storage. From the storage unit, the visualization tools reap the insights and present them in form of graphs, alerts, charts, Excel sheets, dashboards, or maps, to the business owners who can assimilate the information and take some action based upon it. The above was an excerpt from the book Practical Real-time data Processing and Analytics.

0
0
16833

article-image-baidu-duer-os-prometheus-project-conversational-ai

Abhishek Jha

10 Nov 2017

2 min read

China’s Baidu launches Duer OS Prometheus Project to accelerate conversational AI

Abhishek Jha

10 Nov 2017

2 min read

Experts believe artificial intelligence is the operating system of the future. Earlier this year, when Baidu announced its DuerOS platform, it clearly threw its hat in the ring. Now the Chinese search giant has gone a step ahead to launch a new operating system that has conversational AI capabilities: Duer OS Prometheus Project. “Voice is increasingly becoming how we interact with our devices today,” Kaihua Zhu, chief technology officer of Baidu’s DuerOS, said in a statement. “Open datasets, interdisciplinary collaboration and financial incentives will create the conditions necessary for rapid advancement of conversational AI.” The operating system is already providing conversational support to 10 major domains and over 100 subdomains in China. Since its beta launch at the beginning of 2017, it has quickly gone on to be the preferred choice for third party hardware manufacturers in China for devices ranging from refrigerators and air conditioners to TV set-top boxes, storytelling machines and smart speakers that are seeking Mandarin language voice recognition support. Baidu will gradually open three large scale datasets in far field wake word detection, far field speech recognition, and multi-turn conversations to enable developers to train their algorithms for conversational AI systems. The wake word detection dataset will consist of around 500,000 voice clips of five to ten popular Chinese wake words, including xiaodu xiaodu which is the wake word to activate DuerOS enabled devices. The speech recognition datasets will include thousands of hours of Mandarin speech recognition data to enable people to train systems that can accurately “hear” human speech under complex circumstances such as noisy environments. The project will also release thousands of dialogue data, covering 10 different domains to promote the development of multi-turn conversation technology. To seek broader support for the operating system, Baidu has announced a $1 million fund to invest in efforts related to voice and machine learning. Guoguo Chen, Baidu’s Principal Architect for DuerOS, noted that in the age of AI data should not be a barrier to prevent smaller organizations and individuals from developing leading edge conversational AI systems. “By opening our dataset and offering interdisciplinary collaborations and financial incentives, we hope to accelerate the pace of innovation in this space and advance the future of conversational computing,” Chen said. The DuerOS Prometheus project is sponsored by the Baidu Duer Business Unit, together with Baidu Speech Technology Group, Baidu Campus Branding and Baidu Cloud.

0
0
11953

Packt Editorial Staff

09 Nov 2017

3 min read

9th Nov.' 17 - Headlines

Packt Editorial Staff

09 Nov 2017

3 min read

Bitcoin prices soar and tumble, MongoDB announces its biggest release, and a proposed Grid to improve blockchain system, in today’s top stories in data science news. Bitcoin's roller-coaster amid SegWit2x cancellation Bitcoin price surges to record high, then tanks, as plans to split digital currency is called off Bitcoin was scheduled to upgrade around Nov. 16 following a proposal called SegWit2x, which would have split the digital currency in two. But with major bitcoin developers dropping their support for the upgrade recently, developers behind SegWit2x called off the upgrade plans on Wednesday. In response to this, bitcoin price reached an all-time high around $7,900. However, this was followed by a $1,000 crash, plummeting the price to $6,977. Experts believe the rapid price swing could denote a possible conflict between the short- and long-term impacts of SegWit2x cancellation. The hard fork would have split Bitcoin into two competing blockchains, resulting in an ugly fight for supremacy. Announcing MongoDB 3.6 MongoDB 3.6 released: Change Streams, Retryable Writes among key updates in MongoDB’s biggest ever release MongoDB has announced its biggest release yet, version 3.6, with over a hundred new and updated features. With new array update operators, users can now specify in-place updates to specific array items at any depth of nesting. Extensions to the $lookup aggregation stage now allow uncorrelated subqueries and multiple matching conditions, so referencing and joining documents in complex combinations can be handled in the database. Also, MongoDB 3.6 introduces Change Streams, which applications can use to get real-time notification of updates to collection data. To handle network outages gracefully, MongoDB 3.6 uses Retryable Writes, a new feature ensuring that writes are performed exactly once, even in the face of outages. Besides, MongoDB 3.6 improves on its previous capabilities with the introduction of JSON Schema. “With MongoDB 3.6, schema isn’t a straightjacket, it’s framework of validation you can tune to exactly the degree you need,” Co-Founder Eliot Horowitz said in the official announcement. A new 'Grid' blockchain system Introducing Grid: A scalable Blockchain system for better performance, resource segregation and working governance model A new blockchain initiative Grid proposes to establish a blockchain system which functions as an operating system similar to Linux. As per the modus operandi, Grid will run nodes on clusters. It will allow assigned transactions to different groups based on mutex of the transactions. Transactions within a group will be processed in linear sequence, while all groups will be processed simultaneously. Grid adopts a Main Chain + N Side Chains architecture, which means each business scenario has its dedicated Side Chain to fulfill its requirements. By segregating resources like this, processing efficiency of the system is increased and there is no congestion. Grid also promises a better governance model by permitting Side Chains to join or exit from Main Chain dynamically based on stakeholder voting, therefore introducing competition and incentive to improve each Side Chain. Singapore-based Grid Foundation is promoting Grid’s development and applications, while technical developments will be led by Beijing Hoopox Information and Technology Co. Ltd.

0
0
1472

Dr. Brandon explains 'Transfer Learning' to Jon

Google joins the social coding movement with CoLaboratory

How Reinforcement Learning works

Introducing Google's Tangent: A Python library with a difference

14th Nov.' 17 - Headlines

Introducing Tile : A new machine learning language with auto generating GPU Kernels

Has IBM edged past Google in the battle for Quantum Supremacy?

PINT (Paper IN Two mins) - Making Neural Network Architectures generalize via Recursion

13th Nov.' 17 - Headlines

Getting started with Storm Components for Real Time Analytics

Trending Topics

Week at a Glance (4th - 10th Nov. '17): Top News from Data Science

10th Nov.' 17 - Headlines

How Near Real Time (NRT) Applications work

China’s Baidu launches Duer OS Prometheus Project to accelerate conversational AI

9th Nov.' 17 - Headlines

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access