Artificial Intelligence | 0 articles | Tech News, Tutorials & Expert Insights

article-image-paper-in-two-minutes-i-revnet-a-deep-invertible-convolutional-network

02 Apr 2018

4 min read

Paper in Two minutes: i-RevNet, a deep invertible convolutional network

02 Apr 2018

The ICLR 2018 accepted paper, i-RevNet: Deep Invertible Networks, introduces i-RevNet, an invertible convolutional network, that does not discard any information about the input while classifying images. This paper is authored by Jörn-Henrik Jacobsen, Arnold W.M. Smeulders, and Edouard Oyallon. The 6th annual ICLR conference is scheduled to happen between April 30 - May 03, 2018. i-RevNet, a deep invertible convolutional network What problem is the paper attempting to solve? A CNN is generally composed of a cascade of linear and nonlinear operators. These operators are very effective in classifying images of all sorts but reveal little information about the contribution of the internal representation to the classification. The learning process of a CNN works by a regular reduction of large amounts of uninformative variability in the images to reveal the essence of the visual class. However, the extent to which information is discarded is lost somewhere in the intermediate nonlinear processing steps. Also, there is a wide belief, that discarding information is essential for learning representations that generalize well to unseen data. The authors of this paper show that discarding information is not necessary and propose to explain this theory with empirical evidence. This paper also provides an understanding of the variability reduction process by proposing an invertible convolutional network. The i-RevNet does not discard any information about the input while classifying images. It has a built-in pseudo-inverse, allowing for easy inversion. It basically uses linear and invertible operators for performing downsampling, instead of non-invertible variants like spatial pooling. Paper summary i-RevNet is an invertible deep network, which builds upon the recently introduced RevNet, where the non-invertible components of the original RevNets are replaced by invertible ones. i-RevNets retain all information about the input signal in any of their intermediate representations up until the last layer. They achieve the same performance on Imagenet compared to similar non-invertible RevNet and ResNet architectures. The above image describes the blocks of an i-RevNet. The strategy implemented by an i-RevNet consists in an alternation between additions, and nonlinear operators, while progressively down-sampling the signal operators. The pair of the final layer is concatenated through a merging operator. Using this architecture, the authors avoid the non-invertible modules of a RevNet (e.g. max-pooling or strides) which are necessary to train them in a reasonable time and are designed to build invariance w.r.t. Translation variability. Their method replaces the non-invertible modules by linear and invertible modules Sj, that can reduce the spatial resolution while maintaining the layer’s size by increasing the number of channels. Key Takeaways This work provides a solid empirical evidence that learning invertible representations does not discard any information about their input on large-scale supervised problems. i-RevNet, the invertible network proposed, is a class of CNN which is fully invertible and permits to exactly recover the input from its last convolutional layer. i-RevNets achieve the same classification accuracy in the classification of complex datasets as illustrated on ILSVRC-2012 when compared to the RevNet and ResNet architectures with a similar number of layers. The inverse network is obtained for free when training an i-RevNet, requiring only minimal adaption to recover inputs from the hidden representations. Reviewer feedback summary Overall Score: 25/30 Average Score: 8.3 Reviewers agreed the paper is a strong contribution, despite some comments about the significance of the result; i.e., why is invertibility a "surprising" property for learnability, in the sense that F(x) = {x, phi(x)}, where phi is a standard CNN satisfies both properties: invertible and linear measurements of F producing good classification. Having said that, the reviews agreed that the paper is well written and easy to follow and considered it to be a great contribution to the ICLR conference.

0
0
7774

article-image-neuroevolution-step-toward-thinking-machine

Amarabha Banerjee

16 Oct 2017

9 min read

Neuroevolution: A step towards the Thinking Machine

Amarabha Banerjee

16 Oct 2017

9 min read

“I propose to consider the question - Can machines think?” - Alan Turing The goal for AI research has always remained the same - create a machine that has human-like decision-making capabilities based on available information. This includes the machine’s ability to analyze and process huge amounts of data and then make a meaningful inference from it. Machine learning, deep learning and other old and new paradigms in AI research are all attempts at imparting complex decision-making capabilities to machines or systems. Alan Turing’s famous test for AI has set the standards over the years for what qualifies as a smart AI i.e. a thinking machine. The imitation game is about an AI/ bot interacting with a human anonymously, in a way that the human can’t decipher the fact that it’s a machine. This not-so-trivial test has seen many adaptations over the years like the modern day Tokyo test. These tests set challenging boundaries that machines must cross to be considered capable of possessing intelligence. Neuroevolution, a few decades old theory, remodeled in a modern day format with the help of Neural and Deep Neural Networks, promises to challenge these boundaries and even break them. With neuroevolution, machines aim to solve complex problems on their own with satisfactory levels of accuracy even though they do not know how to achieve those results. Neuroevolution: The Essence “If a wild animal habitually performs some useless activity, natural selection will favor rival individuals who instead devote time to surviving and reproducing...Ruthless utilitarianism trumps, even if it doesn’t always seem that way.” - Richard Dawkins This is the essence of Neuroevolution. But the process itself is not as simple. Just like the human evolution process, in the beginning, a set of algorithms work on a problem. The algorithms that show an inclination to solve the problem in the right way are selected for the next stage. They then undergo random minor mutations - i.e., small logical changes in the inherent algorithm structure. Next, we check whether these changes enable the algorithms to achieve the same result with better accuracy or efficiency. The successful ones then move to the next stage with further mutations introduced. This is similar to how nature did the sorting for us and humans evolved from a natural need to survive in unfamiliar situations. Since the concept uses Neural Networks, it has come to be known as Neuroevolution. Neuroevolution, in the simplest terms, is the process of “descent with modification” by which machines/systems evolve and get better at solving the problems they were built for. Backpropagation to DNN: The Evolution Neural networks are made up of nodes. These nodes function like neurons in the human brain that receive a set of inputs and generate a response based on the type, intensity, frequency etc of stimuli. A single node looks like the below illustration: An algorithm can be viewed as a node. With backpropagation, the algorithm is modified in an iterative manner - where the error generated after each pass, is fed back to the system. The algorithms (nodes) responsible for higher error contribution are identified and assigned less weight in the next pass. Thus, backpropagation is a way to assign appropriate weights to nodes by calculating error contributions of individual nodes. These nodes, when combined in different layers, form the structure of Deep Neural Networks. Deep Neural networks have separate input and output layers and a middle layer of hidden nodes which form the core of DNN. This hidden layer consists of multiple nodes like the following. In case of DNNs, as before in each iteration, the weight of the nodes are adjusted based on their accuracy. The number of iterations is a factor that varies for each DNN. As explained earlier, the system without any external stimuli continues to improve on its own. Now, where have we seen this before? Of course, this looks a lot like a simplified, miniature version of evolution! Unfit nodes are culled by reducing the weight they have in the overall output, and the ones with favorable results are encouraged, just like the natural selection. However, the only thing that is missing from this is the mutation and the ability to process mutation. This is where we introduce the mutations in the successful algorithms and let them evolve on their own. Backpropagation in DNNs doesn’t change the algorithm or it’s approach, it merely increases or decreases the algorithm’s overall contribution to the desired result. Forcing random mutations of neural and deep neural networks and then letting these mutations take shape as these neural networks together try to solve a given problem seem pretty straightforward. The point where everything starts getting messy is when different layers or neural networks start solving the given problem in their own pre-defined way. One of two things may then happen: The neural networks behave in self-contradiction and stall the overall problem-solving process. The system as such cannot take any decision and becomes dormant. The neural networks are in some sort of agreement regarding a decision. The decision itself might be correct or incorrect. Both scenarios present us with dilemmas - how to restart a stalled process and how to achieve better decision making capability. The solution to both of situations lies in enabling the DNNs to rectify themselves first by choosing the correct algorithms. And then by mutating them with an intention to allow them to evolve and reach a decision toward achieving greater accuracy. Here’s a look at some popular implementations of this idea. Neuroevolution in flesh and blood Cutting edge AI research giants like OpenAI backed by Elon Musk and Google DeepMind have taken the concept of neuroevolution and applied them to a bunch of deep neural networks. Both aim to evolve these algorithms in a way that the smarter ones survive and eventually create better and faster models & systems. Their approaches are however starkly different. The Google implementation Google’s way is simple - It takes a number of algorithms, divides them into groups and assigns one particular task to all. The algorithms that fare better at solving these problems are then chosen for the next stage, much like the reward and punishment system in reinforcement learning. However, the difference here is that the faster algorithms are not just chosen for the next step, but their models and parameters are tweaked slightly - this is our way of introducing a mutation into the successful algorithms. These minor mutations then play out as these modified algorithms try to solve the given problem. Again, the better ones remain and the rest are culled out. This way, the algorithms themselves find a way to perform better and better until they are reasonably close to the desired result. The most important advantage of this process is that the algorithms keep track of their evolution process as they get smarter. A major limitation of Google’s approach is that the time taken for performing these complex computations is too high, hence the result takes time to show. Also, once the mutation kicks in, their behavior is not controlled externally - i.e., quite literally they can go berserk because of the resulting mutation - which means the process can fail even at an advanced stage. The OpenAI implementation Let’s contrast this with OpenAI’s master-worker approach to neuroevolution. OpenAI used a set of nearly 1440 algorithms to play the game of Atari and submit their scores to the master algorithm. Then, the algorithms with better scores were chosen and given a mutation and put back into the same process. In more abstract terms, the OpenAI method looks like this. A set of worker algorithms are given a certain complex problem to solve. The best scores are passed on to the master algorithm. The better algorithms are then mutated and set to perform the same tasks. The scores are again recorded and passed on to the master algorithm. This happens through multiple iterations. The master algorithm progressively eliminates the chance of failure since the master algorithm knows which algorithms to employ when given a certain problem. However, it does not know the road to success as it has access only to the final scores and not how those scores were achieved. The advantage of this approach is that better results are guaranteed, there are no cases of decision conflict and the system stalling. The flip side is that this system only knows its way through the given problem. All this effort to evolve the system to a better one will have to be repeated for a similar but different problem. The process is therefore cumbersome and lengthy. The Future with Neuroevolution Human evolution has taken millions of years to reach where we are today. Evolving AI and enabling them to pass the Turing test, or to further make them smart enough to pass a university entrance exam will require significant improvement from the current crop of AI. Amazon’s Alexa and Apple’s Siri are mere digital assistants. If we want AI driven smart systems with seamless integration of AI into our everyday life, algorithms with evolutionary characteristics are a must. Neuroevolution might hold the secret to inventing smart AIs that can ultimately propel human civilization to greater heights of development and advancement. “It seems probable that once the machine thinking method had started, it would not take long to outstrip our feeble powers...They would be able to converse with each other to sharpen their wits. At some stage, therefore, we should have to expect the machines to take control." - Alan Turing

0
0
7117

article-image-what-you-missed-at-last-weeks-icml-2018-conference

Sugandha Lahoti

18 Jul 2018

6 min read

What you missed at last week’s ICML 2018 conference

Sugandha Lahoti

18 Jul 2018

6 min read

The 35th International Conference on Machine Learning (ICML) 2018, took place on July 10, 2018 - July 15, 2018 in Stockholm, Sweden. ICML is one of the most anticipated conferences for every data scientist and ML practitioner and features some of the best ML researchers who come to talk about their research and discuss new ideas. It won’t be wrong to say that Deep learning and its subsets were the showstopper of this conference with a large number of research papers and AI professionals implementing it in their methods. These included sessions and paper presentations on, Gaussian Processes, -Networks and Relational Learning, Time-Series Analysis, Deep Bayesian Non-parametric Tracking, Generative Models, etc. Also, other deep learning subsets such as Representation Learning, Ranking and Preference Learning, Supervised Learning, Transfer and Multi-Task Learning, etc were heavily featured. The conference consisted of one day of tutorials (July 10), followed by three days of main conference sessions (July 11-13), followed by two days of workshops (July 14-15). Best Talks and Seminars of ICML 2018 ICML 2018 featured two informative talks dealing with the applications of Artificial Intelligence in other domains. Day 1 was inaugurated by an invited talk from Prof. Dawn Song on “AI and Security: Lessons, Challenges and Future Directions’’. She talked about the impact of AI in computer security, differential privacy techniques, and the synergy between AI, computer security, and blockchain. She also gave an overview of challenges and new techniques to enable privacy-preserving machine learning. Day 3 featured an inaugural talk by Max Welling on “Intelligence per Kilowatt hour”, focusing on the connection between physics and AI. According to Max, in the coming future, companies will find it too expensive to run energy absorbing ML tools to power their AI engines, or the heat dissipation in edge devices will be too high to be safe. So the next frontier of AI is going to be finding the most energy efficient combination of hardware and algorithms. There were also two plenary talks. Language to Action: towards Interactive Task Learning with Physical Agents, by Joyce Chai and Building Machines that Learn and Think Like People by Josh Tenenbaum. Best Research Papers of ICML 2018 Among the many interesting research papers that were submitted to the ICML 2018 conference, here are the winners. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples by Anish Athalye, Nicholas Carlini, and David Wagner was lauded and bestowed with the Best Paper award. The paper identifies obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. They identify the three different types of obfuscated gradients and develop attack techniques to overcome them. Delayed Impact of Fair Machine Learning by Lydia T. Liu, Sarah Dean, Esther Rolf, and Max Simchowitz also got the Best Paper award. This paper examines the circumstances where fairness criteria promotes the long-term well-being of disadvantaged groups, measured in terms of a temporal variable of interest. The paper also introduces a one-step feedback model of decision-making that exposes how decisions change the underlying population over time. Bonus: The Test of Time award Day 4 witnessed Facebook researchers Ronan Collobert and Jason Weston receiving the honorary ‘Test of Time award’ for their 2008 ICML paper, A Unified Architecture for Natural Language Processing: Deep Neural Networks with Multitask Learning. The paper proposed a single convolutional neural network that takes a sentence and outputs it’s language processing predictions. So the network can identify and distinguish part-of-speech tags, chunks, named entity tags, semantic roles, semantically similar words and the likelihood that the sentence makes sense (grammatically and semantically) using a language model. At the time of the paper publishing there was almost no neural networks research in Natural Language Processing. The paper’s use of word embeddings and how they are trained, the use of auxiliary tasks and multitasking, and the use of convolutional neural nets in NLP, really inspired the neural networks of today. For instance, Facebook’s recent machine translation and summarization tool Fairseq uses CNNs for language. AllenNLP’s Elmo learns improved word embeddings via a neural net language model and applies them to a large number of NLP tasks. Featured Tutorials at ICML 2018 The ICML 2018 featured a total of 9 tutorials in sets of 3 each. All the tutorials took place on Day 1. These included: Imitation Learning by Yisong Yue and Hoang M Le where they gave a broad overview of imitation learning techniques and its recent applications. Learning with Temporal Point Processes by Manuel Gomez Rodriguez and Isabel Valera. They talk about temporal point processes in machine learning from basics to advanced concepts such as marks and dynamical systems with jumps. Machine Learning in Automated Mechanism Design for Pricing and Auctions by Nina Balcan, Tuomas Sandholm, and Ellen Vitercik. This tutorial covered automated mechanism design for revenue maximization. Toward Theoretical Understanding of Deep Learning by Sanjeev Arora where he explained about what kind of theory may ultimately arise for deep learning with examples. Defining and Designing Fair Algorithms by Sam Corbett-Davies and Sharad Goel. They illustrated the problems that lie at the foundation of algorithmic fairness, drawing on ideas from machine learning, economics, and legal theory. Understanding your Neighbors: Practical Perspectives From Modern Analysis by Sanjoy Dasgupta and Samory Kpotufe. This tutorial aimed to cover new perspectives on k-NN, and translate new theoretical insights to a broader audience. Variational Bayes and Beyond: Bayesian Inference for Big Data by Tamara Broderick where she covered modern tools for fast, approximate Bayesian inference at scale. Machine Learning for Personalised Health by Danielle Belgrave and Konstantina Palla. This tutorial evaluated the current drivers of machine learning in healthcare and present machine learning strategies for personalised health. Optimization Perspectives on Learning to Control by Benjamin Recht where he showed how to learn models of dynamical systems, how to use data to achieve objectives in a timely fashion, how to balance model specification etc. Workshops at ICML 2018 Day 5 and 6 of the ICML 2018 conference were dedicated entirely for Workshops based on topics ranging from AI in health to AI in computational psychology to Humanizing AI to AI for Wildlife Conservation. Some other workshops included Bridging the Gap between Human and Automated Reasoning Data Science meets Optimization Domain Adaptation for Visual Understanding Eighth International Workshop on Statistical Relational AI Enabling Reproducibility in Machine Learning MLTrain@RML Engineering Multi-Agent Systems Exploration in Reinforcement Learning Federated AI for Robotics Workshop (F-Rob-2018) This is just a brief overview of the ICML conference, where we have handpicked a select few paper presentations and invited talks. You can see the full schedule along with the list of selected research papers at the ICML website. 7 of the best machine learning conferences for the rest of 2018 Microsoft start AI School to teach Machine Learning and Artificial Intelligence Google introduces Machine Learning courses for AI beginners

0
0
6954

article-image-yoshua-bengio-et-al-twin-networks

Savia Lobo

16 Feb 2018

5 min read

Yoshua Bengio et al on Twin Networks

Savia Lobo

16 Feb 2018

5 min read

The paper “Twin Networks: Matching the Future for Sequence Generation”, is written by Yoshua Bengio in collaboration with Dmitriy Serdyuk, Nan Rosemary Ke, Alessandro Sordoni, Adam Trischler, and Chris Pal. This paper proposes a simple technique for encouraging generative RNNs to plan ahead. To achieve this, the authors have presented a simple RNN model, which further has two separate networks--a forward and a backward--that run in opposite directions during training. The main motive to train them in opposite direction is the hypothesis that the states of the forward model should be able to predict the entire future sequence. Yoshua Bengio is a Canadian computer scientist.He is known for his work on artificial neural networks and deep learning. His main research ambition is to understand principles of learning that yield intelligence. Yoshua has been co-organizing the Learning Workshop with Yann Le Cun, with whom he has also created the International Conference on Representation Learning (ICLR), since the year 1999. Yoshua has also organized or co-organized numerous other events, principally the deep learning workshops and symposia at NIPS and ICML since 2007. The article talks about TwinNet i.e. Twin Networks, a method for training RNN architectures to better model the future in its internal state, which are supervised by another RNN modelling the future in reverse order. Twin Networks: Matching the Future for Sequence Generation What problem is the paper attempting to solve? Recurrent Neural Networks (RNNs) are the basis of state-of-art models for generating sequential data such as text and speech and are usually trained by teacher forcing. This corresponds to optimizing one-step ahead prediction. At present, there is no explicit bias toward planning in the training objective, the model may prefer to focus on the most recent tokens instead of capturing subtle long-term dependencies, which could contribute to global coherence. Local correlations are usually stronger than long-term dependencies and thus end up dominating the learning signal. This results obtained from this are, samples from RNNs tend to exhibit local coherence but lack meaningful global structure. Recent efforts to address this problem have involved augmenting RNNs with external memory, with unitary or hierarchical architectures, or with explicit planning mechanisms. Parallel efforts aim to prevent overfitting on strong local correlations by regularizing the states of the network, by applying dropout or penalizing various statistics. To solve this, the paper proposes TwinNet, a simple method for regularizing a recurrent neural network that encourages modeling those aspects of the past that are predictive of the long-term future. Paper summary This paper presents a simple technique which enables generative recurrent neural networks to plan ahead. A backward RNN is trained to generate a given sequence in a reverse order. The states of the forward model are implicitly forced to predict cotemporal states of the backward model. The paper empirically shows that this approach achieves 9% relative improvement for a speech recognition task, and also achieves significant improvement on a COCO caption generation task. Overall, the model is driven by the intuition: (a) The backward hidden states contain a summary of the future of the sequence (b) To predict the future more accurately, the model will have to form a better representation of the past. The paper also demonstrates the success of the TwinNet approach experimentally, through several conditional and unconditional generation tasks that include speech recognition, image captioning, language modelling, and sequential image generation. Key Takeaways The paper introduces a simple method for training generative recurrent networks that regularizes the hidden states of the network to anticipate future states The paper also provides an extensive evaluation of the proposed model on multiple tasks and concludes that it helps training and regularization for conditioned generation (speech recognition, image captioning) and for the unconditioned case (sequential MNIST, language modelling) As a deeper analysis, the paper includes a visualization, which includes the introduced cost and observes that it negatively correlates with the word frequency. Reviewer feedback summary Overall Score: 21/30 Average Score: 7/10 The reviewers stated that the paper presents a novel approach to regularize RNNs and give results on different datasets indicating wide range of application. However, based on our results, they said that further experimentation and extensive hyperparameter search is needed. Overall, the paper is detailed, simple to implement and positive empirical results support the described approach. The reviewers have also pointed out a few limitations which include: Major downside of the approach is the cost in terms of resources. The twin model requires large memory and takes longer to train (~ 2-4 times) while providing little improvement over the baseline. During evaluation we found that the attention twin model gives results like “a woman at table a with cake a”, where it forces the model to look like a sentence from the back side too. This might be the reason for low metric values observed in soft attention twin net model. The effect of twin net as a regularizer can be examined against other regularization strategies for comparison purposes.

0
0
6849

article-image-meet-the-whos-who-of-reinforcement-learning

Fatema Patrawala

12 Jul 2018

7 min read

Meet the who's who of Reinforcement learning

Fatema Patrawala

12 Jul 2018

7 min read

0
0
6838

article-image-what-we-learned-oracle-openworld-2017

Amey Varangaonkar

06 Oct 2017

5 min read

What we learned from Oracle OpenWorld 2017

Amey Varangaonkar

06 Oct 2017

5 min read

“Amazon’s lead is over.” These famous words by the Oracle CTO Larry Ellison in the Oracle OpenWorld 2016 garnered a lot of attention, as Oracle promised their customers an extensive suite of cloud offerings, and offered a closer look at their second generation IaaS data centers. In the recently concluded OpenWorld 2017, Oracle continued on their quest to take on AWS and other major cloud vendors by unveiling a host of cloud-based products and services. Not just that, they have juiced these offerings up with Artificial Intelligence-based features, in line with all the buzz surrounding AI. Key highlights from the Oracle OpenWorld 2017 Autonomous Database Oracle announced a totally automated, self-driving database that would require no human intervention for managing or fine-tuning the database. Using machine learning and AI to eliminate human error, the new database guarantees 99.995% availability. While taking another shot at AWS, Ellison promised in his keynote that customers moving from Amazon’s Redshift to Oracle’s database can expect a 50% cost reduction. Likely to be named as Oracle 18c, this new database is expected to be shipped across the world by December 2017. Oracle Blockchain Cloud Service Oracle joined IBM in the race to dominate the Blockchain space by unveiling its new cloud-based Blockchain service. Built on top of the Hyperledger Fabric project, the service promises to transform the way business is done by offering secure, transparent and efficient transactions. Other enterprise-critical features such as provisioning, monitoring, backup and recovery are also some of the standard features which this service will offer to its customers. “There are not a lot of production-ready capabilities around Blockchain for the enterprise. There [hasn’t been] a fully end-to-end, distributed and secure blockchain as a service,” Amit Zavery, Senior VP at Oracle Cloud. It is also worth remembering that Oracle joined the Hyperledger consortium just two months ago, and the signs of them releasing their own service were there already. Improvements to Business Management Services The new features and enhancements introduced for the business management services were one of the key highlights of the OpenWorld 2017. These features now empower businesses to manage their customers better, and plan for the future with better organization of resources. Some important announcements in this area were: Adding AI capabilities to its cloud services - The Oracle Adaptive Intelligent Apps will now make use of the AI capabilities to improve services for any kind of business Developers can now create their own AI-powered Oracle applications, making use of deep learning Oracle introduced AI-powered chatbots for better customer and employee engagement New features such as enhanced user experience in the Oracle ERP cloud and improved recruiting in the HR cloud services were introduced Key Takeaways from Oracle OpenWorld 2017 With the announcements, Oracle have given a clear signal that they’re to be taken seriously. They’re already buoyed by a strong Q1 result which saw their revenue from cloud platforms hit $1.5 billion, indicating a growth of 51% as compared to Q1 2016, Here are some key takeaways from the OpenWorld 2017, which are underlined by the aforementioned announcements: Oracle undoubtedly see cloud as the future, and have placed a lot of focus on the performance of their cloud platform. They’re betting on the fact that their familiarity with the traditional enterprise workload will help them win a lot more customers - something Amazon cannot claim. Oracle are riding on the AI wave and are trying to make their products as autonomous as possible - to reduce human intervention and human error, to some extent. With enterprises looking to cut costs wherever possible, this could be a smart move to attract more customers. The autonomous database will require Oracle to automatically fine-tune, patch, and upgrade its database, without causing any downtime. It will be interesting to see if the database can live up to its promise of ‘99.995% availability’. Is the role of Oracle DBAs going to be at risk, due to the automation? While it is doubtful that they will be out of jobs, there is bound to be a significant shift in their day to day operations. It is speculated that the DBAs would require to spend less time on the traditional administration tasks such as fine-tuning, patching, upgrading, etc. and instead focus on efficient database design, setting data policies and securing the data. Cybersecurity has been a key theme in Ellison’s keynote and the OpenWorld 2017 in general. As enterprise Blockchain adoption grows, so does the need for a secure, efficient digital transaction system. Oracle seem to have identified this opportunity, and it will be interesting to see how they compete with the likes of IBM and SAP to gain major market share. Oracle’s CEO Mark Hurd has predicted that Oracle can win the cloud wars, overcoming the likes of Amazon, Microsoft and Google. Judging by the announcements in the OpenWorld 2017, it seems like they may have a plan in place to actually pull it off. You can watch highlights from the Oracle OpenWorld 2017 on demand here. Don’t forget to check out our highly popular book Oracle Business Intelligence Enterprise Edition 12c, your one-stop guide to building an effective Oracle BI 12c system.

0
0
6407

article-image-handpicked-weekend-reading-8th-dec-2017

Aarthi Kumaraswamy

09 Dec 2017

2 min read

Handpicked for your weekend Reading – 8th Dec 2017

Aarthi Kumaraswamy

09 Dec 2017

2 min read

While you were away attending NIPS 2017 this week, a lot has been happening around you in the data science and machine learning space. No worries! Here is a brief roundup of the best of what we published on the Datahub this week for your weekend reading. [box type="shadow" align="" class="" width=""]If you would like to share your insights and takeaways from NIPS with our readers on the DataHub, write to us at contributors@packtpub.com.[/box] NIPS 2017 Highlights - Part 1 3 great ways to leverage Structures for Machine Learning problems by Lise Getoor at NIPS 2017 Top Research papers showcased at NIPS 2017 – Part 2 Top Research papers showcased at NIPS 2017 – Part 1 Watch out for more in this area in the coming weeks. Expert in Focus Kate Crawford, Principal Researcher at Microsoft Research and a Distinguished Research Professor at New York University, on 20 lessons on bias in machine learning systems, Keynote at NIPS 2017 3 Things that happened this week in Data Science News PyTorch 0.3.0 releases, ending stochastic functions DeepVariant: Using Artificial Intelligence in Human Genome Sequencing Amazon unveils Sagemaker: An end-to-end machine learning service For a more comprehensive roundup of top news stories this week, check out our weekly news roundup post. Get hands-on with these Tutorials Understanding Streaming Applications in Spark SQL Implementing Linear Regression Analysis with R What are Slowly Changing Dimensions (SCD) and why you need them in your Data Warehouse? Do you agree with these Insights & Opinions? 4 popular algorithms for Distance-based outlier detection Stitch Fix: Full Stack Data Science and other winning strategies Admiring the many faces of Facial Recognition with Deep Learning One Shot Learning: Solution to your low data problem

0
0
5824

article-image-deep-learning-microsoft-cntk

SarvexJatasra

05 Aug 2016

7 min read

Deep Learning with Microsoft CNTK

SarvexJatasra

05 Aug 2016

7 min read

“Deep learning (deep structured learning, hierarchical learning, or deep machine learning) is a branch of machine learning based on a set of algorithms that attempt to model high-level abstractions in data by using multiple processing layers, with complex structures or otherwise, composed of multiple nonlinear transformations.” -Wikipedia High Performance Computing is not a new concept, but only recently the technical advances along with economies of scale have ensured that HPC is accessible to the masses with affordable yet powerful configurations. Anyone interested can buy commodity hardware and start working on Deep Learning, thus bringing a machine learning subset of artificial intelligence out of research labs and into garages. DeepLearning.net is a starting point for more information about Deep Learning. Nvidia ParallelForAll is a nice resource for learning GPU-based Deep Learning (Core Concepts, History and Training, and Sequence Learning). What is CNTK Microsoft Research released its Computational Network Toolkit in January this year. CNTK is a unified deep-learning toolkit that describes neural networks as a series of computational steps via a directed graph. CNTK allows the following models: - Feed Forward Deep Neural Networks (DNN) - Convolutional Neural Networks (CNN) - Recurrent Neural Networks (RNN)/Long Short Term Memory Units (LTSM) - Stochastic Gradient Descent (SGD) Why CNTK? Better Scaling When Microsoft CNTK was released, the stunning feature that it brought was distributed computing, that is, a developer was not limited by the number of GPUs installed on a single machine. This was a significant breakthrough because even the best of machines was limited by4-way SLI, thus limiting the total number of cores to 4 x 3072 = 12288. The configuration of the developer machine put an extra load on the hardware configuration, because this configuration left very little room for upgrades. There is only one motherboard available that supports4-way PCI-E Gen3x16, and there are very few manufacturers who provide good quality 1600W watt power supply to support four Titans. This meant that developers were forced to pay a hefty premium for upgradability in terms of the motherboard and processor, settling for an older generation processor. Distributed computing in High Performance Computing is essential, since it allows scaling out as opposed to scaling up. Developers can build grids with cheaper nodes and the latest processors with a lower hardware cost for an entry barrier. Microsoft Research demonstrated in December 2015 that distributed GPU computing is most efficient in CNTK. In comparison, Google TensorFlow, FAIR Torch, andCaffe did not allow scaling beyond a single machine, and Theano was the worst, as it did not even scale on multiple GPUs on the same machine. Google Research, on April 13, released support for distributed computing. The speed up claimed is 56X for 100 GPUs and 40X for 50 GPUs. The performance deceleration is sharp for any sizable distributed Machine Learning setup. I do not have any comparative performance figures for CNTK, but scaling with GPUs on a single machine for CNTKhad very good numbers. GPU Performance One of the shocking finds with my custom build commodity hardware (2xTitanX) was the TFLOPS achieved under Ubuntu 14.04 LTS and Windows 10. With the fully updated OS and latest drivers from NVIDIA, I got double the number of TFLOPS under Windows than Ubuntu. I would like to rerun the samples with Ubuntu 16.04 LTS, but until then, I have a clear winner in performance with Windows. CNTK works perfectly on Windows, but TensorFlow has a dependency onBazel, which as of now does not build on Windows (Bug#947). Google can look into this and make TensorFlow work on windows, or Ubuntu &Nvidia can achieve the same TFLOPS as Windows. But until that time, architects have two options:to either settle for lower TFLPOS under Ubuntu with TensorFlow, or migrate to CNTK with increased performance. Getting started with CNTK Let’s see how toget started with CNTK. Binary Installation Currently, the CNTK binary installation is the easiest way to get started with CNTK. Just follow the instructions. The only downside is that the currently available binaries are compiled with CUDA 7.0, rather than the latest CUDA 7.5 (released almost a year ago). Codebase Compilation If you want to learn CNTK in detail, and if you are feeling adventurous, you should try compiling CNTK from source. Compile the code base, even when you do not expect to use the generated binary, because the whole compilation process will be a good peek under the hood and enhance your understanding of Deep Learning. The instructions for the Windows installation are available here, whereas the Linux installation instructions are available here. If you want to enable 1-bit Stochastic Gradient Descent (1bit-SGD), you should follow these instructions.1bit-SGD is licensed more restrictively, and you have to understand the differences if you are looking for commercial deployments. Windows Compilation is characterized by older versions of libraries. NvidiaCUDA andcuDNN were recently updated to 7.5, whereas other dependencies such asNvidia CUB, Boost, and OpenCV are still using older versions. Kindly pay extra attention to the versions listed in the documentation to ensure smooth compilation. Nvidia has updated the support for its Nsight to Visual Studio 2015; however, Microsoft CNTK still supportsonly Visual Studio 2013. Samples To test the CNTK installation, here are some really great samples: Simple2d (Feed Forward) Speech / AN4 (Feed Forward &LSTM) Image / MNIST (CNN) Text / PennTreeback (RNN) Alternative Deep Learning Toolkits Theano Theano is possibly the oldest Deep Learning Framework available. The latest release, 0.8,which was released on March 16, enables the much awaited multi-GPU support (there are no indications of distributed computing support though).cuDNNv5 and CNMeM are also supported. A detailed report is available here. Python bindings are available. Caffe Caffe is a deep learning framework primarily oriented towards image processing. Python bindings are available. Google TensorFlow TensorFlow is a deep learning framework written in C++ with Python API bindings. The computation graph is pure Python, making it slower than other frameworks, as demonstrated by benchmarks. Google has been pushing Go for a long time now, and it has even open sourced the language. But when it came to TensorFlow, Python was chosen over Go. There are concerns about Google supporting commercial implementations. FAIR Torch Facebook AI Research (Fair) has release its extension to Torch7. Torch is a scientific computing framework with Lua as its primary language. Lua has certain advantages over Python (lower interpreter overhead, simpler integration with C code), which lend themselves to Torch. Moreover, multi-core using OpenMP directives points to better performance. Leaf Leaf is the latest addition to the machine learning framework. It is based on the Rust programming language (supposed to replace C/C++). Leaf is a framework created by hackers for hackers rather than scientists. Leaf has some nice performance improvements. Conclusion Deep Learning with GPUs is an emerging field, and there is much required to be done to make good products out of machine learning. So every product needs to evaluate all of the possible alternatives (programming language, operating system, drivers, libraries, and frameworks) available for specific use-cases. Currently, there is no one-size-fits-all approach available. About the author SarvexJatasra is a Technology Aficionado, exploring ways to apply technology to make lives easier. He is currently working as the Chief Technology Officer at 8Minutes. When not in touch with technology, he is involved in physical activities such as swimming and cycling.

0
0
5665

article-image-6-ways-ai-transforming-web-development

Sugandha Lahoti

19 Dec 2017

8 min read

6 ways you can give an Artificially Intelligent Makeover to the Web Development

Sugandha Lahoti

19 Dec 2017

8 min read

The web is an ever changing world! Users are always seeking personalized content and richer experiences - websites which can do whatever users want, however they want, and exactly as they want. In other words, end-users expect smarter applications with self-learning capabilities and hyper-customized user experiences. Now this poses a major challenge for developers - How do they design websites that deliver new and personalized content every time? Traditional approaches for web development are not the answer. They can, in fact, be problematic. Why you ask? Here are some clues. Building basic layouts and designing the website alone takes time. Forget customizing for dynamic content. Web app testing is a tedious, time-intensive process prone to errors. Even mundane web development decisions depend on the developer, slowing down the time taken to go live. Automating the web development process, starting with the more structured, repetitive and clearly defined tasks, can help developers pay less attention to cumbersome details and focus on the more value adding aspects of development such as formulating design strategy, planning the ultimate user experience and other such activities. Artificial Intelligence can help not just with this kind of intelligent automation, but also help do a lot more - from assisting with design conceptualization, website implementation to web analytics. This human-machine collaboration have the potential to transform the web as we know it. How AI can improve web development Through the lens of a typical software development lifecycle, let’s look at some ways in which AI is transforming web development. 1. Automating complex requirement gathering and analysis Using an AI powered chatbot or voice assistant, for instance, one can automate the process of collecting client requirements and end-user stories without human intervention. It can also prepare a detailed description of the gathered data and use data extraction tools to generate insights that then drive the web design and development strategy. This is possible through a carefully constructed system that employs NLP, ML, computer vision and image recognition algorithms and tools amongst others. Kore.ai is one such platform which empowers decision makers with insights they need to drive business outcomes within data-driven analytics. 2. Simplifying web designing with AI virtual assistants Designing basic layouts and templates of web pages is a tedious job for all developers. AI tools such as virtual assistants like The Grid’s Molly can help here by simplifying the design process. By prompting questions to the user (in this case the web owner or even the developer), and extracting content from their answers, AI assistants can create personalized content with the exact combination of branding, layout, design, and content required by that user. Take Adobe Sensei for instance - it can automatically analyze inputs and recommend design elements to the user. These range from automating basic photo editing skills such as cropping using image recognition techniques to creating elements in images which didn’t exist before by studying the neighbouring pixels. Developers now need only to focus on training a machine to think and perform like a designer. 3. Redefining web programming with self-learning algorithms AI can also facilitate programming. AI programs can perform basic tasks like updating and adding records to a database, predict which bits of code are most likely to be used to solve a problem, and then utilize those predictions to prompt developers to adopt a particular solution. An interesting example is Pix2Code that aims at automating front-end development. Infact, AI algorithms can also be utilized to create self-modifying codes right from scratch! Think of it as a fully-functional piece of code without human involvement. Developers can therefore build smarter apps and bots using AI tech at much faster rates than before. They would however need to train these machines and feed them with good datasets to start with. The smarter the design and more comprehensive the training, the better the results these systems produce. This is where the developers’ skills make a crucial difference. 4. Putting testing and quality assurance on auto-pilot AI algorithms can help an application test itself, with little or no human input. They can predict the key parameters of software testing processes based on historical data. They can also detect failure patterns and amplify failure prediction at a much higher efficiency than traditional QA approaches. Thus, bug identification and remedial will no longer be a long and slow process. As we speak, Microsoft is readying the release of an AI Bug Finder for developers, Microsoft Security Risk Detection. In this new AI powered QA environment, developers can discover more effective ways of testing, identify outliers faster, and work on effective code coverage techniques all with no or basic testing experience. In simple terms, developers can just focus on perfecting the build while the AI handles complex test cases and resultant bugs automatically. 5. Harnessing web analytics for SEO with AI SEO strategies rely heavily on number crunching. Many web analytics tools are good, however their potential is currently limited by the processing capabilities of humans who interpret that data for their websites. With AI backed data mining and analytics, one can maximise the usefulness of a website’s meta-data and other user generated data and metadata. The Predictive Engines built using AI technologies can generate insights which can point the developers to irregularities in their website architecture or highlight ill-fit content from an SEO point of view. Using such insights, AI can list out better ways to design websites and develop web content that connect with the target audience. Market Brew is one such Artificially Intelligent SEO platform which uses AI to help developers react and plan the content for their websites in ways that search engines might perceive them. 6. Providing superior end-user experience with chatbots AI powered chatbots can take customer support and interaction to the next level. A simple rule based chatbot responds to only specific preset commands. An AI infused chatbot, on the other hand, can simulate an actual conversation by learning something new from every conversation and tailoring answers and actions accordingly. They can automate routine tasks and provide relevant information and services. Imagine the possibilities here - these chatbots can enhance visitor engagement by responding to queries, to comments on blog posts, and provide real-time assistance and personalization. eBay’s AI-powered ShopBot is one such chatbot built on facebook messenger which can help consumers narrow down best deals from eBay from their entire listings and answer customer-centric queries. Skill up Developers! With the rise of AI, it is clear that developers will play a different role from the traditional role of a programmer-cum-designer. Developers will need to adapt their technical skillsets to rise above and complement the web development work that AI is capable of taking over. For example, they will now need to focus more on training AI algorithms to learn web and mobile usage patterns for better recommendations. A data-driven approach is now required with more focus on curating the data and taking the software through the process of learning by itself, and writing scripts to interact with the software. To do these, web developers need to get upto speed with the basics of machine learning, natural language processing, deep learning, etc and apply the tools and techniques related to AI into their web development workflow. The Future Perfect Artificial Intelligence has found its way into everything imaginable. Within web development this translates to automated web design, intelligent application development, highly proficient recommendation engines and many more. Today the use of AI in web development is a nice to have ammunition in an web developer’s toolkit. Soon, AI will make itself indispensable to the web development ecosystem ushering in the intelligent web. Developers will be a hybrid of designers, programmers and ML engineers who have a good grasp of user experience, are comfortable thinking in abstracts and algorithms and equally well versed with translating them into AI assisted elegant code . The next milestone for AI in web development is building self-improving apps which can think beyond the confines of human thought. Such apps would have the ability to perceive connections between data points that have not been previously considered, or are out of the reach of human intelligence. The ultimate goal of such machines, on the web and otherwise, would be to gain clairvoyance on aspects humans are missing or are oblivious to. Here’s hoping that when such a revolutionary AI hits the market, it impacts society for the greater good.

0
0
5293

article-image-decoding-chief-robotics-officer-cro

Aaron Lazar

20 Dec 2017

7 min read

Decoding the Chief Robotics Officer (CRO) Role

Aaron Lazar

20 Dec 2017

7 min read

The world is moving swiftly towards an automated culture and this means more and more machines will enter the fray. There’ve been umpteen debates on whether this is a good or bad thing - talks of how fortune tellers might be overtaken by Artificial Intelligence, etc. From the progress mankind has made, benefiting from these machines, we can safely say for now, that it’s only been a boon. With this explosion of moving and thinking metal, there’s a strong need for some governance at a top level. It now looks like we need to shift a bit to make more room at the C-level table, cos we’ve got the Master of the Machines arriving! Well “Master of the Machines” does sound cool, although not many companies would appreciate the “professionalism” of the term. The point is, the rise of a brand new C-level role, the Chief Robotics Officer, seems just on the horizon. Like we did in the Chief Data Officer article, we’re going to tell you more about this role and its accompanying responsibilities and by the end, you’ll be able to understand whether your organisation needs a CRO. Facts and Figures As far as I can remember, one of the first Chief Robotics Officers (CRO), was John Connor (#challengeme). Jokes apart, the role was introduced at the Chief Robotics Officer (CRO) Summit after having been spoken quite a lot in 2015. You’ve probably heard about this role by another name - Chief Autonomy Officer. Gartner predicts that 10% of large enterprises in supply-chain industries will have created a CRO position by 2020. Cisco states that as many as 60% of industries like logistics, healthcare, manufacturing, energy, etc,will have a CRO by 2025. The next generation of AI and Robots will affect workforce, business models, operations and competitive position of leading organisations. Therefore, it’s not surprising that the Boston Consulting Group projects that the market for robots will reach $67 billion by 2025. Why all the fuss? It’s quite evident that robots and smart machines will soon take over/redefine the way a lot of jobs are currently being performed by humans. This means that robots will be working alongside humans and as such there’s a need for the development of principles, processes, and disciplines that govern or manage this collaboration. According to Myria research, “The CROs (and their teams) will be at the forefront of technology, to translate technology fluency into clear business advantages, and to maintain Robotics and Intelligent Operational Systems (RIOS) capabilities that are intimately linked to customer-facing activities, and ultimately, to company performance”. With companies like Amazon, Adidas, Crowne Plaza Hotels and Walmart already deploying robots worth millions in research, to move in the direction of automation, there is clearly a need for a CRO. What might the Chief Robotics Officer’s responsibilities be? If you search for job listings of the role, you probably won’t succeed because the role is still in the making and there are no properly defined responsibilities. Although, if we were to think of what the CRO’s responsibilities might me, here’s what we could expect: Piece the Puzzle together: CROs will be responsible for bringing business functions like Engineering, HR, IT, and others together, implementing and maintaining automation technologies within the technical, social and financial contexts of a company. Manage the Robotics Life Cycle: The CRO would be responsible for defining and managing different aspects of the robotics life cycle. They will need to identify ways and means to improve the way robots function and boost productivity. Code of Conduct: CROs will need to design and develop the principles, processes and disciplines that will manage robots and smart machines, to enable them to collaborate seamlessly with a human workforce. Define Integration: CROs will define robotic environments and integration touch points with other business functions such as supply chain, manufacturing, agriculture, etc. Brand Guardians: With the addition of non-humans in the workforce, CROs will be responsible for the brand health and any violations caused by their robots. Define Management Techniques: CROs will bridge the gap between the machines and humans and will develop techniques that humans can use to manage robotic workers. On a broader level, these responsibilities look quite similar to those of a Chief Information Officer, Chief Digital Information Officer or even the Director of IT. Key CRO Skills Well, with the robots in place people management skills would be lesser required, or not. You might think that a CRO is expected to possess only technical skills because of the nature of the job. Although, they still will have to interact with humans and manage their collaboration with the machines. This brings in the challenge of managing change. Not everyone is comfortable working with machines and a certain amount of understanding and skill will need to be developed. With Brand Management and other strategic goals involved, the CRO must be on their toes moulding the technological side of the business to achieve short and long term goals. IT Managers, those in charge of automation and Directors who are skilled in Robotics, will be interested in scaling up to the position. On another note, there might be over 35% vacant robotics jobs by 2020, owing to the rapid growth of the field. Futuristic Challenges Some of the major challenges we expect to see could be managing change and an environment where humans and bots work together. The European Union has been thinking of considering robots as “electronic persons” with rights in the near future. This will result in complications about who is right and who is wrong. Moreover, there are plans about rewarding and penalising bots, based on their performance. How do you penalize a bot? Maybe penalising would come in the form of not charging the bot for a few days or formatting it’s memory, if it’s been naughty! Or rewards could be in the form of a software update or debugging it more frequently? These probably sound silly at the moment, but you never know what the future might have in store. The Million $ Question: Do we need a CRO? So far, there haven’t been any companies that have publicly announced about hiring a CRO, although many manufacturing companies already have senior roles related to robotics, such as Vice President of Advanced Automation and Robotics, or Vice President of Advanced Engineering. However, these roles are purely technical and not strategic. It’s clear that there needs to be someone at the high table calling the shots and strategies for a collaborative future, and world where robots and machines will work in harmony. Remy Glaisner of Myria Research predicts that the CROs will occupy a strategic role on a par with CIOs within the next five to eight years. CIOs might even get replaced by CROs in the long run. You never know, in the future the CRO might work with a bot themselves - the bot helping in taking decisions at an organisation/strategic level. The sky's the limit! In the end, small, medium or even a large sized businesses that are already planning to hire a CRO to drive automation, are on the right track. A careful evaluation of the benefits of having one in your organisation to lead your strategy, will help you decide on whether to take the CRO path or not. With automation bound to increase in importance in a coming years, it looks as though strategic representation will be inevitable for people with skills in the field.

0
0
5286

article-image-pieter-abbeel-et-al-improve-exploratory-behaviour-deep-rl-algorithms

Sugandha Lahoti

14 Feb 2018

4 min read

Pieter Abbeel et al on how to improve the exploratory behaviour of Deep Reinforcement Learning algorithms

Sugandha Lahoti

14 Feb 2018

4 min read

The paper, Parameter space noise for exploration proposes parameter space noise as an efficient solution for exploration, a big problem for deep reinforcement learning. This paper is authored by Pieter Abbeel, Matthias Plappert, Rein Houthooft, Prafulla Dhariwal, Szymon Sidor, Richard Y. Chen, Xi Chen, Tamim Asfour, and Marcin Andrychowicz. Pieter Abbeel is currently a professor at UC Berkeley since 2008. He was also a Research Scientist at OpenAI (2016-2017). Pieter is one of the pioneers of deep reinforcement learning for robotics, including learning locomotion and visuomotor skills. His current research focuses on robotics and machine learning with particular focus on deep reinforcement learning, meta-learning, and AI safety. Deep reinforcement learning is the combination of deep learning with reinforcement learning to create artificial agents to achieve human-level performance across many challenging domains. This article will talk about one of Pieter’s top accepted research papers in the field of deep reinforcement learning at the 6th annual ICLR conference scheduled to happen between April 30 - May 03, 2018. Improving the exploratory behavior of Deep RL algorithms with Parameter Space Noise What problem is the paper attempting to solve? This paper is about the exploration challenge in deep reinforcement learning (RL) algorithms. The main purpose of exploration is to ensure that the agent’s behavior does not converge prematurely to a local optimum. Enabling efficient and effective exploration is difficult since it is not directed by the reward function of the underlying Markov decision process (MDP). A large number of methods have been proposed to tackle this challenge in high-dimensional and/or continuous-action MDPs. These methods increase the exploratory nature of these algorithms through the addition of temporally-correlated noise or through the addition of parameter noise. The main limitation of these methods is that they are either only proposed and evaluated for the on-policy setting with relatively small and shallow function approximators or disregard all temporal structure and gradient information. Paper summary This paper proposes adding noise to the parameters (parameter space noise) of a deep network when taking actions in deep reinforcement learning to encourage exploration. The effectiveness of this approach is demonstrated through empirical analysis across a variety of reinforcement learning tasks (i.e.DQN, DDPG, and TRPO). It answers the following questions: Do existing state-of-the-art RL algorithms benefit from incorporating parameter space noise? Does parameter space noise aid in exploring sparse reward environments more effectively? How does parameter space noise exploration compare against evolution strategies for deep policies with respect to sample efficiency? Key Takeaways The paper describes a method which proves parameter space noise as a conceptually simple yet effective replacement for traditional action space noise like -greedy and additive Gaussian noise. This work shows that parameter perturbations can successfully be combined with contemporary on- and off-policy deep RL algorithms such as DQN, DDPG, and TRPO and often results in improved performance compared to action noise. The paper attempts to prove with experiments that using parameter noise allows solving environments with very sparse rewards, in which action noise is unlikely to succeed. Parameter space noise is a viable and interesting alternative to action space noise, which is still the effective standard in most reinforcement learning applications. Reviewer feedback summary Overall Score: 20/30 Average Score: 6.66 The reviewers were pleased with the paper. They termed it as a simple strategy for exploration that is effective empirically. The paper was found to be clear and well written with thorough experiments across deep RL domains. The authors have also released open-source code along with their paper for reproducibility, which was appreciated by the reviewers. However, a common trend among the reviews was that the authors overstated their claims and contributions. The reviewers called out some statements in particular (e.g. the discussion of ES and RL). They also felt that the paper lacked a strong justification for the method other than it being empirically effective and intuitive.

0
0
5225

Akram Hussain

18 Mar 2015

4 min read

Is 2015 the Year of Deep Learning?

Akram Hussain

18 Mar 2015

4 min read

The new phenomenon to hit the world of ‘Big Data’ seems to be ‘Deep Learning’. I’ve read many articles and papers where people question whether there’s a future for it, or if it’s just a buzzword that will die out like many a term before it. Likewise I have seen people who are genuinely excited and truly believe it is the future of Artificial intelligence; the one solution that can greatly improve the accuracy of our data and development of systems. Deep learning is currently a very active research area, by no means is it established as an industry standard, but rather one which is picking up pace and brings a strong promise of being a game changer when dealing with raw, unstructured data. So what is Deep Learning? Deep learning is a concept conceived from machine learning. In very simple terms, we think of machine learning as a method of teaching machines (using complex algorithms to form neural networks) to make improved predictions of outcomes based on patterns and behaviour from initial data sets. The concept goes a step further however. The idea is based around a set of techniques used to train machines (Neural Networks) in processing information that can generate levels of accuracy nearly equivalent to that of a human eye. Deep learning is currently one of the best providers of solutions regarding problems in image recognition, speech recognition, object recognition and natural language processing. There are a growing number of libraries that are available, in a wide range of different languages (Python, R, Java) and frameworks such as: Caffe,Theanodarch, H20, Deeplearning4j, DeepDist etc. How does Deep Learning work? The central idea is around ‘Deep Neural Networks’. Deep Neural Networks take traditional neural networks (or artificial neural networks) and build them on top of one another to form layers that are represented in a hierarchy. Deep learning allows each layer in the hierarchy to learn more about the qualities of the initial data. To put this in perspective; the output of data in level one is then the input of data in level 2. The same process of filtering is used a number of times until the level of accuracy allows the machine to identify its goal as accurately as possible. It’s essentially a repeat process that keeps refining the initial dataset. Here is a simple example of Deep learning. Imagine a face, we as humans are very good at making sense of what our eyes show us, all the while doing it without even realising. We can easily make out ones: face shape, eyes, ears, nose, mouth etc. We take this for granted and don’t fully appreciate how difficult (and complex) it can get whilst writing programs for machines to do what comes naturally to us. The difficulty for machines in this case is pattern recognition - identifying edges, shapes, objects etc. The aim is to develop these ‘deep neural networks’ by increasing and improving the number of layers - training each network to learn more about the data to the point where (in our example) it’s equal to human accuracy. What is the future of Deep Learning? Deep learning seems to have a bright future for sure, not that it is a new concept, I would actually argue it’s now practical rather than theoretical. We can expect to see the development of new tools, libraries and platforms, even improvements on current technologies such as Hadoop to accommodate the growth of Deep Learning. However it may not be all smooth sailing. It is still by far very difficult and time consuming task to understand, especially when trying to optimise networks as datasets grow larger and larger, surely they will be prone to errors? Additionally, the hierarchy of networks formed would surely have to be scaled for larger complex and data intensive AI problems. Nonetheless, the popularity around Deep learning has seen large organisations invest heavily, such as: Yahoo, Facebook, Googles acquisition of Deepmind for $400 million and Twitter’s purchase of Madbits. They are just few of the high profile investments amongst many. 2015 really does seem like the year Deep learning will show its true potential. Prepare for the advent of deep learning by ensuring you know all there is to know about machine learning with our article. Read 'How to do Machine Learning with Python' now. Discover more Machine Learning tutorials and content on our dedicated page. Find it here.

0
0
5016

article-image-intent-based-networking-systems-ibns-machine-learning-sprinkles-intelligence-in-networks

Sugandha Lahoti

27 Nov 2017

6 min read

Intent-based Networking Systems (IBNS): Machine Learning sprinkles intelligence in networks

Sugandha Lahoti

27 Nov 2017

6 min read

Machine Learning is entangling the world in the web of smart, intelligent and automated algorithms. The next bug caught in the mesh is Networking. The paradigm of network management is all set to take a pretty big shift with machine learning sprinkled networks. Termed as Intent-based network, it is a network management system focussed on improving network availability and agility through means of automation. Exploring an Intent-based Network Essentially, an Intent-based networking software (IBNS), like any other networking solution helps to plan, plot, execute, and operate networks. A distinguishable fact is, that the intent based network can translate the “intent” of a user. The user may be the end-user or the network administrator. The “intent” may be through a series of commands or through APIs. How does it work?— a network admin describes the state of a network, or the policy they want a network to follow, the IBNS validates whether such a request is feasible and then executes it. The second differentiating feature of an IBNS is the ability to perform automated implementation of the network state and policies described above. They can also control the security level to be applied to applications based on user role, time, and device conditions. Additionally, the vLANs, mandate firewalls and other network technologies required within the network can be automated in an IBNS. An IBNS can monitor a network in real-time and take necessary actions to assess and improve the network. The ultimate goal of an IBNS is, to ensure that the desired condition of a network is maintained while making sure an automated reparative action is taken in case of any discrepancy. Why networks need machine learning to create intent? Traditional Networks cannot be integrated into networking environments after a certain number of layers. They require manual integration which can be a daunting process demanding advanced skill-set. Traditional network systems are also prone to failover and capacity augment as they involve managing individual devices. This resulted in developing a new set of centrally globalized network system, controlled from one place that permeates through the network. Machine learning was considered a possible contender for aiding the development of such networks and hence IBNS came to be. The idea of IBNS was proposed long ago, but requirements for developing such a system were not available. In the recent times, advances in ML have made it possible to develop such a system which can automate tasks. By automation we mean, automating the entire network policy when the network administrator simply defines the desired state of the network. With the presence of ML, IBNS can do real time analysis of network conditions, which involves monitoring, identifying and reacting to real time data using algorithmic validations. Why IBNS could be the future of networking? Central control system The intent based network uses a central, cloud based control system to program the routes to be taken by the network elements to deliver required information. The control system can also reprogram them in case of a network path failure. Automation of network elements Intent Based networks are all about reducing human efforts. A network is built on millions of connected devices. Managing a huge number of devices will require a lot of manpower. Also, the current networking system will have difficulty in managing enormous amount of data exploding from IoT devices. With an IBNS, human intervention required to build these network elements across services and circuits is reduced by a considerable amount. IBNS provides an automated network management system which mathematically validates the desired state of networks as specified by network administrators. Real-time decision making The network systems continuously gather information about the network architecture while adapting themselves to the desired condition. If a network is congested at a particular time frame, it can modify the route. For example, the GPS system can re-route a driver if he/she encounters some sort of traffic jam or a closed lane. IBNS can also gather information about the current status of the network by consistent ingestion of real-time state pointers in the networking system. The real time data, so gathered can in turn make it smarter and more predictive. Business benefits gained Intent-based software will open better opportunities for IT employees. Due to elimination of mundane operational tasks, IT admins can increase their productivity levels while improving their skill set. For an organization, it would mean lower cost of operation, reduced fallacy, better reliability & security, resource management, increased optimization, and multi-vendor device management. What do we see around the globe in the IBNS domain? The rising potential of IBNS is giving startups and MNCs higher opportunities to invest in this technology. Apstra have launched their Apstra Operating system which provides deployment of an IBN system to a variety of customers including Cisco, Arista, Juniper, and other white box users. The solution focuses on a single data center. It prevents and repairs network outages to improve infrastructure agility. The AOS also provides network admins full transparency to intent with the ability to ask any question about the network, amend any aspect of the intent after deployment, while allowing complete freedom for users to perform customizations of the AOS. Anuta networks, the pioneer in Network service orchestration have announced their award winning NCX intent-based orchestration platform to automate network services, which has enabled DevOps for networking. They provide more than 35 industry leading vendors with orchestrated devices. They also own detailed and exhaustive REST API for network integration. One of the top pioneers in the networking domain, Cisco has launched itself in the intent-based networking space with the solution—Software Defined Access (SDA). The software enables easy to perform operational tasks and build powerful networks by automating the network. It also boasts of features like: Multi-site management Improved visibility with simple topological views Integration with Kubernetes Zero -trust security With tech-innovation taking place all around, networking organizations are racing towards breaking the clutter by bringing their intent based networks to the networking space. According to Gartner analyst, Andrew Lerner, “Intent-based networking is nascent, but could be the next big thing in networking, as it promises to improve network availability and agility, which are key, as organizations transition to digital business. I&O leaders responsible for networking need to determine if and when to pilot this technology.”

0
0
4340

article-image-tensorflow-next-gen-machine-learning

Ariel Scarpinelli

01 Jun 2016

7 min read

Tensorflow: Next Gen Machine Learning

Ariel Scarpinelli

01 Jun 2016

7 min read

Last November, Google open sourced its shiny Machine Intelligence package, promising a simpler way to develop deep learning algorithms that can be deployed anywhere, from your phone to a big cluster without a hassle. They even take advantage of running over GPUs for better performance. Let's Give It a Shot! First things first, let's install it: # Ubuntu/Linux 64-bit, CPU only (GPU enabled version requires more deps): $ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/linux/cpu/tensorflow-0.7.1-cp27-none-linux_x86_64.whl # Mac OS X, CPU only:$ sudo easy_install --upgrade six$ sudo pip install --upgrade https://storage.googleapis.com/tensorflow/mac/tensorflow-0.7.1-cp27-none-any.whl We are going to play with the old-known iris dataset, where we will train a neural network to take dimensions of the sepals and petals of an iris plant and classify it between three different types of iris plants: Iris setosa, Iris versicolour, and Iris virginica. You can download the training CSV dataset from here. Reading the Training Data Because TensorFlow is prepared for cluster-sized data, it allows you to define an input by feeding it with a queue of filenames to process (think of MapReduce output shards). In our simple case, we are going to just hardcode the path to our only file: import tensorflow as tf def inputs(): filename_queue = tf.train.string_input_producer(["iris.data"]) We then need to set up the Reader, which will work with the file contents. In our case, it's a TextLineReader that will produce a tensor for each line of text in the dataset: reader = tf.TextLineReader()key, value = reader.read(filename_queue) Then we are going to parse each line into the feature tensor of each sample in the dataset, specifying the data types (in our case, they are all floats except the iris class, which is a string). # decode_csv will convert a Tensor from type string (the text line) in # a tuple of tensor columns with the specified defaults, which also # sets the data type for each column sepal_length, sepal_width, petal_length, petal_width, label = tf.decode_csv(value, record_defaults=[[0.0], [0.0], [0.0], [0.0], [""]]) # we could work with each column separately if we want; but here we # simply want to process a single feature vector containing all the # data for each sample. features = tf.pack([sepal_length, sepal_width, petal_length, petal_width]) Finally, in our data file, the samples are actually sorted by iris type. This would lead to bad performance of the model and make it inconvenient for splitting between training and evaluation sets, so we are going to shuffle the data before returning it by using a tensor queue designed for it. All the buffering parameters can be set to 1500 because that is the exact number of samples in the data, so will store it completely in memory. The batch size will also set the number of rows we pack in a single tensor for applying operations in parallel: return tf.train.shuffle_batch([features, label], batch_size=100, capacity=1500, min_after_dequeue=100) Converting the Data Our label field on the training dataset is a string that holds the three possible values of the Iris class. To make it friendly with the neural network output, we need to convert this data to a three-column vector, one for each class, where the value should be 1 (100% probability) when the sample belongs to that class. This is a typical transformation you may need to do with input data. def string_label_as_probability_tensor(label): is_setosa = tf.equal(label, ["Iris-setosa"]) is_versicolor = tf.equal(label, ["Iris-versicolor"]) is_virginica = tf.equal(label, ["Iris-virginica"]) return tf.to_float(tf.pack([is_setosa, is_versicolor, is_virginica])) The Inference Model (Where the Magic Happens) We are going to use a single neuron network with a Softmax activation function. The variables (learned parameters of our model) will only be the matrix weights applied to the different features for each sample of input data. # model: inferred_label = softmax(Wx + b) # where x is the features vector of each data example W = tf.Variable(tf.zeros([4, 3])) b = tf.Variable(tf.zeros([3])) def inference(features): # we need x as a single column matrix for the multiplication x = tf.reshape(features, [1, 4]) inferred_label = tf.nn.softmax(tf.matmul(x, W) + b) return inferred_label Notice that we left the model parameters as variables outside of the scope of the function. That is because we want to use those same variables both while training and when evaluating and using the model. Training the Model We train the model using backpropagation, trying to minimize cross entropy, which is the usual way to train a Softmax network. At a high level, this means that for each data sample, we compare the output of the inference with the real value and calculate the error (how far we are). Then we use the error value to adjust the learning parameters in a way that minimizes that error. We also have to set the learning factor; it means for each sample, how much of the computed error we will apply to correct the parameters. There has to be a balance between the learning factor, the number of learning loop cycles, and the number of samples we pack tighter in the same tensor in batch; the bigger the batch, the smaller the factor and the higher the number of cycles. def train(features, tensor_label): inferred_label = inference(features) cross_entropy = -tf.reduce_sum(tensor_label*tf.log(inferred_label)) train_step = tf.train.GradientDescentOptimizer(0.001) .minimize(cross_entropy) return train_step Evaluating the Model We are going to evaluate our model using accuracy, which is the ratio of cases where our network identifies the right iris class over the total evaluation samples. def evaluate(evaluation_features, evaluation_labels): inferred_label = inference(evaluation_features) correct_prediction = tf.equal(tf.argmax(inferred_label, 1), tf.argmax(evaluation_labels, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) return accuracy Running the Model We are only left to connect our graph and run it in a session, where the defined operations are actually going to use the data. We also split our input data between training and evaluation around 70%:30%, and run a training loop with it 1,000 times. features, label = inputs() tensor_label = string_label_as_probability_tensor(label) train_step = train(features[0:69, 0:4], tensor_label[0:69, 0:3]) evaluate_step = evaluate(features[70:99, 0:4], tensor_label[70:99, 0:3]) with tf.Session() as sess: sess.run(tf.initialize_all_variables()) # Start populating the filename queue. coord = tf.train.Coordinator() threads = tf.train.start_queue_runners(coord=coord) for i in range(1000): sess.run(train_step) print sess.run(evaluate_step) # should print 0 => setosa print sess.run(tf.argmax(inference([[5.0, 3.6, 1.4, 0.2]]), 1)) # should be 1 => versicolor print sess.run(tf.argmax(inference([[5.5, 2.4, 3.8, 1.1]]), 1)) # should be 2 => virginica print sess.run(tf.argmax(inference([[6.9, 3.1, 5.1, 2.3]]), 1)) coord.request_stop() coord.join(threads) sess.closs() If you run this, it should print an accuracy value close to 1. This means our network correctly classifies the samples in almost 100% of the cases, and also we are providing the right answers for the manual samples to the model. Conclusion Our example was very simple, but TensorFlow actually allows you to do much more complicated things with similar ease, such as working with voice recognition and computer vision. It may not look much different than using any other deep learning or math packages, but the key is the ability to run the expressed model in parallel. Google is willing to create a mainstream DSL to express data algorithms focused on machine learning, and they may succeed in doing so. For instance, although Google has not yet open sourced the distributed version of the engine, a tool capable of running Tensorflow-modeled graphs directly over an Apache Spark cluster was just presented at the Spark Summit, which shows that the community is interested in expanding its usage. About the author Ariel Scarpinelli is a senior Java developer in VirtualMind and is a passionate developer with more than 15 years of professional experience. He can be found on Twitter at @ triforcexp.

0
0
4319

article-image-ibm-machine-learning-what-it-and-what-does-it-mean-data-scientists

Graham Annett

11 Apr 2017

5 min read

IBM Machine Learning: What Is It, and What Does It Mean For Data Scientists?

Graham Annett

11 Apr 2017

5 min read

Like many cloud providers, IBM is invested and interested in developing and building out their machine learning offerings. Because more and more companies are interested in trying to figure out what machine learning can currently do and what it is on the cusp of doing, there has been a plethora of new services and companies trying to see if they can either do anything new or be better than the many other competitors in the market. While much of the machine learning world is based around cloud computing and the ability to horizontally scale during training, the reality is that not every company is cloud based. While IBM has had many machine learning tools and services available on their various cloud platforms, the IBM Machine Learning system seems to be just an on-premise compromise to many of the previously cloud-based APIs that IBM offered under their machine learning and Watson brand. While I am sure many enterprise-level companies have a large amount of data that would previously not have been able to use these services, I’m unaware whether and skeptical that these services would be any better or more utilitarian than their cloud counterparts. This seems like it may be particularly useful to companies in industries with many regulations or worries about hosting data outside of their private network, although this mentality seems like it is being slowly eroded away and becoming outdated in many senses. One of the great things about the IBM Machine Learning system (and many similar companies as well) are the APIs that allow developers to pick whatever language they would like and allowing multiple frameworks because there are so many available and interesting options at the moment. This is a really important aspect for something like deep learning, where there is a constant barrage of new architectures and ideas that iterate upon prior ideas, but require new architecture and developer implementations. While I have not used IBM’s new system and will most likely not be in any particular situation where it would be available, I was lucky enough to participate in IBM’s Catalyst program and used a lot of their cloud services for a deep learning project and testing out many of their available Bluemix offerings. While I thought the system was incredibly nice and simple to use compared to many other cloud providers, I found the various machine learning APIs they offered either weren’t that useful or seemed to provide worse results than their comparable Google, Azure, and other such services. Perhaps that has changed, or their new services are much better and will be available on this new system, but it is hard to find any definitive information that this is true. One aspect of these machine learning tools that I am not excited about is the constant focus on using machine learning to create some business cost-savings models (which the IBM Machine Learning press release touts), which companies may state is passed onto the customers, but it seems like this is rarely the truth (and one of the things that was stressed on in the IBM Machine Learning launch event). The ability for machine learning methodology to solve tedious and previously complicated problems is much more important and fascinating to me than simply saving money via optimizing business expenses. While many of these problems are much harder to solve and we are far from solving them, the current applications of machine learning in business and marketing areas often provides proof for the rhetoric that machine learning is exploitive and a toxic solution. Along with this, while the IBM Machine Learning and Data Science products may seem to be aimed at someone in a data scientist role, I can never help but wonder to what extent data scientists are actually using these tools outside of pure novelty or an extremely simple prototyping step in a more in-depth analytical pipeline. I personally think the ability of these tools to create usefulness for someone who may otherwise not be interested in many aspects of traditional data science is where the tools are incredibly powerful and useful. While not all traditional developers or analysts are interested in in-depth data exploration, creating pipelines, and many other traditional data science skills, the ability to do so at ease and without having to learn skills and tooling can be seen as a huge burden to those not particularly interested. While true machine learning and data science skills are unlikely to be completely replaceable, many of the skills that a traditional data scientist has will be less important as more people are capable of doing what previously may have been a quite technical or complicated process. These sorts of products and services that are a part of the IBM Machine Learning catalog reinforce that idea and are an important step in allowing services to be used regardless of data location or analytical background, and are an important step forward for machine learning and data science in general. About the author Graham Annett is an NLP engineer at Kip (Kipthis.com). He has been interested in deep learning and has worked with and contributed to Keras. He can be found on GitHub or on his website.

0
0
4186

Tech Guides - Artificial Intelligence

Paper in Two minutes: i-RevNet, a deep invertible convolutional network

Neuroevolution: A step towards the Thinking Machine

What you missed at last week’s ICML 2018 conference

Yoshua Bengio et al on Twin Networks

Meet the who's who of Reinforcement learning

What we learned from Oracle OpenWorld 2017

Handpicked for your weekend Reading – 8th Dec 2017

Deep Learning with Microsoft CNTK

6 ways you can give an Artificially Intelligent Makeover to the Web Development

Decoding the Chief Robotics Officer (CRO) Role

Trending Topics

Pieter Abbeel et al on how to improve the exploratory behaviour of Deep Reinforcement Learning algorithms

Is 2015 the Year of Deep Learning?

Intent-based Networking Systems (IBNS): Machine Learning sprinkles intelligence in networks

Tensorflow: Next Gen Machine Learning

IBM Machine Learning: What Is It, and What Does It Mean For Data Scientists?

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access