Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-is-blockchain-a-failing-trend-or-can-it-build-a-better-world-harish-garg-provides-his-insight-interview

02 Jan 2019

4 min read

Is Blockchain a failing trend or can it build a better world? Harish Garg provides his insight [Interview]

02 Jan 2019

In 2018, Blockchain and cryptocurrency exploded across tech. We spoke to Packt author Harish Garg on what they see as the future of Blockchain in 2019 and beyond. Harish Garg, founder of BignumWorks Software LLP, is a data scientist and lead software developer with 17 years' software industry experience. BignumWorks is an India-based software consultancy that provides consultancy services in software development and technical training. Harish has worked for McAfee\Intel for 11+ years. He is an expert in creating data visualizations using R, Python, and web-based visualization libraries. Find all of Harish Garg's books for Packt here. From early adopters to the enterprise What do you think was the biggest development in blockchain during 2018? The biggest development in Blockchain during 2018 was the explosion of Blockchain based digital currencies. We have now thousands of different coins and projects supported by these coins. 2018 was also the year when Blockchain really captured the imagination of public at large, beyond just technical savvy early adopters. 2018 also saw first a dramatic rise in the price of digital currencies, especially Bitcoin and then a similar dramatic fall in the last half of the year. Do you think 2019 is the year that enterprise embraces blockchain? Why? Absolutely. Early adoption of Enterprise blockchain is already underway in 2018. Companies like IBM have already released and matured their Blockchain offerings for enterprises. 2018 also saw the big behemoth of Cloud Services, Amazon Web Services launching their own Blockchain solutions. We are on the cusp of wider adoption of Blockchain in enterprises in 2019. Key Blockchain challenges in 2019 What do you think the principle challenges in deploying blockchain technology are, and how might developers address them in 2019? There have been two schools that have been emerging about the way blockchain is perceived. One one side, there are people who are pitching Blockchain as some kind of ultimate Utopia, the last solution to solve all of humanity’s problems. And on the other end of the spectrum are people who dismiss Blockchain as another fading trend with nothing substantial to offer. These two kind of schools pose the biggest challenge to the success of Blockchain technology. The truth is somewhere lies in between these two. Developers need to take the job of Blockchain evangelism in their own hands and make sure the right kind of expectations are set up for policy makers and customers. Have the Bitcoin bubble and greater scrutiny from regulators made blockchain projects less feasible, or do they provide a more solid market footing for the technology? Why? Bitcoin has invited lot of scrutiny from regulators and governments, without the bubble too. Bitcoin upends the notion of a nation state controlling the supply of money. So obviously different governments are reacting to it with a wide range of actions, ranging from outright ban from using the existing banking systems to buy and sell Bitcoin and other digital currencies to some countries putting a legal framework in place to securely let their citizens trade in them. The biggest fear they have is the black money being pumped into digital currencies. With proper KYC procedures, these fears can be removed. However, governments and financial institutions are also realizing the advantages Blockchain offer in streamlining their banking and financial markets and are launching pilot projects to adopt Blockchain. Blockchain and disruption in 2019 Will Ethereum continue to dominate the industry or are there new platforms that you think present a serious challenge? Why? Ethereum do have an early mover advantage. However, we know that the early moved advantage is not such a big moat to cross for new competitors. There are likely to be competing and bigger platforms to emerge from the likes of Facebook, Amazon, and IBM that will solve the scalability issues Ethereum faces. What industries do you think blockchain technology is most likely to disrupt in 2019, and why? Finance and Banking are still the biggest industries that will see an explosion of creative products coming out due to the adoption of Blockchain technology. Products for Government use are going to be big especially wherever there is a need for immutable source of truth, like in the case of land records. Do you have any other thoughts on the future of blockchain you’d like to share? We are at a very early stage of Blockchain adoption. It’s very hard to predict right now what kind of killer apps will emerge few years down the line. Nobody predicted smartphones in 2007 will give rise to Apps like Uber. Important thing is to have the right mix of optimism and skepticism.

0
0
15152

article-image-black-friday-17-ways-ecommerce-machine-learning

Sugandha Lahoti

24 Nov 2017

10 min read

Black Friday Special: 17 ways in 2017 that online retailers use machine learning

Sugandha Lahoti

24 Nov 2017

10 min read

Black Friday sales are just around the corner. Both online and traditional retailers have geared up to race past each other in the ultimate shopping frenzy of the year. Although both brick and mortar retailers and online platforms will generate high sales, online retailers will sweep past the offline platforms. Why? In case of online retailers, the best part remains the fact that shopping online, customers don’t have to deal with pushy crowds, traffic, salespeople, and long queues. Online shoppers have access to a much larger array of products. They can also switch between stores, by just switching between tabs on their smart devices. Considering the surge of shoppers expected on such peak seasons, Big Data Analytics is a helpful tool for online retailers. With the advances in Machine Learning, Big data analytics is no longer confined to the technology landscape, it also represents a way how retailers connect with consumers in a purposeful way. For retailers, both big and small, adopting the right ML powered Big Data Analytic strategy would help in increasing their sales, retent their customers and generate high revenues. Here are 17 reasons why data is an important asset for retailers, especially on the 24th, this month. A. Improving site infrastructure The first thing that a customer sees when landing on an e-commerce website is the UI, ease of access, product classification, number of filters, etc. Hence building an easy to use website is paramount. Here’s how ML powered Big data analytics can help: [toggle title="" state="close"] 1. E-commerce site analysis A complete site analysis is one of the ways to increase sales and retain customers. By analyzing page views and actual purchases, bounce rates, and least popular products, the e-commerce website can be altered for better usability. For enhancing website features data mining techniques can also be used. This includes web mining, which is used to extract information from the web, and log files, which contain information about the user. For time-bound sales like Black Friday and Cyber Monday, this is quite helpful for better product placement, removing unnecessary products and showcasing products which cater to a particular user base. 2. Generating test data Generation of test data helps in a deeper analysis which helps in increasing sales. Big data analytics can give a helping hand here by organizing products based upon the type, shopper gender and age group, brands, pricing, number of views of each product page, and the information provided for that product. During peak seasons such as Black Friday, ML powered data analytics can analyze most visited pages and shopper traffic flow for better product placements and personalized recommendations.[/toggle] B. Enhancing Products and Categories Every retailer in the world is looking for ways to reduce costs without sacrificing the quality of their products. Big data analytics in combination with machine learning is of great help here. [toggle title="" state="close"] 3. Category development Big Data analytics can help in building up of new product categories, or in eliminating or enhancing old ones. This is possible by using machine learning techniques to analyze patterns in the marketing data as well as other external factors such as product niches. ML powered assortment planning can help in selecting and planning products for a specified period of time, such as the Thanksgiving week, so as to maximize sales and profit. Data analytics can also help in defining Category roles in order to clearly define the purpose of each category in the total business lifecycle. This is done to ensure that efforts made around a particular category, actually contribute to category development. It also helps to identify key categories, which are the featured products that specifically meet an objective for e.g. Healthy food items, Cheap electronics, etc. 4. Range selection An optimum and dynamic product range is essential to retain customers. Big data analytics can utilize sales data and shopper history to measure a product range for maximum profitability. This is especially important for Black Friday and Cyber Monday deals where products are sold at heavily discounted rates. 5. Inventory management Data analytics can give an overview of best selling products, non-performing or slow moving products, seasonal products and so on. These data pointers can help retailers manage their inventory and reduce the associated costs. Machine learning powered Big data analytics are also helpful in making product localization strategies i.e. which product sells well in what areas. In order to localize for China, Amazon changed its China branding to Amazon.cn. To make it easy for Chinese to pay, Amazon China introduced portable POS so users can pay the delivery guy via credit card at their doorstep. 6. Waste reduction Big Data analytics can analyze sales and reviews to identify products which don’t do well, and either eliminate the product or combine them with a companion well-doing product to increase its sales. Analysing data can also help in listing products that were returned due to damages and defects. Generating insights from this data using machine learning models can be helpful to retailers in many ways. Some examples are: they can modify their stocking methods, improve on their packaging and logistic support for those kinds of products. 7. Supply chain optimization Big Data analytics also have a role to play in Supply chain optimization. This includes using sales and forecast data to plan and manage goods from retailers to warehouses to transport, onto the doorstep of customers. Top retailers like Amazon, are offering deals under the Black Friday space for the entire week. Expanding the sale window is a great supply chain optimization technique for a more manageable selling.[/toggle] C. Upgrading the Customer experience Customers are the most important assets for any retailer. Big Data analytics is here to help you retain, acquire, and attract your customers. [toggle title="" state="close"] 8. Shopper segmentation Machine learning techniques can link and analyze granular data such as behavioral, transactional and interaction data to identify and classify customers who behave in similar ways. This eliminates the guesswork associated and helps in creating rich and highly dynamic consumer profiles. According to a report by Research Methodology, Walmart uses a mono-segment type of positioning targeted to single customer segment. Walmart also pays attention to young consumers due to the strategic importance of achieving the loyalty of young consumers for long-term perspectives. 9. Promotional analytics An important factor for better sales is analyzing how customers respond to promotions and discount. Analyzing data on an hour-to-hour basis on special days such as Black Friday or Cyber Monday, which have high customer traffic, can help retailers plan for better promotions and lead to brand penetration. The Boston consulting group uses data analytics to accurately gauge the performance of promotions and predict promotion performance in advance. 10. Product affinity models By analyzing a shopper’s past transaction history, product affinity models can track customers with the highest propensity of buying a particular product. Retailers can then use this for attracting more customers or providing the existing ones with better personalizations. Product affinity models can also cluster products that are mostly bought together, which can be used to improve recommendation systems. 11. Customer churn prediction The massive quantity of customer data being collected can be used for predicting customer churn rate. Customer churn prediction is helpful in retaining customers, attracting new ones, and also acquiring the right type of customers in the first place. Classification models such as Logistic regression can be used to predict customers most likely to churn. As part of the Azure Machine Learning offering, Microsoft has a Retail Customer Churn Prediction Template to help retail companies predict customer churns.[/toggle] D. Formulating and aligning business strategies Every retailer is in need of tools and strategies for a product or a service to reach and influence the consumers, generate profits, and contribute to the long-term success of the business. Below are some pointers depicting how ML powered Big Data Analytics can help retailers do just that. [toggle title="" state="close"] 12. Building dynamic pricing models Pricing models can be designed by looking at the customer’s purchasing habits and surfing history. This descriptive analytics can be fed into a predictive model to obtain an optimal pricing model such as price sensitivity scores, and price to demand elasticity. For example, Amazon uses a dynamic price optimization technique by offering its biggest discounts on its most popular products, while making profits on less popular ones. IBM’s Predictive Customer Intelligence can dynamically adjust the price of a product based on customer’s purchase decision. 13. Time series analysis Time series analysis can be used to identify patterns and trends in customer purchases, or a product’s lifecycle by observing information in a sequential fashion. It can also be used to predict future values based on the sequence so generated. For online retailers this means using historical sales data to forecast future sales, analyzing time-dependent patterns to list new arrivals, mark up prices or lower them down depending events such as Black Friday or Cyber Monday sales etc. 14. Demand forecasting Machine learning powered Big Data analytics can learn demand levels from a wide array of factors such as product nature, characteristics, seasonality, relationships with other associated products, relationship with other market factors, etc. It can then forecast the type of demand associated with a particular product using a simulation model. Such predictive analytics are highly accurate and also reduce costs especially for events like Black Friday, where there is a high surge of shoppers. 15. Strategy Adjustment Predictive Big Data analytics can help shorten the go-to-market time for product launches, allowing marketers to adjust their strategy midcourse if needed. For Black Friday or Cyber Monday deals, an online retailer can predict the demand for a particular product and can amend strategies in between, such as increasing the discount, or placing a product at the discounted rate for a longer time, etc. 16. Reporting and sales analysis Big data analytics tools can analyze large quantities of retail data quickly. Also, most such tools have a simple UI Dashboard which helps retailers know detailed descriptions of their queries in a single click. Thus a lot of time is saved, which was previously used for creating reports or sales summary. Reports generated from a data analytics tool are quick, fast, and easy to understand. 17. Marketing mix spend optimization Forecasting sales and proving ROI of marketing activities are two pain points faced by most retailers. Marketing Mix Modelling is a big data statistical analysis, which uses historical data to show the impact of marketing activities on sales and then forecasts the impact of future marketing tactics. Insights derived from such tools can be used to enhance marketing strategies and optimize the costs.[/toggle] Adopting the strategies as mentioned above, retailers can maximize their gains this holiday season starting with Black Friday which begins as the clock chimes 12 today. Machine Powered Big Data analytics is there to help retailers attract new shoppers, retain them, enhance product line, define new categories, and formulate and align business strategies. Gear up for a Big Data Black Friday this 2017!

0
0
14751

article-image-trending-datascience-news-handpicked-weekend-reading-24th-nov-17

Aarthi Kumaraswamy

24 Nov 2017

2 min read

Handpicked for your Weekend Reading – 24th Nov ’17

Aarthi Kumaraswamy

24 Nov 2017

2 min read

We hope you had a great Thanksgiving and are having the time of your life shopping for your wishlist this weekend. The last thing you want to do this weekend is to spend your time you would rather spend shopping, scouring the web for content you would like to read. Here is a brief roundup of the best of what we published on the Datahub this week for your weekend reading. Thanksgiving Weekend Reading A mid-autumn Shopper’s dream – What an Amazon-fulfilled Thanksgiving would look like Data science folks have 12 reasons to be thankful for this Thanksgiving Black Friday Special - 17 ways in 2017 that online retailers use machine learning Through the customer’s eyes - 4 ways Artificial Intelligence is transforming e-commerce Expert in Focus Shyam Nath, director of technology integrations, Industrial IoT, GE Digital on Why the Industrial Internet of Things (IIoT) needs Architects 3 Things that happened this week in Data Science News Amazon ML Solutions Lab to help customers “work backwards” and leverage machine learning Introducing Gluon- a powerful and intuitive deep learning interface New MapR Platform 6.0 powers DataOps Get hands-on with these Tutorials Visualizing 3D plots in Matplotlib 2.0 How to create 3D Graphics and Animation in R Implementing the k-nearest neighbors algorithm in Python Do you agree with these Insights & Opinions? Why you should learn Scikit-learn 4 ways Artificial Intelligence is leading disruption in Fintech 7 promising real-world applications of AI-powered Mixed Reality

0
0
14721

article-image-ai-chip-wars-brainwave-microsofts-answer-googles-tpu

Amarabha Banerjee

18 Oct 2017

5 min read

AI chip wars: Is Brainwave Microsoft's Answer to Google's TPU?

Amarabha Banerjee

18 Oct 2017

5 min read

When Google decided to design their own chip with TPU, it generated a lot of buzz for faster and smarter computations with its ASIC-based architecture. Google claimed its move would significantly enable intelligent apps to take over, and industry experts somehow believed a reply from Microsoft was always coming (remember Bing?). Well, Microsoft has announced its arrival into the game – with its own real-time AI-enabled chip called Brainwave. Interestingly, as the two tech giants compete in chip manufacturing, developers are certainly going to have more options now, while facing the complex computational processes of modern day systems. What is Brainwave? Until recently, Nvidia was the dominant market player in the microchip segment, creating GPUs (Graphics Processing Unit) for faster processing and computation. But after Google disrupted the trend with its TPU (tensor processing unit) processor, the surprise package in the market has come from Microsoft. More so because its ‘real-time data processing’ Brainwave chip claims to be faster than the Google chip (the TPU 2.0 or the Cloud TPU chip). The one thing that is common between both Google and Microsoft chips is that they can both train and simulate deep neural networks much faster than any of the existing chips. The fact that Microsoft has claimed that Brainwave supports Real-Time AI systems with minimal lag, by itself raises an interesting question - are we looking at a new revolution in the microchip industry? The answer perhaps lies in the inherent methodology and architecture of both these chips (TPU and Brainwave) and the way they function. What are the practical challenges of implementing them in real-world applications? The Brainwave Architecture: Move over GPU, DPU is here In case you are wondering what the hype with Microsoft’s Brainwave chip is about, the answer lies directly in its architecture and design. The present-day complex computational standards are defined by high-end games for which GPUs (Graphical Processing Units) were originally designed. Brainwave differs completely from the GPU architecture: the core components of a Brainwave chip are Field Programmable Gate Arrays or FPGAs. Microsoft has developed a huge number of FPGA modules on top of which DNN (Deep Neural Network) layers are synthesized. Together, this setup can be compared with something similar to Hardware Microservices where each task is assigned by a software to different FPGA and DNN modules. These software controlled Modules are called DNN Processing Units or DPUs. This eliminates the latency of the CPU and the need for data transfer to and fro from the backend. The two methodologies involved here are seemingly different in their architecture and application: one is the hard DPU and the other is the Soft DPU. While Microsoft has used the soft DPU approach where the allocation of memory modules are determined by software and the volume of data at the time of processing, the hard DPU has a predefined memory allocation which doesn’t allow for flexibility so vital in real-time processing. The software controlled feature is exclusive to Microsoft, and unlike other AI processing chips, Microsoft have developed their own easy to process data types that are faster to process. This enables the Brainwave chip to perform near real-time AI computations easily. Thus, in a way Microsoft brainwave holds an edge over the Google TPU when it comes to real-time decision making and computation capabilities. Brainwave’s edge over TPU 2 - Is it real time? The reason Google had ventured out into designing their own chips was their need to increase the number of data centers, with the increase in user queries. They had realized the fact that instead of running data queries via data centers, it would be far more plausible if the computation was performed in the native system. That’s where they needed more computational capabilities than what the modern day market leaders like Intel X86 Xeon processors and the Nvidia Tesla K80 GPUs offered. But Google opted for Application Specific Integrated Circuits (ASIC) instead of FPGAs, the reason being that it was completely customizable. It was not specific for one particular Neural Network but was rather applicable for multiple Networks. The trade-off for this ability to run multiple Neural Networks was of course Real Time computation which Brainwave could achieve because of using the DPU architecture. The initial data released by Microsoft shows that the Brainwave has a data transfer bandwidth of 20TB/sec, 20 times faster than the latest Nvidia GPU chip. Also, the energy efficiency of Brainwave is claimed to be 4.5 times better than the current chips. Whether Google would up their ante and improve on the existing TPU architecture to make it suitable for real-time computation is something only time can tell. [caption id="attachment_1064" align="alignnone" width="644"] Source: Brainwave_HOTCHIPS2017 PPT on Microsoft Research Blog[/caption] Future outlook and challenges Microsoft is yet to declare the benchmarking results for the Brainwave chip. But Microsoft Azure customers most definitely look forward to the availability of Brainwave chip for faster and better computational abilities. What is even more promising is Brainwave works seamlessly with Google’s TensorFlow and Microsoft’s own CNTK framework. Tech startups like Rigetti, Mythic and Waves are trying to create mainstream applications which will employ AI and quantum computation techniques. This will bring AI to the masses, by creating practical AI driven applications for daily consumers, and these companies have shown a keen interest in both the Microsoft and the Google AI chips. In fact, Brainwave will be most suited for these companies such as the above which are looking to use AI capabilities for everyday tasks, as they are less in number because of the limited computational capabilities of the current chips. The challenges with all AI chips, including Brainwave, will still revolve around their data handling capabilities, the reliability of performance, and on improving memory capabilities of our current hardware systems.

0
0
14519

article-image-data-science-folks-12-reasons-thankful-thanksgiving

Savia Lobo

21 Nov 2017

8 min read

Data science folks have 12 reasons to be thankful for this Thanksgiving

Savia Lobo

21 Nov 2017

8 min read

We are nearing the end of 2017. But with each ending chapter, we have remarkable achievements to be thankful for. Similarly, for the data science community, this year was filled with a number of new technologies, tools, version updates etc. 2017 saw blockbuster releases such as PyTorch, TensorFlow 1.0 and Caffe 2, among many others. We invite data scientists, machine learning experts, and other data science professionals to come together on this Thanksgiving Day, and thank the organizations, which made our interactions with AI easier, faster, better and generally more fun. Let us recall our blessings in 2017, one month at a time... [dropcap]Jan[/dropcap] Thank you, Facebook and friends for handing us PyTorch Hola 2017! While the world was still in the New Year mood, a brand new deep learning framework was released. Facebook along with a few other partners launched PyTorch. PyTorch came as an improvement to the popular Torch framework. It now supported the Python language over the less popular Lua. As PyTorch worked just like Python, it was easier to debug and create unique extensions. Another notable change was the adoption of a Dynamic Computational Graph, used to create graphs on the fly with high speed and flexibility. [dropcap]Feb[/dropcap] Thanks Google for TensorFlow 1.0 The month of February brought Data Scientist’s a Valentine's gift with the release of TensorFlow 1.0. Announced at the first annual TensorFlow Developer Summit, TensorFlow 1.0 was faster, more flexible, and production-ready. Here’s what the TensorFlow box of chocolate contained: Fully compatibility with Keras Experimental APIs for Java and Go New Android demos for object and image detection, localization, and stylization A brand new Tensorflow debugger An introductory glance of XLA--a domain-specific compiler for TensorFlow graphs [dropcap]Mar[/dropcap] We thank Francois Chollet for making Keras 2 a production ready API Congratulations! Keras 2 is here. This was a great news for Data science developers as Keras 2, a high- level neural network API allowed faster prototyping. It provided support both CNNs (Convolutional Neural Networks) as well as RNNs (Recurrent Neural Networks). Keras has an API designed specifically for humans. Hence, a user-friendly API. It also allowed easy creation of modules, which meant it is perfect for carrying out an advanced research. Developers can now code in Python, a compact, easy to debug language. [dropcap]Apr[/dropcap] We like Facebook for brewing us Caffe 2 Data scientists were greeted by a fresh aroma of coffee, this April, as Facebook released the second version of it’s popular deep learning framework, Caffe. Caffe 2 came up as a easy to use deep learning framework to build DL applications and leverage community contributions of new models and algorithms. Caffe 2 was fresh with a first-class support for large-scale distributed training, new hardware support, mobile deployment, and the flexibility for future high-level computational approaches. It also provided easy methods to convert DL models built in original Caffe to the new Caffe version. Caffe 2 also came with over 400 different operators--the basic units of computation in Caffe 2. [dropcap]May[/dropcap] Thank you, Amazon for supporting Apache MXNet on AWS and Google for your TPU The month of May brought in some exciting launches from the two tech-giants, Amazon and Google. Amazon Web Services’ brought Apache MXNet on board and Google’s Second generation TPU chips were announced. Apache MXNet, which is now available on AWS allowed developers to build Machine learning applications which can train quickly and run anywhere, which means it is a scalable approach for developers. Next up, was Google’s second generation TPU (Tensor Processing Unit) chips, designed to speed up machine learning tasks. These chips were supposed to be (and are) more capable of CPUs and even GPUs. [dropcap]Jun[/dropcap] We thank Microsoft for CNTK v2 The mid of the month arrived with Microsoft’s announcement of the version 2 of its Cognitive Toolkit. The new Cognitive Toolkit was now enterprise-ready, had production-grade AI and allowed users to create, train, and evaluate their own neural networks scalable to multiple GPUs. It also included the Keras API support, faster model compressions, Java bindings, and Spark support. It also featured a number of new tools to run trained models on low-powered devices such as smartphones. [dropcap]Jul[/dropcap] Thank you, Elastic.co for bringing ML to Elastic Stack July made machine learning generally available for the Elastic Stack users with its version 5.5. With ML, the anomaly detection of the Elasticsearch time series data was made possible. This allows users to analyze the root cause of the problems in the workflow and thus reduce false positives. To know about the changes or highlights of this version visit here. [dropcap]Aug[/dropcap] Thank you, Google for your Deeplearn.js August announced the arrival of Google’s Deeplearn.js, an initiative that allowed Machine Learning models to run entirely in a browser. Deeplearn.js was an open source WebGL- accelerated JS library. It offered an interactive client-side platform which helped developers carry out rapid prototyping and visualizations. Developers were now able to use hardware accelerator such as the GPU via the webGL and perform faster computations with 2D and 3D graphics. Deeplearn.js also allowed TensorFlow model’s capabilities to be imported on the browser. Surely something to thank for! [dropcap]Sep[/dropcap] Thanks, Splunk and SQL for your upgrades September surprises came with the release of Splunk 7.0, which helps in getting Machine learning to the masses with an added Machine Learning Toolkit, which is scalable, extensible, and accessible. It includes an added native support for metrics which speed up query processing performance by 200x. Other features include seamless event annotations, improved visualization, faster data model acceleration, a cloud-based self-service application. September also brought along the release of MySQL 8.0 which included a first-class support for Unicode 9.0. Other features included are An extended support for native JSOn data Inclusion of windows functions and recursive SQL syntax for queries that were previously impossible or difficult to write Added document-store functionality So, big thanks to the Splunk and SQL upgrades. [dropcap]Oct[/dropcap] Thank you, Oracle for the Autonomous Database Cloud and Microsoft for SQL Server 2017 As Fall arrived, Oracle unveiled the World’s first Autonomous Database Cloud. It provided full automation associated with tuning, patching, updating and maintaining the database. It was self scaling i.e., it instantly resized compute and storage without downtime with low manual administration costs. It was also self repairing and guaranteed 99.995 percent reliability and availability. That’s a lot of reduction in workload! Next, developers were greeted with the release of SQL Server 2017 which was a major step towards making SQL Server a platform. It included multiple enhancements in Database Engine such as adaptive query processing, Automatic database tuning, graph database capabilities, New Availability Groups, Database Tuning Advisor (DTA) etc. It also had a new Scale Out feature in SQL Server 2017 Integration Services (SSIS) and SQL Server Machine Learning Services to reflect support for Python language. [dropcap]Nov[/dropcap] A humble thank you to Google for TensorFlow Lite and Elastic.co for Elasticsearch 6.0 Just a month more for the year to end!! The Data science community has had a busy November with too many releases to keep an eye on with Microsoft Connect(); to spill the beans. So, November, thank you for TensorFlow Lite and Elastic 6. Talking about TensorFlow Lite, a lightweight product for mobile and embedded devices, it is designed to be: Lightweight: It allows inference of the on-device machine learning models that too with a small binary size, allowing faster initialization/ startup. Speed: The model loading time is dramatically improved, with an accelerated hardware support. Cross-platform: It includes a runtime tailormade to run on various platforms–starting with Android and iOS. And now for Elasticsearch 6.0, which is made generally available. With features such as easy upgrades, Index sorting, better Shard recovery, support for Sparse doc values.There are other new features spread out across the Elastic stack, comprised of Kibana, Beats and Logstash. These are, Elasticsearch’s solutions for visualization and dashboards, data ingestion and log storage. [dropcap]Dec[/dropcap] Thanks in advance Apache for Hadoop 3.0 Christmas gifts may arrive for Data Scientists in the form of General Availability of Hadoop 3.0. The new version is expected to include support for Erasure Encoding in HDFS, version 2 of the YARN Timeline Service, Shaded Client Jars, Support for More than 2 NameNodes, MapReduce Task-Level Native Optimization, support for Opportunistic Containers and Distributed Scheduling to name a few. It would also include a rewritten version of Hadoop shell scripts with bug fixes, improved compatibility and many changes in some existing installation procedures. Pheww! That was a large list of tools for Data Scientists and developers to thank for this year. Whether it be new frameworks, libraries or a new set of software, each one of them is unique and helpful to create data-driven applications. Hopefully, you have used some of them in your projects. If not, be sure to give them a try, because 2018 is all set to overload you with new, and even more amazing tools, frameworks, libraries, and releases.

0
0
14049

article-image-one-shot-learning-solution-low-data-problem

Savia Lobo

04 Dec 2017

5 min read

One Shot Learning: Solution to your low data problem

Savia Lobo

04 Dec 2017

5 min read

The fact that machines are successful in replicating human intelligence is mind-boggling. However, this is only possible if machines are fed with correct mix of algorithms, huge collection of data, and most importantly the training given to it, which in turn leads to faster prediction or recognition of objects within the images. On the other hand, when you train humans to recognize a car for example, you simply have to show them a live car or an image. The next time they see any vehicle, it would be easy for them to distinguish a car among-st other vehicles. In a similarly way, can machines learn with single training example like humans do? Computers or machines lack a key part that distinguishes them from humans, and that is, ‘Memory’. Machines cannot remember; hence it requires millions of data to be fed in order to understand the object detection, be it from any angle. In order to reduce this supplement of training data and enabling machines to learn with less data at hand, One shot learning is brought to its assistance. What is one shot learning and how is it different from other learning? Deep Neural network models outperform various tasks such as image recognition, speech recognition and so on. However, such tasks are possible only due to extensive, incremental training on large data sets. In cases when there is a smaller dataset or fewer training examples, a traditional model is trained on the data that is available. During this process, it relearns new parameters and incorporates new information, and completely forgets the one previously learned. This leads to poor training or catastrophic inference. One shot learning proves to be a solution here, as it is capable of learning with one, or a minimal number of training samples, without forgetting. The reason for this is, they posses meta-learning; a capability often seen in neural network that has memory. How One shot learning works? One shot learning strengthens the ability of the deep learning models without the need of a huge dataset to train on. Implementation of One shot learning can be seen in a Memory Augmented Neural Network (MANN) model. A MANN has two parts, a controller and an external memory model. The controller is either a feed forward neural network or an LSTM (Long Short Term Memory) network, which interacts with the external memory module using number of read/write heads. These heads fetch or place representations to and fro the memory. LSTMs are proficient in long term storage through slow updates of weights and short term storage via the external memory module. They are trained to meta-learn; i.e. it can rapidly learn unseen functions with fewer data samples.Thus, MANNs are said to be capable of metalearning. The MANN model is later trained on datasets that include different classes with very few samples. For instance, the Omniglot dataset, a collection of handwritten samples of different languages, with very few samples of each language. After continuously training the model with thousands of iterations by using few samples, the model was able to recognize never-seen-before image samples, taken from a disjoint sample of the Omniglot dataset. This proves that MANN models are able to outperform various object categorization tasks with minimal data samples. Similarly, One shot learning can also be achieved using Neural Turing Machine and Active One shot learning. Therefore, learning with a single attempt/one shot actually involves meta-learning. This means, the model gradually learns useful representations from the raw data using certain algorithms, for instance, the gradient descent algorithm. Using these learnings as a base knowledge, the model can rapidly cohere never seen before information with a single or one-shot appearance via an external memory module. Use cases of One shot learning Image Recognition: Image representations are learnt using supervised metric based approach. For instance, siamese neural network, an identical sister network, discriminates between the class-identity of an image pair. Features of this network are reused for one-shot learning without the need for retraining. Object Recognition within images: One shot learning allows neural network models to recognize known objects and its category within an image. For this, the model learns to recognize the object with a few set of training samples. Later it compares the probability of the object to be present within the image provided. Such a model trained on one shot can recognize objects in an image despite the clutter, viewpoint, and lighting changes. Predicting accurate drugs: The availability of datasets for a drug discovery are either limited or expensive. The molecule found during a biological study often does not end up being a drug due to ethical reasons such as toxicity, low-solubility and so on. Hence, a less amount of data is available about the candidate molecule. Using one shot learning, an iterative LSTM combined with Graph convolutional neural network is used to optimize the candidate molecule. This is done by finding similar molecules with increased pharmaceutical activity and lesser risks to patients. A detailed explanation of how using low data, accurate drugs can be predicted is discussed in a research paper published by the American Chemical Society(ACS). One shot learning is in its infancy and therefore use cases can be seen in familiar applications such as image and object recognition. As the technique will advance with time and the rate of adoption, other applications of one shot learning will come into picture. Conclusion One shot learning is being applied in instances of machine learning or deep learning models that have less data available for their training. A plus point in future is, that organizations will not have to collect huge amount of data for their ML models to be trained, only a few training samples would do the job! Large number of organizations are looking forward to adopt one shot learning within their deep learning models. It would be exciting to see how one shot learning will glide through being the base of every neural network implementation.

0
0
13975

article-image-4-ways-artificial-intelligence-leading-disruption-fintech

Pravin Dhandre

23 Nov 2017

6 min read

4 ways Artificial Intelligence is leading disruption in Fintech

Pravin Dhandre

23 Nov 2017

6 min read

In the digital disruption era, Artificial Intelligence in Fintech is viewed as an emerging technology forming the sole premise for revolution in the sector. Tech giants positioned in the Fortune’s 500 technology list such as Apple, Microsoft, Facebook are putting resources in product innovations and technology automation. Businesses are investing hard to bring agility, better quality and high end functionality for driving their revenue growth by multi digits. Widely used AI-powered applications such as Virtual Assistants, Chatbots, Algorithmic Trading and Purchase Recommendation systems are fueling up the businesses with low marginal costs, growing revenues and providing a better customer experience. According to a survey, by National Business Research Institute, more than 62% of the companies will deploy AI powered fintech solutions in their applications to identify new opportunities and areas to scale the business higher. What has led the disruption? The Financial sector is experiencing a faster technological evolution right from providing personalized financial services, executing smart operations to simplify the complex and repetitive process. Use of machine learning and predictive analytics has enabled financial companies to provide smart suggestions on buying and selling stocks, bonds and commodities. Insurance companies are accelerating in automating their loan applications, thereby saving umpteen number of hours. Leading Investment Bank, Goldman Sachs automated their stock trading business replacing their trading professionals with computer engineers. Black Rock, one of the world’s largest asset management company facilitates high net worth investors with automated advice platform superseding highly paid wall street professionals. Applications such as algorithmic trading, personal chatbots, fraud prevention & detection, stock recommendations, and credit risk assessment are the ones finding their merit in banking and financial services companies. Let us understand the changing scenarios with next-gen technologies: Fraud Prevention & Detection Fraud prevention is tackled by the firms using an anomaly detection API. The API is designed using machine learning & deep learning mechanism. It helps identify and report any suspicious or fraudulent activity taking place among-st billions of transactions on a daily basis. Fintech companies are infusing huge capital to handle cyber-crime, resulting into a global market spends of more than 400 billion dollars annually. Multi-national giants such as MasterCard, Sun Financial, Goldman Sachs, and Bank of England use AI-powered systems to safeguard and prevent money laundering, banking frauds and illegal transactions. Danske Bank, a renowned Nordic-based financial service provider, deployed AI engines in their operations helping them investigate millions of online banking transactions in less than a second. With this, cost of fraud investigation and delivering faster actionable insights reduced drastically. AI Powered Chatbots Chatbots are automated customer support chat applications powered by Natural Language Processing (NLP). They help deliver quick, engaging, personalized, and effective conversation to the end user. With an upsurge in the number of investors and varied investment options, customers seek financial guidance, profitable investment options and query resolution, faster and in real-time. Large number of banks such as Barclays, Bank of America, JPMorgan Chase are widely using AI-supported digital Chatbots to automate their client support, delivering effective customer experience with smarter financial decisions. Bank of America, the largest bank in US launched Erica, a Chatbot which guides customers with investment option notification, easy bill payments, and weekly update on their mortgage score. MasterCard offers a chatbot to their customers which not only allows them to review their bank balance or transaction history but also facilitates seamless payments worldwide. Credit Risk Management For money lenders, the most common business risk is the credit risk and that piles up largely due to inaccurate credit risk assessment of borrowers. If you are unaware of the term credit risk, it is simply a risk associated with a borrower defaulting to repay the loan amount. AI backed Credit Risk evaluation tools developed using predictive analytics and advanced machine learning techniques has enabled bankers and financial service providers to simplify the borrower’s credit evaluation thereby transforming the labor intensive scorecard assessment method. Wells Fargo, an American international banking company adopted AI technology in executing mortgage verification and loan processing. It resulted in lower market exposure risk of their lending assets. With this, the team was able to establish smarter and faster credit risk management functionality. It resulted in analysis of millions of structured and unstructured data points for investigation thereby proving AI as an extremely valuable asset for credit security and assessment. Algorithmic Trading More than half a dozen US citizens own individual stocks, mutual funds, and exchange-traded mutual funds. Also, a good number of users trade on a daily basis, making it imperative for major broking and financial trading companies to offer AI powered algorithmic trading platform. The platform enables customers with strategic execution of trades offering significant returns. The algorithms analyse hundreds of millions of data pointers and draw down a decisive trading pattern enabling traders to book higher profits every microsecond of the trading hour. France-based international bank BNP Paribas deployed algorithmic trading which aids their customers in executing trades strategically and provides graphical representation of stock market liquidity. With the help of this, customers are able to determine the most appropriate ways of executing trade under various market conditions. The advances in automated trading has assisted users with suggestions and rich insights, helping humans to take better decisions. How do we see the Future of AI in Financial sector? The influence of AI in fintech has marked disruption in almost each and every financial institution, right from investment banks to retail banking, to small credit unions. Data science and machine learning practitioners are endeavoring to position AI as an essential part of the banking ecosystem. Financial companies are synergizing with data analytics and fintech professionals to orient AI as the primary interface for interaction with their customers. However, the sector commonly faces challenges in adoption of emerging technologies, making it inevitable for AI too. The foremost challenge companies face is availability of massive data which is clean and rich to train machine learning algorithms. The next hurdle in line would be the reliability and accuracy of the data insights provided by the AI mechanized solution. With dynamic market situation, businesses could experience decline in efficacy of their models causing serious harm to the company. Hence, they need to be smarter and cannot solely trust the AI technology in achieving the business mission. Absence of emotional intelligence in Chatbots is another area of concern resulting in an unsatisfactory customer service experience. While there may be other roadblocks, the rising investment in AI technology would definitely assist financial companies in overcoming such challenges and developing competitive intelligence in their product offerings. Predicting the near future, adoption of cutting edge technologies such as machine learning and predictive analytics will boost higher customer engagement, exceptional banking experience, lesser frauds and higher operating margins for banks, financial institutions and Insurance companies.

0
0
13948

Aaron Lazar

23 Oct 2017

7 min read

"My Favorite Tools to Build a Blockchain App" - Ed, The Engineer

Aaron Lazar

23 Oct 2017

7 min read

Hey! It’s great seeing you here. I am Ed, the Engineer and today I’m going to open up my secret toolbox and share some great tools I use to build Blockchains. If you’re a Blockchain developer or a developer-to-be, you’ve come to the right place! If you are not one, maybe you should consider becoming one. “There are only 5,000 developers dedicated to writing software for cryptocurrencies, Bitcoin, and blockchain in general. And perhaps another 20,000 had dabbled with the technology, or have written front end applications that connect with the blockchain.” - William Mougayar, The Business Blockchain Decentralized apps or dapps, as they are fondly called, are serverless applications that can be run on the client-side, within a blockchain based distributed network. We’re going to learn what the best tools are to build dapps and over the next few minutes, we’ll take these tools apart one by one. For a better understanding of where they fit into our development cycle, we’ll group them up into stages - just like the buildings we build. So, shall we begin? Yes, we can!! ;) The Foundation: Platforms The first and foremost element for any structure to stand tall and strong is its foundation. The same goes for Blockchain apps. Here, in place of all the mortar and other things, we’ve got Decentralized and Public blockchains. There are several existing networks on the likes of Bitcoin, Ethereum or Hyperledger that can be used to build dapps. Ethereum and Bitcoin are both decentralized, public chains that are open source, while Hyperledger is private and also open source. Bitcoin may not be a good choice to build dapps on as it was originally designed for peer-to-peer transactions and not for building smart contracts. The Pillars of Concrete: Languages Now, once you’ve got your foundation in place, you need to start raising pillars that will act as the skeleton for your applications. How do we do this? Well, we’ve got two great languages specifically for building dapps. Solidity It’s an object-oriented language that you can use for writing smart contracts. The best part of Solidity is that you can use it across all platforms - making it the number one choice for many developers to use. It’s a lot like JavaScript and way more robust than other languages. Along with Solidity, you might want to use Solc, the compiler for Solidity. At the moment, Solidity is the language that’s getting the most support and has the best documentation. Serpent Before the dawn of Solidity, Serpent was the reigning language for building dapps. Something like how bricks replaced stone to build massive structures. Serpent though is still being used in many places to build dapps and it has great real-time garbage collection. The Transit Mixers: Frameworks After you choose your language to build dapps, you need a framework to simplify the mixing of concrete to build your pillars. I find these frameworks interesting: Embark This is a framework for Ethereum you can use to quicken development and to streamline the process by using tools or functionalities. It allows you to develop and deploy dapps easily, or even build a serverless HTML5 application that uses decentralized technology. It equips you with tools to create new smart contracts which can be made available in JavaScript code. Truffle Here is another great framework for Ethereum, which boasts of taking on the task of managing your contract artifacts for you. It includes support for the library that links complex Ethereum apps and provides custom deployments. The Contractors: Integrated Development Environments Maybe, you are not the kind that likes to build things from scratch. You just need a one-stop place where you can tell what kind of building you want and everything else just falls in place. Hire a contractor. If you’re looking for the complete package to build dapps, there are two great tools you can use, Ethereum Studio and Remix (Browser-Solidity). The IDE takes care of everything - right from emulating the live network to testing and deploying your dapps. Ethereum Studio This is an adapted version of Cloud9, built for Ethereum with some additional tools. It has a blockchain emulator called the sandbox, which is great for writing automated tests. Fair warning: You must pay for this tool as it’s not open source and you must use Azure Cloud to access it. Remix This can pretty much do the same things that Ethereum Studio can. You can run Remix from your local computer and allow it to communicate with an Ethereum node client that’s on your local machine. This will let you execute smart contracts while connected to your local blockchain. Remix is still under development during the time of writing this article. The Rebound Hammer: Testing tools Nothing goes live until it’s tried and tested. Just like the rebound hammer you may use to check the quality of concrete, we have a great tool that helps you test dapps. Blockchain Testnet For testing purposes, use the testnet, an alternative blockchain. Whether you want to create a new dapp using Ethereum or any other chain, I recommend that you use the related testnet, which ideally works as a substitute in place of the true blockchain that you will be using for the real dapp. Testnet coins are different from actual bitcoins, and do not hold any value, allowing you as a developer or tester to experiment, without needing to use real bitcoins or having to worry about breaking the primary bitcoin chain. The Wallpaper: dapp Browsers Once you’ve developed your dapp, it needs to look pretty for the consumers to use. Dapp browsers are mostly the User Interfaces for the Decentralized Web. Two popular tools that help you bring dapps to your browser are Mist and Metamask. Mist It is a popular browser for decentralized web apps. Just as Firefox or Chrome are for the Web 2.0, the Mist Browser will be for the decentralized Web 3.0. Ethereum developers would be able to use Mist not only to store Ether or send transactions but to also deploy smart contracts. Metamask With Metamask, you can comfortably run dapps in your browser without having to run a full Ethereum node. It includes a secure identity vault that provides a UI to manage your identities on various sites, as well as sign blockchain contracts. There! Now you can build a Blockchain! Now you have all the tools you need to make amazing and reliable dapps. I know you’re always hungry for more - this Github repo created by Christopher Allen has a great listing of tools and resources you can use to begin/improve your Blockchain development skills. If you’re one of those lazy-but-smart folks who want to get things done at the click of a mouse button, then BaaS or Blockchain as a Service is something you might be interested in. There are several big players in this market at the moment, on the likes of IBM, Azure, SAP and AWS. BaaS is basically for organizations and enterprises that need blockchain networks that are open, trusted and ready for business. If you go the BaaS way, let me warn you - you’re probably going to miss out on all the fun of building your very own blockchain from scratch. With so many banks and financial entities beginning to set up their blockchains for recording transactions and transfer of assets, and investors betting billions on distributed ledger-related startups, there are hardly a handful of developers out there, who have the required skills. This leaves you with a strong enough reason to develop great blockchains and sharpen your skills in the area. Our Building Blockchain Projects book should help you put some of these tools to use in building reliable and robust dapps. So what are you waiting for? Go grab it now and have fun building blockchains!

0
2
13620

article-image-using-meta-learning-nonstationary-competitive-environments-pieter-abbeel-et-al

Sugandha Lahoti

15 Feb 2018

5 min read

Using Meta-Learning in Nonstationary and Competitive Environments with Pieter Abbeel et al

Sugandha Lahoti

15 Feb 2018

5 min read

This ICLR 2018 accepted paper, Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments, addresses the use of meta-learning to operate in non-stationary environments, represented as a Markov chain of distinct tasks. This paper is authored by Pieter Abbeel, Maruan Al-Shedivat, Trapit Bansal, Yura Burda, Ilya Sutskever, and Igor Mordatch. Pieter Abbeel is a professor at UC Berkeley since 2008. He was also a Research Scientist at OpenAI (2016-2017). His current research focuses on robotics and machine learning with particular focus on meta-learning and deep reinforcement learning. One of the other authors of this paper, Ilya Sutskever is the co-founder and Research Director of OpenAI. He was also a Research Scientist at the Google Brain Team for 3 years. Meta-Learning, or alternatively learning to learn, typically uses metadata to understand how automatic learning can become flexible in solving learning problems, i.e. to learn the learning algorithm itself. Continuous adaptation in real-world environments is quite essential for any learning agent and meta-learning approach is an appropriate choice for this task. This article will talk about one of the top accepted research papers in the field of meta-learning at the 6th annual ICLR conference scheduled to happen between April 30 - May 03, 2018. Using a gradient-based meta-learning algorithm for Nonstationary Environments What problem is the paper attempting to solve? Reinforcement Learning algorithms, although achieving impressive results ranging from playing games to applications in dialogue systems to robotics, are only limited to solving tasks in stationary environments. On the other hand, the real-world is often nonstationary either due to complexity, changes in the dynamics in the environment over the lifetime of a system, or presence of multiple learning actors. Nonstationarity breaks the standard assumptions and requires agents to continuously adapt, both at training and execution time, in order to succeed. The classical approaches to dealing with nonstationarity are usually based on context detection and tracking i.e., reacting to the already happened changes in the environment by continuously fine-tuning the policy. However, nonstationarity allows only for limited interaction before the properties of the environment change. Thus, it immediately puts learning into the few-shot regime and often renders simple fine-tuning methods impractical. In order to continuously learn and adapt from limited experience in nonstationary environments, the authors of this paper propose the learning-to-learn (or meta-learning) approach. Paper summary This paper proposes a gradient-based meta-learning algorithm suitable for continuous adaptation of RL agents in nonstationary environments. The agents meta-learn to anticipate the changes in the environment and update their policies accordingly. This method builds upon the previous work on gradient-based model-agnostic meta-learning (MAML) that has been shown successful in the few shot settings. Their algorithm re-derive MAML for multi-task reinforcement learning from a probabilistic perspective, and then extends it to dynamically changing tasks. This paper also considers the problem of continuous adaptation to a learning opponent in a competitive multi-agent setting and have designed RoboSumo—a 3D environment with simulated physics that allows pairs of agents to compete against each other. The paper answers the following questions: What is the behavior of different adaptation methods (in nonstationary locomotion and competitive multi-agent environments) when the interaction with the environment is strictly limited to one or very few episodes before it changes? What is the sample complexity of different methods, i.e., how many episodes are required for a method to successfully adapt to the changes? Additionally, it answers the following questions specific to the competitive multi-agent setting: Given a diverse population of agents that have been trained under the same curriculum, how do different adaptation methods rank in a competition versus each other? When the population of agents is evolved for several generations, what happens with the proportions of different agents in the population? Key Takeaways This work proposes a simple gradient-based meta-learning approach suitable for continuous adaptation in nonstationary environments. This method was applied to nonstationary locomotion and within a competitive multi-agent setting—the RoboSumo environment. The key idea of the method is to regard nonstationarity as a sequence of stationary tasks and train agents to exploit the dependencies between consecutive tasks such that they can handle similar nonstationarities at execution time. In both cases, i.e meta-learning algorithm and the multi-agent setting, meta-learned adaptation rules were more efficient than the baselines in the few-shot regime. Additionally, agents that meta-learned to adapt, demonstrated the highest level of skill when competing in iterated games against each other. Reviewer feedback summary Overall Score: 24/30 Average Score: 8 The paper was termed as a great contribution to ICLR. According to the reviewers, the paper addressed a very important problem for general AI and was well-written. They also appreciated the careful experiment designs, and thorough comparisons making the results convincing. They found that editorial rigor and image quality could be better. However, there was no content related improvements suggested. The paper was appreciated for being dense and rich on rapid meta-learning.

0
0
13596

article-image-these-are-the-best-machine-learning-conferences-in-2018

Richard Gall

12 Jun 2018

8 min read

7 of the best machine learning conferences for the rest of 2018

Richard Gall

12 Jun 2018

8 min read

0
0
13336

article-image-looking-different-types-lookup-cache

Savia Lobo

20 Nov 2017

6 min read

Looking at the different types of Lookup cache

Savia Lobo

20 Nov 2017

6 min read

0
0
13276

article-image-ian-goodfellow-et-al-better-text-generation-via-filling-blanks-using-maskgans

Savia Lobo

19 Feb 2018

5 min read

Ian Goodfellow et al on better text generation via filling in the blanks using MaskGANs

Savia Lobo

19 Feb 2018

5 min read

In the paper, “MaskGAN: Better Text Generation via Filling in the ______”, Ian Goodfellow, along with William Fedus and Andrew M. Dai have proposed a way to improve sample quality using Generative Adversarial Networks (GANs), which explicitly trains the generator to produce high quality samples and have also shown a lot of success in image generation. Ian Goodfellow is a Research scientist at Google Brain. His research interests lies in the fields of deep learning, machine learning security and privacy, and particularly in generative models. Ian Goodfellow is known as the father of Generative Adversarial Networks. He runs the Self-Organizing Conference on Machine Learning, which was founded at OpenAI in 2016. Generative Adversarial Networks (GANs) is an architecture for training generative models in an adversarial setup, with a generator generating images that is trying to fool a discriminator that is trained to discriminate between real and synthetic images. GANs have had a lot of success in producing more realistic images than other approaches but they have only seen limited use for text sequences. They were originally designed to output differentiable values, as such discrete language generation is challenging for them. The team of researchers, introduce an actor-critic conditional GAN that fills in missing text conditioned on the surrounding context. The paper also shows that this GAN produces more realistic text samples compared to a maximum likelihood trained model. MaskGAN: Better Text Generation via Filling in the _______ What problem is the paper attempting to solve? This paper highlights how text classification was traditionally done using Recurrent Neural Network models by sampling from a distribution that is conditioned on the previous word and a hidden state that consists of a representation of the words generated so far. These are typically trained with maximum likelihood in an approach known as teacher forcing. However, this method causes problems when, during sample generation, the model is often forced to condition on sequences that were never conditioned on at training time, which leads to unpredictable dynamics in the hidden state of the RNN. Also, methods such as Professor Forcing and Scheduled Sampling have been proposed to solve this issue, which work indirectly by either causing the hidden state dynamics to become predictable (Professor Forcing) or by randomly conditioning on sampled words at training time, however, they do not directly specify a cost function on the output of the RNN encouraging high sample quality. The method proposed in the paper is trying to solve problem of text generation with GANs, by a sensible combination of novel approaches. MaskGANs Paper summary This paper proposes to improve sample quality using Generative Adversarial Network (GANs), which explicitly trains the generator to produce high quality samples. The model is trained on a text fill-in-the-blank or in-filling task. In this task, portions of a body of text are deleted or redacted. The goal of the model is to then infill the missing portions of text so that it is indistinguishable from the original data. While in-filling text, the model operates autoregressively over the tokens it has thus far filled in, as in standard language modeling, while conditioning on the true known context. If the entire body of text is redacted, then this reduces to language modeling. The paper also shows qualitatively and quantitatively, evidence that this new proposed method produces more realistic text samples compared to a maximum likelihood trained model. Key Takeaways One can have a hold about what MaskGANs are, as this paper introduces a text generation model trained on in-filling (MaskGAN). The paper considers the actor-critic architecture in extremely large action spaces, new evaluation metrics, and the generation of synthetic training data. The proposed contiguous in-filling task i.e. MASKGAN, is a good approach to reduce mode collapse and help with training stability for textual GANs. The paper shows that MaskGAN samples on a larger dataset (IMDB reviews) is significantly better than the corresponding tuned MaskMLE model as shown by human evaluation. One can produce high-quality samples despite the MaskGAN model having much higher perplexity on the ground-truth test set Reviewer feedback summary/takeaways Overall Score: 21/30 Average Score: 7/10. Reviewers liked the overall idea behind the paper. They appreciated the benefits they received from context (left context and right context) by solving a "fill-in-the-blank" task at training time and translating this into text generation at test time. A reviewer also stated that experiments were well carried through and very thorough. A reviewer also commented that the importance of the MaskGAN mechanism has been highlighted and the description of the reinforcement learning training part has been clarified. However, with pros, the paper has also received some cons stating, There is a lot of pre-training required for the proposed architecture Generated texts are generally locally valid but not always valid globally It was not made very clear whether the discriminator also conditions on the unmasked sequence. A reviewer also stated that there were some unanswered questions such as Was pre-training done for the baseline as well? How was the masking done? How did you decide on the words to mask? Was this at random? Is it actually usable in place of ordinary LSTM (or RNN)-based generation?

0
0
12961

article-image-challenges-training-gans-generative-adversarial-networks

Shoaib Dabir

14 Dec 2017

5 min read

Why training a Generative Adversarial Network (GAN) Model is no piece of cake

Shoaib Dabir

14 Dec 2017

5 min read

[box type="note" align="" class="" width=""]This article is an excerpt from a book by Kuntal Ganguly titled Learning Generative Adversarial Networks. The book gives a complete coverage of Generative adversarial networks. [/box] The article highlights some of the common challenges that a developer might face while using GAN models. Common challenges faced while working with GAN models Training a GAN is basically about two networks, generator G(z) and discriminator D(z) trying to race against each other and trying to reach an optimum, more specifically a nash equilibrium. The definition of nash equilibrium as per Wikipedia: (in economics and game theory) a stable state of a system involving the interaction of different participants, in which no participant can gain by a unilateral change of strategy if the strategies of the others remain unchanged. 1. Setting up failure and bad initialization If you think about it, this is exactly what a GAN is trying to do; the generator and discriminator reach a state where they cannot improve further given the other is kept unchanged. Now the setup of gradient descent is to take a step in a direction that reduces the loss measure defined on the problem—but we are by no means enforcing the networks to reach Nash equilibrium in GAN, which have non-convex objective with continuous high dimensional parameters. The networks try to take successive steps to minimize a non-convex objective and end up in an oscillating process rather than decreasing the underlying true objective. In most cases, when your discriminator attains a loss very close to zero, then right away you can figure out something is wrong with your model. But the biggest pain-point is figuring out what is wrong. Another practical thing done during the training of GAN is to purposefully make one of the networks stall or learn slower, so that the other network can catch up. And in most scenarios, it's the generator that lags behind so we usually let the discriminator wait. This might be fine to some extent, but remember that for the generator to get better, it requires a good discriminator and vice versa. Ideally the system would want both the networks to learn at a rate where both get better over time. The ideal minimum loss for the discriminator is close to 0.5— this is where the generated images are indistinguishable from the real images from the perspective of the discriminator. 2. Mode collapse One of the main failure modes with training a generative adversarial network is called mode collapse or sometimes the helvetica scenario. The basic idea is that the generator can accidentally start to produce several copies of exactly the same image, so the reason is related to the game theory setup we can think of the way that we train generative adversarial networks as first maximizing with respect to the discriminator and then minimizing with respect to the generator. If we fully maximize with respect to the discriminator before we start to minimize with respect to the generator everything works out just fine. But if we go the other way around and we minimize with respect to the generator and then maximize with respect to the discriminator, everything will actually break and the reason is that if we hold the discriminator constant it will describe a single region in space as being the point that is most likely to be real rather than fake and then the generator will choose to map all noise input values to that same most likely to be real point. 3. Problem with counting GANs can sometimes be far-sighted and fail to differentiate the number of particular objects that should occur at a location. As we can see, it gives more numbers of eyes in the head than originally present: 4. Problems with perspective GANs sometime are not capable of differentiating between front and back view and hence fail to adapt well with 3D objects while generating 2D representation from it as follows: 5. Problems with global structures GANs do not understand a holistic structure similar to problems with perspective. For example, in the bottom left image, it generates an image of a quadruple cow, that is, a cow standing on its hind legs and simultaneously on all four legs. That is definitely unrealistic and not possible in real life! It is very important when its comes to train GAN models towards the execution and there would be some common challenges that can come ahead. The major challenge that arises is the failure of the setup and also the one that is mainly faced in training GAN model is mode collapse or sometimes the helvetica scenario. It highlights some of the common problems like with counting, perspective or be global structure. The above listings are some of the major issues faced while training a GAN model. To read more on solutions with real world examples, you will need to check out this book Learning Generative Adversarial Networks.

0
0
12841

article-image-containerized-data-science-docker

Darwin Corn

03 Jul 2016

4 min read

Containerized Data Science with Docker

Darwin Corn

03 Jul 2016

4 min read

So, you're itching to begin your journey into data science but you aren't sure where to start. Well, I'm glad you’ve found this post since I will give the details in a step-by-step fashion as to how I circumvented the unnecessarily large technological barrier to entry and got my feet wet, so to speak. Containerization in general and Docker in particular have taken the IT world by storm in the last couple of years by making LXC containers more than just VM alternatives for the enterprising sysadmin. Even if you're coming at this post from a world devoid of IT, the odds are good that you've heard of Docker and their cute whale mascot. Of course, now that Microsoft is on board, the containerization bandwagon and a consortium of bickering stakeholders have formed, so you know that container tech is here to stay. I know, FreeBSD has had the concept of 'jails' for almost two decades now. But thanks to Docker, container tech is now usable across the big three of Linux, Windows and Mac (if a bit hack-y in the case of the latter two), and today we're going to use its positives in an exploration into the world of data science. Now that I have your interest piqued, you're wondering where the two intersect. Well, if you're like me, you've looked at the footprint of R-studio and the nightmare maze of dependencies of IPython and “noped” right out of there. Thanks to containers, these problems are solved! With Docker, you can limit the amount of memory available to the container, and the way containers are constructed ensures that you never have to deal with troubleshooting broken dependencies on update ever again. So let's install Docker, which is as straightforward as using your package manager in Linux, or downloading Docker Toolbox if you're using a Mac or Windows PC, and running the installer. The instructions that follow will be tailored to a Linux installation, but are easily adapted to Windows or Mac as well. On those two platforms, you can even bypass these CLI commands and use Kitematic, or so I hear. Now that you have Docker installed, let's look at some use cases for how to use it to facilitate our journey into data science. First, we are going to pull the Jupyter Notebook container so that you can work with that language-agnostic tool. # docker run --rm -it -p 8888:8888 -v "$(pwd):/notebooks" jupyter/notebook The -v "$(pwd):/notebooks" flag will mount the current directory to the /notebooks directory in the container, allowing you to save your work outside the container. This will be important because you’ll be using the container as a temporary working environment. The --rm flag ensures that the container is destroyed when it exits. If you rerun that command to get back to work after turning off your computer for instance, the container will be replaced with an entirely new one. That flag allows it access to the folder on the local filesystem, ensuring that your work survives the casually disposable nature of development containers. Now go ahead and navigate to http://localhost:8888, and let's get to work. You did bring a dataset to analyze in a notebook, right? The actual nuts and bolts of data science are beyond the scope of this post, but for a quick intro to data and learning materials, I've found Kaggle to be a great resource. While we're at it, you should look at that other issue I mentioned previously—that of the application footprint. Recently a friend of mine convinced me to use R, and I was enjoying working with the language until I got my hands on some real data and immediately felt the pain of an application not designed for endpoint use. I ran a regression and it locked up my computer for minutes! Fortunately, you can use a container to isolate it and only feed it limited resources to keep the rest of the computer happy. # docker run -m 1g -ti --rm r-base This command will drop you into an interactive R CLI that should keep even the leanest of modern computers humming along without a hiccup. Of course, you can also use the -c and --blkio-weight flags to restrict access to the CPU and HDD resources respectively, if limiting it to the GB of RAM wasn't enough. So, a program installation and a command or two (or a couple of clicks in the Kitematic GUI), and we're off and running using data science with none of the typical headaches. About the Author Darwin Corn is a systems analyst for the Consumer Direct Care Network. He is a mid-level professional with diverse experience in the information technology world.

0
0
11956

article-image-data-science-getting-easier

Erik Kappelman

10 Sep 2017

5 min read

Is data science getting easier?

Erik Kappelman

10 Sep 2017

5 min read

The answer is yes, and no. This is a question that could've easily been applied to textile manufacturing in the 1890s, and could've received a similar answer. By this I mean, textile manufacturing improved leaps and bounds throughout the industrial revolution, however, despite their productivity, textile mills were some of the most dangerous places to work. Before I further explain my answer, let’s agree on a definition for data science. Wikipedia defines data science as, “an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured.” I see this as the process of acquiring, managing, and analyzing data. Advances in data science First, let's discuss why data science is definitely getting easier. Advances in technology and data collection have made data science easier. For one thing, data science as we know it wasn’t even possible 40 years ago, but due to advanced technology we can now analyze, gather, and manage data in completely new ways. Scripting languages like R and Python have mostly replaced more convoluted languages like Haskell and Fortran in the realm of data analysis. Tools like Hadoop bring together a lot of different functionality to expedite every element of data science. Smartphones and wearable tech collect data more effectively and efficiently than older data collection methods, which gives data scientists more data of higher quality to work with. Perhaps most importantly, the utility of data science has become more and more recognized throughout the broader world. This helps provide data scientists the support they need to be truly effective. These are just some of the reasons why data science is getting easier. Unintended consequences While many of these tools make data science easier in some respects, there are also some unintended consequences that actually might make data science harder. Improved data collection has been a boon for the data science industry, but using the data that is streaming in is similar to drinking out of a firehose. Data scientists are continually required to come up with more complicated ways of taking data in, because the stream of data has become incredibly strong. While R and Python are definitely easier to learn than older alternatives, neither language is usually accused of being parsimonious. What a skilled Haskell programming might be able to do in 100 lines, might take a less skilled Python scripter 500 lines. Hadoop, and tools like it, simplify the data science process, but it seems like there are 50 new tools like Hadoop a day. While these tools are powerful and useful, sometimes data scientists spend more time learning about tools and less time doing data science, just to keep up with the industry’s landscape. So, like many other fields related to computer science and programming, new tech is simultaneously making things easier and harder. Golden age of data science Let me rephrase the title question in an effort to provide even more illumination: is now the best time to be a data scientist or to become one? The answer to this question is a resounding yes. While all of the current drawbacks I brought up remain true, I believe that we are in a golden age of data science, for all of the reasons already mentioned, and more. We have more data than ever before and our data collection abilities are improving at an exponential rate. The current situation has gone so far as to create the necessity for a whole new field of data analysis, Big Data. Data science is one of the most vast and quickly expanding human frontiers at present. Part of the reason for this is what data science can be used for. Data science can effectively answer questions that were previously unanswered. Of course this makes for an attractive field of study from a research standpoint. One final note on whether or not data science is getting easier. If you are a person who actually creates new methods or techniques in data science, especially if you need to support these methods and techniques with formal mathematical and scientific reasoning, data science is definitely not getting easier for you. As I just mentioned, Big Data is a whole new field of data science created to deal with new problems caused by the efficacy of new data collection techniques. If you are a researcher or academic, all of this means a lot of work. Bootstrapped standard errors were used in data analysis before a formal proof of their legitimacy was created. Data science techniques might move at the speed of light, but formalizing and proving these techniques can literally take lifetimes. So if you are a researcher or academic, things will only get harder. If you are more of a practical data scientist, it may be slightly easier for now, but there’s always something! About the Author Erik Kappelman wears many hats including blogger, developer, data consultant, economist, and transportation planner. He lives in Helena, Montana and works for theDepartment of Transportation as a transportation demand modeler.

0
0
11603

Tech Guides - Data

Is Blockchain a failing trend or can it build a better world? Harish Garg provides his insight [Interview]

Black Friday Special: 17 ways in 2017 that online retailers use machine learning

Handpicked for your Weekend Reading – 24th Nov ’17

AI chip wars: Is Brainwave Microsoft's Answer to Google's TPU?

Data science folks have 12 reasons to be thankful for this Thanksgiving

One Shot Learning: Solution to your low data problem

4 ways Artificial Intelligence is leading disruption in Fintech

"My Favorite Tools to Build a Blockchain App" - Ed, The Engineer

Using Meta-Learning in Nonstationary and Competitive Environments with Pieter Abbeel et al

7 of the best machine learning conferences for the rest of 2018

Trending Topics

Looking at the different types of Lookup cache

Ian Goodfellow et al on better text generation via filling in the blanks using MaskGANs

Why training a Generative Adversarial Network (GAN) Model is no piece of cake

Containerized Data Science with Docker

Is data science getting easier?

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access