Tech News

article-image-27th-feb-2018-data-science-news-daily-roundup

27 Feb 2018

3 min read

27th Feb 2018 – Data Science News Daily Roundup

27 Feb 2018

4th cumulative update release for #SQLServer 2017, MongoDB 3.6.3 ready for production, Google debuts AdSense ‘auto ads’ with machine learning, and more in today’s top stories around machine learning, deep learning,and data science news. 1. SQL Server 2017 4th Cumulative Update release Microsoft released the 4th cumulative update for SQLServer 2017 RTM (Release To Market). This update includes all fixes introduced after the release of SQL Server 2017. This means one can install this to resolve issues fixed in any of the previous RTM CU. The latest 2017 update is CU4 - KB4056498. For information about the fixes and other improvements, read the release notes. 2. MongoDB 3.6.3 is out and ready for production deployment MongoDB version 3.6.3 is out. This version includes some minor fixes and is a recommended upgrade for all 3.6 users. The fixes in 3.6.3 include: 3.6 mongod crash on find with index and nested $and/$or Tailing oplog on secondary fails with CappedPositionLost specifying --bind_ip localhost results in error “address already in use” All JIRA issues closed in 3.6.3 For more information read the 3.6.3 Changelog. 3. Google releases ML-powered AdSense ‘Auto Ads’ Google announces AdSense ‘Auto Ads’. This is a brand new ad format which makes use of machine learning to read any web page. It is highly optimized, easy-to-use, and is capable of increasing revenue opportunities for any business on the web. It detects and places ads that are appropriate to be placed on that page. This also includes where to place the ads and how many ads to run. Publishers can also activate Auto ads with just a single line of code. Using machine learning not only helps to decide where the ad should be placed, but it is also used to ingest analytics for how well that ad performs. This can teach the system how to place ads better in the future. To know more about Auto Ads and how it works, visit AdSense Auto Ads’ official blog post. 4. Microsoft updates its Quantum Development Kit with support for Linux and Mac Microsoft announced a major update to its Quantum Development Kit. This update will enable more developers to experience the power of Quantum computing on more platforms. The update comprises of: Support for Mac- and Linux-based development Full open source license for our quantum development libraries and samples Interoperability with the Python programming language Faster simulator performance For a detailed know-how on these enhancements, read Microsoft’s official blog. 5. Pendo Systems Releases Version 4.0 Pendo Systems released version 4.0 of their Pendo Machine Learning Platform (PMLP). This new version includes an improved machine learning toolset, which accelerates time to implementation and is capable of tackling highly complex machine learning processing challenges. Version 4.0: Creates training data via an enhanced UI which helps streamline the complex management, classification and processing of all documents and enables users to train models against it. New connectivity options with CMIS (Content Management Interoperability Services) support and web crawling. Includes new plugins that integrate seamlessly with other systems to provide access to a range of Machine Learning algorithms. Read more about this in the official press release.

0
0
1451

article-image-26th-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

26 Feb 2018

3 min read

26th Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

26 Feb 2018

3 min read

Alibaba’s 11-qubit quantum computer, InfluxDB’s support for ephemeral data, regres releases on CRAN, OpenAI’s MADDPG, and more in today’s top stories around machine learning, blockchain, and data science news. 1. Alibaba launches its 11-qubit quantum computing service Alibaba has progressed towards quantum computing with the launch of its 11-qubit quantum computing service. This is a joint venture between Alibaba’s cloud service subsidiary Aliyun and the Chinese Academy of Sciences. This service is available to the public on the Quantum Computing Cloud Platform. Alibaba Quantum Lab (AQL) has also released an ambitious 15-year roadmap. By 2025, it expects to have built quantum computers that will be the world’s fastest by today’s measure. By 2030, AQL hopes to achieve a general quantum computing prototype with 50–100 qubits. Aliyun is also offering a new 32-qubit quantum computer simulation service. By comparing simulated experiment results with real results on quantum computers, users can measure the latter’s performance, verify correctness, etc. 2. InfluxDB adds support for ephemeral data to its databases InfluxData Inc. have updated their Time-series database platforms with support for Ephemeral data. Ephemeral data refers to data that only exists for a very short period of time. It is increasingly being generated by new technology deployments such as software containers, Kubernetes and IoT sensors. The nature of this data makes it troublesome for existing database solutions to keep up with the influx of this temporary data. To counter this issue, InfluxData has built two time-series databases called InfluxDB and InfluxEnterprises, which are designed to query time-stamped metrics, events and measurements more efficiently than traditional relational databases. InfluxDB boasts significant number of users, including IBM Corp. which uses the platform to analyze operational information in real time. 3. regres releases on CRAN regres is now released in CRAN. reqres is a new (in R context) approach to working with HTTP messages, that is, the requests send to a server and the response it returns. There are two main objects in reqres, the Request class and the Response class. Both of these are built on R6 and heavily inspired by the request and response classes in Express.js (a web server framework for Node.js). With regres launched in CRAN, working directly with HTTP messages will be simplified as reqres takes care of the minimum requirements letting the developers focus on the server logic instead. 4. Open AI releases MADDPG, an algorithm for multi-agent reinforcement learning Open AI researchers have developed a new algorithm for centralized learning and decentralized execution in multiagent environments. Called the MADDPG, this algorithm allows agents to learn to collaborate and compete with each other. MADDPG extends a reinforcement learning algorithm called DDPG, taking inspiration from actor-critic reinforcement learning techniques. They treat each agent as an “actor”, and each actor gets advice from a “critic” that helps the actor decide what actions to reinforce during training. More information is available at the OpenAI blog. 5. ServiceNow launches Agent Intelligence to make machine learning more accessible to organizations ServiceNow have added machine-learning capabilities directly into the Now Platform, making it accessible to all their cloud services and other applications built on ServiceNow. Their Agent Intelligence ML solution will automate the categorization, prioritization and assignment of work to reduce resolution times, minimize human error and improve customer satisfaction. It can also quickly classify and route requests with fewer errors, increasing agent productivity. Agent Intelligence will initially be applied to improving the speed and quality of IT and customer-service processes.

0
0
1276

article-image-23rd-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

23 Feb 2018

3 min read

23rd Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

23 Feb 2018

3 min read

Microsoft announces SQLServer update for mssql-cli, ENAS-PyTorch, NumPy v1.14.1 released, and more in today’s top stories around machine learning, deep learning,and data science news. 1. Microsoft announces SQLServer update for mssql-cli Microsoft announced a new update for its mssql-cli, which is a new and interactive command line query tool for SQL Server. Mssql-cli is an open source tool that works cross-platform and is part of the dbcli community. In this v1.0.0, the feature highlights are the special commands. Microsoft in its blog states that these special commands make various executions easier. They are shortcuts to perform common tasks and queries. All special commands start with a backslash (), and one can use the built-in IntelliSense to see a list of special commands that they can use. Read more at SQL Server Blog. 2. ENAS-PyTorch: A PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing” Introducing ENAS-PyTorch, a PyTorch implementation of "Efficient Neural Architecture Search via Parameters Sharing. ENAS reduce the computational requirement (GPU-hours) of Neural Architecture Search (NAS) by 1000x. This is done via parameter sharing between models that are subgraphs within a large computational graph. To know more in detail, visit the GitHub repository. 3. NumPy v1.14.1 released NumPy version 1.14.1 released. This is a bugfix release for some problems reported following the 1.14.0 release. The major problems fixed include: Problems with the new array printing, particularly the printing of complex values. Problems with np.einsum due to the new optimized=True default. Some fixes for optimization have been applied and optimize=False is now the default. The sort order in np.unique when axis=<some-number> will now always be lexicographic in the subarray elements. In previous NumPy versions there was an optimization that could result in sorting the subarrays as unsigned byte strings. The change in 1.14.0 that multi-field indexing of structured arrays returns a view instead of a copy has been reverted but remains on track for NumPy 1.15. To know more, read NumPy’s release notes. 4. Feature Labs Launches Software Solutions to Automate Feature Engineering for Machine Learning and AI Applications Feature Labs, Inc., launched a set of tools to aid data scientists build machine learning algorithms more quickly. As stated by Max Kanter, CEO and founder of Feature Labs, the company plans to automate ‘feature engineering’, a time consuming and manual process for data scientists. Feature Labs uses “Deep Feature Synthesis” to automatically create features from raw relational and transactional datasets. Max Kanter also said,“Feature Labs is unique because we automate feature engineering, which is the process of using domain knowledge to extract new variables from raw data that make machine learning algorithms work.” Read more about this news in detail on Feature Labs’ official website. 5. SentryOne Releases Version 18.1 with Enhanced Support of SSAS Tabular SentryOne released Version 18.1of its SentryOne Platform. This updated version has an enhanced support of SSAS Tabular in BI Sentry. Bi Sentry is the complete performance monitoring, diagnosis, and optimization solution for SQL Server Analysis Services (SSAS). Jason Hall, SentryOne Vice President of Product, said, “This update also introduces general performance enhancements to the SentryOne client, and additional performance enhancements to our APS and Azure SQL DW products.” For a more detailed information read the official press release.

0
0
1164

article-image-22nd-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

22 Feb 2018

3 min read

22nd Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

22 Feb 2018

3 min read

MariaDB MaxScale v2.2, Accenture AI testing services, Qualcomm’s AI engine, AllenNLP v0.4, and more in today’s top stories around machine learning, deep learning, and data science news. 1. MariaDB MaxScale 2.2, an advanced database proxy for MariaDB, is now generally available MariaDB has announced the general availability of MariaDB MaxScale 2.2. MaxScale is an advanced database proxy for MariaDB. The version 2.2 hosts a variety of new features including Replication cluster failover management High availability of MaxScale Security features for General Data Protection Regulation (GDPR) compliance, readiness of upcoming MariaDB Server 10.3 Improved management interface Proxy Protocol, to ease out configuration and authorization of users by eliminating the need to duplicate them in both MariaDB MaxScale and MariaDB Server To know the entire changes, have a look at the release notes. 2. Accenture announces new services for testing AI systems Accenture launches new AI testing services. These services are built on a “Teach and Test” methodology designed to help companies build, monitor and measure reliable AI systems. The “Teach” phase emphasizes the choice of data, models, and algorithms that are used to train Machine Learning. In the “Test” phase, AI outputs are compared with the main performance indicators and analyzed for whether the system can explain how a decision or outcome was determined by using innovative techniques and Cloud-based tools to monitor the system. Accenture has used this methodology to train a conversational virtual agent for a financial services company’s website. The agent was trained 80 percent faster than previously possible and achieved an 85 percent accuracy rate on customer recommendations. 3. Qualcomm launches its new Artificial Intelligence Engine To help developers provide better machine learning-based enhancements, Qualcomm has launched a new AI engine. The Qualcomm Artificial Intelligence Engine consists of several hardware and software components that can be used by app developers to provide “AI-powered user experiences”, with or without a network connection. Key features include: Snapdragon Neural Processing Engine (NPE) software framework to accelerate AI user experiences on a device. The Snapdragon NPE supports Tensorflow, Caffe and Caffe2 frameworks, in addition to the Open Neural Network Exchange (ONNX) interchange format. Support for the Android Neural Networks API, giving developers access to Snapdragon platforms directly through the Android operating system. Hexagon Neural Network (NN) library allowing developers to run AI algorithms directly on the Hexagon Vector Processor. 4. Microsoft Azure Notebooks will now let users learn Data Science, free of charge Microsoft has made it easier to create and share live, working code an easier process with its Microsoft Azure Notebooks service. This notebook is now available free of charge and allows data science enthusiasts to learn programming and data science outside of traditional schooling. Microsoft Azure Notebooks lets users get started quickly on tasks such as data visualization and prototyping, all within a web browser. It's an implementation of the popular open-source Jupyter Notebooks service and is available to anyone who creates a free account. 5. AllenNLP, an open-source NLP research library built on PyTorch, releases its version 0.4 AllenNLP has released the version 0.4 of their NLP research library, which is built on PyTorch. The major changes include: Inclusion of ELMo which produces contextualized word embeddings that greatly improve model performance. Support for lazy datasets: Users can now stream data through the trainer with a lower memory footprint. First-class support for models that operate on spans instead of tokens. Support for programmatically importing additional dependencies. A simple server to create a stand-alone web demo for a model. Constrained decoding added to the ConditionalRandomField module (and to the corresponding NER tagger model) Additional features and bug fixes are available in the GitHub repo.

0
0
1185

article-image-21st-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

21 Feb 2018

3 min read

21st Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

21 Feb 2018

3 min read

JupyterLab now user-ready, JetBrains announces beta version of Datalore, Baidu’s latest research paper on Speech Synthesis, and more in today’s top stories around machine learning, deep learning,and data science news. 1. JupyterLab is now ready for users Jupyter team announced the beta release of JupyterLab, the next-generation web-based interface for Project Jupyter. JupyterLab consists of an interactive development environment for working with notebooks, code and data. JupyterLab provides a high level of integration between notebooks, documents, and activities: Drag-and-drop to reorder notebook cells and copy them between notebooks. Run code blocks interactively from text files (.py, .R, .md, .tex, etc.). Link a code console to a notebook kernel to explore code interactively without cluttering up the notebook with temporary scratch work. Edit popular file formats with live preview, such as Markdown, JSON, CSV, Vega, VegaLite, and more. Read more on the Jupyter blog. 2. JetBrains announces beta version of Datalore: A web application for machine learning Jet Brains has launched a public beta of Datalore, an intelligent web application for data analysis and visualization in Python. Datalore includes features such as: Intelligent and easy-to-use code editor Incremental computations Out-of-the-box machine learning tools Real-time collaboration Different computational instances To read about the features in detail visit JetBrains’ official blog. 3. Baidu’s latest breakthrough in Speech Synthesis Baidu Research recently rolled out a new research paper on “Neural Voice Cloning with a Few Samples” This paper focuses on two fundamental approaches for solving the problems with voice cloning. Firstly, speaker adaptation and secondly, speaker encoding . Both these techniques can be adapted to a multi-speaker generative speech model with speaker embeddings, without degrading its quality. In terms of naturalness of the speech and similarity to the original speaker, both demonstrate good performance, even with very few cloning audios. Read the research paper, for a complete information on this topic. 4. IBM and Unity collaboration brings Watson into virtual reality environments IBM announced its collaboration with Unity to build a development kit for IBM platform. This platform will let companies to draw on IBM’s cloud-based Watson artificial intelligence suite into their projects. The features that this collaboration would bring in are: Ability to analyze the objects in a virtual environment using Watson Visual Recognition The development kit would allow Unity developers to configure games and projects in order to understand speech, communicate with users, and understand the intent of a user in natural language. Watson's Vision API will also allow developers to integrate real-time visual recognition into their Unity projects. Visit Unity’s official blog post for a details on the extended features of this collaboration. 5. Satoshipowered.ai wants to link VR and blockchain Satoshipowered.ai (SAI), a decentralized Autonomous Game Development and Crowd publishing Organization, stated that it wants to use blockchain’s decentralized bookkeeping to give players true ownership over digital goods, which could introduce economic scarcity to games with a focus on virtual worlds. Satoshipowered.ai (SAI) announced that it will make use of the Ethereum blockchain, a cryptocurrency that allows anyone to spin up their own customized digital coin. Developers can rework the blockchain that keeps track of Ethereum to also keep track of any other records. To know more about this in detail, read more here.

0
0
9056

article-image-paper-two-minutes-using-mean-field-games-learning-behavior-policy-large-populations

Sugandha Lahoti

20 Feb 2018

4 min read

Paper in Two minutes: Using Mean Field Games for learning behavior policy of large populations

Sugandha Lahoti

20 Feb 2018

4 min read

This ICLR 2018 accepted paper, Deep Mean Field Games for Learning Optimal Behavior Policy of Large Populations, deals with inference in models of collective behavior, specifically at how to infer the parameters of a mean field game (MFG) representation of collective behavior. This paper is authored by Jiachen Yang, Xiaojing Ye, Rakshit Trivedi, Huan Xu, and Hongyuan Zha. The 6th annual ICLR conference is scheduled to happen between April 30 - May 03, 2018. Mean field game theory is the study of decision making in very large populations of small interacting agents. This theory understands the behavior of multiple agents each individually trying to optimize their position in space and time, but with their preferences being partly determined by the choices of all the other agents. Estimating the optimal behavior policy of large populations with Deep Mean Field Games What problem is the paper attempting to solve? The paper considers the problem of representing and learning the behavior of a large population of agents, to construct an effective predictive model of the behavior. For example, a population’s behavior directly affects the ranking of a set of trending topics on social media, represented by the global population distribution over topics. Each user’s observation of this global state influences their choice of the next topic in which to participate, thereby contributing to future population behavior. Classical predictive methods such as time series analysis are also used to build predictive models from data. However, these models do not consider the behavior as the result of optimization of a reward function and so may not provide insight into the motivations that produce a population’s behavior policy. Alternatively, methods that employ the underlying population network structure assume that nodes are only influenced by a local neighborhood and do not include a representation of a global state. Hence, they face difficulty in explaining events as the result of uncontrolled implicit optimization. MFG (mean field games) overcomes the limitations of alternative predictive methods by determining how a system naturally behaves according to its underlying optimal control policy. The paper proposes a novel approach for estimating the parameters of MFG. The main contribution of the paper is in relating the theories of MFG and Reinforcement Learning within the classic context of Markov Decision Processes (MDPs). The method suggested uses inverse RL to learn both the reward function and the forward dynamics of the MFG from data. Paper summary The paper covers the problem in three sections-- theory, algorithm, and experiment. The theoretical contribution begins by transforming a continuous time MFG formulation to a discrete time formulation and then relates the MFG to an associated MDP problem. In the algorithm phase, an RL solution is suggested to the MFG problem. The authors relate solving an optimization problem on an MDP of a single agent with solving the inference problem of the (population-level) MFG. This leads to learning a reward function from demonstrations using a maximum likelihood approach, where the reward is represented using a deep neural network. The policy is learned through an actor-critic algorithm, based on gradient descent with respect to the policy parameters. The algorithm is then compared with previous approaches on toy problems with artificially created reward functions. The authors then demonstrate the algorithm on real-world social data with the aim of recovering the reward function and predicting the future trajectory. Key Takeaways This paper describes a data-driven method to solve a mean field game model of population evolution, by proving a connection between Mean Field Games with Markov Decision Process and building on methods in reinforcement learning. This method is scalable to arbitrarily large populations because the Mean Field Games framework represents population density rather than individual agents. With experiments on real data, Mean Field Games emerges as a powerful framework for learning a reward and policy that can predict trajectories of a real-world population more accurately than alternatives. Reviewer feedback summary Overall Score: 26/30 Average Score: 8.66 The reviewers are unanimous in finding the work in this paper highly novel and significant. According to the reviewers, there is still minimal work at the intersection of machine learning and collective behavior, and this paper could help to stimulate the growth of that intersection. On the flip side, surprisingly, the paper was criticized with the statement “scientific content of the work has critical conceptual flaws”. However, the author refutations persuaded the reviewers that the concerns were largely addressed.

0
0
10249

article-image-20th-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

20 Feb 2018

4 min read

20th Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

20 Feb 2018

4 min read

Kepler brings blockchain for AI, ARM releases two AI chip designs, Coinbase commerce plugin, and more in today’s top stories around machine learning, blockchain, and data science news. 1. Kepler Technologies plans to build a decentralized ecosystem for the development of AI and robotics. Kepler Technologies, a blockchain-based startup, plans to use blockchain-based solutions for the development of AI and robotics projects. Their decentralized platform is backed by smart contracts and powered by proprietary analytical algorithms. Every proposal presented to Kepler Technologies will be recorded to the blockchain through its’ innovative Proof-of-Creation network protocols. The platform’s innovative KEP token is the driver behind all voting and funding on the ecosystem. The KEP tokens will be used to incentivize behaviors on the platform in a decentralized way. It will also let users buy products at a discount, vote for or against projects to be developed, and funding proposals that are accepted. The company has also established a blockchain tech incubation platform for the users to exhibit their ideas and connect with global investors to gain financial support. Innovators can connect with each other from anywhere in the world to develop their project. 2. ARM releases two new AI chip designs for mobile devices ARM, has released designs for two new AI processors for delivering large amounts of computational capabilities to mobile devices. The first, ARM Machine Learning (ML) Processor, which will speed up general AI applications from machine translation to facial recognition. The second, ARM Object Detection (OD) Processor is a second-generation design optimized for processing visual data and detecting people and objects. The company said that its Arm ML processors can handle more than 4.6 trillion operations per second while drawing very little power. Devices using the ARM ML processor will be able to perform ML independent of the cloud. The OD processor is expected to be available to industry customers at the end of this month, while the ML processor design will be available sometime in the middle of the year. 3. Coinbase unveils a new plugin For Ethereum, Bitcoin, and other cryptocurrencies Coinbase, the popular crypto broker, has launched a new PayPal like plugin service for cryptocurrency merchants. This feature allows them to seamlessly integrate crypto payments by adding a Coinbase Commerce button. The plugin is available for Ethereum, Bitcoin, Bitcoin Cash and Litecoin. Previously, their merchants' service was directly integrated with Coinbase, requiring a Coinbase account. Now it’s just a seamless crypto integration option, no different than paying through credit card, or Paypal. 4. IBM plans to use blockchain technology to aid the government IBM wants to use blockchain technology in US governance processes to help make services more secure. According to, IBM's vice-president of blockchain technology, Jerry Cuomo, “US government should employ the digital ledger technology for services such as paying taxes, creating secure identities, tracking food and drug shipments, among other purposes”. He preferred integrating blockchain into existing government projects and programmes rather than creating new projects based on the technology. The federal and state governments in the US are already working on several experimental projects based on blockchain, with some states working on implementing blockchain-based drivers licenses and identification cards. IBM itself is working with the Centers for Disease Control and Prevention in implementing blockchain to increase the speed of CDC's ability to develop new drugs. 5. Hyperband, Hyperparameter Optimization for PyTorch A new PyTorch implementation of Hyperband is in development. HyperBand is a hyperparameter optimization algorithm that exploits the iterative nature of SGD and the embarrassing parallelism of random search. Unlike Bayesian optimization methods which focus on optimizing hyperparameter configuration selection, HyperBand poses the problem as a hyperparameter evaluation problem. It adaptively allocates more resources to promising configurations while quickly eliminating poor ones. This allows it to evaluate orders of magnitude more hyperparameter configurations. It is described in the paper Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization by Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh and Ameet Talwalkar. The implementation details are available in the GitHub repo.

0
0
1337

article-image-19th-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

19 Feb 2018

3 min read

19th Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

19 Feb 2018

3 min read

Oracle Database 18c, spaCy v2.0.8 released, Apache Phoenix 5.0.0-alpha released, and more in today’s top stories around machine learning, deep learning, and data science news. 1. Oracle Database 18c: Now available on the Oracle Cloud and Oracle Engineered Systems Oracle announced that its Oracle Database 18c is now available on the Oracle Cloud and Oracle Engineered systems. The 18c is the first annual release in Oracle’s new database software release model, and is a core component of Oracle's recently announced Autonomous Database Cloud. The 18c does not contain any seismic changes in functionality but there are lots of incremental improvements, which include: High performance analytics High availability Multitenant improvements Application Development Big Data and Data Warehousing Security improvements To read more on the new features of Oracle Database 18c in detail, visit the documentation blog. 2. spaCy v2.0.8 released spaCy is an open-source software library for carrying out advanced Natural Language Processing. Its v2.0.8 has been released with some new features, performance improvements and minor bug fixes. The features and performance improvements are: NEW: Lexical attribute IS_CURRENCY via Token.is_currency for currency symbols. Add noun_chunks syntax iterator for Norwegian. Add get_beam_parse method in ArcEager. Revert changes to the Matcher in favour of the new and improved API (#1971) coming in v2.1.0. Fixes various typos and inconsistencies For bug fixes and other detailed information, visit the GitHub Repo. 3. Turi Create 4.1.1 released Turi Create is designed to simplify the development of custom machine learning models. The last major release of Turi Create was on December, 2017, which was Turi Create 4.0: the initial open source release by Apple. Version 4.1.1 of Turi Create is released, which includes two fixes: import turicreate fails on macOS 10.12.6 (#256) Miscellaneous documentation consistency fixes For source code and other information visit the GitHub repository. 4. Apache Phoenix 5.0.0-alpha released Apache Phoenix is an open source, parallel, relational database engine that supports OLTP for Hadoop using the Apache HBase as the backing store. It enables OLTP and operational analytics in Hadoop for low latency applications. The Apache Phoenix 5.0.0 is an alpha release. This release is the first version of Phoenix which is compatible with Apache Hadoop 3.0.x and Apache HBase 2.0.x. Known issues: The Apache Hive integration is known to be non-functional (PHOENIX-4423) Split/Merge logic with Phoenix local indexes are broken (PHOENIX-4440) Apache Tepha integration/transactional tables are non-functional (PHOENIX-4580) Point-in-time queries and tools that look at “old” cells are broken, e.g. IndexScrutiny (PHOENIX-4378) Developers encourage users to test this release out and report any observed issues for the official 5.0.0 release quality to be significantly improved. 5. Cloudera Enterprise 5.14 released Cloudera released its Cloudera Enterprise 5.14. This is a maintenance release and has fixed two issues, which were: Cloudera Manager upgrade workflow incorrectly requires deploying some optional management roles Logging issue slows down Hive and HDFS Replication jobs For further details, visit the Cloudera documentation page. 6. Eggplant AI 2.0: Machine Learning brought to Software Testing Testplant released a new version 2.0 of its Eggplant AI. This version uses AI, machine learning, and analytics to intelligently navigate applications, predict quality issues, and correlate data, which can help product teams quickly identify and resolve issues. Eggplant AI 2.0 Highlights: Uses AI and neural networks to auto-generate tests and focus test execution on the user journeys most likely to find defects Enables software and app vendors to keep up with the pace of DevOps and user expectations Helps improve the user experience

0
0
1371

Packt

16 Feb 2018

2 min read

Courses Week - Discover Packt's New Live and On Demand Products

Packt

16 Feb 2018

2 min read

Practical training delivered by experts. That’s what we wanted to achieve when we started developing our new courses product. And now you can try them for yourself - courses week, running from 12th-18th February, is the perfect time to discover a range of Live and on-demand courses and pick them up for an incredible price. Packt courses explained On-demand courses On demand courses give you access to training delivered by an expert that you can follow however you want. Watch online, pause, rewind and skip ahead at your convenience. Then put your learning to the test with featured quizzes after each section - proven to help you better retain and apply your learning. Featuring courses on everything from deep learning to AWS, from Akka to Angular 2, we’ve got courses on some of the most important trends and tools in modern tech. Throughout courses week every single on demand course will be available for just $10. That’s hours of training combined with expert insight for the price of lunch. Live courses Live courses are an event - which means you’ll be attending a unique training session with like-minded people. Each session will be run by an expert in the field with unmatched industry experience. In it, they’ll explore concepts, use cases and challenges, using their insight to illustrate how to make cutting-edge trends and tools work for you. Throughout courses week you’ll be able to book your seat at our short (2 hour) courses for just $10. You’ll also be able to book your place on an 8 hour session with DevOps expert Viktor Farcic for $49. Course bundles We've put together related courses into bundles so you can purchase your own complete program of training. You can get any of these on demand course bundles for $20 throughout courses week: Deep Learning with TensorFlow and Python Modern Pentesting Techniques and Tools Modern Angular Web Development Redefining software with AWS and Azure Data science with Python At just $20 throughout courses week, it's your chance to learn new skills with industry experts for an incredible price.

0
0
2209

Tech News

article-image-16th-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

16 Feb 2018

3 min read

16th Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

16 Feb 2018

3 min read

Facebook introduces Tensor comprehensions, IBM adds AI to Video Cloud, Microsoft’s VS now includes Anaconda and more in today’s top stories around machine learning, deep learning, and data science news. 1. Facebook Announces Tensor Comprehensions Facebook AI Research (FAIR) announced the release of Tensor Comprehensions, a C++ library and mathematical language for running large-scale models on various hardware backends. It features a Just-In-Time compilation to produce automatic, on-demand, and high-performance codes for machine learning needs. The release includes: a mathematical notation to express a broad family of ML ideas in a simple syntax a C++ frontend for this mathematical notation based on Halide IR a polyhedral Just-in-Time (JIT) compiler based on Integer Set Library (ISL) a multi-threaded, multi-GPU autotuner based on evolutionary search For further details, you may visit the facebook research blog. 2. IBM adds AI Capabilities to IBM Video Cloud IBM Cloud Video unveiled new AI-powered Automated Watson Caption Support and Speech-to-Text capabilities to its enterprise video offering. These new AI capabilities will help in recognizing speech within videos and convert spoken words and phrases into text for video captions. Here’s how they would be helpful. Automatic transcript generation and real-time processing will slash editing workflows and costs. The advanced search and discovery features will help in optimizing employee engagement through. Increased accessibility and compliance will make content more digestible for all team members. To know more, read the official press release published online. 3. Microsoft’s Visual Studio Code is now included in the Anaconda distribution Microsoft’s Visual Studio Code will now ship as part of the popular Python data science platform Anaconda. According to Microsoft, “Visual Studio Code can easily be installed at the same time as Anaconda, providing a great editing and debugging experience for Python users, with special features tailor-made for Anaconda users.” Microsoft has previously made investments in the Python community with Python extension for VS Code and support for Python in Azure Machine Learning, SQL Server, and Azure Notebooks. According to the Anaconda team, “VS Code is a good IDE choice for its users on Windows, macOS, and Linux because of its debugging, code completion, and Git integration features.” It also offers a number of extensions that developers can tailor to their specific needs. 4. MongoDB announces support for multi-document ACID transactions in version 4.0 MongoDB has announced that it will support multi-document ACID transactions in its 4.0 release. With this release, MongoDB will now have the power of NoSQL and cross-collection ACID transaction support. This combination will make it easy for developers to write mission-critical applications leveraging the power of MongoDB. ACID (Atomicity, Consistency, Isolation, Durability) describes the ability to guarantee that a transaction is valid, which is difficult when data is distributed across multiple documents. With these multi-document transactions, MongoDB will now provide a globally consistent view of data across replica sets and enforce all-or-nothing execution to maintain data integrity. 5. Cloudant 2.8.0 is now released Cloudant, the cloud-based service based on the Apache-backed CouchDB project, has released their version 2.8.0. The changes include: Added support for /_search_disk_size endpoint which retrieves disk size information for a specific search index. Updated default IBM Cloud Identity and Access Management token URL. Removed broken source and target parameters that constantly threw AttributeError when creating a replication document. The entire changes are available at the GitHub repo.

0
0
1258

article-image-15th-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

15 Feb 2018

4 min read

15th Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

15 Feb 2018

4 min read

TensorFlow 1.6.0-rc1, node-oracledb 2.1 on npm, eXist-db v4.0.0 released, and more in today’s top stories around machine learning, deep learning,and data science news. 1. TensorFlow 1.6.0-rc1 has been released Tensorflow 1.6.0 release candidate 1 has been released with two breaking changes Prebuilt binaries are now built against CUDA 9.0 and cuDNN 7. Prebuilt binaries will use AVX instructions. This may break TensorFlow on older CPUs. There are also some major features and improvements: New Optimizer internal API for non-slot variables tf.estimator.{FinalExporter,LatestExporter} now export stripped SavedModels, which improves forward compatibility of the SavedModel. FFT support added to XLA CPU/GPU. Android TF can now be built with CUDA acceleration on compatible Tegra devices In addition to these, there are also some bug fixes and other changes such as updates in the documentation and the Google Cloud Storage(GCS) which you can read at the GitHub repo. 2. Oracle announces ‘Oracle Enterprise Data Management Cloud’ Oracle extends its Enterprise Performance Management (EPM) suite by announcing Oracle Enterprise Data Management Cloud at its Modern Finance Experience 2018, in New York. Hari Sankar, Oracle’s group vice president, EPM product management, said, “ The new offering will help customers manage important metadata structures related to their financial applications to avoid misalignment and lack of consistency.” Some benefits of the Oracle Enterprise Data Management Cloud include: Faster cloud adoption, which allows migration and mapping of enterprise data elements and on-going changes across public, private and hybrid cloud environments from Oracle or third parties. Improvised business agility allows faster business transformation through modeling M&A scenarios, reorganizations and restructuring, chart of accounts standardization and redesign. Better alignment of enterprise applications, which can manage on-going changes across front-office, back-office and performance management applications through self-service enterprise data maintenance, sharing and rationalization. System of reference for all your enterprise data provides a support enterprise data across business domains including: master data, reference data, dimensions, hierarchies, business taxonomies, associated relationships, mappings and attributes across diverse business contexts. 3. Oracle’s node-oracledb 2.1 is now available from npm Oracle announced that its Node-oracledb 2.1.0, the Node.js module for accessing Oracle Database, is now available on npm. The top features of this release include: Support for SYSDBA, SYSOPER, SYSASM, SYSBACKUP, SYSDG, SYSKM, and SYSRAC privileges in standalone connections. A new 'queryStream()' Stream 'destroy()' method Improvement in the Error object with new 'errorNum' and 'offset' properties Addition of new 'versionSuffix' and 'versionString' properties to the oracledb object to aid showing the release status and version. Node-oracledb 2.1 no longer compiles with the long-obsolete Node 0.10 or 0.12 versions. See the Change Log for complete changes in the node-oracledb 2.1 4. Oracle brings industry 4.0 capabilities to its IoT Cloud Oracle announced addition of new capabilities for its Oracle IoT Cloud applications. Oracle would be adding them to applications including Asset Monitoring, Production Monitoring, Fleet Monitoring, Connected Worker, and Service Monitoring for Connected Assets. The Industry 4.0 capabilities include: Digital Twin Augmented Reality Machine Vision Auto Data Science The advanced monitoring and analytics capabilities of these new offering allows organizations to improve efficiency, reduce costs, and identify new sources of revenue through advanced tracking of assets, workers, and vehicles, real-time issue detection, and predictive analytics. To read about these new offerings in detail, visit Oracle’s official press release. 5. eXist-db v4.0.0 released This is a major release of the eXist-db v4.0.0. The release contains API changes, several new features and bug fixes. New added features include: Addition of fn:unparsed-text, fn:unparsed-text-lines and fn:unparsed-text-available functions. Implementation of the HTML ASCII Case Insensitive Collation for XPath 3.1. Replacement of ASCIIFoldingFilter with ICUFoldingFilter in NoDiacriticsAnalyzer for better language search support. New User Manager application shipped for the Dashboard. Updated Cache Extension Module,: Implements an LRU policy with both TTL and size options. Includes new functions: cache:names(), cache:keys($name), and cache:destroy($name). Scheduled task option unschedule-on-exception is now exposed in conf.xml. Each thread that eXist creates is now explicitly named for easier identification. Bash Scripts now use /bin/env to locate bash. Updated third-party dependencies See Release notes for API changes, bug fixes, and other performance improvements.

0
0
8101

article-image-14th-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

14 Feb 2018

3 min read

14th Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

14 Feb 2018

3 min read

Keras 2.1.4, Updates to TensorFlow Object detection API, PyTorch 0.3.1, New releases in Pgpool-II, and more in today’s top stories around machine learning, deep learning, and data science news. 1. Keras 2.1.4 releases Keras 2.1.4 is released with bug fixes and improvements in performance and example scripts. The major changes include: In ImageDataGenerator, change default interpolation of image transforms from nearest to bilinear. Stateful metrics in model.compile(..., metrics=[...]) are now allowed. A stateful metric inherits from Layer, and implements __call__ and reset_states. Support for constants argument in StackedRNNCells. Some TensorBoard features are enabled in the TensorBoard callback (loss and metrics plotting) with non-TensorFlow backends. Reshape argument in model.load_weights() are added, to optionally reshape weights being loaded to the size of the target weights in the model considered. The entire changes are available in the release notes. 2. TensorFlow Object Detection API gets updated with instance segmentation Tensorflow announced the addition of instance segmentation to their object detection API. Instance segmentation is used to segment an object region once it is detected. Instance segmentation allows for more fine-grained information about the extent of the object within the box. With this API update, Tensorflow now supports a number of instance segmentation models similar to those discussed in the Mask R-CNN paper. The model now predicts masks in addition to object bounding boxes. They have also provided four instance segmentation config files to be used to train models: mask_rcnn_inception_resnet_v2_atrous_coco mask_rcnn_resnet101_atrous_coco mask_rcnn_resnet50_atrous_coco mask_rcnn_inception_v2_coco More details can be read at the official Github repo. 3. PyTorch 0.3.1 release PyTorch have released a minor release 0.3.1 of bug fixes and performance improvements. They have removed support for CUDA capability 3.0 and 5.0 Binary releases for CUDA 7.5 are now stopped. They will now add CPU-only binary releases that are 10x smaller in size than the full binary with CUDA capabilities. Added Cosine Annealing Learning Rate Scheduler Added reduce argument to PoissonNLLLoss to be able to compute unreduced losses Added random_split that randomly splits a dataset into non-overlapping new datasets of given lengths Introduced scopes to annotate ONNX graphs to have better TensorBoard visualization of models Allowed map_location in torch.load to be a string, such as map_location='cpu' or map_location='cuda:2' Bug fixes and other improvements are available in the changelog. 4. Pgpool-II 3.7.2, 3.6.9, 3.5.13, 3.4.16 and 3.3.20 are now officially released Pgpool-II is a tool to add useful features to PostgreSQL, such as connection pooling, load balancing, and automatic failover. Pgpool Global Development Group has announced the availability of versions 3.7.2, 3.6.9, 3.5.13, 3.4.16, and 3.3.20 of Pgpool-II. The changes include: Fixed the bug with socket writing added in Pgpool-II 3.7.0, 3.6.6 and 3.5.10. Allow building with libressl. Set TCP_NODELAY and non-blocking to frontend socket. TCP_NODELAY is now employed Changed systemd service file to use STOP_OPTS=" -m fast". Changed pgpool_setup to add restore_command in recovery.conf. For more information, take a look at the release notes. 5. OmniDB 2.5 is now released with support for Oracle Databases OmniDB 2.5, the browser-based database management tool, is now released. It now allows users to manage multiple databases in a unified workspace with a user-friendly and fast-performing interface. The following features and improvements are added: Basic support for Oracle databases. Users can manage, connect, and interact with Oracle databases using most of the same features provided to manage PostgreSQL databases. New DDL Panel. A new panel located below the treeview displays properties and DDL of the currently selected node. For a complete list of updates, read the OmniDB change tracker.

0
0
1947

article-image-13th-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

13 Feb 2018

4 min read

13th Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

13 Feb 2018

4 min read

Cloud TPUs available in beta, Microsoft’s plans for blockchain, Oracle’s PaaS, Scanpy, and more in today’s top stories around machine learning, blockchain, and data science news. 1. Google’s Cloud TPUs now available in beta for accelerating machine learning Google’s Cloud TPUs, the family of Google-designed hardware accelerators, is now available in beta. These custom chips are optimized to speed up and scale up specific ML workloads programmed with TensorFlow. The company first announced Cloud TPUs at its I/O developer conference on May 17-19, 2017, for a limited number of developers and researchers. Each Cloud TPU features four custom ASICs. It packs up to 180 teraflops of floating-point performance and 64 GB of high-bandwidth memory onto a single board. Developers who already use TensorFlow don’t have to make any major changes to their code to use this service. Usage will be billed at $6.50 per Cloud TPU per hour. Using a single Cloud TPU, developers can train ResNet-50 to the expected accuracy on the ImageNet benchmark challenge in less than a day, for well under $200! Once the TPU pods (array of cloud TPUs connected together via an ultra-fast, dedicated network to form multi-petaflop ML supercomputers) are available, ResNet-50 and Transformer training times will drop from almost a day to less than 30 minutes. 2. Microsoft plans to use blockchain technology for identity management In a recent blog post, Microsoft revealed plans to use blockchain technology in the form of decentralized identity systems to solve online management issues with personal data and identity management. A decentralized identity system is not controlled by any single, centralized institution. It removes the possibility of censorship and gives an individual full control over their identity and reputation. The building of the platform takes inspiration from Microsoft’s commitment to the ID2020 alliance. Initially, Microsoft will support blockchain-based decentralized IDs (DIDs) through the Microsoft Authenticator app. Microsoft plans to work with DID method implementations, which follow a specific standard outlined by a W3C working group. According to Ankur Patel, PM, Microsoft’s identity division, “Using our technology individuals will get a secure encrypted digital hub where they can store their identity data and easily control access to it.” 3. Oracle adds advanced autonomous capabilities to its Cloud platform Oracle lays out a broader vision for Oracle Cloud Platform with a range of autonomous service capabilities. Oracle PaaS (Platform as a service) capabilities support the needs of the entire organization, including developers, enterprise architects, data scientists, IT operations, and business users. Autonomous PaaS services include advanced capabilities such as auto code generation, self-defining data flows, automated data discovery and preparation. They also have voice-enabled integration links, machine learning-based continuous data analysis, and self-learning bots that understand user intent and continually refine that understanding. They will speed IT deployments, by letting developers jump right to creating new functionality rather than having to spend time on the routine tasks. PaaS promises to lower IT costs and improve security because they require less human management and eliminate human error. 4. Scientists develop Scanpy to help manage enormous datasets Scientists from the Helmholtz Zentrum München have developed Scanpy, a program that is able to help manage enormous datasets. Scanpy was made with the purpose of analyzing the gene-expression data of a large number of individual cells. It allows comprehensive analysis of large gene-expression datasets with a broad range of machine-learning and statistical methods. Scanpy is based on the Python language and uses graph-like coordinate system. Instead of characterizing a single cell by the expression value for thousands of genes, the system simply characterizes cells by identifying their closest neighbors -- very much like the connections in social networks. In fact, to identify cell types, Scanpy uses the same algorithms as Facebook does for identifying communities. To read more, visit the official documentation. 5. Accelirate has announced its partnership with Chirrp.ai to strengthen its enterprise-class chatbot solutions capability Accelirate has partnered with Chirrp.ai, an AI-powered communication channel provider, to strengthen their chatbot solutions. With this partnership, the enterprise-grade chatbots will be able to handle low-, medium- and high-complexity use cases. A high-complexity use case is where many clients use chatbots as an NLP/NLU-powered application-delivery mechanism which can handle complex user queries as well as application rules and workflows right from within the chatbot interface. Initially, the chatbots will be configured and set up to understand structured as well as unstructured customer queries and provide them with appropriate answers without involving a human. The chatbot will gather the relevant customer information, query the backend systems (which can be accomplished by using RPA robots) and present the information to the customer interactively. All without human intervention!

0
0
1409

article-image-12th-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

12 Feb 2018

4 min read

12th Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

12 Feb 2018

4 min read

DeepMind IMPALA, Dynamometer opensourced, VoltDB v8.0, and more in today’s top stories around machine learning, deep learning,and data science news. 1. DeepMind Lab introduces IMPALA - a new and efficient distributed architecture capable of solving many tasks at the same time DeepMind has developed a new distributed agent named IMPALA (Importance-Weighted Actor-Learner Architectures) that maximises data throughput using an efficient distributed architecture with TensorFlow. IMPALA was developed in order to tackle the challenging DMLab-30 suite. DMLab-30 is a set of environments designed using the open source RL environment by DeepMind Lab. These environments enable any DeepRL researcher to test systems on a large spectrum of interesting tasks either individually or in a multi-task setting. IMPALA is inspired by the popular A3C architecture which uses multiple distributed actors to learn the agent’s parameters. When it was tested on the DMLab-30 levels, IMPALA was 10 times more data efficient and achieved double the final score compared to distributed A3C. Moreover, IMPALA showed positive transfer from training in multi-task settings compared to training in single-task setting. To know more about IMPALA, you can read the research paper. 2. LinkedIn open-sources Dynamometer, a new tool for testing big-data performance LinkedIn opensources Dynamometer, a tool which focuses around stress-testing large Hadoop big-data deployments without using massive amounts of infrastructure. Using Dynamometer, Information technology teams can test production workloads and ensure they’ll be able to cope with any changes to their Hadoop clusters. It is designed for those running large-scale Hadoop deployments, as well as those who propose changes to the core Hadoop project and want to ensure new features don’t hurt performance. Visit the GitHub Repo for a detailed information on LinkedIn’s Dynamometer. 3. VoltDB Introduces VoltDB v8.0, a Translytical Database for Powering Real-Time Decisions VoltDB, provider of an enterprise-class translytical database for business-critical applications announced the latest version (v0.8) of its flagship solution. According to Forrester analyst Mike Gualtieri, a translytical database is a “single unified database that supports transaction and analytics in real time without sacrificing transactional integrity, performance, and scale. The new version delivers more predictable, long-tail latency responses based on real-time data and historical intelligence, improving real-time processing and offering self-service analysis. What’s new in the VoltDB v8.0? Improved Network Security User-Defined Functions Common Table Expressions Kafka Enhancements Python V3 API For detailed information on v0.8, read the release notes. 4. Amazon adds encryption at rest to DynamoDB database service Amazon Web Services Inc. added a new encryption feature to its DynamoDB database service, which helps secure users’ data better. DynamoDB, Amazon’s NoSQL database service is designed to store and retrieve unstructured data, and is typically used for big-data workloads and analysis. With the new update, users can choose to encrypt data stored “at rest,” that is, when the data is not being used. The option is not switched on by default, so users will have to enable it manually when creating a new database table. Visit the AWS’ official post for a detailed read on this topic. 5. Apache Flink® Master Branch Monthly: New in Flink in January 2018 Apache Flink team highlighted a selection of features that have been merged into Flink’s master branch during the past month in its “Flink Master Monthly” blog post. The summary of features merged are: Improvements to Flink’s deployment and process model (FLIP-6) Groundwork for task recovery from local state, which speeds up failure recovery Improved state backend abstraction Network stack changes to improve performance Application-level flow control for improved control of checkpointing behavior Improved Mesos integration with Docker Table API / Streaming SQL Ecosystem integrations Integrate generated config tables into documentation

0
0
1364

article-image-9th-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

09 Feb 2018

3 min read

9th Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

09 Feb 2018

3 min read

PostgreSQL 10.2, 9.6.7, 9.5.11, 9.4.16, and 9.3.21 released, Bokeh 0.12.14 released, Cloudera Altus Analytic DB Beta,and more in today’s top stories around machine learning, deep learning,and data science news. 1. PostgreSQL 10.2, 9.6.7, 9.5.11, 9.4.16, and 9.3.21 released! PostgreSQL Global Development Group has released updates 10.2, 9.6.7, 9.5.11, 9.4.16, and 9.3.21. This release: Fixes two security issues Fixes issues with VACUUM, GIN indexes, and hash indexes that could lead to data corruption Fixes for using parallel queries and logical replication. Read the detailed release document on the official website. 2. Bokeh 0.12.14 released The Bokeh organization announced the incremental release of Bokeh 0.12.14. This version has two highlights: New multi-gesture tools for editing glyphs directly Update for compatibility with upcoming Tornado 5.0 Additionally, this release also includes some bug fixes and documentation improvements. You can visit the Change log on GitHub and the official documentation for a detailed hold on this release. 3. MapR simplifies end-to-end workflow for Data Scientists with MapR Expansion Pack (MEP) 4.1 MapR Technologies announced the availability of MapR Expansion Pack (MEP) 4.1, which allows data scientists and engineers to build scalable deep learning pipelines, instant availability of operational data for data science. It also enables them to achieve over 2X improvement in performance across a variety of data discovery and ad-hoc queries. The MEP 4.1 allows building real-time pipelines and brings data science capabilities to a broad set of users with new languages support. The team also added features to MapR-DB, MapR Data Science Refinery, and Apache Drill 1.12 in the MapR Expansion Pack 4.1, which include: MapR Data Science Refinery extends support for distributing Python archives for PySpark. This allows data scientists to leverage popular Python data science libraries in a distributed way to create scalable deep learning pipelines. MapR Data Science Refinery enables Apache Zeppelin to easily leverage a diverse set of Python libraries and environments that can be shared and stored in MapR-XD. PySpark jobs can directly read and write to MapR-DB OJAI, making operational data instantly available for data science. Python and Java Bindings for MapR-DB OJAI Connector for Apache Spark enable developers to read/write to MapR-DB from Spark using Java and Python. With this, developers can now build data-intensive business applications in Java and Python. A new version of Apache Drill, Drill 1.12 enables fast data exploration on operational data in MapR-DB and historical data in Parquet for data scientists, with over 2X performance improvements across a variety of data discovery and ad-hoc queries. 4. Cloudera Altus Analytic DB Beta Available Cloudera announced the beta version of its Altus Analytic DB, which is built on the Cloudera Altus platform-as-a-service foundation. The Altus Analytic DB also supports the Altus Data Engineering service. Cloudera’s Altus Analytic DB: Allows maintaining a single shared repository of Data in Open File Formats Provides multiple clusters over shared data Provides a fully controlled data security Makes it easy to provision a cluster Read more about each feature in detail on Cloudera’s official website. 5. Extract! 4.0 - the first fully Deep learning powered Resume Parsing Solution Textkernel announced the first Deep learning powered ‘Resume Parsing Solution’ named Extract! 4.0. The resume parsing software is currently available in the English language. Matt McNair, VP Global Services at CareerBuilder - Textkernel's parent company, said, “Deep Learning has transformed entire industries including automotive, healthcare, retail and financial services. Today, Textkernel is revolutionizing the HR domain with its launch of Extract! 4.0.” To have a detailed information on Extract! 4.0 and Deep Learning, visit Textkernel’s official website.

0
0
1325

27th Feb 2018 – Data Science News Daily Roundup

26th Feb 2018 – Data Science News Daily Roundup

23rd Feb 2018 – Data Science News Daily Roundup

22nd Feb 2018 – Data Science News Daily Roundup

21st Feb 2018 – Data Science News Daily Roundup

Paper in Two minutes: Using Mean Field Games for learning behavior policy of large populations

20th Feb 2018 – Data Science News Daily Roundup

19th Feb 2018 – Data Science News Daily Roundup

Courses Week - Discover Packt's New Live and On Demand Products

16th Feb 2018 – Data Science News Daily Roundup

Trending Topics

15th Feb 2018 – Data Science News Daily Roundup

14th Feb 2018 – Data Science News Daily Roundup

13th Feb 2018 – Data Science News Daily Roundup

12th Feb 2018 – Data Science News Daily Roundup

9th Feb 2018 – Data Science News Daily Roundup

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access