Tech News

article-image-8th-feb-2018-data-science-news-daily-roundup

08 Feb 2018

4 min read

8th Feb 2018 – Data Science News Daily Roundup

08 Feb 2018

RapidMiner Studio 8.1.0, MySQL Shell 8.0.4, Project Maestro’s new release, upgrades to S/4HANA, and more in today’s top stories around machine learning, deep learning, and data science news. 1. What's New in RapidMiner Studio 8.1.0 RapidMiner announces the release of Auto Model and RapidMiner 8.1 to accelerate data science. They have added an Auto Model feature, a new working model for rapid creation, comparison, and exploration of new models. They have also added a powerful global search functionality. Users can now search for operators, repository contents, UI actions, and Marketplace content. Other enhancements include: New Process Templates upgraded to use the latest operator versions. Read Excel now allows sheet selection by name. Read CSV, Read XML and Read Excel has a new expert parameter to read all values as polynomial, which allows the user to disable type guessing. Hide passwords in the Password Manager dialog and store them with a stronger encryption. Search Twitter and Get Twitter User Statuses added support for 280-character tweets. For other bug fixes and enhancements, read the official documentation. 2. MySQL announces Shell 8.0.4 and the general availability of Oracle Enterprise Manager for MySQL Database MySQL introduces a new Upgrade checker (UC) utility with the latest release of Shell. This feature is introduced to make the 5.7 system ready for MySQL 8.0 upgrade. UC connects to a specified server and runs a series of checks. If any issues are discovered, it displays them along with any advice targeted at resolving those issues. It also prints a summary and returns an integer value describing the severity of the issues found: 0 – no issues or only ones categorized as notice. 1 – No fatal errors were found, but some potential issues were detected. 2 – UC found errors that must be fixed before upgrading to 8.0. More information is available at the MySQL Server Blog. The MySQL development team has also announced the general availability of Oracle Enterprise Manager for MySQL Database. Oracle Enterprise Manager for MySQL Database is the official MySQL plugin that provides comprehensive performance, availability, and configuration information for Oracle's enterprise IT management product line and Oracle Enterprise Manager (13c or later). More information on the contents of this release is available in the changelog. 3. Tableau releases Beta 3 of Project Maestro Tableau’s latest release for Project Maestro includes improvements to data cleaning to quickly and accurately get dirty data ready for analysis. The major changes include: Quick text cleaning, which allows application of common calculation to text fields to change the case or remove unwanted characters without having to write the calculation manually. Fast, visual filters. The new quick filter experience allows easy filtration of ranges of values for dates and numbers. Users can also write a calculation to handle more complex filtering tasks. Easy Debugging features, to easily find errors and navigate to them One-click removal of columns or steps. The entire information is available at the official blog. 4. S/4HANA cloud service from SAP gets a major upgrade SAP has unveiled a major new update to its S/4HANA Cloud service. The new update adds more intelligent functionality in machine learning, in-memory analytics, and in-context collaboration. These changes are mostly for the Finance, Procurement, Sales, Manufacturing and Professional Services sector. A new improvement requests submission form which can be used on the Customer Influence site. An Automated Payment Advice Processing, powered by machine learning and SAP Leonardo. This will help users turn documents into structured data with automated extraction of payment information from PDF files. Predictive Quotation Conversion Rates calculator to understand probable orders and predicted Sales Volumes, allowing for a more accurate forecasting. Release Billing Proposal application for transparent view of non-billable services in the Professional service firms. 5. HarperDB Launches Database Solution for IoT, App Developers, and Enterprise HarperDB, have launched an HTAP (Hybrid Transactional/Analytical Processing) database solution. HarperDB's database is powered by a data storage algorithm that ingests both unstructured and structured data into a fully indexed, single model data store. Both NoSQL and SQL capabilities are provided natively in real-time, and there is no increase in the storage footprint. This database solution is available for IoT, which can run on the Edge. It is also useful for app developers, allowing them to focus more time on coding and less on managing a complex database. It also provides a single model for structured and unstructured data for enterprises.

0
0
1128

article-image-7th-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

07 Feb 2018

4 min read

7th Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

07 Feb 2018

4 min read

Sisense 7.0, Distilled IMPACT Behavioral Analytics Model, Gnocchi 4.2 release, and more in today’s top stories around machine learning, deep learning,and data science news. 1. Sisense 7.0 aids non-technical users to gain data expertise Sisense announced the release of Sisense Version 7.0, which delivers an intuitive, visual, drag and drop interface for data preparation that is used by non-technical business users to easily find, add, and combine complex data sources. It delivers smart, machine learning-driven recommendations, helping guide users through the data preparation process by recommending use of specific fields to easily 'mash-up' data sources—saving time, unveiling new insights and reducing the chance of errors. Leveraging advanced machine learning to allow for smart data preparation and visualization field suggestions presents a new step in making analytics accessible to everyone, regardless of technical skill or expertise. For more information read the detailed coverage here. 2. Distilled Analytics releases distilled IMPACT behavioral analytics model Distilled Analytics announced the release of Distilled IMPACT, which is an innovative approach to providing quantitative measurement of non-financial factors associated with for-profit investment that uses advanced behavioral analytics supported by artificial intelligence. Distilled IMPACT platform quantifies non-financial activities using granular, discrete measures, particularly around human factors, to enable asset growth for impact investing by providing greater trust and transparency. It helps organizations understand impact by analyzing patterns of movement from aggregated and third-party data sources, revealing fundamental insight to human behavior. 3. IBM’s Watson Captioning to leverage Artificial Intelligence to Automate Closed Captioning Process IBM leverages Artificial Intelligence (AI) to automate the closed captioning process as part of its latest Watson Captioning. This new service will provide businesses with A scalable solution which saves time and capital Maximize productivity by streamlining workflows An increased caption accuracy over time The new IBM offering provides a seamless user experience via tools including Machine Generated Captions, Embedded Smart Layout, Watson Caption Editor and Live Captioning. 4. Gnocchi 4.2 released, with added features and performance Gnocchi 4.2 is released. Gnocchi is an open-source time series database designed to handle large amounts of aggregates being stored while being performant, scalable and fault-tolerant. Let’s have a quick look at the features added in Gnocchi 4.2: Wildcard can be used instead of metric name in Dynamic aggregates API. Dynamic Aggregate API have a new method called ‘rateofchange’. A new format for the batch payload is available to allow to pass the archive policy description Gnocchi now strictly respects the archive policy configured timespan when storing aggregates. A new date type ‘ datetime‘ is available for resource type attribute. It provides a new /v1/influxdb endpoint that allows to ingest data from InfluxDB clients. Only write is implemented. This should ease transition of users coming from InfluxDB tools such as Telegraf. Metricd exposes a new option called greedy (true by default) that allows to control whether eager processing of new measures is enabled when available. Gnocchi API can act as Prometheus Remote Write Adapter to receive Prometheus metrics. The endpoint to configure in Prometheus configuration is: https://<gnocchi-host-port>/v1/prometheus/write. The deprecated dynamic aggregation (moving average) has been removed. To know about these features in detail, visit its official website. 5. Podium Data releases Podium 3.2 to take its data lake catalog to the cloud Podium Data Inc. brings self-service big data to the cloud with the release of its new version 3.2 of its Data marketplace. Data Marketplace is a data catalog which is used with data lakes to eliminate the need for the extensive extraction and massaging procedures that characterize pure-Hadoop models. Podium promotes the software as providing self-service, on-demand access to quality data. With the Podium 3.2 release, users can now combine on-premises and cloud data, as stated by the company. Podium architecture separates storage from computing to enable data taken from the data delivery teams to support multiple variations of an analytical application from a single store. With version 3.2, sources now include Amazon Web Services Inc. and Microsoft Corp. Azure clouds. Version 3.2 also permits assets inside and outside the cloud to be merged and joined. For a detailed understanding of the Data Marketplace, visit the official website.

0
0
104

article-image-6th-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

06 Feb 2018

3 min read

6th Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

06 Feb 2018

3 min read

Tensorflow 1.6.0-rc, RocksDB 5.10.2, Grafana v5.0, the upcoming release of Spark 2.3, and more in today’s top stories around machine learning, deep learning, and data science news. 1. Tensorflow 1.6.0-rc released Introducing TensorFlow 1.6 release candidate with some breaking changes and other exciting major features and improvements. Prebuilt binaries are now built against CUDA 9.0 and cuDNN 7. Prebuilt binaries will now use AVX instructions. (This may break TF on older CPUs.) tf.estimator.{FinalExporter,LatestExporter} can now export stripped SavedModels. This improves forward compatibility of the SavedModel. FFT support added to XLA CPU/GPU. To know about Bug Fixes and other changes, visit the GitHub repo. 2. Facebook’s RocksDB 5.10.2 is now released RocksDB, the high performance embedded database for key-value data built by Facebook, has released its version 5.10.2. The new features include: CRC32C is now using the 3-way pipelined SSE algorithm crc32c_3way on supported platforms to improve performance. It now provides lifetime hints when writing files on Linux. This reduces hardware write-amp on storage devices supporting multiple streams. It now has a DB stat, NUMBER_ITER_SKIP, which returns the number of internal keys skipped during iterations. PerfContext counters, key_lock_wait_count and key_lock_wait_time are added, which measure the number of times transactions wait on key locks and total amount of time waiting. The complete release and changes are available at the official GitHub repo. 3. Grafana v5.0 is out in Beta Grafana, the open platform for analytics and monitoring, is now available in version 5.0. The major new features and enhancements include New Dashboard Layout Engine with easier drag, drop and resize experience and new types of layouts. New UX and improvements in UI in both look and function. Dashboard Folders for dashboards organization. Permissions on folders and dashboards to help manage larger Grafana installations. Datasource provisioning, to setup datasources and dashboards via config files. Persistent dashboard url makes it possible to rename dashboards without breaking links. The entire changes can be read at the official documentation. 4. What is expected from the upcoming Apache Spark 2.3 Release Apache Spark is soon to release their version 2.3.0 in an upcoming live webinar. The expected changes include: New DataSource APIs for helping developers to easily read and write data for Continuous Processing in Structured Streaming. PySpark support for vectorization, giving Python developers the ability to run native Python code fast. Improved performance by taking advantage of NVMe SSDs. Native Kubernetes support. 5. Ian Goodfellow releases code for SN-GAN and the projection discriminator Ian Goodfellow, the inventor of GANs, has released the code for SN-GAN and the projection discriminator. Spectral Normalization for GANs is a novel weight normalization technique to stabilize the training of the discriminator of GANs. cGANs with Projection Discriminator is a projection based way to incorporate the conditional information into the discriminator of GANs that respects the role of the conditional information in the underlying probabilistic model. Ian has done the chainer implementation for conditional image generation on ILSVRC2012 dataset (ImageNet) with spectral normalization and projection discriminator. The entire code implementation is available on GitHub.

0
0
7833

Savia Lobo

05 Feb 2018

6 min read

AutoML : Developments and where is it heading to

Savia Lobo

05 Feb 2018

6 min read

With the growing demand in ML applications, there is also a demand for machine learning tasks such as data preprocessing, optimizing model hyperparameters and so on to be easily handled by non-experts. This is because, these tasks were repetitive and due to the complexity were considered to be handled only by ML experts. To support this cause and to maintain off-the-shelf quality of machine learning methods without expert knowledge, Google came out with a project named AutoML, an approach that automates designing of ML models. You could also refer to our article on Automated Machine Learning (AutoML) for a clear understanding on how AutoML functions. Trying AutoML on smaller datasets AutoML brought in altogether new dimensions within machine learning workflows where repetitive tasks performed by human experts could be taken over by machines. When Google started off with AutoML, they applied the AutoML approach onto two smaller datasets in DL namely, CIFAR-10 and Penn Treebank to test them on image recognition and language modeling tasks respectively. The result was, AutoML approach could design models that were at par with the ones designed by the ML experts. Also, on comparing the designs drafted by humans and AutoML, it was seen that the machine-suggested architecture included new elements. These elements were later known to alleviate gradient vanishing/exploding issues, which concludes that the machines provided a new architecture which could be more useful for multiple tasks. Also, the machine designed architecture has many channels so that the gradients could flow backwards. This could help explain why LSTM RNNs work better than standard RNNs. Trying AutoML on larger datasets After a success in small scale datasets, Google tested AutoML on large scale datasets such as ImageNet and COCO object detection dataset. Testing AutoML on these was a challenge because of their higher orders of magnitude, and also because simply applying AutoML directly to ImageNet would require many months of training the AutoML method. In order to apply AutoML to large scale datasets, some alterations were made within the AutoML approach for it to be more tractable to large scale datasets. The changes include: Redesigning the search space so that AutoML could find the best layer which can then be stacked many times in a flexible manner to create a final network. Carry out architecture search on CIFAR-10 dataset and transfer the best learned architecture to ImageNet image classification and COCO object detection datasets. Thus, AutoML could find out two best layers i.e normal cell and reduction cell, which when combined resulted into a novel architecture called as “NASNet”. These two work well with CIFAR-10, and also ImageNet and COCO object detection. NASNet was seen to have a prediction accuracy of 82.7% on the validation, as stated by Google. Such an accuracy surpassed all previous inception models built by Google. Further, the learned features from the ImageNet classification were transferred to carry out object detection tasks using the COCO dataset. The learned features combined with a faster R-CNN resulted into a state-of-the-art predictive performance on the COCO object detection task in both the largest as well as mobile-optimized models. Google suspected that these image features learned by ImageNet and COCO can be reused for various other computer vision applications. Hence, Google open-sourced NASNet for inference on image classification and for object detection in the Slim and Object Detection TensorFlow repositories. Towards Cloud AutoML: Automated Machine learning platform for everyone Cloud AutoML has been Google’s latest buzz for its customers as it makes AI available for everyone. Using Google’s advanced techniques such as learning2learn and transfer learning, Cloud AutoML helps businesses having limited ML expertise, to start building their own high-quality custom models. Thus, Cloud AutoML benefits AI experts by improving their productivity and explore new fields in AI. The experts can also aid less-skilled engineers to build powerful systems. Companies such as Disney and Urban Outfitters are using AutoML for making search and shopping on their websites more relevant. With AutoML going on cloud, Google released its first Cloud AutoML product, Cloud AutoML Vision, an Image Recognition tool that enables fast and easy to build custom ML models. This tool has a drag-and-drop interface that allows one to easily upload images, train and manage the models, and then deploy those trained models directly on Google Cloud. When used to classify popular public datasets like ImageNet and CIFAR, Cloud AutoML Vision has shown state-of-the-art results. These results included fewer misclassifications than the generic ML APIs results. Here are some highlights on Cloud AutoML vision: It is built on Google’s leading image recognition approaches, along with transfer learning and neural architecture search technologies. Hence, one can expect an accurate model even if the business has a limited expertise in ML. One can build a simple model in minutes or a full, production-ready model in a day in order to pilot AI-enabled application. AutoML Vision has a simple graphical UI using which one can easily specify data. It later turns the data into a high quality model customized for one’s specific needs. Starting off with Images, Google plans to roll out Cloud AutoML tools and services for text and audio too. However, Google isn’t the only one in the race; other competitors including AWS and Microsoft are also bringing in tools such as Amazon’s SageMaker and Microsoft’s service for customizing Image recognition model, to aid developers with automating machine learning. Some other automated tools include: Auto-sklearn: An automated project that aids scikit-learn project--package of common machine learning functions--to choose the right estimator function. The Auto-sklearn includes a generic estimator function that conducts analysis to determine the best algorithm and set of hyperparameters for a given Scikit-learn job. Auto-WEKA : An inspiration from the Auto-sklearn is for machine learners using Java programming language and the Weka ML package. Auto-WEKA uses a fully automated approach to select a learning algorithm and sets its hyperparameters, unlike previous methods which used to address this in isolation. H2o Driverless AI : This uses a web-based UI and is specifically designed for business users who want to gain insights from data but do not want to get into the intricacies of machine learning algorithms. This tool allows users to choose one or multiple target variables in the dataset that needs a solution, and the system provides the answer. The results are in the form of interactive charts, explained with annotations in plain English. Currently, Google’s AutoML is leading them. It would be exciting to see how Google scales an automated ML environment exactly the same as traditional ML. Not only Google, but also other businesses are contributing to the movement towards adopting an automated machine learning ecosystem. We saw some tools joining the automation league and can expect more tools to join them. Also, these tools could go on cloud in future for an extended availability for non-experts, similar to the AutoML cloud by Google. With machine learning going automated, we can expect more and more systems to move a step closer to widening the scope for AI.

0
0
25232

article-image-5th-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

05 Feb 2018

3 min read

5th Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

05 Feb 2018

3 min read

MySQL Cluster 7.6.4, new features in dbForge SQL Complete v5.8, new chip for linking IoT and Blockchain, and more in today’s top stories around machine learning, deep learning,and data science news 1. MySQL Cluster 7.6.4 is out Announcing MySQL Cluster 7.6.4 which contain a number of attractive features including: A rewritten Local checkpoint algorithm, further designed to scale to atleast 16 TBytes of DataMemory sizes. Improvements in the MySQL Cluster Configurator (MCC, Auto Installer) New cloud feature for configuring nodes with a LocationDomainId. New ODirectSyncFlag, to improve disk write speeds by around 2-3x. Change default behaviour of restart configuration leading to a very significant reduction in restore times. Improvements to parallel query implementation (pushdown join, SPJ). Parallel UNDO log applier for disk columns Bug fixes For a detailed read on the features, visit the blog post. 2. Introducing new productivity features in dbForge SQL Complete v5.8 Devart, one of the leading developers of database tools and administration software, announced the release of dbForge SQL Complete v5.8. The dbForge SQL Complete is a useful add-in for Microsoft SQL Server Management Studio and Microsoft Visual Studio. The new release includes new productivity features, such as Result Grid Aggregates Find in Results Grid Execution warnings CRUD generator, and much more. For a detailed information on the new features and improvements, refer here. 3. New chip that helps IoT devices communicate with Blockchain Filament, an industrial internet solutions startup have developed a new Blocklet chip. The chip allows IoT devices to communicate with any Blockchain technology. Additionally, the chip has a very small footprint and low power consumption. It is also secure, containing a robust cryptographic chain-of-custody protocol. The chip currently supports Hyperledger Sawtooth blockchain, and will shortly be expanded to encompass the Ethereum blockchain ledger. 4. Nexus Earth collaborates with SingularityNET to Integrate Artificial Intelligence with Blockchain Technology Nexus (NXS) partners with SingularityNET (AGI), to explore new technologies from the collaboration. The new partnership could result in a secure, scalable and censorship-resistant blockchain AI infrastructure. SingularityNET is looking to expand horizons creating a decentralized AI network based on blockchain. It is also planning to explore use of Nexus' satellite-based alternative internet protocol.As for Nexus, the collaboration gives a valuable use-case for deploying its 3D Chain architecture and exploring AI applications on layers 1 and 2 of its network. 5. Microsoft forms Cortana Intelligence Institute to advance AI Microsoft, on Thursday announced the establishment of the Cortana Intelligence Institute. It is a collaboration with the Royal Melbourne Institute of Technology (RMIT), which is focused on broadening the capabilities of its virtual assistant. Researchers from RMIT will work with Microsoft personnel to apply AI to new tasks that currently can’t be handled by neural networks. The primary task on the agenda is to assemble a “multidimensional” user dataset for development purposes. Microsoft aims to gather a wide variety of information ranging from online activity patterns to location. It is also looking to build new AI models that can understand contextual data well enough to interpret and carry out complex user requests involving multiple different steps. Read the complete coverage for a detailed information on this establishment.

0
0
1249

article-image-how-deep-neural-networks-can-improve-speech-recognition-and-generation

Sugandha Lahoti

02 Feb 2018

7 min read

How Deep Neural Networks can improve Speech Recognition and generation

Sugandha Lahoti

02 Feb 2018

7 min read

While watching your favorite movie or TV show, you must have found it difficult to sometimes decipher what the characters are saying, especially if they are talking really fast, or well, you’re seeing a show in the language you don’t know. You quickly add subtitles and voila, the problem is solved. But, do you know how these subtitles work? Instead of a person writing them, a computer automatically recognizes speech and the dialogues of the characters and generates scripts. However, this is just a trivial example of what computers and neural networks can do in the field of speech understanding and generation. Today, we’re gonna talk about the achievements of deep neural networks to improve the ability of our computing systems to understand and generate human speech. How traditional speech recognition systems work Traditionally speech recognition models used classification algorithms to arrive at a distribution of possible phonemes for each frame. These classification algorithms were based on highly specialized features such as MFCC. Hidden Markov Models (HMM) were used in the decoding phase. This model was accompanied with a pre-trained language model and was used to find the most likely sequence of phones that can be mapped to output words. With the emergence of deep learning, neural networks were used in many aspects of speech recognition such as phoneme classification, isolated word recognition, audiovisual speech recognition, audio-visual speaker recognition and speaker adaptation. Deep learning enabled the development of Automatic Speech Recognition (ASR) systems. These ASR systems require separate models, namely acoustic model (AM), a pronunciation model (PM) and a language model (LM). The AM is typically trained to recognize context-dependent states or phonemes, by bootstrapping from an existing model which is used for alignment. The PM maps the sequences of phonemes produced by the AM into word sequences. Word sequences are scored using LM trained on large amounts of text data, which estimate probabilities of word sequences. However, training independent components added complexities and was suboptimal compared to training all components jointly. This called for developing end-to-end systems in the ASR community, those which attempt to learn the separate components of an ASR jointly as a single system. A single system Speech recognition model The end-to-end trained neural networks can essentially recognize speech, without using an external pronunciation lexicon, or a separate language model. End-to-end trained systems can directly map the input acoustic speech signal to word sequences. In such sequence-to-sequence models, the AM, PM, and LM are trained jointly in a single system. Since these models directly predict words, the process of decoding utterances is also greatly simplified. The end-to-end ASR systems do not require bootstrapping from decision trees or time alignments generated from a separate system. Thereby making the training of such models simpler than conventional ASR systems. There are several sequence-to-sequence models including connectionist temporal classification (CTC), and recurrent neural network (RNN) transducer, an attention-based model etc. CTC models are used to train end-to-end systems that directly predict grapheme sequences. This model was proposed by Graves et al. as a way of training end-to-end models without requiring a frame-level alignment of the target labels for a training statement. This basic CTC model was extended by Graves to include a separate recurrent LM component, in a model referred to as the recurrent neural network (RNN) transducer. The RNN transducer augments the encoder network from the CTC model architecture with a separate recurrent prediction network over the output symbols. Attention-based models are also a type of end-to-end sequence models. These models consist of an encoder network, which maps the input acoustics into a higher-level representation. They also have an attention-based decoder that predicts the next output symbol based on the previous predictions. A schematic representation of various sequence-to-sequence modeling approaches Google’s Listen-Attend-Spell (LAS) end-to-end architecture is one such attention-based model. Their end-to-end system achieves a word error rate (WER) of 5.6%, which corresponds to a 16% relative improvement over a strong conventional system which achieves a 6.7% WER. Additionally, the end-to-end model used to output the initial word hypothesis, before any hypothesis rescoring, is 18 times smaller than the conventional model. These sequence-to-sequence models are comparable with traditional approaches on dictation test sets. However, the traditional models outperform end-to-end systems on voice-search test sets. Future work is being done on building optimal models for voice-search tests as well. More work is also expected in building multi-dialect and multi-lingual systems. So that data for all dialects/languages can be combined to train one network, without the need for a separate AM, PM, and LM for each dialect/language. Enough with understanding speech. Let’s talk about generating it Text-to-speech (TTS) conversion, i.e generating natural sounding speech from text, or allowing people to converse with machines has been one of the top research goals in the present times. Deep Neural networks have greatly improved the overall development of a TTS system, as well as enhanced individual pieces of such a system. In 2012, Google first used Deep Neural Networks (DNN) instead of Gaussian Mixture Model (GMMs), which were then used as the core technology behind TTS systems. DNNs assessed sounds at every instant in time with increased speech recognition accuracy. Later, better neural network acoustic models were built using CTC and sequence discriminative training techniques based on RNNs. Although being blazingly fast and accurate, these TTS systems were largely based on concatenative TTS, where a very large database of short speech fragments was recorded from a single speaker and then recombined to form complete utterances. This led to the development of parametric TTS, where all the information required to generate the data was stored in the parameters of the model, and the contents and characteristics of the speech were controlled via the inputs to the model. WaveNet further enhanced these parametric models by directly modeling the raw waveform of the audio signal, one sample at a time. WaveNet yielded more natural-sounding speech using raw waveforms and was able to model any kind of audio, including music. Baidu then came with their Deep Voice TTS system constructed entirely from deep neural networks. Their system was able to do audio synthesis in real-time, giving up to 400X speedup over previous WaveNet inference implementations. Google, then released Tacotron, an end-to-end generative TTS model that synthesized speech directly from characters. Tacotron was able to achieve a 3.82 mean opinion score (MOS), outperforming the traditional parametric system in terms of speech naturalness. Tacotron was also considerably faster than sample-level autoregressive methods because of its ability to generate speech at the frame level. Most recently, Google has released Tacotron 2 which took inspiration from past work on Tacotron and WaveNet. It features a tacotron style, recurrent sequence-to-sequence feature prediction network that generates mel spectrograms. Followed by a modified version of WaveNet which generates time-domain waveform samples conditioned on the generated mel spectrogram frames. The model achieved a MOS of 4.53 compared to a MOS of 4.58 for professionally recorded speech. Deep Neural Networks have been a strong force behind the developments of end-to-end speech recognition and generation models. Although these end-to-end models have compared substantially well against the classical approaches, more work is to be done still. As of now, end-to-end speech models cannot process speech in real time. Real-time speech processing is a strong requirement for latency-sensitive applications such as voice search. Hence more progress is expected in such areas. Also, end-to-end models do not give expected results when evaluated on live production data. There is also difficulty in learning proper spellings for rarely used words such as proper nouns. This is done quite easily when a separate PM is used. More efforts will need to be made to address these challenges as well.

0
0
21564

article-image-2nd-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

02 Feb 2018

3 min read

2nd Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

02 Feb 2018

3 min read

PdVega, a new library for pandas, Elastic Cloud Enterprise version 1.1.3, Google Analytics’ Audiences report, EverString’s Data platform, and more in today’s top stories around machine learning, blockchain, and data science news. 1. Introducing PdVega, a library for creating Interactive Vega-Lite Plots for Pandas The PdVega library allows quick creation of interactive Vega-Lite plots from Pandas dataframes. Vega-Lite is a visualization specification that allows users to declaratively describe which data features should map to which visualization features using a well-defined JSON schema. PdVega uses an API that is nearly identical to Pandas’ built-in plotting API. It is designed for easy use within the Jupyter notebook. The resulting plots are beautiful and dynamic data visualizations with a minimum of boilerplate. More information is available at the official documentation. 2. Elastic Cloud Enterprise version 1.1.3 released Elastic Cloud Enterprise (ECE) 1.1.3 has been released with an important bug fix to support Elasticsearch 6.1.x deployments. This new release adds support for Elasticsearch and Kibana version 6.1.3 by fixing a potential data loss bug when attempting a cluster configuration change (meaning any cluster configuration change, such as an upgrade or the addition of capacity). More bug fixes include: For stack versions 6.1.0 and above, Kibana now navigates to the home page when there is no data. Added a check that ensures the reallocation of clusters happens only after data is successfully migrated. Added an internal configuration flag when starting ZooKeeper that corrects a failure during an ECE upgrade. Other minor bug fixes and changes can be found in the release notes. 3. Google Analytics rolls out new ‘Audiences’ report to analyze a website’s custom audiences. Google Analytics has introduced a new report in Analytics called “Audiences” which analyzes a website’s custom audiences. The new Audience dimension can be used in segments and custom reports. With the new Audiences report, users can now view how their audience is performing and subsequently evaluate remarketing efforts. The Audiences report can display the following metrics: Acquisition: The volume of users an audience is sending, and how well the audience works to generate potential new business. Behavior: How well a site engages a particular audience based on bounce rate, pages per session, and time on site. Conversions: How well an audience is performing in terms of goal completions and transactions. 4. Hortonworks updates its streaming analytics platform for better data flow management Hortonworks Inc. incorporated some new releases to their Hortonworks DataFlow (HDF) streaming analytics platform. It now has the ability to share and publish data flows directly to production with improved support for complex processes. According to Scott Gnau, Hortonworks’ CTO, “The new release will be particularly useful for companies in regulated environments that need to rigorously document and govern their data”. Apart from this, HDF can now also be integrated with the Apache Atlas data governance and metadata framework, Hortonworks’s SmartSense problem resolution and optimization software, and Apache Knox authentication gateway. The new release is available as on 1st Feb’2018. All Hortonworks enhancements have been incorporated into their respective open-source projects. 5. EverString announces ML powered Data Platform for B2B marketing firms EverString announced the launch of a new Data Platform powered by Machine Learning to provide sales and marketing teams with company intelligence. The platform combines machine learning and AI to help keep contact, firmographic, technographic and intent insights up to date in real time. The platform can automatically identify problematic data and apply machine learning to improve system-wide accuracy. With this platform, B2B companies can prioritize pipeline to focus time and resources on high-value prospects and maintain their growing databases with accurate data on relevant prospects.

0
0
1419

article-image-1st-feb-2018-data-science-news-daily-roundup

Packt Editorial Staff

01 Feb 2018

4 min read

1st Feb 2018 – Data Science News Daily Roundup

Packt Editorial Staff

01 Feb 2018

4 min read

OpenAI’s seven unsolved problems, NVIDIA integrated GPU to IBM cloud, open-sourcing Psychlab, InterSystems IRIS data platform generally available, and more in today’s top stories around machine learning, deep learning,and data science news. 1. OpenAI releases new batch of seven unsolved problems OpenAI states that it is releasing new batch of seven unsolved problems, which came up during the course of their research. These questions will pave a meaningful way for new people to enter the field, as well as for practitioners to hone their skills. It is also a great way for people to get a job at OpenAI. Let us now have a look at the seven unsolved problems: Implement and solve a multiplayer clone of the classic Snake game as a Gym environment. (One can refer slither.io for inspiration) Explore the effect of parameter averaging schemes on sample complexity and amount of communication in RL algorithms. Transfer Learning Between Different Games via Generative Models Use linear attention for the Transformer model (which uses soft attention with softmax)in order to use the resulting model for RL. Use a learned VAE of data, to perform “learned data augmentation”. Experimentally investigate (and qualitatively explain) the effect of different regularization methods on an RL algorithm of choice. Excited? Have a detailed read on OpenAI blog. 2. DeepMind open-sources Psychlab DeepMind open-sourced Psychlab, a platform built on top of DeepMind Lab, for others to use. Psychlab allows direct application of methods from fields like cognitive psychology to study behaviours of artificial agents in a controlled environment. Alongwith open-sourcing Psychlab, the DeepMind team have also built a series of classic experimental tasks to run on the virtual computer monitor, which has a flexible and easy-to-learn API, enabling others to build their own tasks. Read more about Psychlab and the added tasks on DeepMind’s blog. 3. NVIDIA integrates its fastest GPU accelerator within IBM Cloud: Boosts AI and HPC workloads IBM announces availability of NVIDIA Tesla V100 GPU on its Cloud, which aims to accelerate enterprise efforts in mission-critical artificial intelligence (AI), deep learning, and HPC workloads. The V100 GPU is NVIDIA's fastest and most advanced GPU accelerator on the market, says John Considine, general manager of cloud infrastructure services for IBM Watson and Cloud Platform. Users can now integrate individual IBM Cloud bare metal servers with up to two NVIDIA Tesla V100 PCle GPU accelerators. This combination of IBM's high-speed network connectivity and bare metal servers with the V100 GPUs will provide a major boost to compute-intensive workloads. In a blog post, John Considine, general manager of cloud infrastructure services for IBM Watson and Cloud Platform said,"With the Tesla P100 GPU accelerator, you can leverage up to 65 percent more deep learning capabilities and 50 times the performance than its predecessor". For details, visit IBM’s blog. 4. DeepMind’s new research paper on achieving symbolic generalisation in deep neural networks DeepMind has come up with a new paper in the Journal of Artificial Intelligence Research(JAIR). The new paper showcases how Deep Neural Networks can be extended for generalizing visually and symbolically. This paper proposes a Differentiable Inductive Logic framework, which can solve tasks which traditional Inductive Logic Programming (ILP) systems are suited for. It can also show robustness to noise and error in the training data which ILP cannot cope with. Further, as it is trained by backpropagation, it can be hybridised by connecting it with neural networks over ambiguous data. Thus, this provides data efficiency and generalisation beyond what neural networks on their own can achieve. Read the detailed information on DeepMind’s blog. You can also read the research paper here. 5. InterSystems IRIS data platform is now generally available InterSystems announced the general availability of InterSystems IRIS Data Platform, the first data platform to deliver multi-workload and multi-model data management, native interoperability, and an open analytics platform in a single product. The IRIS is a complete unified data platform that makes it faster and easier to build real-time data-rich applications. It allows organizations to combine event and transactional data with large sets of historical and other data for capturing untapped business opportunities and also to improve operational efficiencies. InterSystems IRIS Data Platform aims: To delivers concurrent transactional and analytic processing, and multiple data representations (including relational and non-relational models which are always synchronized) in a single database; To provide a complete set of interoperability capabilities for integrating disparate data and applications and create seamless real time business processes To include business intelligence and natural language processing capabilities, and an open analytics platform that allows best-of-breed, third-party analytics to be easily incorporated through dedicated connectors and industry standards To support flexible deployment options for public and private cloud, on premises, and hybrid environments. Have a detailed read at the official press release.

0
0
1496

article-image-31st-jan-2018-data-science-news-daily-roundup

Packt Editorial Staff

31 Jan 2018

4 min read

31st Jan 2018 – Data Science News Daily Roundup

Packt Editorial Staff

31 Jan 2018

4 min read

Hyperledger Sawtooth 1.0, an implementation of AlphaGoZero, enhancements in PostgreSQL 11, SAS AI offerings, and more in today’s top stories around machine learning, blockchain, and data science news. 1. Hyperledger Sawtooth 1.0, the second blockchain framework from Hyperledger, is now production ready Hyperledger announced the availability of their second blockchain framework, Hyperledger Sawtooth 1.0. It is the latest open source digital ledger project, after Hyperledger Fabric which reached version 1.0 in July 2017. Sawtooth 1.0 is equipped with several new enterprise features: On-chain governance – Users can now utilize smart contracts to vote on blockchain configuration settings such as the allowed participants and smart contracts. Advanced transaction execution engine – The engine can process transactions in parallel to accelerate block creation and validation. Support for Ethereum – Sawtooth runs solidity smart contracts and allows integration with Ethereum tooling. Dynamic consensus – Users can also upgrade or swap the blockchain consensus protocol on the fly, enabling the integration of more scalable algorithms as they are available. Interested users can download the code here. They can also read the official documentation here. 2. Minigo: An open-source implementation of the AlphaGoZero algorithm Minigo is a pure Python implementation of a neural network-based Go AI, using TensorFlow. It is inspired by DeepMind's AlphaGo algorithm. Minigo is based on Brian Lee's MuGo, which is a pure Python implementation of the first AlphaGo paper. The project provides a clear set of learning examples using Tensorflow, Kubernetes, and Google Cloud Platform for establishing Reinforcement Learning pipelines on various hardware accelerators. It reproduces the methods of the original DeepMind AlphaGo papers through an open-source implementation and open-source pipeline tools. The project aims to provide their contributions in the form of data, results, and discoveries for the benefit of the Go, machine learning, and Kubernetes communities. More information is available at the official Github repo. 3. PostgreSQL 11 plans to add enhancements to Partitioning & Indexes PostgreSQL 11 would be releasing this year, and the team plans to add some enhancements to partitioning and indexes. The whole idea is to allow Partitioned tables to have Referential Integrity, by way of Primary Keys and Foreign Keys, and some additional tweaks can be expected. Foreign Keys (FKs) are implemented using row Triggers, so Triggers would allow them to be executed on Partitioned Tables. Primary Keys are implemented using Unique Indexes, so an addition of indexes would allow them to be unique. Following are some set of features and the order in which they have to be implemented: Create Index on Partitioned Tables Allow Unique Index on Partitioned Tables Create Triggers on Partitioned Tables Allow FKs on Partitioned Tables To have a detailed read on this news visit the website. 4. SAS launches new AI offerings for Text Analytics, Data Mining, and Machine Learning SAS, an analytics software development firm, has released a variety of new offerings for its SAS Viya Platform. This includes SAS Visual Text Analytics and significant enhancements to SAS Visual Data Mining and Machine Learning. SAS Visual Text Analytics is a modern and flexible framework which can perform text mining, contextual extraction, categorization, sentiment analysis and search operations. It extracts value from unstructured data using NLP, machine learning, and linguistic rules. The software allows users to prepare data for analysis, visually explore topics, build text models and deploy them within existing systems or business processes. Apart from this, there are also enhancements in SAS Visual Data Mining and Machine Learning. It now offers an end-to-end visual environment for data access, data wrangling, sophisticated model building, and deployment. It has an in-memory, distributed processing to solve critical business queries. It also supports programming from popular open source languages like Python and R. 5. Cisco advances its intent-based networking with new analytics services Cisco has introduced three new analytics tools to advance its intent based networking services. These analytics services are powerful assurance products spanning the entire networking portfolio. Network Assurance Engine, which continually verifies network health and uses models to pinpoint issues with the network. It uses continuous verification of the entire network to help keep a business running as intended, even as the network changes dynamically. Cisco's ACI and Tetration connect to the Network Assurance engine to link network and application monitoring. DNA Center Assurance, which is a service that connects users and application behavior to make predictions. DNA Center Assurance provides problem isolation so IT teams can find a root cause quickly, replicate problems and offer guided remediation. Meraki Network Health, which is a cloud IT management tool to automate network and IT operations. The tool finds poor performing access points and provides insights to improve service.

0
0
1201

article-image-30th-jan-2018-data-science-news-daily-roundup

Packt Editorial Staff

30 Jan 2018

4 min read

30th Jan 2018 – Data Science News Daily Roundup

Packt Editorial Staff

30 Jan 2018

4 min read

Microsoft releases new data science tools, Tensorflow publishes an implementation of SPINN, Machinelabs now supports private labs, and more in today’s top stories around machine learning, deep learning,and data science news. 1. Microsoft Releases DataScience Tools for Interactive Data Exploration and Data Modeling Microsoft has introduced the early preview release of the Data Science Utilities developed by Team Data Science Process (TDSP). At present, the Data Science Utilities are released in the GitHub repository and these include: Interactive Data Exploration, Analysis, and Reporting (IDEAR) in R, MRS, and in Python are tools developed for data scientists to interactively explore, visualize, and analyze data sets prior to building modeling tasks.The Python version of IDEAR is delivered through Jupyter Notebooks which runs on both Jupyter Notebook Server available and any notebook services in Python 2.7 or 3.5 kernel, as long as the required Python libraries are installed on the notebook server. Automated Modeling and Reporting in R (AMAR in R) tool creates an automated workflow for generating and comparing multiple modeling approaches on a data-set. One can easily run these utilities on sample data in the Data/Common directory. To read more on this, visit the GitHub repo. 2. Tensorflow publishes an implementation of SPINN written with Eager execution Tensorflow recently published an implementation of SPINN written with Eager execution. Stack-Augmented Parser-Interpreter Neural Network (SPINN), is a recursive neural network that utilizes syntactic parse information for natural language understanding. It was originally described in the paper, A Fast Unified Model for Parsing and Sentence Understanding. The Tensorflow implementation is based on Jek bradbury's PyTorch implementation. It includes model definition and training routines, a pipeline for loading and preprocessing the SNLI data and GloVe word embedding, written using the tf.data API, saving and loading checkpoints, TensorBoard summaries for monitoring and visualization, etc. More information can be found at the Github repo. 3. Machinelabs now supports private labs MachineLabs announces the support for Private Labs. MachineLabs is an open platform for sharing machine learning experiments with others. This means Labs can be viewed by and shared with everyone one, even via a browser. If a user wants to work in secrecy, or use it for company’s internal tasks, or may want to do a trial and error and don’t want to create a clutter of trails and errors in the public labs, they can now use private labs. To set labs as private is as easy as setting a flag private in one’s lab settings. Once private, the lab will be only visible to the user, including its executions. One can also find all the public and private labs on their own profile page. Also, these private labs can be recognized by the little “private” badge. Know more about these Private labs on MachineLab’s blog post. 4. Aureum 5.3 to power predictive analytics for Data-Driven Industrial World Peaxy announced the release of Aureum 5.3, a data access solution that provides a foundation for industrial digital twins and predictive analytics. Manuel Terranova, CEO, Peaxy, says, “Aureum has evolved since 2012 from an advanced distributed data platform to an incredibly useful infrastructure component in complex analytical solutions and predictive applications. Our team of engineers are experts in supporting predictive analytics solutions to difficult industrial problems at enterprise scale.” Aureum 5.3 is being used by Fortune 100 companies in the aviation, power generation and oil & gas industries as an essential data staging area for analytics that solve real-world business problems. Know more on the website. 5. Dodge Data & Analytics launches Dodge Construction Central Dodge Data & Analytics announced the launch of Dodge Construction Central, a single unified hub where all construction industry and project stakeholders can discover, share and access new and unique insights from across the entire construction ecosystem and along the full project lifecycle to make timely, data-driven decisions. Dodge Construction Central Delivers deep intelligence to project stakeholders from the most-comprehensive industry data cloud. It empowers project stakeholders to collaborate with project teams and integrate insights directly into their business processes by leveraging artificial intelligence, advanced analytics, collaboration and workflow automation technologies. To know about the new capabilities offered by Dodge Construction Central, you can visit this website.

0
0
1202

article-image-packt-humble-bundle-developers-selection-mobile-development

Packt News

29 Jan 2018

2 min read

Packt teams up with Humble Bundle to bring developers a selection of mobile development bundles

Packt News

29 Jan 2018

2 min read

Following on from the popularity of the Python bundle at the start of January, Packt is once again working with Humble Bundle to bring readers a selection of tech resources - this time on all things mobile development. Find the offer on Humble Bundle. Featuring a combination of Packt’s most popular and latest, cutting-edge mobile development titles, such as Mastering iOS 11 Programming, Second Edition, and Mastering React Native, it’s an opportunity for mobile developers - from beginners to experienced professionals - to stock up on skills for 2018. ‘Mobile development is a field where innovation is happening at an impressive rate - this means developers have a great opportunity to get involved in some really exciting projects’ says Heather Gopsill, Head of Channels at Packt. ‘The books we’ve put together with the team at Humble Bundle will help readers develop essential skills to equip them for the future of mobile. That they can do this while donating to incredible causes is even better!’ Readers can pay from $1 to purchase: Android Application Development Cookbook, Second Edition iOS 10 Programming for Beginners Learning Ionic, Second Edition Creating Cross Platform Games with Xamarin Three months of Mapt Pro Readers who pay from $8 can also purchase: Android Programming for Beginners Mastering Android Studio 3 Swift 4 Protocol-oriented Programming Xamarin 4.x Cross Platform Application Development Mastering React Native Ionic 2 Solutions Kotlin Fundamentals React Native Projects And readers who contribute from $15 can get all of those titles as well as these: React Native Cookbook Android Development with Kotlin Mastering Cross-Platform Development with Xamarin Mastering iOS 11 Programming, Second Edition Mastering Swift 4, Fourth Edition React and React Native Swift 4 Programming Cookbook Developing iOS 11 Applications Using Swift 4 iOS 11 Programming with Swift Mastering Kotlin for Android Development Customers will be invited to donate to featured charities code.org and Charity: Water, but will also be able to select from Humble Bundle’s huge database of partner charities. The bundle offer ends on 12th February 2018.

0
0
11892

Tech News

article-image-29th-jan-2018-data-science-news-daily-roundup

Packt Editorial Staff

29 Jan 2018

4 min read

29th Jan 2018 – Data Science News Daily Roundup

Packt Editorial Staff

29 Jan 2018

4 min read

Tensorflow 1.5.0, DataNucleus AccessPlatform 5.1.6, Databricks comes to Microsoft Azure, Citus 7.2, and more in today’s top stories around machine learning, deep learning, and data science news. 1. Tensorflow 1.5.0 now generally available with preview of Tensorflow Lite Tensorflow 1.5.0 is now generally available. Previously, Tensorflow 1.5 RC was announced on 4th January, 2018. With Tensorflow 1.5, a lot of new features and changes have been added. The breaking changes revolve around prebuilt binaries. They are now built against CUDA 9 and cuDNN 7. Also, starting from 1.6 release, prebuilt binaries will use AVX instructions. Other major features and improvements include: The availability of eager execution preview version. The availability of TensorFlow Lite dev preview. Accelerated Linear Algebra (XLA) related changes. Addition of streaming_precision_recall_at_equal_thresholds, a method for computing streaming precision and recall with O(num_thresholds + size of predictions) time and space complexity. RunConfig default behavior will now not set a random seed, making random behavior independently random on distributed workers. The implementation of tf.flags is replaced with absl.flags. Support for CUBLAS_TENSOR_OP_MATH in fp16 GEMM. Support for CUDA on NVIDIA Tegra devices. For details on bug fixes and other changes, see the full release notes here. 2. A new version of DataNucleus AccessPlatform is now available DataNucleus AccessPlatform is Apache 2 licensed and provides retrieval of Java objects to a range of datastores using JDO/JPA/REST APIs, with a range of query languages. Now they have released the version 5.1.6 which includes a lot of new enhancements and bug fixes. ClassUtils.getConstructorWithArguments doesn’t allow to skip type check of one of the arguments. Support for queries with "IS NULL" / "IS NOT NULL" added. Support for String toUpperCase/toLowerCase/trim/trimLeft/trimRight/substring in JDOQL/JPQL added. Support for Numeric cos/sin/tan/acos/asin/atan/toDegrees/toRadians in JDOQL/JPQL added. Unable to execute an UPDATE JPQL Query against a domain class that contains 'Set' in its name. Lists might appear empty while they are actually not (forEach). Retrieval code doesn't handle primitive retrieval when not existing in database Inequality Filter method, .ne() gives QueryExecutionException. Query with candidate being base of inheritance tree using "complete-table" strategy fails when overriding the "id" column name. JDOQL query fails when using reference to interface field, and implementations share table. @Basic @Lob ArrayList<byte[]> entity field results in erroneous metamodel. @Basic @Lob Serializable entity field results in erroneous metamodel. The entire changelog can be found in their release notes. 3. Databricks integrates with Microsoft Azure Until now, services of Databricks were available as a single cloud offering based on the Amazon Web Services (AWS) cloud. Starting 27th January 2018, a new flavor of Apache Spark service is announced. Called Azure Databricks (ADB), it is based on and is tightly integrated with Microsoft Azure. The Apache Spark-based analytics platform is optimized for the Microsoft Azure cloud services platform. It provides one-click setup, streamlined workflows, and an interactive workspace allowing data scientists, data engineers, and business analysts to collaborate. This new service is a first-party offering from Microsoft. It consists of three major parts, a notebook-based collaborative workspace, the Databricks Runtime, and a serverless compute model. ADB has direct support for Azure Blob Storage and Azure Data Lake Store. It also integrates with Cosmos DB and Azure Active Directory. More information is available here. 4. A new version of Citus (7.2), the distributed database is now released Citus have announced the version 7.2 of their distributed database. With Citus database version 7.2, distributed SQL support is added to queries that run on data spread across a cluster of machines. A quick overview of the changes in Citus database version 7.2 for distributed queries include: Common Table Expressions (CTEs). Complex subqueries. Set operations (UNION, INTERSECT, etc). Joins between distributed and local tables through CTEs. Joins that include non-equality clauses. Partition management automation with pg_partman. Citus 7.2 is compatible with PostgreSQL 9.6 and 10. It can be downloaded by following the instructions here. It can also be deployed in a single-click through the Citus Cloud console. To learn more about the new features, visit the official blog. 5. CapLinked announces Transitnet, a new blockchain framework CapLinked has launched their new blockchain framework TransitNet to protect and record enterprise transactions. TransitNet is a decentralized protocol that protects digital assets and permanently records data access. The protocol is accessible via an API and is used to apply protections and activity tracking for information exchanging during business deals. TransitNet adds security to transfer of funds, as opposed to Ripple and other decentralized technologies that are addressing payments. TransitNet’s decentralized application will allow users to apply protections and track their digital assets when they’re transferred to third parties. They’ll be able to encrypt, watermark, and set access parameters for digital assets being moved and track their movement on an immutable decentralized ledger.

0
0
1328

article-image-25th-jan-2018-data-science-news-daily-roundup

Packt Editorial Staff

25 Jan 2018

4 min read

25th Jan 2018 – Data Science News Daily Roundup

Packt Editorial Staff

25 Jan 2018

4 min read

Microsoft’s New Windows Desktop Program, SAP HANA adds Geospatial Data, Alphabet launches Chronicle, and more in today’s top stories around machine learning, deep learning,and data science news. 1. Microsoft’s New Windows Desktop Program Analytics for developers Microsoft has launched a new online toolkit that developers can use to analyze how well their software is performing. It can identify bugs and other potential issues that detract from the user experience. The new Windows Desktop Program is a one-stop portal to view their desktop application analytics or access the data via an API. On signing up for the Windows Desktop Application Program and registering your certificates, one will be able to use the analytics reports to: View a summary of all failure types, sorted by number of hits Drill down into each failure and download stack traces and CAB files to debug the issue faster Compare the health status and adoption of a newly released version of your application to previous releases View health data in aggregate or by region, allowing you to isolate issues that are specific to a region Compare performance and adoption of your desktop applications across Windows versions, such as the latest Windows 10 or Windows Insider releases. To know more about the Windows Desktop Application Program, check out the video from the Windows Developer series. 2. SAP HANA adds Geospatial Data to its database offerings SAP announced that it is adding geospatial data to its growing list of database offerings as it launches a partnership with spatial analytics specialist Esri. The aim of this partnership is to adopt SAP’s in-memory data and application development platform. Esri would run its ArcGIS enterprise geodatabase on the SAP HANA platform. The integration is designed to boost performance and scaling with the goal of integrating location with other enterprise data sets. Among the promised applications are geographic information systems (GIS), mapping, advanced visualizations and spatial analytics. For a detailed information on this topic, visit the website. 3. Alphabet launches Chronicle to track hackers with the help of machine learning Alphabet, the parent company of Google launched Chronicle. Chronicle is developing technology that finds hackers faster than humans currently can. The company would build tools that use machine learning to identify the signs of hackers in company systems and shorten the amount of time it takes to stop a breach. It also joins a crowded field of cybersecurity firms that are trying to help companies find hackers sooner. Chronicle aims to give organizations a much higher-resolution view of their security situation. This is done by combining machine learning, large amounts of computing power and large amounts of storage. Click to know more about Chronicle on this website. 4. BigTime Software launches reporting and data analytics tool BigTime Software, Inc.,announced the launch of its resource allocation product within its Premier platform. The new product or more precisely the tool gives enterprise managers a real-time view of reporting and data analytics. The product is SaaS-based and is scalable to hundreds of enterprise users. Brian Saunders, CEO of BigTime Software, stated,“We accelerated the rollout of our Premier product based on the demand from beta customers who said the platform had an immediate impact on project visibility and profitability, particularly in managing multiple projects and large staffing teams across multiple time zones.” Click on the link for a detailed information on this topic. 5. Ray framework close to production-ready Ray, a high-performance distributed execution framework is targeted at large-scale machine learning and reinforcement learning applications. It is being developed by a team of UC Berkeley professors including Ion Stoica, who has previously developed large scale systems like Apache Spark. Ray comes with two libraries for accelerating deep learning and reinforcement learning development: Ray Tune: Hyperparameter Optimization Framework Ray RLlib: Scalable Reinforcement Learning Library To explore Ray framework visit the codebase on GitHub and its documentation.

0
0
1382

article-image-24th-jan-2018-data-science-news-daily-roundup

Packt Editorial Staff

24 Jan 2018

4 min read

24th Jan 2018 – Data Science News Daily Roundup

Packt Editorial Staff

24 Jan 2018

4 min read

Changes in MySQL 8.0.4, PyCharm Early Release, Dundas BI 5, Cortex 5 to accelerate Enterprise AI Adoption, and more in today’s top stories around machine learning, deep learning,and data science news. 1. Changes in MySQL 8.0.4 (Release Candidate) MySQL has come up with some changes in the 8.0.4 version of its release candidate. Let’s have a look at some of these: Some configuration options in the InnoDB and replication have been deprecated and will be removed. The Performance Schema now uses SHA-256 hashes for statement digests rather than MD5 hashes There are some packaging notes included for Linux, Windows, and macOS. A new RESTART SQL statement is available that enables restarting a MySQL server instance from within a client session The MySQL test suite now includes CRC32() tests X Plugin now supports Caching SHA-2 Pluggable Authentication These and many other changes have been explained in detail in the link here. 2. PyCharm Early Release 2018.1 EAP now available The Early Access Program(EAP) for Pycharm 2018.1 is here. Here’s what’s new in 2018.1 EAP 1: Scientific Project type : One can create a new scientific project straight from the new project screen. Scientific projects are created by default with a Conda environment and will scaffold the directory for your data. Improved HiDPI support : It now supports configurations running Windows 8.1 or higher, with multiple displays that have different scale factors or a display with a fractional scale factor. Open Terminal from the Project tool window : One just has to right-click a folder in the Project tool window and this will start a terminal in that folder. Better code completion for Python : Includes improvised stubs for the Python standard library to improve code completion for these libraries. To read more about this news in detail, visit this link. You can also check out the release notes. 3. Dundas Data Visualization announces its Dundas BI 5 Dundas Data Visualization, Inc. announced today the availability of Dundas BI 5. The Dundas BI 5 is the latest release of their fully customizable BI and analytics platform, which makes unlimited analytics possibilities easier. The BI 5 enables a complete 360-degree data control with innovative methods for analyzing data. The methods include, advanced visual analytics, new built-in integrations, and an excellent user experience. Listed below are some features of Dundas BI 5: An advanced Predictive Analytics with forecasting, clustering using Python Tree and Chord Diagrams for visual detection of hidden relationships Heat Maps and other data mapping enhancements A new homepage enabling faster and easier navigation around data content Easier and faster Data Preparation with greater data visibility New Advanced Visual Interactions for intuitive point and click analysis To read about this coverage in detail, visit the website. 4. Litmus Automation Announces Loop Insights: A platform for complex IoT Analytics and Visualization Litmus Automation, a provider of the Industrial Internet of Things (IIoT) platform, today announced the availability of Loop Insights. Loop insights is a live dashboard, offered with the Loop platform for complex IoT analytics and visualization. It is one of the several flexible modules, designed to provide a wealth of information and insight on IoT integrations, through just a few simple clicks. Loop Insights enable the creation of derived complex visualizations and analysis for measuring key data and device KPIs. It is also ideal for monitoring downtime and uptime to improve overall equipment effectiveness. Vatsal Shah, Co-founder and CEO of Litmus, says, “We have developed Loop Insights to keep all business users on an IoT project on the same page. While IoT projects are growing rapidly, we see customers capturing a vast amount of data without the ability to properly process and put that data to work for their business. The Loop IoT platform, with Loop Insights, brings data from the edge to the internal service provider for better business intelligence in real-time." For a detailed read visit this link. 5. CognitiveScale launches Cortex 5 to Accelerate Enterprise AI Adoption CognitiveScale, a leading augmented intelligence software company, today announced Cortex 5. It is the next generation of augmented intelligence cloud software powered by artificial intelligence (AI) and blockchain technology. Matt Sanchez, CTO and co-founder, CognitiveScale, states,“Our Cortex 5 platform helps businesses derive rapid benefit from AI powered business processes by bridging the data, skills, and tooling gaps between data science workflows and the software development lifecycle.” Cortex 5 is designed to help businesses with limited ML expertise and start building their own high-quality AI systems through three interrelated cloud-based software offerings: AI Marketplace AI Platform AI Systems Cortex 5 is available on both Amazon AWS and Microsoft Azure cloud environments and supports both enterprise and hybrid cloud deployments. To read about the cloud-based software offerings in detail you may visit the website.

0
0
1259

article-image-23rd-jan-2018-data-science-news-daily-roundup

Packt Editorial Staff

23 Jan 2018

4 min read

23rd Jan 2018 – Data Science News Daily Roundup

Packt Editorial Staff

23 Jan 2018

4 min read

MariaDB Server 10.3, Detectron open sourced by Facebook, Salesforce cloud and Google Analytics 360 teams up, and more in today’s top stories around machine learning, deep learning,and data science news. 1. Announcing MariaDB Server 10.3 second Beta release MariaDB, one of the fastest growing open source relational databases, announces the second beta release of its server 10.3. The most powerful enhancement of this release is the System Versioned Tables, which has the ability to process temporal data. This new addition opens up new use cases such as retrospective and trend data analysis, forensic discovery, or data auditing, to the server. System Versioned Tables could be used for compliance, audit, risk analysis, or position analysis. MariaDB server 10.3.4 beta version includes various other features and enhancements for its users and customers, which include: Enhancements in Database Compatibility Providing better user flexibility Enhancements in performance/storage Enhancements in the Storage Engine To read about each one in detail, visit the link given. 2. Facebook AI research open sources Detectron Facebook AI Research(FAIR) open sourced Detectron today! Detectron is a state-of-the-art platform for object detection research. The Detectron project has supported large number of projects including Mask R-CNN and Focal Loss for Dense Object Detection, which won the Marr Prize and Best Student Paper awards, respectively, at ICCV 2017. These Detectron-powered algorithms provide intuitive models for important computer vision tasks, such as instance segmentation, and have played a key role in the unprecedented advancement of visual perception systems that the Facebook community has achieved in recent years. Also, a number of Facebook teams use this platform to train custom models for a variety of applications including augmented reality and community integrity. To know more about Detectron, visit the GitHub repository. 3. Salesforce Sales Cloud teams up with Google Analytics 360 After Google announced its partnership with Salesforce in November 2017, today it further announced their first integration, Salesforce Sales Cloud with Google Analytics 360. The sales pipeline data from the sales cloud can now be imported directly into Google Analytics 360. With this, marketers can now easily combine offline sales data with their digital analytics data to see a complete view of the conversion funnel. The built-in connections between Analytics 360 and Google’s media buying platforms offer additional ways to find new customers and drive incremental revenue. Enterprises such as Rackspace and Carbonite are already benefiting from this integration, saving hours piecing together data and reaching new, more valuable audiences. To know about this integration in detail visit the link given here. 4. IBM and Salesforce combine for next level analytics IBM and Salesforce have decided to combine Watson and Einstein service cloud to deliver new AI-driven recommendations for future best actions. With AI-driven predictive analytics, companies will be able to create personalized, customer-triggered interactions based on the latest call or messaging chat, to help build stronger connections with their customers. Marc Benioff, chairman and CEO, Salesforce, said,“The combination of IBM Cloud and Watson services with Salesforce Einstein and Quip will deliver even more innovation to empower companies to connect with their customers in a whole new way, leveraging the power of the cloud and AI.” IBM also plans to build new IBM Watson Quip Live Apps, which brings together the power of IBM’s Watson and Salesforce’s Quip.These interactive custom-built applications will be embedded directly into any Quip document to increase the effectiveness of sales teams across the lifecycle of an opportunity. 5. ggtern version 2.2.2 released Ggtern is a software package for the statistical computing language R and is an extension to ggplot2. ggtern version 2.2.2 has just been submitted to CRAN and it includes some new features such as, Ternary Hexbin and Ternary Tribin. The Ternary Hexbin has the capability to bin points in a regular hexagonal grid to produce a pseudo-surface. The Ternary Tribin operates much the same, except that the binwidth no longer has meaning, instead, the density (number of panels) of the triangular mesh is controlled exclusively by the ‘bins’ argument To read more on this news in detail, visit the link: www.ggtern.com

0
0
1623

8th Feb 2018 – Data Science News Daily Roundup

7th Feb 2018 – Data Science News Daily Roundup

6th Feb 2018 – Data Science News Daily Roundup

AutoML : Developments and where is it heading to

5th Feb 2018 – Data Science News Daily Roundup

How Deep Neural Networks can improve Speech Recognition and generation

2nd Feb 2018 – Data Science News Daily Roundup

1st Feb 2018 – Data Science News Daily Roundup

31st Jan 2018 – Data Science News Daily Roundup

30th Jan 2018 – Data Science News Daily Roundup

Trending Topics

Packt teams up with Humble Bundle to bring developers a selection of mobile development bundles

29th Jan 2018 – Data Science News Daily Roundup

25th Jan 2018 – Data Science News Daily Roundup

24th Jan 2018 – Data Science News Daily Roundup

23rd Jan 2018 – Data Science News Daily Roundup

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access