Tech News

article-image-trending-datascience-news-18th-oct-17-headlines

18 Oct 2017

2 min read

Intel takes Facebook’s help on AI chip; Cisco uses AI to predict IT services; and more - 18th Oct.' 17 Headlines

18 Oct 2017

Intel’s AI chip in news Intel collaborates with Facebook on its upcoming artificial intelligence chip NNP Intel said it is now working with Facebook on its much anticipated artificial intelligence chip which will be shipped by the end of this year. Intel, the chip giant, is making an ambitious debut into the field of artificial intelligence with its upcoming Nervana Neural Network Processor (NNP). “We are thrilled to have Facebook in close collaboration sharing its technical insights as we bring this new generation of AI hardware to market,” CEO Brian Krzanich wrote. The Intel Nervana Neural Network Processor is named as such because Intel acquired the chip startup Nervana Systems in 2016. New AI services in news Cisco using new AI services to predict IT failures Cisco has launched two new AI-enabled services – Business Critical Services and High-Value Services – to apply machine learning and artificial intelligence in helping businesses negotiate IT risks and predict failures. Cisco Business Critical Services will help predict opportunities by applying actionable analytics, automation and technology expertise, whereas the Cisco High-Value Services will enhance the overall utilization of advanced software, solutions and the network. SAP uses machine learning to optimize online shopping SAP has released several machine learning, facial recognition, and Internet of Things features to improve e-commerce experience with targeted marketing campaigns. With the updated SAP Hybris Marketing Cloud solution, companies can use the right messages to target the right customers. SAP said the personalized offers will ensure data protection and privacy. Other Data Science News Tableau announces its Hyper engine is now in beta release 10.5 Tableau’s much anticipated in-memory data processing engine “Hyper” is now in beta release 10.5 and will be generally available early next year, the company announced at its recent conference at Las Vegas. The Hyper engine was described at Tableau’s last year conference and it claims to to solve performance problems when handling large-scale structured data extracts. In future, Hyper will be enhanced to address NoSQL and graph workloads. IBM SPSS Modeler to teach data science, machine learning for free IBM SPSS Modeler is now available for free, as part of IBM’s Academic Initiative program where it provides many software and cloud services for free or reduced cost. Students and professors can go to ibm.onthehub.com and search for IBM SPSS Modeler. They will require to get a license key valid for one year.

0
0
11990

article-image-trending-datascience-news-17th-oct-17-headlines

Packt Editorial Staff

17 Oct 2017

4 min read

Google’s AutoML beats human AI capacities and more - 17th Oct.' 17 Data science news headlines

Packt Editorial Staff

17 Oct 2017

4 min read

Google AutoML in News AutoML has started creating better AIs than researchers, Google says Google’s AutoML project has started replicating itself, and the AI software is producing machine learning codes with higher efficiency rate than researchers. AutoML was launched this year at Google’s annual developer conference in May with an aim to make machines ‘intelligent’ enough to create other intelligent machines. Now the project is yielding great success as AutoML has started building ML software that are more powerful than human-designed AI systems, even in complicated tasks related to augmented reality and automation. CEO Sundar Pichai said Google could it plans to ‘democratize’ AutoML in future making it available outside Google. AI platforms in News Mitchell's WorkCenter™ Assisted Review is P&C industry's first AI-driven Claim Review Solution Mitchell, leading provider of technology, connectivity and information solutions to the Property & Casualty (P&C) claims and Collision Repair industries, has launched an integrated workflow solution to leverage artificial intelligence for the estimate review process. Named Mitchell WorkCenter™ Assisted Review, the solution uses machine-learning technology to help identify incorrect replace or repair decisions, helping insurance companies review more estimates in less time while refining estimating guidelines and consistency. Early pilot tests demonstrated that A.I.-identified claims consistently reduced the amount of time for the audit and review function per claim by a substantial margin. IZEA uses artificial intelligence on its content with ContentMine IZEA has introduced a new feature into its IZEAx platform named ContentMine that automatically mines content and tags photo and video assets using artificial intelligence. Apart from text and image processing, ContentMine includes smart groups, content ratings, and several search filters, and can programmatically grab screen shots of published social media content. “ContentMine serves as an intelligent repository for all the content generated through IZEA campaigns, and allows marketers to upload content produced outside our platform as well,” said Ted Murphy, Founder and CEO of IZEA, “If an Instagram picture taken by an influencer contains a dog and a car, ContentMine will programmatically identify those objects and make them searchable. Marketers can use the content analysis engine built into ContentMine to reduce the time historically spent tagging and manually organizing content.” Other trending data science news IBM’s new services make cloud migration easier, faster and more affordable IBM Cloud Migration Services and IBM Cloud Deployment Services are two new services launched by IBM that provide less expensive and faster ways to move business data and applications to the cloud. With Cloud Migration Services, businesses can understand their existing IT infrastructure and accordingly work to migrate services to the cloud. Whereas Cloud Deployment Services is a next-gen automation platform for building private and hybrid clouds across multiple platforms and service providers. Overall, the new services claim to drastically reduce the design, build, deployment and testing efforts. AMA initiates integrated big data analytics platform IHMI to organize health data In what could usher in a new era of patient care, the American Medical Association has announced that it is working on a big data analytics platform named Integrated Health Model Initiative (IHMI) to develop a common data model which could improve the way healthcare information is organized and shared. AMA is collaborating with Cerner Corporation, IBM, Intermountain Healthcare, PCORI, AMIA, and SNOMED, on this project. “We spend more than three trillion dollars a year on health care in America and generate more health data than ever before. Yet some of the most meaningful data – data to unlock potential improvements in patient outcomes – is fragmented, inaccessible or incomplete,” CEO of AMA James Madara noted. Razorthink Big Brain: An advanced deep learning platform that automates data science tasks Razorthink has launched a data science platform that automates the data preparation, modeling, evaluation and deployment of deep learning solutions. The automation platform is named Razorthink Big Brain, and it generates Expert AIs for customized business cases with superior predictive analytics. Created with hybrid algorithms that learns without human intervention, the Big Brain platform can discover insights using deep learning neural networks not possible otherwise with traditional machine learning algorithms.

0
0
1435

article-image-trending-datascience-news-16th-oct-17-headlines

Packt Editorial Staff

16 Oct 2017

3 min read

IBM’s new blockchain platform, Elasticsearch's availability on Alibaba Cloud, and more - 16th Oct.' 17 Headlines

Packt Editorial Staff

16 Oct 2017

3 min read

Blockchain in News IBM launches blockchain network for international banking across 12 currency corridors The use of digital currency received yet another boost for future as tech elephant IBM introduced its blockchain network for cross-border transactions. IBM has collaborated with KlickEx Group and Stellar.org on this platform, which the company described as a ‘paradigm shift’ underway to transfer money digitally all across the world at a real time. Currently, the network is handling transactions in a regulated environment across 12 currency corridors encompassing the Pacific Islands, Australia, New Zealand and the United Kingdom, but soon there could be an increase in scalability and volume. Jesse Lund, IBM’s VP of Blockchain, predicted a "drastic shift in the construct of payments infrastructure" within the next five years. Universa, Yellowrockets announce world’s first decentralized Blockchain accelerator Universa and Yellowrockets have reportedly launched world’s first decentralized Blockchain accelerator. Recruitment for the accelerator has started, founder of the Universa blockchain platform Alexander Borodich announced. YellowRockets team will organize and manage the accelerator programs for which applications have to be first submitted till 10th of November on www.urockets.com. After the projects are examined and shortlisted, there will be a PreCamp held at BAZAAR Tech Convention in Sochi from Nov. 16-19, and only after that the best blockchain startups will be selected for the acceleration program. It could be a good meeting point for blockchain start-ups, industry experts, and investors. Ethereum implements Blockchain Hard Fork to Byzantium Ethereum, the second largest cryptocurrency by market cap, has officially updated with the first half of the Metropolis hard fork, nicknamed Byzantium. The Byzantium upgrade is part of the Metropolis protocol designed to improve the blockchain by boosting network privacy and making it easier for decentralized applications (dapps) to proliferate on the platform. In 2015, Ethereum introduced a large-scale upgrade in its roadmap under the name Metropolis but the upgrade encountered substantial delays. As a result, Metropolis was broken into two phases – Byzantium and Constantinople. Other Data Science News Alibaba, Elastic collaborate to add Elasticsearch on Alibaba Cloud The Alibaba Group and Elastic have joined hands to offer Elasticsearch on the Alibaba Cloud platform. The new service is called Alibaba Cloud Elasticsearch. Now customers of Alibaba Cloud can deploy Elastic’s real-time search, data ingestion, and analytic features as a hosted and turnkey solution, according to an announcement during a keynote at The Computing Conference 2017. “Alibaba Cloud Elasticsearch will be a highly differentiated service as it uses Elastic’s advanced search product and powerful X-Pack features across every tier of our service in a way that is easy to get started, consume, and manage,” said Yeming Wang, Deputy General Manager, Alibaba Cloud Global. The product is available immediately and includes Elastic’s Kibana and X-Pack features.

0
0
1621

article-image-15-most-trending-applications-of-machine-learning-on-twitter

Aarthi Kumaraswamy

13 Oct 2017

2 min read

Top 15 Applications of Machine Learning on Twitter

Aarthi Kumaraswamy

13 Oct 2017

2 min read

0
0
6208

Packt Editorial Staff

13 Oct 2017

3 min read

Microsoft, AWS join forces for Gluon; Google, IBM unveil open API Grafeas; and more - 12th Oct.' 17 Headlines

Packt Editorial Staff

13 Oct 2017

3 min read

Microsoft, Amazon announce deep learning interface Gluon that is accessible to all developers Amazon Web Services and Microsoft have together developed a new open source deep learning library called Gluon that can be accessible to all developers. Using the interface, developers of all skill levels can build neural networks using simple, concise code, without sacrificing performance. Gluon will help developers build machine learning models using a simple Python API and a range of pre-built, optimized neural network components. “We created the Gluon interface so building neural networks and training models can be as easy as building an app,” said Swami Sivasubramanian, VP of Amazon AI. Gluon currently works with Apache MXNet and will support Microsoft Cognitive Toolkit (CNTK) in an upcoming release. Google, IBM launch open source API Grafeas for governing software supply chains as “central source of truth” IBM and Google have announced the launch of an open source initiative called Grafeas which offers developers a uniform way of auditing and governing their software supply chains. The software supply chain has several stages such as code, build, test, deploy and operate. At each stage, different tools generate metadata about various software components. Grafeas provides an open API that captures and aggregates this metadata. So using the API developers can easily track when and where the code was changed and who changed it, whether the code successfully passed security scan, and what type of vulnerabilities were found if it failed the test. As part of Grafeas, Google is also introducing Kritis which helps developers create Kubernetes governance policies based on the metadata stored in Grafeas. IBM said it will offer Grafeas and Kritis as part of the IBM Container Service on IBM Cloud. Grafeas and Kritis are Greek words which mean “scribe” and “judge” respectively. Other Data Science News Box announces Box Skills to manage growing multimedia content with artificial intelligence To manage the growing amount of multimedia content, Box has launched a new artificial intelligence toolkit called Box Skills. The company announced that its Box Skills framework will integrate the best AI and machine learning tools directly into the content on Box in a secured environment. Box Skills provides customers built-in flexibility to use algorithms from various companies, and mix and match intelligent machine learning tools from Google, IBM, and Microsoft. Breaking the announcement at its BoxWorks conference, Box previewed three specific skills it will initially offer in public beta: Audio intelligence, using technology from IBM Watson; video intelligence powered by Microsoft Cognitive Services; and image intelligence, using Google Cloud Platform. Box Skills will be available in beta in early 2018. Tensorflow receives hardware support from NVidia and Movidius TensorFlow now has new hardware support from NVidia and Movidius. TensorFlow will now run on NVidia’s Jetson TX2 and Intel’s Movidius chip. The Movidius Neural Compute Stick Software Development Kit (NC SDK) now supports TensorFlow. TensorRT 3 is part of the NVidia Deep Learning SDK; TensorRT includes the TensorRT Optimizer and runtime. DIGITS 6 also supports TensorFlow (DIGITS expands as NVIDIA Deep Learning GPU Training System). Vora 2.0 released, SAP partners can now deploy Vora on multiple cloud systems SAP has launched a new edition of Vora, its big data analytics software. The new release, Vora 2.0, offers SAP partners multi-cloud deployment options. It uses container architecture and leverages open source Kubernetes platform for deployment, thus simplifying the overall deployment and cluster management on public cloud. The product is thus cloud-ready, and more hybrid-ready.

0
0
2101

article-image-trending-datascience-news-11th-oct-17-headlines

Packt Editorial Staff

12 Oct 2017

3 min read

Github's plan for coding automation, TensorFlow releases Tensorflow Lattice - 11th Oct.' 17 Headlines

Packt Editorial Staff

12 Oct 2017

3 min read

GitHub's plan for coding automation, TensorFlow Lattice release, and more in today’s data science news. GitHub in News GitHub will leverage its 10 years data to automate coding and offer project insights Introducing new features in what it called "just the start of a longterm roadmap" GitHub announced several automated coding features at its GitHub Universe conference this week. GitHub intends to leverage the data aggregated on its platform over the 10 years, and demonstrate how machine learning and data science can be applied to software development. The new tools will help developers track dependencies, keep code secure and discover new projects. Its new feature “dependency graph” provides developers insights into the projects, and suggests whether the software is up to date or still supported by a community, apart from giving detailed information on its license and security vulnerabilities. TensorFlow in News TensorFlow team releases TensorFlow Lattice TensorFlow team has announced the release of TensorFlow Lattice, which will ensure that your machine learning models follow the global trends, even when training data is noisy. The team said that TensorFlow Lattice is a library that implements Monotonic Calibrated Interpolated Look-Up Tables in TensorFlow. The library includes a collection of regularizations and monotonicity constraints configurable per feature. It has a set of TensorFlow estimators for regression and classification with the most common set ups for lattice models, and includes lattices and piecewise linear calibration as layers that can be composed into custom models. TensorFlow Lattice is not an official Google product. TensorFlow 1.4.0 released, several custom functions and bug fixes added TensorFlow 1.4.0-rc0 has been released, as per the official announcement on the TensorFlow twitter page. Among the new features, tf.data is now part of the core TensorFlow API and several other custom transformation functions have been added. The release also resolves and fixes bugs that required attention, such as the race condition in TensorForest TreePredictionsV4Op. In TensorFlow 1.4.0, Google Cloud Storage file system and Hadoop file system support are now default build options. Changes in the API include doing away with the seldom used and unnecessary functions. The API is now subject to backwards compatibility guarantees. In other Data Science News ViewLift adds AI on its platform to get insights on customer behavior, drive targeted retention strategies Leading content distribution platform ViewLift has integrated artificial intelligence engine technology into its platform services. ViewLift Intelligence, or VLI, will use the advanced machine learning and AI algorithms to leverage its data and offer enhanced customer behavioral insights and retention tools for operators. ViewLift Intelligence can analyze user viewing behavior, content preferences, subscription packages, acquisition method, and device preferences. It could accurately predict which paying subscribers are likely to cancel subscriptions in the near future. This could help in making targeted strategies to reduce the churn rate across multiple channels.

0
0
12942

Packt Editorial Staff

11 Oct 2017

5 min read

NVIDIA unveils supercomputer Pegasus, IBM integrates Data Science Experience - 10th Oct' 17 Headlines

Packt Editorial Staff

11 Oct 2017

5 min read

NVIDIA says its supercomputer Pegasus will drive fully autonomous robotaxis In what could truly make self-driving cars a reality, NVIDIA has designed world's first AI computer codenamed “Pegasus” that is capable of handling Level 5 driving without requiring steering wheels, pedals, or mirrors. It will instead consist of sensors, cameras, radars and lidars to facilitate driving fully autonomous robotaxis. The advanced computing system NVIDIA Drive PX Pegasus is extending the capabilities of its predecessor NVIDIA Drive PX 2 by more than 10 times in terms of the processing power and performance. "Driverless cars will enable new ride- and car-sharing services. New types of cars will be invented, resembling offices, living rooms or hotel rooms on wheels. Travelers will simply order up the type of vehicle they want based on their destination and activities planned along the way. The future of society will be reshaped," NVIDIA founder and CEO Jensen Huang said. There are hundreds of tech companies who are striving to bring autonomous self-driving cars on the road, and Pegasus will be marketed to them from the second half of 2018, the company said in its announcement. Shares of NVIDIA hit a record high following the news. IBM in News IBM advances analytics by integrating PowerAI and Data Science Experience IBM is bringing their two key data science tools, Data Science Experience and IBM PowerAI, together. The company said in an announcement that the integration is intended to provide machine learning and deep learning on a single machine. The Data Science Experience gives users collaboration tools for managing and monitoring data models, according to Dinesh Nirmal, IBM’s vice president of analytics development. PowerAIi, meanwhile, brings in GPUs as well as deep learning libraries and algorithms that can be used on multiple frameworks, such as TensorFlow, he said. With this significant integration, users can create and train intelligence-led models using the deep learning frameworks to gain expanded data insights. Nirmal said that while 80% of enterprise problems can be solved with machine learning, there are specific use cases where deep learning is more effective. “If you’re running a huge neural network, that complexity requires deep learning. Or if you’re FedEx, to know what happened to a damaged box and how it got damaged, you would use deep learning. Anything that is data and process intensive,” he noted. Others in Data Science News Sage launches Sage Business Cloud to provide unified set of business solutions Sage, the leading provider of cloud business management solutions, has unveiled Sage Business Cloud. The platform offers a powerful set of core products and add-on applications as a complete solution that meets unique business needs. The company claimed that Sage Business Cloud could be the “only cloud platform that businesses will ever need” and that it could also use the latest advancements in AI and machine learning to further help businesses improve productivity and efficiency. "Sage Business Cloud is the next transformative wave of business software. As the fourth industrial revolution continues to take hold, we want to make our customers lives simple. Businesses of all shapes and sizes need products that aid productivity, enable them to respond at lightning speed and deliver insights as well as opportunity,” Sage CEO Stephen Kelly said. Puppet partners Google to offer customers cloud platform modules supporting migration and management Puppet has entered into a collaboration with Google Cloud which could offer its customers Google Cloud Platform (GCP) services, including its advanced machine learning and data analytics capabilities. The partnership may also help slash their IT costs. Puppet is known for its automated approach to delivery and operations of the software, and now its customers can avail the Google Cloud’s flexibility and agility as well. According to the joint announcement, Google Cloud will also release the technology they used to generate modules so that the Puppet module ecosystem could move faster, keeping up with rapidly changing APIs in the cloud. "Our customers want choice, flexibility and the ability to manage everything they have, from their physical infrastructure to cloud resources for maximum operational efficiency and scale," said Nigel Kersten, Chief Technical Strategist at Puppet, “With Google Cloud's expertise in providing world class infrastructure and Puppet's widely adopted enterprise management platform, we're helping customers accelerate their move to the cloud." NICE accelerates machine learning capabilities in next evolution of cognitive process automation NICE has announced the next evolution in its cognitive automation platform – an integration with technology partner Celaton to infuse NICE Robotic Automation with enhanced machine learning capabilities. This integration slashes manual effort by as much as 85% across some of the most complex business processes, and reduces process time by almost 95%. With cognitive machine learning capabilities, complex data is quickly consumed and interpreted, and sound judgments made by robots, who are instructed to respond to customer queries or complaints in an intelligent and highly personalized manner. “Robotic Process Automation has already made great strides globally by significantly impacting business efficiencies and ROI. We have now entered a new era of cognitive automation, and we are delighted to be at the forefront of innovation as we boldly expand our machine learning capabilities,” Miki Migdal, president of the NICE Enterprise Product Group said, “The integration with Celaton not only addresses many of the more complex and challenging business problems facing our customers today, but also marks a significant contribution to the cognitive automation arena.”

0
0
1698

article-image-trending-datascience-news-9th-oct-17-headlines

Packt Editorial Staff

10 Oct 2017

3 min read

Uber open sources AthenaX, Cortana says ‘hi’ on Skype, and more - 9th Oct' 17 - Headlines

Packt Editorial Staff

10 Oct 2017

3 min read

Uber unveils open source streaming analytics platform AthenaX To serve users better with actionable insights, Uber has built an SQL-based streaming analytics platform named AthenaX. The in-house platform was open sourced on GitHub. With the increase in growth of its business, Uber required an infrastructure that could analyze real-time events and was easy to navigate. “AthenaX empowers our users, both technical and non-technical, to run comprehensive, production-quality streaming analytics using Structured Query Language (SQL), the company said in its announcement, “Our real-world experience shows that AthenaX enables users to bring large-scale streaming analytic workloads in production within a matter of hours compared to weeks.” Qlik’s “Visualize Your World” Data Analytics 2017 Tour kicks off Qlik has commenced its annual “Visualize Your World” data analytics global tour, being held in 27 cities from different parts of the world. Over 15,000 registrants may attend the event this year across the Asia Pacific, Middle East, Europe, Africa, and Americas. “Following a tremendously successful 2016 Tour, we are excited to once again host these events to connect with people in the region who are passionate in learning more about the biggest technology trends in the data analytics space and how blending machine learning with human intuition and creativity creates a multiplier effect for their businesses,” said Julian Quinn, Vice President at Qlik for APAC regions. Qlik will unveil some of its latest innovations at the event. Registration is free. MicroStrategy 10.9 introduces Dossiers that could “deliver analytics for everyone” MicroStrategy Inc. has announced general availability of MicroStrategy 10.9, the newest feature release which introduces Dossier, a new storybook experience around analytics. Dossier is an interactive, streamlined interface that presents relevant data analytics in chapters and pages in a format everyone can understand. "MicroStrategy 10.9 represents the biggest leap forward since our MicroStrategy 10 platform launch and underscores our vision of delivering ‘Intelligence Everywhere'," said Tim Lang, senior executive vice president and chief technology officer at MicroStrategy, “We believe collaborative analytics accelerates the velocity of decision making. That's why we're introducing Dossier, an easier and faster method of consuming analytics that we believe end users are going to love. MicroStrategy 10.9 empowers users to do more with their analytics regardless of their technical skill or role." Other Data Science News Python package pomegranate releases latest version 0.8.0 A new version of pomegranate, a python package for probabilistic modeling, has been released. In pomegranate v0.8.0, there are several new functionalities such as built-in out-of-core learning, bulit-in parallelism, minibatch learning, and semi-supervised learning. Also, multivariate gaussian distributions can now use a GPU through the CuPy package, pomegranate developer Jacob Schreiber said in an announcement, adding that this has speeded up the operations around 4x on test runs. The pomegranate v0.8.0 is still not compatible with networkx v2.0, and users may need to downgrade networkx to use pomegranate. A very detailed documentation has been released for pomegranate v0.8.0, including FAQ for each section. Microsoft introduces AI assistant Cortana into Skype Microsoft has added the AI assistant Cortana into Skype. Now every Skype user will see Cortana in their contact list which can be used for either one-on-one chats answering queries with suggested replies, or for conversations involving scheduling of events, searching for nearby restaurants, or sharing IMDB movie reviews. The gradual roll out of Cortana on Skype has kickstarted for iOS and Android users in the U.S., Microsoft announced, while adding that the feature currently does not work in voice or video calls.

0
0
1766

article-image-real-time-stream-processing

Packt Editorial Staff

06 Oct 2017

10 min read

Stream me up, Scotty!

Packt Editorial Staff

06 Oct 2017

10 min read

[box type="note" align="aligncenter" class="" width=""]The following is an excerpt from the book Scala and Spark for Big Data Analytics, Chapter 9, Stream me up, Scotty - Spark Streaming written by Md. Rezaul Karim and Sridhar Alla. It explores the big three stream processing paradigms that are in use today. [/box] In today's world of interconnected devices and services, it is hard to spend even a few hours a day without our smartphone to check Facebook, or hail an Uber ride, or tweet something about the burger we just bought, or check the latest news or sports updates on our favorite team. We depend on our phones and Internet, for a lot of things, whether it is to get work done, or just browse, or e-mail a friend. There is simply no way around this phenomenon, and the number and variety of applications and services will only grow over time. As a result, the smart devices are everywhere, and they generate a lot of data all the time. This phenomenon, also broadly referred to as the Internet of Things, has changed the dynamics of data processing forever. Whenever you use any of the services or apps on your iPhone, or Droid or Windows phone, in some shape or form, real-time data processing is at work. Since so much depends on the quality and value of the apps, there is a lot of emphasis on how the various startups and established companies are tackling the complex challenges of SLAs (Service Level Agreements), and usefulness and also the timeliness of the data. One of the paradigms being researched and adopted by organisations and service providers is the building of very scalable, near real-time or real-time processing frameworks on cutting-edge platforms or infrastructure. Everything must be fast and also reactive to changes and failures. You won’t like it if your Facebook updated once every hour or if you received email only once a day; so, it is imperative that data flow, processing, and the usage are all as close to real time as possible. Many of the systems we are interested in monitoring or implementing, generate a lot of data as an indefinite continuous stream of events. As in any data processing system, we have the same fundamental challenges of data collection, storage, and data processing. However, the additional complexity is due to the real-time needs of the platform. In order to collect such indefinite streams of events and then subsequently process all such events to generate actionable insights, we need to use highly scalable specialized architectures to deal with tremendous rates of events. As such, many systems have been built over the decades starting from AMQ, RabbitMQ, Storm, Kafka, Spark, Flink, Gearpump, Apex, and so on. Modern systems built to deal with such large amounts of streaming data come with very flexible and scalable technologies that are not only very efficient but also help realize the business goals much better than before. Using such technologies, it is possible to consume data from a variety of data sources and then use it in a variety of use cases almost immediately or at a later time as needed. Let us talk about what happens when you book an Uber ride on your smartphone to go to the airport. With a few touches on the smartphone screen, you're able to select a point, choose the credit card, make the payment, and book the ride. Once you're done with your transaction, you then get to monitor the progress of your car real-time on a map on your phone. As the car is making its way toward you, you're able to monitor exactly where the car is and you can also make a decision to pick up coffee at the local Starbucks while you're waiting for the car to pick you up. You could also make informed decisions regarding the car and the subsequent trip to the airport by looking at the expected time of arrival of the car. If it looks like the car is going to take quite a bit of time picking you up, and if this poses a risk to the flight you are about to catch, you could cancel the ride and hop in a taxi that just happens to be nearby. Alternatively, if it so happens that the traffic situation is not going to let you reach the airport on time, thus posing a risk to the flight you are due to catch, you also get to make a decision regarding rescheduling or canceling your flight. Now in order to understand how such real-time streaming architectures such as Uber’s Apollo work to provide such invaluable information, we need to understand the basic tenets of streaming architectures. On the one hand, it is very important for a real-time streaming architecture to be able to consume extreme amounts of data at very high rates while, on the other hand, also ensuring reasonable guarantees that the data that is getting ingested is also processed. The following diagram shows a generic stream processing system with a producer putting events into a messaging system while a consumer is reading from the messaging system. Processing of real-time streaming data can be categorized into the following three essential paradigms: At least once processing At most once processing Exactly once processing Let's look at what these three stream processing paradigms mean to our business use cases. While exactly once processing of real-time events is the ultimate nirvana for us, it is very difficult to always achieve this goal in different scenarios. We have to compromise on the property of exactly once processing in cases where the benefit of such a guarantee is outweighed by the complexity of the implementation. Stream Processing Paradigm 1: At least once processing The at least once processing paradigm involves a mechanism to save the position of the last event received only after the event is actually processed and results persisted somewhere so that, if there is a failure and the consumer restarts, the consumer will read the old events again and process them. However, since there is no guarantee that the received events were not processed at all or partially processed, this causes a potential duplication of events as they are fetched again. This results in the behavior that events get processed at least once. At least once is ideally suitable for any application that involves updating some instantaneous ticker or gauge to show current values. Any cumulative sum, counter, or dependency on the accuracy of aggregations (sum, groupBy, and so on) does not fit the use case for such processing simply because duplicate events will cause incorrect results. The sequence of operations for the consumer are as follows: Save results Save offsets Below is an illustration of what happens if there is a failure and consumer restarts. Since the events have already been processed but the offsets have not been saved, the consumer will read from the previous offsets saved, thus causing duplicates. Event 0 is processed twice in the following figure: Stream Processing Paradigm 2: At most once processing The at-most-once processing paradigm involves a mechanism to save the position of the last event received before the event is actually processed and results persisted somewhere so that, if there is a failure and the consumer restarts, the consumer will not try to read the old events again. However, since there is no guarantee that the received events were all processed, this causes potential loss of events as they are never fetched again. This results in the behavior that the events are processed at most once or not processed at all. At most once is ideally suitable for any application that involves updating some instantaneous ticker or gauge to show current values, as well as any cumulative sum, counter, or other aggregation, provided accuracy is not mandatory or the application needs absolutely all events. Any events lost will cause incorrect results or missing results. The sequence of operations for the consumer are as follows: Save offsets Save results Below is an illustration of what happens if there are a failure and the consumer restarts. Since the events have not been processed but offsets are saved, the consumer will read from the saved offsets, causing a gap in events consumed. Event 0 is never processed in the following figure: Stream Processing Paradigm 3: Exactly once processing The Exactly once processing paradigm is similar to the at least once paradigm, and involves a mechanism to save the position of the last event received only after the event has actually been processed and the results persisted somewhere so that, if there is a failure and the consumer restarts, the consumer will read the old events again and process them. However, since there is no guarantee that the received events were not processed at all or were partially processed, this causes a potential duplication of events as they are fetched again. However, unlike the at least once paradigm, the duplicate events are not processed and are dropped, thus resulting in the exactly once paradigm. Exactly once processing paradigm is suitable for any application that involves accurate counters, aggregations, or which, in general, needs every event processed only once and also definitely once (without loss). The sequence of operations for the consumer are as follows: Save results Save offsets The following is illustration shows what happens if there are a failure and the consumer restarts. Since the events have already been processed but offsets have not saved, the consumer will read from the previous offsets saved, thus causing duplicates. Event 0 is processed only once in the following figure because the consumer drops the duplicate event 0: How does the exactly once paradigm drop duplicates? There are two techniques which can help here: Idempotent updates Transactional updates Idempotent updates involve saving results based on some unique ID/key generated so that, if there is a duplicate, the generated unique ID/key will already be in the results (for instance, a database) so that the consumer can drop the duplicate without updating the results. This is complicated as it's not always possible or easy to generate unique keys. It also requires additional processing on the consumer end. Another point is that the database can be separate for results and offsets. Transactional updates save results in batches that have a transaction beginning and a transaction commit phase within so that, when the commit occurs, we know that the events were processed successfully. Hence, when duplicate events are received, they can be dropped without updating results. This technique is even more complicated than the idempotent updates as now we need some transactional data store. Another point is that the database must be the same for results and offsets. You should look into the use case you're trying to build and see if ‘at least once processing’, or ‘at most once processing’, can be reasonably wide and still achieve an acceptable level of performance and accuracy. If you enjoyed this excerpt, be sure to check out the book Scala and Spark for Big Data Analytics it appears in. You will also like this exclusive interview on why Spark is ideal for stream processing with Romeo Kienzler, Chief Data Scientist in the IBM Watson IoT worldwide team and author of Mastering Apache Spark, 2nd Edition.

0
0
16723

article-image-trending-datascience-news-5th-oct-17-headlines

Packt Editorial Staff

06 Oct 2017

3 min read

Google Compute Engine memory levels raised, Edwards merge into Tensorflow and more - 5th Oct' 17 Headlines

Packt Editorial Staff

06 Oct 2017

3 min read

Google Compute Engine launched with up to 96 CPU cores and 624 GB of memory Google Compute Engine has announced an offering with 64 CPU cores and 416 GB of memory. Google has thus doubled the memory Compute Engine previously offered with 32 cores. “These machine types run on Intel Xeon Scalable processors (codenamed Skylake), and offer the most vCPUs of any cloud provider on that chipset. Skylake in turn provides up to 20% faster compute performance, 82% faster HPC performance, and almost 2X the memory bandwidth compared with the previous generation Xeon,” Google said in it announcement. Users can also adjust their workload requirements with custom CPU and memory configurations in case they don’t require that much power. In future, Google is even considering products that deliver up to 4TB of memory. Edward merge into Tensorflow Edward, a Python library for probabilistic modeling, inference and criticism, has announced its official merger into TensorFlow. Dustin Tran, who leads the development of Edward, announced that for now Edward will be in the contrib module to avoid redundancy with other submodules. “We’re not sure if all of Edward’s features will be in TensorFlow just yet: for example, it’s unclear where to put Edward’s precise PPL. That said, expect that in this move many new innovations in Edward’s design will appear as we make programmable inference far more flexible, more generally compatible with hardware and distributed choices, and most importantly, more accessible by researchers and applied MLers alike,” Dustin said in the official announcement. Other Data Science News PostgreSQL 10 released: logical replication, declarative table partitioning key features PostgreSQL Global Development Group has announced the release of PostgreSQL 10. The latest version includes several additions that were long anticipated such as native logical replication, declarative table partitioning, and improved query parallelism. "Our developer community focused on building features that would take advantage of modern infrastructure setups for distributing workloads," said Magnus Hagander, a core team member of the PostgreSQL Global Development Group. The versioning for PostgreSQL has henceforth been revised to "x.y" format, meaning the next minor release will be 10.1 and next major release will be 11. Microsoft Azure Functions adds support for Java Microsoft’s Azure Functions serverless computing platform will now support Java, the company announced at the JavaOne conference in San Francisco. Azure Functions has so far supported C#, JavaScript, F#, PHP, Python, Bash, Batch and PowerShell, and now the service intends to tap the large developer base of Java. To use Azure Functions, Java developers will not have to learn any new tools. Microsoft is coming up with a Maven plugin using which developers can write and deploy the Maven-enabled apps directly to Azure Functions. PyPy v5.9 released with added support for Pandas and NumPy PyPy has announced the release of its version 5.9, and it now supports Pandas and NumPy too. PyPy 5.8 was released earlier this year in June, where the growing community of PyPy users had reported cases of bugs and other issues. The latest version has several incremental improvements, and the PyPy team has advised that its users go for an update to resolve several ongoing performance issues. According to the announcement, PyPy has released both PyPy3.5 v5.9 (a beta-quality interpreter for Python 3.5 syntax) and PyPy2.7 v5.9 (an interpreter supporting Python 2.7 syntax). NumPy and Pandas now work on PyPy2.7 (together with Cython 0.27.1). CFFI, which has been updated to 1.11.1, now supports complex arguments in API mode, as well as char16_t and char32_t and has improved support for callbacks.

0
0
1705

article-image-trending-datascience-news-4th-oct-17-headlines

Packt Editorial Staff

05 Oct 2017

3 min read

Dragonchain ICO, DeepMind’s ethical compass, Google’s Teachable Machine and more - 4th Oct 17 Headlines

Packt Editorial Staff

05 Oct 2017

3 min read

DeepMind sets up a separate unit for ethical regulations around AI Google-owned DeepMind has set up DeepMind Ethics & Society (DMES), a unit for societal and ethical impact of artificial intelligence. Tech consultant Sean Legassick will lead the unit, along with former Google policy manager and government adviser Verity Harding. DMES is expected to focus on six areas: privacy transparency and fairness, economic impacts, governance and accountability, managing AI risk, AI morality and values, and how AI can address global challenges. The announcement for DMES comes a year after a group of leading technology firms, activists, and academic organizations came together to set up the Partnership on AI, pressing for best practices in AI developments. DMES is also separate from DeepMind’s secretive internal ethics and safety board, which has been functioning since 2010 when Google acquired DeepMind. Google’s Teachable Machine lets users train a web app to respond in a certain way to certain actions Google has introduced a new AI feature named “Teachable Machine” using which users can set a pattern and ask the app to identify it using webcam. The platform promises to teach machine learning concepts in the easiest way possible – all you have to do is to turn on your webcam, train the robot brain what a certain action and motion looks like, and then tell it what kind of reactions you would like. The machine uses Java-based deeplearn.js framework, and works well with most browsers. Google introduces improved version of WaveNet on Assistant Google has announced that it is now using an updated and better version of WaveNet on Google Assistant. WaveNet was unveiled by Google a year back, as a deep neural network for generating raw audio waveforms that could produce better and more realistic speeches. Then at the time of its launch the platform had been called too computationally intensive to deploy in the real world. This had led the developers to work hard over the last 12 months to improve the speed and overall quality. At the moment WaveNet is only available for English (U.S.) and Japanese, Google said, adding that the new version is the first product to launch on its latest TPU cloud infrastructure. Other Data Science News LinkedIn announces new data analytics tool named Talent Insights LinkedIn has launched Talent Insights, a paid big data analytics product that may answer questions like what schools are producing the most successful data scientists or which companies have been hiring the most Python developers; or what skills are growing the fastest in the industry. Aimed at smarter hiring, Talent Insights will come initially with two views: “Talent Pool” and “Company report.” The product was announced in a closed beta at LinkedIn’s Talent Connect event, and is expected to be generally available in 2018. Disney’s original blockchain platform Dragonchain kicks off Initial Coin Offering (ICO) Dragonchain, the blockchain platform originally developed at Disney and now managed by the Dragonchain Foundation, launched its public Initial Coin Offering (ICO) on 2nd October, marking the one-year anniversary of Disney releasing it as open source. Tokens will be issued until 2nd November, and proceeds received will be used in providing access to Dragonchain platform services, project incubation, and professional services to support enterprises, start-ups, and entrepreneurs building applications on the platform. "Our vision for Dragonchain is a secure and flexible blockchain platform paired with a crowd scaled incubator," said Joe Roets, Founder and CEO of Dragonchain Inc. "The system is modeled to create feedback loops and accelerate blockchain projects and market success."

0
0
1353

Packt Editorial Staff

04 Oct 2017

5 min read

3rd Oct 17 - Headlines

Packt Editorial Staff

04 Oct 2017

5 min read

The Internet of Drones, Oracle OpenWorld updates including AI-infused Oracle cloud, and more in today’s data science news. FlytBase launches AI Platform for Drone applications FlytBase Inc., that has built the world’s first IoT platform for commercial drones – the “Internet of Drones” (IoD) – has released its AI Platform for Drone applications at the the Drone World Expo. Continuing on its mission to bring intelligence and connectivity to commercial drones, FlytBase is enhancing its cloud and edge compute platforms to further integrate AI and machine learning solutions.With recent advancements in data visualization and AI, drones can become autonomous enough to reach near human level performance. On other occasions, AI helps drone process certain unique perspectives and features of the data that are otherwise difficult to get with human efforts. FlytBase said it is extending its platform further to leverage AI solutions for aerial image data. Oracle OpenWorld in News Oracle enhances its cloud with ‘intelligent and adaptive’ apps With Oracle Adaptive Intelligent Apps, Oracle has integrated AI and machine learning functionalities across its cloud applications. The company announced at its ongoing OpenWorld conference that the new apps will infuse AI solutions directly into Oracle Enterprise Resource Planning Cloud, Oracle Supply Chain Management Cloud, Oracle Human Capital Management Cloud and Oracle Customer Experience Cloud Suite. In addition to reacting on real time, the apps can also ‘adapt’ based on the available data and this may significantly help businesses in decision making. “The new AI capabilities combine first and third-party data with advanced machine learning and sophisticated decision science to deliver the industry’s most powerful AI-based modern business applications,” said Steve Miranda, Oracle’s executive vice president of applications development. Oracle cloud offers NVIDIA Tesla P100 GPU instances, V100 GPUs next Oracle announced it is now offering NVIDIA’s P100 GPU instances in its public cloud, with plans to add the more powerful V100 GPUs in the near future. Oracle bare metal cloud is now offering NVIDIA Tesla P100 GPUs for technical computing. Oracle is also working with NVIDIA to offer access to the next generation of GPUs, Tesla V100, based on the Volta Architecture in both bare metal and virtual machine compute instances. The company has called it a game changer of sorts for the customers, in ways that can help them rent a supercomputer by the hour. “Enterprises need accelerated computing to run compute-intensive AI, HPC, and advanced analytics workloads,” Ian Buck, general manager and vice president of Accelerated Computing at NVIDIA, commented on the development. “NVIDIA and Oracle’s collaboration will provide Fortune 500 companies that use Oracle Cloud on-demand access to the world’s most advanced GPU computing technology available.” Oracle unveils Oracle Container Native Application Development Platform Oracle has announced the launch of Oracle Container Native Application Development Platform at its ongoing OpenWorld conference. The frictionless, integrated platform offers a comprehensive suite of cloud services for enterprises to build, deploy, and manage container-native microservices and serverless applications. Oracle Container Native Application Development Platform includes three new services: Oracle Container Engine – a managed Kubernetes service to create and manage Kubernetes clusters; Oracle Container Registry Service – a private container registry service for storing and sharing container images across multiple deployments; and Oracle Container Pipelines – a full container lifecycle management CI/CD service. In its release, Oracle noted that developers want to avoid being locked-in by their cloud vendors, and therefore the cloud-neutral Oracle Container Native Application Development Platform offers them "the nirvana of the true hybrid cloud." In other Data Science News Cloud Firestore: Firebase launches second NoSQL database for app development Firebase, Google’s platform for app development, has launched a new flexible and scalable database service called Cloud Firestore. Designed to query, store and sync app data in a simpler and easier way, Cloud Firestore is a fully managed globally distributed NoSQL document database. Alex Dufetel, product manager for Firebase at Google, said that Cloud Firestore is “strongly consistent” despite being replicated at multiple regions, as it does away with complex use cases so that developing apps gets easier irrespective of the scale. “Delivering a great server-side experience for backend developers is a top priority,” Dufetel said, “We're launching SDKs for Java, Go, Python, and Node.js today, with more languages coming in the future.” Cloud Firestore is now available in public as beta version. John Snow Labs Open Sources the NLP Library for Apache Spark Global data operations company John Snow Labs has released its Natural Language Processing software library for Apache Spark as open source. Written in Scala, the NLP software library contains Scala and Python APIs libraries. “With JSL-NLP, we’re delivering on the promise to enable customers to take advantage of the latest open source technology and academic breakthroughs in data science, all within a high performance, enterprise-grade code base,” said the founding team, adding that “JSL-NLP encompasses a wide range of highly efficient Natural Language Understanding tools for text mining, question answering, chatbots, fact extraction, topic modelling or Search, running at a scale and performance that has not been available to date.” The NLP library will continue to be financially sponsored by John Snow Labs for its development.

0
0
1416

Packt Editorial Staff

03 Oct 2017

4 min read

2nd Oct' 17 - Headlines

Packt Editorial Staff

03 Oct 2017

4 min read

Oracle OpenWorld updates including the first ever autonomous database, Apache Solr 7.0.0 release and more in today’s data science news. Apache Solr™ 7.0.0 available Lucene PMC announced the release of Apache Solr 7.0.0 on September 20. Solr 7 has more flexibility with two new replica types, TLOG & PULL, as updates are handled by replicas based on their types. TLOG can use its transaction log to recover and become a leader, while the PULL type replica cannot become a leader as it does not have a transaction log. In earlier releases, any replica could have become a leader when a leader was lost. Autoscaling is another new feature in Solr 7 that helps manage clusters in simpler ways with more automation. Among other features, the new version also provides rich document parsing, enhanced RESTful APIs and parallel SQL. Oracle OpenWorld in News Oracle 18c: World’s first self-driving database What could possibly be the next generation of industry-leading databases, Oracle has launched the first-of-its-kind fully automated database called Oracle 18c. Calling for automation as essential to preventing and handling data theft, Oracle CTO Larry Ellison announced at Oracle OpenWorld conference the new autonomous database that can patch itself in real time without requiring to go offline. Oracle said their aim is to automate both the threat detection and the immediate remediation, without having a delay waiting for “a human to schedule downtime to gracefully implement a patch in a month or two." Oracle 18c’s data warehouse version will be available in December while the OLTP version will be available in June 2018. Oracle announces AI platform Cloud Service, chatbots to tap deep learning, machine learning capabilities At its ongoing OpenWorld event, Oracle has unveiled the AI Platform Cloud Service that may help developers quickly create and deploy enterprise AI services. The company also announced the availability of intelligent, AI-led chatbots in the Oracle Mobile Cloud delivering multi channel platform to companies for integrating machine learning features. “Oracle AI Platform Cloud instances come pre-installed with familiar AI libraries, tools, and deep learning frameworks, including Caffe, Jupyter Notebook, Keras, NymPy, scikit-learn, and TensorFlow, among others,” Oracle said in its release, adding that machine learning practitioners can access Oracle Object Store and easily connect to existing Spark/Hadoop clusters.The AI-powered bots, that will help automate the information processing and customer conversations, will work with Facebook Messenger, Skype, Slack, Kik, Amazon Echo, Amazon Dot, and Google Home. Oracle Blockchain Cloud Service may enhance security, scalability and supply chains In a major announcement, Oracle has unveiled it enterprise-grade blockchain cloud service. The advanced cloud platform, fully managed by Oracle, is expected to simplify and secure operations with its continuous backup, in-built monitoring, and point-in-time recovery features. “Enterprises can now streamline operations across their ecosystem and expand their market reach with new revenue streams, sharing data and transacting within and outside the Oracle Cloud,” said Amit Zavery, senior vice president, Oracle Cloud Platform. Oracle recently joined the open source consortium for blockchain project Hyperledger. In other Data Science News MathWorks introduces Release 2017b of the MATLAB and Simulink Product Families, adds deep learning capabilities MathWorks has announced its Release 2017b with several new features in MATLAB and Simulink. The release also includes six new products, and, updates and bug fixes to 86 other products. R2017b boosts deep learning capabilities with several features that simplify the way researchers, engineers, and domain experts design, train, and deploy models. “With R2017b, engineering and system integration teams can extend the use of MATLAB for deep learning to better maintain control of the entire design process and achieve higher-quality designs faster. They can use pretrained networks, collaborate on code and models, and deploy to GPUs and embedded devices. Using MATLAB can improve result quality while reducing model development time by automating ground truth labeling,” said David Rich, MATLAB marketing director, MathWorks.

0
0
1315

article-image-trending-datascience-news-28th-sept-17-headlines

Packt Editorial Staff

29 Sep 2017

3 min read

Baidu brings AI on smartphones with Mobile Deep learning - 28th Sept' 17 - Headlines

Packt Editorial Staff

29 Sep 2017

3 min read

Baidu open sources its Mobile deep learning, IBM’s HPC research and more in today’s data science news. Open source announcements in News Baidu brings AI on smartphones with Mobile Deep Learning Baidu has open sourced Mobile Deep Learning (MDL), a convolution-based neural network customized for mobile devices. MDL can identify objects in an image (taken from smartphone camera) in fractions of second, and give the suggestion to Baidu to carry forward the search process. Coming at faster speed and reduced complexity, MDL supports both iOS and Android, though it may run better on Apple. The codes are available now at Github taking nearly 4 MB space. Last year, Baidu had open sourced PaddlePaddle deep learning package, and developers suggest PaddlePaddle will be best model to use with MDL. With Abseil, Google open sources internal C++ and Python libraries Google has open sourced Abseil, a set of libraries from the very building blocks of its internal codebase. “These libraries are the nuts-and-bolts that underpin almost everything that Google runs,” the company said, adding that Abseil was developed over the last decade to support important projects like gRPC, Protocol Buffers, and TensorFlow. Abseil includes C++ and Python utilities. While the C++ libraries are now available on GitHub under Apache license, Google will soon make available a Python version of the library. In Other Data Science News IBM’s new Deep Learning model can slash computational expense of HPC infrastructure Researchers at IBM’s Dublin research facility claim to have developed a deep learning model that could advance high-performance computing (HPC) by 12,000 percent. Using available conditions of wave, ocean currents and winds, the framework can help in forecasting wave conditions at real time. The research indicates that simulations can be done on lower-end computing devices like Raspberry Pi, and it does not have to require HPC infrastructure. The deep learning model can also be utilized to make the running HPC infrastructure train smartphones or other cheaper computing devices. Royal Bank of Canada tests Blockchain for cross-border fund transfers Canada’s largest bank, the Royal Bank of Canada (RBC), is trialing using blockchain technology for payments to and from the United States. It allows the bank to explore the potential of the tech without fully replacing the existing system. "We wanted to set it up as a shadow ledger so that we can demonstrate our leadership in exploiting that technology while at the same time recognizing that the technology is still early in its adoption phase," RBC's executive vice president Martin Wildberger said, adding that while the technology could prove "transformative and critical," it still needs more time to mature. MapR advances its database to process real-time analytics MapR Technologies has enhanced the scope of its database MapR-DB to drive real-time analytics. The company announced that its latest database version expands the scope for self-service SQL data exploration with enhanced Drill integration, and also supports connectors to native Spark and Hive for real-time processing. The new MapR-DB version also aids real-time application integration with global data capture.

0
0
1526

article-image-trending-datascience-news-27th-sept-17-headlines

Packt Editorial Staff

28 Sep 2017

4 min read

Yahoo open sources Vespa, Salesforce CRM, Microsoft Dynamics 365 get smarter - 27th Sept' 17 Headlines

Packt Editorial Staff

28 Sep 2017

4 min read

Yahoo open sources Vespa, Salesforce CRM, Dynamics 365 get smarter and more in today's data science news. Vespa: Yahoo open sources internal big data processing and serving engine After Hadoop, Yahoo has just open sourced its most important internal software named Vespa. Yahoo’s parent company Oath, which is owned by Verizon, said in an announcement that it had been using Vespa for content recommendations and searches. Vespa, which is now live on GitHub, dates back to early 2000’s handling around 3 billion ad requests daily. “By releasing Vespa, we are making it easy for anyone to build applications that can compute responses to user requests, over large datasets, at real time and at internet scale – capabilities that up until now, have been within reach of only a few large companies,” Oath said in its release. CRM IN NEWS Salesforce releases Data Studio: A new platform for sharing data on marketing cloud Salesforce has launched a new platform for data sharing within the marketing cloud. The new product, Data Studio, gives the data owners more control in the way they share data and offers marketers better access to relevant data volumes. More importantly, marketers can reach out to their existing customers through artificial intelligence. With Data Studio, publishers have the authority over specific attributes like who are the potential buyers and why are they buying the data, and until what period the data can be used. “Data marketplaces typically provide opaque access to data,” said Raji Bedi, Vice President of Product Management at Salesforce Marketing Cloud, “What our customers desire is the ability to understand the origin, fair rights, and usage of that data. Marketers would prefer to have data with more transparency and a deeper understanding of their audience…this means there's a more targeted reach for marketers and more revenue for data publishers.” Microsoft Adds AI capabilities to Dynamics 365 for Customer Support "Our goal is to increase satisfaction across areas where we engage the customer and within internal support teams who can work more effectively and efficiently," wrote Steve Guggenheimer, corporate vice president of Microsoft's AI unit, in a blog post. "This is accomplished by having virtual agents engage with customers to solve their issues, and seamlessly transfer to support agents only when necessary. The agents receive real-time suggestions when a customer is handed off and can provide real-time feedback to train the virtual agents to become even more effective over time." ML FOR CLOUD IN NEWS First End-to-End Automated Big Data Warehousing Platform launched in Cloud Infoworks said it has released the first and industry’s only end-to-end platform for automated big data warehousing in the cloud which will help organizations “build and deploy big data use-cases in days instead of months.” The platform, Infoworks Cloud Big Data Warehouse, uses advanced automation to handle big data infrastructure reducing the complexity. "We are enabling enterprises to rapidly modernize their data warehouse environments both on premise and in the cloud, and derive strategic value from their big data initiatives," Infoworks CEO Amar Arsikere said, “The unprecedented level of automation built into the Infoworks platform enables enterprises to rapidly design and deploy big data analytics use-cases without any coding." Cloudera Altus Data Engineering: Cloudera partners with Microsoft for Azure cloud platform Cloudera, which has been teaming up with Microsoft in a series of collaborations, has announced the upcoming beta release of Altus Data Engineering for the Microsoft Azure cloud environment. Cloudera Altus Data Engineering on Azure simplifies DevOps and reduces management complexity of infrastructure that are otherwise time consuming. "Enterprise customers increasingly choose Microsoft Azure for their large-scale data processing workloads. We are excited that Cloudera Altus will bring an easy-to-use, end-user focused managed service experience on Azure, that is backed by the proven enterprise-grade Cloudera distribution," said Corey Sanders, who is the director of Compute at Microsoft Azure. "Azure is the only public cloud that provides Azure Data Lake Storage designed for big data at cloud scale. Together with Cloudera Altus, we help customers build, deploy, and share analytics solutions." Hortonworks DataPlane Service will manage and govern data regardless of where it is In what could be a departure from big data architectures that consolidate data into a single data lake, Hortonworks has launched its cloud-based DataPlane Service that will analyze, manage and govern the data across environments, letting enterprises secure their data irrespective of the use case. It will capture data regardless of whether data is in motion or at rest. While DataPlane Service could be a fabric to manage all kinds of data no matter where they reside, its necessity goes beyond infrastructure reasons as data protection laws are getting stringent day by day.

0
1
1606

Intel takes Facebook’s help on AI chip; Cisco uses AI to predict IT services; and more - 18th Oct.' 17 Headlines

Google’s AutoML beats human AI capacities and more - 17th Oct.' 17 Data science news headlines

IBM’s new blockchain platform, Elasticsearch's availability on Alibaba Cloud, and more - 16th Oct.' 17 Headlines

Top 15 Applications of Machine Learning on Twitter

Microsoft, AWS join forces for Gluon; Google, IBM unveil open API Grafeas; and more - 12th Oct.' 17 Headlines

Github's plan for coding automation, TensorFlow releases Tensorflow Lattice - 11th Oct.' 17 Headlines

NVIDIA unveils supercomputer Pegasus, IBM integrates Data Science Experience - 10th Oct' 17 Headlines

Uber open sources AthenaX, Cortana says ‘hi’ on Skype, and more - 9th Oct' 17 - Headlines

Stream me up, Scotty!

Google Compute Engine memory levels raised, Edwards merge into Tensorflow and more - 5th Oct' 17 Headlines

Trending Topics

Dragonchain ICO, DeepMind’s ethical compass, Google’s Teachable Machine and more - 4th Oct 17 Headlines

3rd Oct 17 - Headlines

2nd Oct' 17 - Headlines

Baidu brings AI on smartphones with Mobile Deep learning - 28th Sept' 17 - Headlines

Yahoo open sources Vespa, Salesforce CRM, Microsoft Dynamics 365 get smarter - 27th Sept' 17 Headlines

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access