Data | 0 articles | Tech News, Tutorials & Expert Insights

07 Dec 2017

6 min read

7th Dec.' 17 - Headlines

07 Dec 2017

NVIDIA's CUTLASS, ONNX 1.0, Qualcomm’s Snapdragon 845 chip, Ethereum interface Status, and new machine learning changes in Google Sheets in today's top stories around machine learning, AI and data science news. Announcing CUTLASS for fast linear algebra in CUDA C++ NVIDIA’s CUTLASS to help develop new algorithms in CUDA C++ using high-performance GEMM constructs as building blocks NVIDIA has released CUTLASS: CUDA Templates for Linear Algebra Subroutines. It’s a collection of CUDA C++ templates and abstractions for implementing high-performance GEMM computations at all levels and scales within CUDA kernels. Unlike other templated GPU libraries for dense linear algebra, the purpose of CUTLASS is to decompose the “moving parts” of GEMM into fundamental components abstracted by C++ template classes, allowing programmers to easily customize and specialize them within their own CUDA kernels. “Our CUTLASS primitives include extensive support for mixed-precision computations, providing specialized data-movement and multiply-accumulate abstractions for handling 8-bit integer, half-precision floating point (FP16), single-precision floating point (FP32), and double-precision floating point (FP64) types,” NVIDIA said in its statement. “One of the most exciting features of CUTLASS is an implementation of matrix multiplication that runs on the new Tensor Cores in the Volta architecture using the WMMA API. Tesla V100’s Tensor Cores are programmable matrix-multiply-and-accumulate units that can deliver up to 125 Tensor TFLOP/s with high efficiency.” NVIDIA is releasing the CUTLASS source code on GitHub as an initial exposition of CUDA GEMM techniques that will evolve into a template library API. ONNX is production ready Announcing ONNX 1.0 Open Neural Network Exchange (ONNX), a joint initiative from Facebook and Microsoft later joined by Amazon Web Services, has reached the production milestone of verson 1.0. ONNX 1.0 enables users to move deep learning models between frameworks, making it easier to put them into production. For example, developers can build sophisticated computer vision models using frameworks such as PyTorch and run them for inference using Microsoft Cognitive Toolkit or Apache MXNet. Since the initial release of ONNX in September, numerous hardware partners including Qualcomm, Huawei, and Intel have announced support of the ONNX format for their hardware platforms, making it easier for users to run models on different hardware platforms. Qualcomm’s new flagship chip Snapdragon 845 Qualcomm’s next generation processor Snapdragon 845 focuses on AI, augmented and virtual reality At their annual Snapdragon Technology Summit, Qualcomm announced updates about their latest premium processor due out next year: Snapdragon 845. Though the processor will be built on the same 10 nm processor technology like the 835 (the previous processor), Qualcomm has made some changes in the architecture to embrace next-generation AR and VR applications. In addition to more focus on imaging and AI, it will have a robust battery life. Snapdragon 845 will still support Gigabit LTE speeds via X20 modem. It will feature four Cortex A75 and four Cortex A53 cores as processing module. In the Summit, Xiaomi also made an appearance to announce that their upcoming flagship phone will come equipped with the Snapdragon 845 processor. Whereas the chip will be found in other non-Android devices as well, including Windows 10 PCs. Spectra 280 ISP and Adreno 630 are the new features added to improve photography and video capture, in addition to SLAM (simultaneous localization and mapping) for obstacle collision. Qualcomm also said that the new chip will have a 3x performance boost in AI. The company has added support for TensorFlow Lite and ONNX frameworks, apart from the regular old TensorFlow and Caffe. Google spreadsheet getting ‘smarter’ Google rolls out new machine learning features into its spreadsheet software to save time, get ‘intuitive’ answers Google said it is enhancing the “Explore” feature in Sheets with new capabilities including formula suggestions and pivot tables powered by machine learning to deliver faster and more useful insights. Sheets is part of Google’s productivity suite, meant to rival Microsoft Corp.’s popular Excel spreadsheet software. At present, users typically type quick formulas such as =SUM or =AVERAGE into sheets to get answers for their data. But now Google wants to introduce machine intelligence into the process, so that when users begin typing a formula, Sheets will pop up a few suggestions for full formulas based on the context of the data in the specific spreadsheet. Next, creating pivot tables has always been tricky and time-consuming. So now, Sheets can ‘intelligently’ suggest pivot tables to find the answers. And users can ask questions using everyday language. For example, they can type “what is the sum of revenue by salesperson?” and Sheets will suggest the best pivot table to find the answer to that question. Additional new features in Sheets include a new user interface for pivot tables, along with customizable headings for rows and columns, and also some new ways to view data. “Now, when you create a pivot table, you can ‘show values as a % of totals’ to see summarized values as a fraction of grand totals,” Google said. “Once you have a table, you can right-click on a cell to ‘view details’ or even combine pivot table groups to aggregate data the way you need it.” Google also added the ability to create “waterfall charts,” which provide a way to visualize sequential changes in data. User can now also quickly import or paste “fixed-width formatted data files” to Sheets. The new updates will be implemented in coming weeks. A new entrant to Enterprise Ethereum Alliance Status, the first-ever Mobile Ethereum OS, joins the Enterprise Ethereum Alliance Status, the Ethereum blockchain-based decentralized browser with built-in chat and wallet functionality, has joined the Enterprise Ethereum Alliance (EEA), the world’s largest open-source blockchain initiative. With membership across the Fortune 500, enterprises, startups, research facilities and even Governments, the EEA’s mission is to enhance the privacy, security, and scalability of Ethereum-based blockchain technologies. Status recently closed a $100 million funding round through the sale of its SNT tokens. Currently in development, it is an open source mobile platform that serves as a gateway to decentralized apps (DApps) and services built on Ethereum. The base offering enables access to encrypted messages, smart contracts, digital currency and more.

0
0
4619

Matthew Emerick

28 Sep 2020

1 min read

Join us! Power BI Webinar. Wednesday 30 September 2020 – 8:00 AM – 9:00 AM PDT from Microsoft Power BI Blog | Microsoft Power BI

Matthew Emerick

28 Sep 2020

1 min read

This webinar shows us shortcuts to help you unlock your new superpower on your usual context and save a lot of time when working with Power BI together with Excel.

0
0
4613

article-image-tableau-migrates-to-the-cloud-how-we-evaluated-our-modernization-from-whats-new

Anonymous

26 Oct 2020

7 min read

Tableau migrates to the cloud: how we evaluated our modernization from What's New

Anonymous

26 Oct 2020

7 min read

Erin Gengo Manager, Analytics Platforms, Tableau Tanna Solberg October 26, 2020 - 9:31pm October 29, 2020 Cloud technologies make it easier, faster, and more reliable to ingest, store, analyze, and share data sets that range in type and size. The cloud also provides strong tools for governance and security which enable organizations to move faster on analytics initiatives. Like many of our customers, we at Tableau wanted to realize these benefits. Having done the heavy lifting to move our data into the cloud, we now have the opportunity to reflect and share our migration story. As we embarked on the journey of selecting and moving to a cloud-driven data platform from a conventional on-premises solution, we were in a unique position. With our mission to help people see and understand data, we’ve always encouraged employees to use Tableau to make faster, better decisions. Between our culture of democratizing data and rapid, significant growth, we consequently had servers running under people’s desks, powering data sources that were often in conflict. It also created a messy server environment where we struggled to maintain proper documentation, apply standard governance practices, and manage downstream data to avoid duplication. When it came time to migrate, this put pressure on analysts and strained resources. Despite some of our unique circumstances, we know Tableau isn’t alone in facing some of these challenges—from deciding what and when to migrate to the cloud, to how to better govern self-service analytics and arrive at a single source of truth. We’re pleased to share our lessons learned so customers can make informed decisions along their own cloud journeys. Our cloud evaluation measures Because the cloud is now the preferred place for businesses to run their IT infrastructure, choosing to shift our enterprise analytics to a SaaS environment (Tableau Online) was a key first step. After that, we needed to carefully evaluate cloud platforms and choose the best solution for hosting our data. The top criteria we focused on during the evaluation phase were: Performance: The platform had to be highly performant to support ad-hoc analysis to high-volume, regular reporting across diverse use cases. We wanted fewer “knobs” to turn and an infrastructure that adapted to usage patterns, responded dynamically, and included automatic encryption. Scale: We wanted scalable compute and storage that would adjust to changes in demand—whether we were in a busy time of closing financial books for the quarter or faced with quickly and unpredictably shifting needs—like an unexpected pandemic. Whatever we chose needed compute power that scaled to match our data workloads. Governance and security: We’re a data-driven organization, but because much of that data wasn’t always effectively governed, we knew we were missing out on value that the data held.. Thus, we required technology that supported enterprise governance as well as the increased security that our growing business demands. Flexibility: We needed the ability to scale infrastructure up or down to meet performance and cost needs. We also wanted a cloud platform that matched Tableau’s handling of structured, unstructured, or semi-structured data types to increase performance across our variety of analytics use cases. Simplicity: Tableau sought a solution that was easy to use and manage across skill levels, including teams with seasoned engineers or teams without them that managed their data pipelines through Tableau Prep. If they quickly saw the benefit of the cloud architecture to streamline workflows and reduce their time to insight, it would help them focus on creating data context and support governance that enabled self-service—a win-win for all. Cost-efficiency: A fixed database infrastructure can create large overhead costs. Knowing many companies purchase their data warehouse to meet the highest demand timeframes, we needed high performance and capacity, but not 24/7. That could cost Tableau millions of dollars of unused capacity. Measurement and testing considerations We needed to deploy at scale and account for diverse use cases as well as quickly get our people answers from their data to make important, in-the-moment decisions. After narrowing our choices, we followed that with testing to ensure the cloud solution performed as efficiently as we needed it to. We tested: Dashboard load times; we tested more than 20,000 Tableau vizzes Data import speeds Compute power Extract refreshes How fast the solution allows our London and Singapore data centers to access data that we have stored in our US-West-2a regional data center We advise similar testing for organizations like us, but we also suggest asking some other questions to guarantee the solution aligns with your top priorities and concerns: What could the migration path look like from your current solution to the cloud? (For us, SQL Server to Snowflake) What's the learning curve like for data engineers—both for migration and afterward? Is the cost structure of the cloud solution transparent, so you can somewhat accurately forecast/estimate your costs? Will the solution lower administration and maintenance? How does the solution fit with your current development practices and methods, and what is the impact for processes that may have to change? How will you handle authentication? How will this solution fit with our larger vendor and partner ecosystem? Tabeau’s choice: Snowflake There isn’t a one-size-fits-all approach, and it’s worth exploring various cloud data platforms. We found that in prioritizing requirements and making careful, conscious choices of where we wouldn’t make any sacrifices, a few vendors rose to the top as our shortlist for evaluation. In our data-heavy, dynamic environment where needs and situations change on a dime, we found Snowflake met our needs and then some. It is feature-rich with a dynamic, collaborative environment that brings Tableau together—sales, marketing, finance, product development, and executives who must quickly make decisions for the health, safety, progress of the business. “This process had a transformational effect on my team, who spent years saying ‘no’ when we couldn’t meet analytics demands across Tableau,” explained Phillip Cheung, a product manager who helped drive the evaluation and testing process. “Now we can easily respond to any request for data in a way that fully supports self-service analytics with Tableau.” Cloud adoption, accelerated With disruption on a global scale, the business landscape is changing like we’ve never experienced. Every organization, government agency, and individual has been impacted by COVID-19. We’re all leaning into data for answers and clarity to move ahead. And through these times of rapid change, the cloud has proven even more important than we thought. As a result of the pandemic, organizations are accelerating and prioritizing cloud adoption and migration efforts. According to a recent IDC survey, almost 50 percent of technology decision makers expect to moderately or significantly increase demand for cloud computing as a result of the pandemic. Meredith Whalen, chief research officer, said, “A number of CIOs tell us their cloud migration investments paid off during the pandemic as they were able to easily scale up or down.” (Source: IDC. COVID-19 Brings New C-Suite Priorities, May 2020.) We know that many of our customers are considering or already increasing their cloud investments. And we hope our lessons learned will help others gain useful perspective in moving to the cloud, and to ultimately grow more adaptive, resilient, and successful as they plan for the future. So stay tuned—as part of this continued series, we’ll also be sharing takeaways and experiences from IT and end users during key milestones as we moved our data and analytics to the cloud.

0
0
4594

Matthew Emerick

15 Oct 2020

1 min read

Thursday News, October 15 from Featured Blog Posts - Data Science Central

Matthew Emerick

15 Oct 2020

1 min read

Here is our selection of articles and technical contributions featured on DSC since Monday: Announcements Penn State Master’s in Data Analytics – 100% Online eBook: Data Preparation for Dummies Technical Contributions A quick demonstration of polling confidence interval calculations using simulation Why you should NEVER run a Logistic Regression (unless you have to) Cross-validation and hyperparameter tuning Why You Should Learn Sitecore CMS? Articles AI is Driving Software 2.0… with Minimal Human Intervention Applications of Machine Learning in FinTech Why Fintech is the Future of Banking? Real Estate: How it is Impacted by Business Intelligence Determining How Cloud Computing Benefits Data Science Enjoy the reading!

0
0
4592

article-image-microsofts-visual-studio-intellicode-gets-improved-features-whole-line-code-completions-ai-assisted-refactoring-and-more

Savia Lobo

06 Nov 2019

3 min read

Microsoft’s Visual Studio IntelliCode gets improved features: Whole-line code completions, AI-assisted refactoring, and more!

Savia Lobo

06 Nov 2019

3 min read

At the Ignite 2019, Microsoft shared a few improvements to the Visual Studio IntelliCode, Microsoft’s tool for AI-assisted coding that offers intelligent suggestions to improve code quality and productivity. Amanda Silver, a director of Microsoft’s developer division, in her official blog post writes, “At Microsoft Ignite, we showed a vision of how AI can be applied to developer tools. After talking with thousands of developers over the last couple years, we found that the most highly effective assistance can only come from one source: the collective knowledge of the open source, GitHub community.” Latest improvements in Microsoft’s IntelliCode Whole-line code completions and AI-assisted suggestions IntelliCode provides whole-line code completion suggestions IntelliCode extends the GPT-2 transformer language model to learn about programming languages and coding patterns. OpenAI-generated GPT model architecture has the ability to generate conditional synthetic text examples without needing domain-specific training datasets. For initial language-specific base models, the team adopted an unsupervised learning approach that learns from over 3000 top GitHub repositories. The base model then extracts statistical coding patterns and learns the intricacies of programming languages from GitHub repos to assist developers in their coding. Based on the code context, as the user types, IntelliCode uses semantic information and sourced patterns to predict the most likely completion in-line with the user’s code. IntelliCode has also extended machine-learning model training capabilities beyond the initial base model to enable teams to train their own team completions. AI-assisted refactoring detection IntelliCode suggests code changes in the IDE and also locally synthesizes, on-demand, edit scripts from any set of repetitive pattern changes. IntelliCode saves developers a lot of time with a new AI technology called program synthesis or programming-by-examples (PBE). PBE has been developed at Microsoft by the PROSE team and has been applied to various products including Flash Fill in Excel and webpage table extraction in PowerBI. “IntelliCode advances the state-of-the-art in PBE by allowing patterns to be learned from noisy traces as opposed to explicitly provided examples, without any additional steps on your part,” Silver writes. Talking about security, Silver says, “our PROSE-based models work entirely locally, so your code never leaves your machine.” She also said that over the past few months, the team has used unsupervised machine learning techniques to create a model that is predictive for Python. Silver also told VentureBeat, “So the result is that as you’re coding Python, it actually feels more like the editing experience that you might get from a statically typed programming language — without actually having to make Python statically typed. And so as you type, you get statement completion for APIs and you can get argument completion that’s based on the context of the code that you’ve written thus far.” Many users are impressed with the improvements in IntelliCode. A user tweeted, “Training ML against repos is super clever.” https://twitter.com/nathaniel_avery/status/1191760019479519232 https://twitter.com/raschneiderman/status/1191704366035734530 To know more about improvements in IntelliCode, in detail, read Microsoft’s official blog post. Microsoft releases TypeScript 3.7 with much-awaited features like Optional Chaining, Assertion functions and more Mapbox introduces MARTINI, a client-side terrain mesh generation code DeepMind AI’s AlphaStar achieves Grandmaster level in StarCraft II with 99.8% efficiency

0
0
4580

article-image-types-of-variables-in-data-science-in-one-picture-from-featured-blog-posts-data-science-central

Matthew Emerick

18 Oct 2020

1 min read

Types of Variables in Data Science in One Picture from Featured Blog Posts - Data Science Central

Matthew Emerick

18 Oct 2020

1 min read

While there are several dozen different types of possible variables, all can be categorized into a few basic areas. This simple graphic shows you how they are related, with a few examples of each type. More info: Types of variables in statistics and research

0
0
4572

Packt Editorial Staff

06 Dec 2017

6 min read

6th Dec.' 17 - Headlines

Packt Editorial Staff

06 Dec 2017

6 min read

PyTorch v.3.0, IBM's Power Systems Servers, Core ML's support for TensorFlow Lite, Microsoft using AMD's EPYC processors, and Google's new machine learning services for text and video among today's top data science news. PyTorch removes Stochastic functions Pytorch 0.3.0 released with performance improvements, ONNX/CUDA 9/CUDNN 7 Support and bug fixes Pytorch has released its version 0.3.0 with several performance improvements, new layers, ship models to other frameworks (via ONNX), CUDA9, CuDNNv7, and “lots of bug fixes.” Among the most important changes, Pytorch has removed stochastic functions, i.e. Variable.reinforce() because of their limited functionality and broad performance implications. “The motivation for stochastic functions was to avoid book-keeping of sampled values. In practice, users were still book-keeping in their code for various reasons. We constructed an alternative, equally effective API, but did not have a reasonable deprecation path to the new API. Hence this removal is a breaking change,” the Python package said, adding that they have introduced the torch.distributions package to replace stochastic functions. Among the other changes, Pytorch said that in v0.3.0, some loss functions can compute per-sample losses in a mini-batch, and that more loss functions will be covered in the next release. There is also an in-built Profiler in the autograd engine which works for both CPU and CUDA models. In addition to new API changes, Pytorch 0.3.0 will see big reduction in framework overhead and 4x to 256x faster Softmax/LogSoftmax, apart from new tensor features. PyTorch models that are ConvNet-like and RNN-like (static graphs) can now be shipped to the ONNX format, a common model interchange format that can be executed in Caffe2, CoreML, CNTK, MXNet, and Tensorflow. AMD processors coming to Azure machines Microsoft Azure is first global cloud provider to deploy AMD EPYC processors Microsoft is the first global cloud provider which will use AMD's EPYC platform to power its data centers. In an official announcement, Microsoft said it has worked closely with AMD to develop the next generation of storage optimized VMs called Lv2-Series, powered by AMD’s EPYC processors. The Lv2-Series is designed to support customers with demanding workloads like MongoDB, Cassandra, and Cloudera that are storage intensive and demand high levels of I/O. Lv2-Series VM’s use the AMD EPYC 7551 processor, featuring a core frequency of 2.2Ghz and a maximum single-core turbo frequency of 3.0GHz. Lv2-Series VMs will come in sizes ranging up to 64 vCPU’s and 15TB of local resource disk. IBM’s Power Systems Servers speeds up deep learning training by 4x Power System AC922: IBM takes deep learning to next level with first Power9-based systems In its quest to be the AI-workload leader for data centers, IBM unveiled its first Power9 server, the Power System AC922, at the AI Summit in New York. It runs a version of the Power9 chip tuned for Linux, with the four-way multithreading variant SMT4. Power9 chips with SMT4 can offer up to 24 cores, though the chips in the AC922 top out at 22 cores. The fastest Power9 in the AC922 runs at 3.3GHz. The air-cooled AC922 model 8335-GTG set for release in mid-December, as well as two other models (one air-cooled and one water-cooled) scheduled to ship in the second quarter next year, offer two Power9 chips each and run Red Hat and Ubuntu Linux. In 2018, IBM plans to release servers with a version of the Power9 tuned for AIX and System i, with SMT8 eight-way multithreading and PowerVM virtualization, topping out at 12 cores but likely running at faster clock speeds. The Power9 family is the first processor line to support a range of new I/O technologies, including PCI-Express 4.0 and NVLink 2.0, as well as OpenCAPI. IBM claims that the Power Systems Servers can make the training of deep learning frameworks four times faster. The U.S. Department of Energy's Summit and Sierra supercomputers, at Oak Ridge National Laboratories and Lawrence Livermore National Laboratory, respectively, are also based on Power9. AI perfects imperfect-information game Study on AI’s win in heads-up no-limit Texas hold’em poker wins Best Paper award at NIPS2017 A detailed research on how the AI Libratus defeated the best human players at Heads-Up No-Limit Texas Hold'em poker game earlier this year has won the Best Paper award at NIPS2017. The paper delves deep into the analysis behind imperfect-information game AI vs perfect-information games such as Chess & Go, and expounds the idea that was used to defeat top humans in heads-up no-limit Texas hold’em poker. Earlier this year in January, the artificial intelligence system Libratus, developed by a team at Carnegie Mellon University, beat four professional poker players. The complete paper is available here on arxiv. No more "versus" between Core ML and Tensorflow Lite Google announces Apple's Core ML support in TensorFlow Lite In November, Google announced the developer preview of TensorFlow Lite. Now, Google has collaborated with Apple to add support for Core ML in TensorFlow Lite. With this announcement, iOS developers can leverage the strengths of Core ML for deploying TensorFlow models. In addition, TensorFlow Lite will continue to support cross-platform deployment, including iOS, through the TensorFlow Lite format (.tflite) as described in the original announcement. Support for Core ML is provided through a tool that takes a TensorFlow model and converts it to the Core ML Model Format (.mlmodel). For more information, users can check out the TensorFlow Lite documentation pages, and the Core ML converter. The pypi pip installable package is available at this link: https://pypi.python.org/pypi/tfcoreml/0.1.0. Google launches new machine learning services for analyzing video and text content Google announces Cloud Video Intelligence and Cloud Natural Language Content Classification are now generally available Google has announced the general availability of two new machine learning services: Cloud Video Intelligence and Cloud Natural Language Content Classification. Cloud Video Intelligence is a machine learning application programming interface that’s designed to analyze video content, while Cloud Natural Language Content Classification is an API that helps classify content into more than 700 different categories. Google Cloud Video Intelligence was launched in beta in March this year, and has since been fine-tuned for greater accuracy and deeper analysis. “We’ve been working closely with our beta users to improve the model’s accuracy and discover new ways to index, search, recommend and moderate video content. Cloud Video Intelligence is now capable of deeper analysis of your videos — everything from shot change detection, to content moderation, to the detection of 20,000 labels,” Google said. Its code is available on GitHub here. On the other hand, Google’s Content Classification for Cloud Natural Language service is designed for text-based content. Announced in September, its main job is to read through texts and categorize them appropriately. The API can be used to sort documents into more than 700 different categories, such as arts and entertainment, hobbies and leisure, law and government, news and many more.

0
0
4563

article-image-google-introduces-nima-neural-image-assessment-model

Savia Lobo

19 Dec 2017

4 min read

Google introduces NIMA: A Neural Image Assessment model

Savia Lobo

19 Dec 2017

4 min read

Google recently introduced NIMA (Neural Image Enhancement Model), a deep convolutional neural network. NIMA is trained to predict which images would be considered would be considered technically or aesthetically attractive by a user. It is able to generalize objects based on their categories despite many variations, similar to object recognition networks. It can be used to score images reliably with high correlation to human perception, and also in other labor-intensive and subjective tasks such as intelligent photo editing, optimizing visual quality for an improvised user engagement, or to minimize perceived visual errors within an imaging pipeline. Assessment of image quality and aesthetics has been a persistent issue in the field of image processing and computer vision. image quality assessment deals with measuring pixel-level degradations such as noise, blur, compression artifacts, etc., whereas the aesthetic assessment captures semantic level characteristics associated with emotions and beauty in images. In recent times, Deep CNNs, which are trained using human-labeled data have been used to detect the subjective nature of image quality for some specific classes of images, for instance, landscapes. But, such an approach is limited as it categorizes images into two classes, high and low namely. On the contrary, what NIMA does is, it predicts the distribution of ratings. This further leads to a high-quality prediction with a higher correlation to the ground truth ratings. Also, it can be applied to all the images in general instead of only to some specific ones. Let’s explore some applications of the NIMA model: Distribution of ratings Instead of classifying images as a low/high score or regressing to the mean score, the NIMA model produces a distribution of ratings for any given image, on a scale of 1 to 10, with 10 being the highest aesthetic score associated to an image. It assigns likelihoods to each of the possible scores, which is more directly in line with how training data is typically captured. Hence, it turns out to be a better predictor of human preferences when measured against other approaches. Ranking photos aesthetically Various functions of the NIMA vector score--such as mean--can be used to rank photos aesthetically. Some test photos from the large-scale database for Aesthetic Visual Analysis (AVA) dataset are taken into account, where each AVA photo is scored by an average of 200 people in response to photography contests. After training on the aesthetic ranking of these photos, the NIMA model closely matched the mean scores given by human raters. So, NIMA is highly likely to perform equally well on other datasets, with predicted quality scores close to human ratings. NIMA scoring for detecting quality of an image NIMA scores can also be used to differentiate between the quality of images which have the same subject but may have been distorted in others ways. For instance, the mean scores that are predicted, are used to qualitatively rank photos as shown in the figure below. These images are part of the TID2013 test set, which contains various types and levels of distortions. Source: https://arxiv.org/pdf/1709.05424.pdf Perceptual Image Enhancement Quality and aesthetic scores are used to perceptually tune image enhancement operators. In other words, maximizing NIMA score as part of a loss function can increase the likelihood of enhancing perceptual quality--the ability to interpret something through human senses--of an image. NIMA can be used as a training loss to tune a tone enhancement algorithm. The baseline aesthetic ratings can be improved by contrast adjustments directed by the NIMA score. Also, the NIMA model is able to guide a deep CNN filter to aesthetically find near-optimal settings of its parameters, such as brightness, highlights, and shadows. To summarize, with NIMA, Google suggests that the quality assessment models that are based on ML may be capable of a wider range of useful functions. For instance, an improved image capture, the ability to sort out best pictures out of many, and so on. For a deeper understanding of the workings of NIMA, you can go through the research paper.

0
0
4492

Matthew Emerick

24 Sep 2020

1 min read

On-premises data gateway September 2020 update is now available from Microsoft Power BI Blog | Microsoft Power BI

Matthew Emerick

24 Sep 2020

1 min read

September 2020 gateway release

0
0
4484

article-image-search-company-elastic-goes-public-and-doubles-its-value-on-day-1

Richard Gall

08 Oct 2018

2 min read

Search company Elastic goes public and doubles its value on day 1

Richard Gall

08 Oct 2018

2 min read

At the end of last week, on October 5, Elastic - the company behind the hugely popular ElasticSearch search tool - went public. And it looks like the move public paid off. By the end of its first day as a public company, shares in Elastic had doubled in price. At the start of the day it was around $36 per share - by the close of trading it has leapt to $70. This meant its market cap had risen from $2.5 billion to $4.9 billion. We probably shouldn't be surprised. As Fortune pointed out over the weekend, the demand for shares in Elastic was already getting pretty hot towards the end of September. "It originally filed to sell the stock at $26 to $29 apiece on Sept. 24, while Thursday’s pricing was higher even than banks managing the sale were expecting." https://twitter.com/elastic/status/1048302021143670786 Elastic's journey to the stock market ElasticSearch was first released back in February 2010. The tool contained the imprint of Elastic's mission - to make powerful search accessible to modern businesses. And it has certainly done just that. Today Elastic powers immensely popular apps like Uber and Tinder, as well as giving logging and processing power to companies like Cisco where big data has become the norm. According to founder Shay Banon, writing in a blog post published on Friday, Elastic has seen "more than 350 million product downloads, a... community of more than 100,000 developers, and more than 5,500 customers." What next for Elastic? Banon insists that "as a public company, [Elastic] will continue doing the things that have made us Elastic." This includes continued investment in the developer communities that have grown up around Elastic's products, new features to products and working with customers to adapt to changing trends in software infrastructure - so Elastic's products can be deployed anywhere. Business as usual might well be a recipe for success for Elastic. Investors will be hoping that the organization continues to deliver on its mission, as demand for better, faster search isn't going to end any time soon.

0
0
4472

article-image-lyft-introduces-amundsen-a-data-discovery-and-metadata-engine-for-its-researchers-and-data-scientists

Amrata Joshi

03 Apr 2019

4 min read

Lyft introduces Amundsen; a data discovery and metadata engine for its researchers and data scientists

Amrata Joshi

03 Apr 2019

4 min read

Yesterday, the team at Lyft introduced a data discovery and metadata engine called Amundsen. Amundsen is introduced to increase the productivity of data scientists and research scientists at Lyft. The team named it Amundsen inspiring from the Norwegian explorer, Roald Amundsen. The aim is to improve the productivity of data users by making their lives simple with this data search interface. According to UNECE (United Nations Economic Commission for Europe), the data in our world has grown over 40x spanning the last 10 years. The growth in data volumes has given rise to major challenges of productivity and compliance issues which were important to solve. The team at Lyft found the solution to these problems in the metadata and not in the actual data. “Metadata, also defined as ‘data about the data’, is a set of data that describes and gives information about other data.” The team solved a part of the productivity problem using the metadata. How did the team come up with Amundsen The team at Lyft realized that the majority of their time was spent in data discovery instead of prototyping and productionalization, where they actually wanted to invest more time. Data discovery involves answering questions like “If a certain type of data exists? Where is it? What is the source of truth of that data? Does it need to be accessed? And similar types of questions are answered in the process. This the reason why the team at Lyft brought the idea of Amundsen inspired a lot by search engines like Google. But Amundsen is more of searching data in the organization. Users can search for data by typing in their search term in the search box. For instance, “election results” or “users”. For ones who aren’t aware of what they are searching for, the platform offers a list of popular tables in the organization to browse through them. [caption id="attachment_26978" align="alignnone" width="696"] Image Source: Lyft[/caption] How does the search ranking feature function Once the user enters the search term, the results show in-line metadata and description of the table as well the last date when the table was updated. These results are chosen by fuzzy matching the entered text with a few metadata fields such as the table name, column name, table description and column descriptions. It uses an algorithm which is similar to Page Rank, where highly queried tables show up above, while those queried less are shown later in the search results. How does the detail page look like After selecting a result, users get to the detail page which shows the name of the table along with it’s manually curated description which is followed by the column list. A special blue arrow by a column indicates that it’s a popular column which encourages users to use it. On the right-hand pane, users can see who’s the owner, who are frequent users and a general profile of the data. [caption id="attachment_26980" align="alignnone" width="696"] Image source: Lyft[/caption] Further classification of metadata The team Lyft divided the metadata into a few categories and gave different access to each of the categories. Existence and other fundamental metadata This category includes name and description of table and fields, owners, last updated, etc. This metadata is available to everyone and anyone can access it. Richer metadata This category includes column stats and preview. This metadata is available to the users who have access to the data because these stats may have sensitive information which should be considered privileged. According to the team at Lyft, Amundsen has been successful at Lyft and has shown a high adoption rate and Customer Satisfaction (CSAT) score. Users can now easily discover more data in a shorter time. Amundsen can also be used to store, and tag all personal data within the organization which can help an organization remain compliant. To know more about this news, check out the official post by Lyft. Lyft acquires computer vision startup Blue Vision Labs, in a bid to win the self driving car race Uber and Lyft drivers strike in Los Angeles Uber open-sources Peloton, a unified Resource Scheduler

0
0
4470

article-image-google-cloud-and-go-jeks-announce-feast-a-new-and-open-source-feature-store-for-machine-learning

Natasha Mathur

21 Jan 2019

3 min read

Google cloud and GO-JEK’s announce Feast, a new and open source feature store for machine learning

Natasha Mathur

21 Jan 2019

3 min read

Google Cloud announced the release of Feast, a new open source feature store that helps organizations to better manage, store, and discover new features for their machine learning projects, last week. Feast, a collaboration project between Google Cloud and GO-JEK (an Indonesian tech startup) is an open, extensible, and a unified platform for feature storage. “Feast is an essential component in building end-to-end machine learning systems at GO-JEK. We’re very excited to release it to the open source community,” says Peter Richens, Senior Data Scientist at GO-JEK. It has been developed with an aim to find solutions for common challenges faced by Machine Learning Development teams. Some of these common challenges include: Machine Learning features not being reused (features representing similar business concepts get redeveloped many times when existing work from other teams could have been reused). Feature definitions vary (teams define features differently and many times there is no easy access to the documentation of a feature). Hard to serve up-to-date features (teams are hesitant in using real-time data). Inconsistency between training and serving (training requires historical data, whereas prediction models require the latest values. When data is broken down into various independent systems, it leads to inconsistencies as the systems then require separate tooling). Feast gets rid of these challenges by providing teams with a centralized platform that allows teams to easily reuse the features developed by another team across different projects. Also, as you add more features to the store, it becomes cheaper to build models Feast Apart from that, Feast manages the ingestion of data by unifying it from both batch and streaming sources (using Apache Beam) into the feature warehouse and feature serving stores. Users can then query features in the warehouse using the same set of feature identifiers. It also allows easy access to historical feature data for its users, which in turn, can be used to produce datasets for training models. Moreover, Feast allows teams to capture documentation, metadata and metrics about features, allowing teams to communicate clearly about these features. Feast aims to be deployable on Kubeflow in the future and would get integrated seamlessly with other Kubeflow components such as a Python SDK for use with Kubeflow's Jupyter notebooks, and Kubeflow Pipelines. This is because Kubeflow focuses on improving packaging, training, serving, orchestration, and evaluation of models. “We hope that Feast can act as a bridge between your data engineering and machine learning teams”, says the Feast team. For more information, check out the official Google Cloud announcement. Watson-CoreML : IBM and Apple’s new machine learning collaboration project Google researchers introduce JAX: A TensorFlow-like framework for generating high-performance code from Python and NumPy machine learning programs Dopamine: A Tensorflow-based framework for flexible and reproducible Reinforcement Learning research by Google

0
0
4346

article-image-locked-nested-projects-provide-unprecedented-flexibility-in-governing-your-site-from-whats-new

Anonymous

05 Nov 2020

4 min read

Locked nested projects provide unprecedented flexibility in governing your site from What's New

Anonymous

05 Nov 2020

4 min read

Mark Shulman Product Manager, Tableau Kristin Adderson November 5, 2020 - 8:48pm November 6, 2020 It’s okay to admit it—when you heard that Tableau introduced locked nested sub projects in 2020.1, it may not have given you goose bumps or sent shivers down your spine. But, we are here to say that if you’re responsible for governance and structuring yours Tableau site, it may be one of the most powerful features to come along in quite awhile. This “little” feature is easy to overlook, but has a big positive impact on minimizing the needs for additional sites, delegating admin responsibilities, and providing the flexibility that your organization needs. How do locked nested projects change my site management? Mark Wu, Tableau Zen Master, stated, “Sub projects can now be locked independently. This is a game changer for all of us!” To understand why this is a big deal, let’s go back to pre-2020.1 days. When you locked a project, the entire locked parent/child hierarchy underneath it had the exact same permissions. This limitation required you to either have a very broad, flat tree structure—or you may have worked around it by spawning unnecessary, new sites. You don’t need to do that anymore! Before: Pre-2020.1 the world was broad and flat. Locked projects drove the same permissions down to all its sub projects which triggered the proliferation of more top-level projects. User navigation of the site is much more challenging with so many projects. This limitation also led some to stand up additional sites, which has the downside of creating a “hard” boundary to sharing data sources and can make collaboration more challenging. Sites require duplicate project hierarchies, increasing the effort to create, permission, and manage across them. After: Your site structure reflects your organization’s depth. You can now lock a project at ANY level in your site’s project folder structure regardless of whether the parent is locked with different permissions. That allows you maximum flexibility to structure and permission your site in ways not possible before Tableau 2020.1. Why would I want to use locked nested projects? Many organizations want to manage their content and permissions in ways that mimic their organizational structures. Think of all the potential benefits. Now you can empower project leaders or owners to lock sub projects with the permissions that meet their specific group needs at any level in the hierarchy. You free up admin time by delegating to the folks closer to the work, and help your Tableau site to be better organized and governed. Locked Nested Projects simplifies permissioning by allowing you to: Lock sub projects independently Simplify top-level projects Group similar projects together Ensure access consistency Ease admin burden by delegating to project owner/leaders Organizations aren't flat, and neither is the way you govern your content. You can now organize your Tableau site and projects exactly the way you want—by department, region, content development lifecycle, or perhaps a combination. Below are three common examples of organizing projects that can take advantage of Locked Nested Projects. Each color represents different group access that could be locked in place at the sub project level and below. Overall, we strongly recommend using locked projects along with group permissions rules to help ease the management of a Tableau site. Unlocked projects can promote a wild west culture, where everyone manages their own content permissions differently. In contrast, locked projects ensure consistency of permissions across content and provide the ability to delegate the admin role to project owners or leaders who know the appropriate details for each group’s content. How do I use locked nested projects? You can apply nested projects to an existing project hierarchy regardless of when you created it. Click the three-dot Action menu - Permissions... for a project. Click the Edit link in the upper left of the Permissions dialog. Click Locked and the check Apply to nested projects. It’s very easy to overlook the new check box when you click to edit the locked settings for a project. Where can I find out more about locked nested projects? There are many resources available to get you started with locked nested projects: Tableau Help: Lock Content Permissions Tableau Help: Use Projects to Manage Content Access Tableau Blueprint: Content Governance Tableau Community: V2020.1 Nested Project Permission (Thanks to Mark Wu)

0
0
4317

Packt Editorial Staff

08 Dec 2017

6 min read

8th Dec.' 17 - Headlines

Packt Editorial Staff

08 Dec 2017

6 min read

OpenAI's Block sparse GPU kernels, Nvidia's Titan V desktop GPU, Coinbase's surge on bitcoin hike, and a new blockchain Overledger to link existing blockchains, among today's trending stories in artificial intelligence, machine learning, and data science news. Nvidia doesn’t support sparse matrix networks, so OpenAI created “Block sparse GPU kernels” AI research firm OpenAI launches software to speed up GPU-powered neural networks OpenAI announced it has developed a library of tools that can help researchers build faster, more efficient neural networks that take up less memory on GPUs. Because Nvidia (the biggest manufacturer of GPUs for neural networks) doesn’t support sparse matrix networks in its hardware, OpenAI has created what it calls “block sparse GPU kernels” to create these sparse networks on Nvidia’s chips. OpenAI said it used its enhanced neural networks to perform sentiment analysis on user reviews on websites including Amazon and IMDB and reported some impressive performance gains. “The sparse model outperforms the dense model on all sentiment datasets,” OpenAI’s researchers wrote in a blog post. “Our sparse model improves the state of the art on the document level IMDB dataset from 5.91 per cent error to 5.01 per cent. This is a promising improvement over our previous results which performed best only on shorter sentence level datasets.” The kernels are written in Nvidia’s CUDA programming language, and are currently only compatible with the TensorFlow deep learning framework. They also only support Nvidia GPUs, but can be expanded to support other frameworks and hardware. OpenAI said it wants to “freely collaborate” with other institutions and researchers so that its block sparse GPU kernels can be used for other projects. The code is available on GitHub. Nvidia’s “most powerful PC GPU ever created” Nvidia announces $2,999 PC GPU “Titan V”— the Volta-powered GPU delivers 110 Teraflops of Deep Learning Horsepower, 9x its predecessor Nvidia has just announced its new flagship graphics card, the TITAN V, based on the architecture of its "Volta" GV100 graphics processor. It marks a new era, as Titan V is NVIDIA's first HBM2-equipped prosumer graphics card available for the masses. It comes with 12 GB of HBM2 memory across a 3072-bit wide memory interface. The GPU is based on 5120 shader processors, 640 Tensor cores and gets 320 Texture units. It has a base clock of 1200 Mhz and a 1455 MHz boost clock. The 12 GB memory runs at 1.7 Gbps (three HBM2 4GB stacks), and the card has 6-pin and 8-pin PCIe power connectors. The display outputs show three DP and one HDMI connectors. Having the regular vapor chamber cooler with a copper heatsink, the card has 16 power phases with 250W TDP. Notably, there are no SLI fingers, and instead, Nvidia seems to be using NVLink connections at the top of the PCB. Priced at a staggering $2,999, the GPU is available only through the NVIDIA store. More info will follow soon, Nvidia said. The insane Bitcoin bubble that is underway.. Coinbase becomes No. 1 iPhone app in US, crashes on demand With the bitcoin surging at an unprecedented pace over the last few days, Coinbase has suddenly become the most downloaded app in the U.S. The popular bitcoin wallet ranked around 400th on the free chart in the App Store less than a month ago, but has spiked to the top slot, beating out the likes of YouTube, Facebook Messenger, and Instagram. Coinbase’s rise to the top of the APP store is attributed to the ongoing crazy ride of bitcoins as the cryptocurrency has skyrocketed from just under $10,000 at the start of the week to over $18,000 in no time. So much so that Coinbase is now unable to handle the demand load. For large portions of the day, its service was unavailable and the app was hanging quite often. Coinbase later tweeted that its site was “down for maintenance” as they were experiencing a record high traffic. "The Blockchain to Fix All Blockchains" Overledger: Quant Network creates cross-blockchain data interoperability technology London-based Quant Network has launched Overledger, a technology for data interoperability across different blockchains. The idea is to do something similar to TCP/IP, which enabled the internet. "The uniqueness of our operating system is that Overledger is not another blockchain,” Quant Network Chief Strategist Paolo Tasca said. “We do not impose new consensus mechanisms, new gateways, adapters or special validating nodes on top of existing blockchains. Overledger is a virtual blockchain that links existing blockchains and allows developers to build multi-chain applications (or in other terms blockchain-agnostic applications)." Gilbert Verdian, CEO and co-founder of Quant Network, confirmed that a patent for Overledger technology was filed in the first week of December. According to Verdian, Quant Network is focussing on three goals: developing an API to connect the world’s networks to multiple blockchains; bridging existing networks (e.g financial services) to new blockchains; and developing a new Blockchain Operating System with a protocol and a platform to create next-generation, multi-chain applications. Machine learning in fashion searches.. Syte.ai unveils new API “Visual Search for All” for online fashion retailers Syte.ai, a visual search startup just for fashion, has launched a new API that makes adding visual search accessible to more e-commerce sites. Called Visual Search for All, the white-label feature can be integrated into retail websites or apps within 24 hours and lets shoppers upload photos saved on their phones, like screenshots from Instagram, to find similar products on sale. It is based on the same technology as Syte.ai’s search tools for large fashion brands and publishers, which shows shoppers relevant items when they hover a cursor over part of an image. “Once it indexes a brand’s product feed, Visual Search for All can be added to a site’s search bar in less than a day by adding a line of HTML,” Co-founder Lihi Pinto Fryman said, noting that clients pay a monthly license fee based on the number of image-matches likely to be used. Facebook Messenger and Line users can try out Syte.ai’s technology by sending images to its chatbot, Syte Inspire. The Israeli fashion tech startup had raised $8 million from investors including top Asian tech firms NHN, Line Corp. and Naver, earlier this year. Honda’s Self-driving Cars project Honda teams up with China’s SenseTime on AI tech for self-driving cars Honda has signed a 5-year joint research and development agreement with China’s SenseTime, an IT firm specializing in artificial intelligence, for self-driving cars technology. As part of its 2030 Vision strategy announced in June, Honda aims to have a car with Level 4 self-driving capability on sale by the year 2025. According to Honda, SenseTime excels in image recognition technologies, especially recognition of moving objects, powered by deep learning technology. In their new partnership, Honda will join its AI algorithms for environment understanding, risk prediction and action planning with SenseTime’s moving object recognition technologies. The goal is to develop a reliable self-driving system that will be able to handle both highways and complex urban environments, the automaker said.

0
0
4315

Matthew Emerick

22 Sep 2020

1 min read

Driving a data culture in a world of remote everything from Microsoft Power BI Blog | Microsoft Power BI

Matthew Emerick

22 Sep 2020

1 min read

During this new Microsoft Ignite format, with 48 hours of hours of digital sessions and interactions where thousands of IT professionals will come together, we have several exciting innovations to announce that will help customers drive clarity when they need it most.

0
0
4269

Tech News - Data

7th Dec.' 17 - Headlines

Join us! Power BI Webinar. Wednesday 30 September 2020 &#8211; 8:00 AM &#8211; 9:00 AM PDT from Microsoft Power BI Blog | Microsoft Power BI

Tableau migrates to the cloud: how we evaluated our modernization from What's New

Thursday News, October 15 from Featured Blog Posts - Data Science Central

Microsoft’s Visual Studio IntelliCode gets improved features: Whole-line code completions, AI-assisted refactoring, and more!

Types of Variables in Data Science in One Picture from Featured Blog Posts - Data Science Central

6th Dec.' 17 - Headlines

Google introduces NIMA: A Neural Image Assessment model

On-premises data gateway September 2020 update is now available from Microsoft Power BI Blog | Microsoft Power BI

Search company Elastic goes public and doubles its value on day 1

Trending Topics

Lyft introduces Amundsen; a data discovery and metadata engine for its researchers and data scientists

Google cloud and GO-JEK’s announce Feast, a new and open source feature store for machine learning

Locked nested projects provide unprecedented flexibility in governing your site from What's New

8th Dec.' 17 - Headlines

Driving a data culture in a world of remote everything from Microsoft Power BI Blog | Microsoft Power BI

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access

Join us! Power BI Webinar. Wednesday 30 September 2020 – 8:00 AM – 9:00 AM PDT from Microsoft Power BI Blog | Microsoft Power BI