Data | Tech News, Tutorials & Expert Insights

article-image-new-qgis-3d-capabilities-and-future-plans-presented-by-martin-dobias-a-core-qgis-developer

13 Dec 2019

8 min read

New QGIS 3D capabilities and future plans presented by Martin Dobias, a core QGIS developer

13 Dec 2019

In his talk titled QGIS 3D: current state and future at FOSS4G 2019, Martin Dobias, CTO of Lutra Consulting talked about the new features in QGIS 3D. He also shared a list of features that can be added to QGIS 3D to make 3D rendering in QGIS more powerful. Free and Open Source Software for Geospatial (FOSS4G) 2019 was a five-day event that happened from Aug 26-30 at Bucharest. FOSS4G is a conference where geospatial professionals, students, professors come together to discuss about free and open-source software for geospatial storage, processing, and visualization. [box type="shadow" align="" class="" width=""] Further Learning This article explores the new features in QGIS 3D native rendering support. If you are embarking on your QGIS journey, check out our book Learn QGIS - Fourth Edition by Andrew Cutts and Anita Graser. In this book, you will explore QGIS user interface, load your data, edit, and then create data. QGIS often surprises new users with its mapping capabilities; you will discover how easily you can style and create your first map. But that’s not all! In the final part of the book, you’ll learn about spatial analysis, powerful tools in QGIS, and conclude by looking at Python processing options. [/box] 3D visualization in QGIS QGIS 3D native rendering support was introduced in QGIS 3. Prior to that, developers had to rely on third-party tools like NVIZ from GRASS GIS, GVIZ, Globe plugin, Qgis2threejs plugin, and more. Though these worked, “the integration was never great with the rest of QGIS,” remarks Dobias. In 2017, the QGIS grand proposal was accepted to start the initial work on QGIS 3D. A year later, QGIS 3 was announced with an interactive, fully integrated interface for you to work in 3D. QGIS 3 has a separate interface dedicated to 3D data visualization called 3D map view, which you can access from the View context menu. After you select this option, a new window will open that you can dock to the main panel. In the new window you will see all the layers that are visible in the main map view and rendered digital elevation and vector data in 3D. With native QGIS 3D support you can render raster, vector, and mesh layers. It also provides various methods for visualizing and styling the 3D data depending on the data or geometry type. Here are some of the features that Dobias talked about: Point-based rendering Starting with QGIS 3, you have three ways to render points: Basic symbols: You can use symbols such as spheres, cylinders, boxes, or cubes, apply a color, and apply a few transformations. 3D models loaded from a file: You can use the Open Asset Import Library (Assimp) to load the 3D models. This library allows you to import and export a wide-range of 3D model file formats including Collada, Wavefront, and more. After loading the model you can do tweaks like changing the color. However, there are currently limitations like “you can only change the color of the whole model and not the individual components,” Dobias mentioned. Billboard rendering: This feature was contributed by Ismail Sunni as a part of the Google Summer of Code (GSoC) 2019 project, QGIS 3D Improvement. The billboard support, which was released in QGIS 3.10, will allow you to render points as a billboard in 3D map view. Line rendering For line rendering, you have two options: Simple lines: In this approach, you define the width of a line in pixels and it does not change when you zoom-in or zoom-out. This technique preserves Z coordinates. Buffered lines: In this approach, you define the line width in map units. So, as soon as you start zooming in the line will appear zoomed out. Buffered rendering ignores z-coordinates. Polygon rendering For polygon rendering, you have four different options: Planar 3D entity: QGIS 3 provides a method to draw polygon geometries as planar polygons. Extrusion: Extrusion is a way to create 3D symbology from 2D features by stretching it vertically. QGIS now supports extruding a planar polygon to make it look like a box. You can specify a constant height or you can write an expression that determines it. Polyhedral surfaces or PolygonZ: QGIS 3 has a provision for creating polyhedral surfaces. Polyhedron is simply a three-dimensional solid which consists of a collection of polygons, usually joined at their edges. Triangular mesh or MultiPatch: It is similar to polyhedral surfaces, the only difference is that it consists of individual triangles. 3D map tools Navigation: You can use mouse and keyboard to navigate the map. Now, with the latest QGIS release you can also perform navigation using on-screen controls. Dobias said, “This is good for beginners when they are not completely sure about other means of moving the map.” Identify tool: With this tool, you can interact with the map canvas and get information on features in a pop-up window. It works exactly like its 2D counterpart, the only difference being it will be on a 3D entity. Measurement tool: This tool was also built as part of the GSoC project. This will enable you to measure real distances between given points. Other 3D capabilities Print layout support QGIS already had support to save the 3D map view as an image file, but for print layouts you needed to perform multiple steps. You had to first save 3D scene images and then embed them within print layouts. Also, the resolution of the saved images was limited to the size of the 3D window. To simplify the use of 3D scenes for printing and allow high resolution scene exports, QGIS 3 supports a new type of layout item that is capable of high resolution exports of 3D map scenes. Camera animation support With the QGIS 3D support, now users can define keyframes on a timeline with camera positions and view directions for various points in time. The 3D engine will interpolate camera parameters between keyframes to create animations. These resulting animations can then be played within the 3D view or exported frame-by-frame to a series of images. Configuration of lights By default, the 3D view has a single white light placed above the centre of the 3D scene. Now, users can set up light source position, color, and intensity and even define multiple lights for some interesting effects. Rule-based 3D rendering Previously, it was only possible to define one 3D renderer per layer meaning all features appear the same. QGIS 3 features rule-based rendering for 3D to make it much easier to apply more complex styling in 3D without having to duplicate vector layers and apply filters. There are many other 3D capabilities that you can explore including terrain shading, better camera control, and more. Where you can find data for 3D maps Dobias shared a few great 3D city models that are free to use including CityGML and CityJSON. To easily load CityJSON datasets in QGIS you can use the CityJSON Loader plugin. OpenStreetMap (OSM) is another project that provides buildings data. You can also use the Google dataset search. Just type CityGML in a search box and find the data you need. QGIS 3D capabilities to expect in the future Dobias further talked about the future plans for QGIS 3D. Currently, the team is working on improving support for larger 3D scenes and also make them load faster. For the far future, Dobias shared a wishlist of features that can be implemented in QGIS to make its 3D support much more powerful: Enhancing the 3D rendering performance More rendering techniques like shadows, transparency New materials to show textured objects More styles for vector layers such as lines and 3D pipes More data types such as point cloud and 3D rasters Formats support like 3D tiles, Arc SceneLayer Animation of data in scenes Profile tool Blender export Rendering of point cloud You just read about some of the latest features in QGIS 3 for 3D rendering. If you are new to QGIS and want to grasp its fundamentals, check out our book Learn QGIS - Fourth Edition by Anita Graser and Andrew Cutts. In this book, you will explore various ways to load data into QGIS, understand how to style data and present it in a map, and create maps and explore ways to expand them. You will get acquainted with the new processing toolbox in QGIS 3.4, manipulate your geospatial data and gain quality insights, and work with QGIS 3.4 in 3D. Why geospatial analysis and GIS matters more than ever today Top 7 libraries for geospatial analysis Uber’s kepler.gl, an open source toolbox for GeoSpatial Analysis

0
0
30225

Packt

07 Feb 2017

32 min read

Context – Understanding your Data using R

Packt

07 Feb 2017

32 min read

0
2
30180

article-image-all-of-my-engineering-teams-have-a-machine-learning-feature-on-their-roadmap-will-ballard-talks-artificial-intelligence-in-2019-interview

Packt Editorial Staff

02 Jan 2019

3 min read

“All of my engineering teams have a machine learning feature on their roadmap” - Will Ballard talks artificial intelligence in 2019 [Interview]

Packt Editorial Staff

02 Jan 2019

3 min read

The huge advancements of deep learning and artificial intelligence were perhaps the biggest story in tech in 2018. But we wanted to know what the future might hold - luckily, we were able to speak to Packt author Will Ballard about what they see as in store for artificial in 2019 and beyond. Will Ballard is the chief technology officer at GLG, responsible for engineering and IT. He was also responsible for the design and operation of large data centers that helped run site services for customers including Gannett, Hearst Magazines, NFL, NPR, The Washington Post, and Whole Foods. He has held leadership roles in software development at NetSolve (now Cisco), NetSpend, and Works (now Bank of America). Explore Will Ballard's Packt titles here. Packt: What do you think the biggest development in deep learning / AI was in 2018? Will Ballard: I think attention models beginning to take the place of recurrent networks is a pretty impressive breakout on the algorithm side. In Packt’s 2018 Skill Up survey, developers across disciplines and job roles identified machine learning as the thing they were most likely to be learning in the coming year. What do you think of that result? Do you think machine learning is becoming a mandatory multidiscipline skill, and why? Almost all of my engineering teams have an active, or a planned machine learning feature on their roadmap. We’ve been able to get all kinds of engineers with different backgrounds to use machine learning -- it really is just another way to make functions -- probabilistic functions -- but functions. What do you think the most important new deep learning/AI technique to learn in 2019 will be, and why? In 2019 -- I think it is going to be all about PyTorch and TensorFlow 2.0, and learning how to host these on cloud PaaS. The benefits of automated machine learning and metalearning How important do you think automated machine learning and metalearning will be to the practice of developing AI/machine learning in 2019? What benefits do you think they will bring? Even ‘simple’ automation techniques like grid search and running multiple different algorithms on the same data are big wins when mastered. There is almost no telling which model is ‘right’ till you try it, so why not let a cloud of computers iterate through scores of algorithms and models to give you the best available answer? Artificial intelligence and ethics Do you think ethical considerations will become more relevant to developing AI/machine learning algorithms going forwards? If yes, how do you think this will be implemented? I think the ethical issues are important on outcomes, and on how models are used, but aren’t the place of algorithms themselves. If a developer was looking to start working with machine learning/AI, what tools and software would you suggest they learn in 2019? Python and PyTorch.

0
0
30007

article-image-build-hadoop-clusters-using-google-cloud-platform-tutorial

Sunith Shetty

24 Jul 2018

10 min read

Build Hadoop clusters using Google Cloud Platform [Tutorial]

Sunith Shetty

24 Jul 2018

10 min read

Cloud computing has transformed the way individuals and organizations access and manage their servers and applications on the internet. Before Cloud computing, everyone used to manage their servers and applications on their own premises or on dedicated data centers. The increase in the raw computing power of computing (CPU and GPU) of multiple-cores on a single chip and the increase in the storage space (HDD and SSD) present challenges in efficiently utilizing the available computing resources. In today's tutorial, we will learn different ways of building Hadoop cluster on the Cloud and ways to store and access data on Cloud. This article is an excerpt from a book written by Naresh Kumar and Prashant Shindgikar titled Modern Big Data Processing with Hadoop. Building Hadoop cluster in the Cloud Cloud offers a flexible and easy way to rent resources such as servers, storage, networking, and so on. The Cloud has made it very easy for consumers with the pay-as-you-go model, but much of the complexity of the Cloud is hidden from us by the providers. In order to better understand whether Hadoop is well suited to being on the Cloud, let's try to dig further and see how the Cloud is organized internally. At the core of the Cloud are the following mechanisms: A very large number of servers with a variety of hardware configurations Servers connected and made available over IP networks Large data centers to host these devices Data centers spanning geographies with evolved network and data center designs If we pay close attention, we are talking about the following: A very large number of different CPU architectures A large number of storage devices with a variety of speeds and performance Networks with varying speed and interconnectivity Let's look at a simple design of such a data center on the Cloud:We have the following devices in the preceding diagram: S1, S2: Rack switches U1-U6: Rack servers R1: Router Storage area network Network attached storage As we can see, Cloud providers have a very large number of such architectures to make them scalable and flexible. You would have rightly guessed that when the number of such servers increases and when we request a new server, the provider can allocate the server anywhere in the region. This makes it a bit challenging for compute and storage to be together but also provides elasticity. In order to address this co-location problem, some Cloud providers give the option of creating a virtual network and taking dedicated servers, and then allocating all their virtual nodes on these servers. This is somewhat closer to a data center design, but flexible enough to return resources when not needed. Let's get back to Hadoop and remind ourselves that in order to get the best from the Hadoop system, we should have the CPU power closer to the storage. This means that the physical distance between the CPU and the storage should be much less, as the BUS speeds match the processing requirements. The slower the I/O speed between the CPU and the storage (for example, iSCSI, storage area network, network attached storage, and so on) the poorer the performance we get from the Hadoop system, as the data is being fetched over the network, kept in memory, and then fed to the CPU for further processing. This is one of the important things to keep in mind when designing Hadoop systems on the Cloud. Apart from performance reasons, there are other things to consider: Scaling Hadoop Managing Hadoop Securing Hadoop Now, let's try to understand how we can take care of these in the Cloud environment. Hadoop can be installed by the following methods: Standalone Semi-distributed Fully-distributed When we want to deploy Hadoop on the Cloud, we can deploy it using the following ways: Custom shell scripts Cloud automation tools (Chef, Ansible, and so on) Apache Ambari Cloud vendor provided methods Google Cloud Dataproc Amazon EMR Microsoft HDInsight Third-party managed Hadoop Cloudera Cloud agnostic deployment Apache Whirr Google Cloud Dataproc In this section, we will learn how to use Google Cloud Dataproc to set up a single node Hadoop cluster. The steps can be broken down into the following: Getting a Google Cloud account. Activating Google Cloud Dataproc service. Creating a new Hadoop cluster. Logging in to the Hadoop cluster. Deleting the Hadoop cluster. Getting a Google Cloud account This section assumes that you already have a Google Cloud account. Activating the Google Cloud Dataproc service Once you log in to the Google Cloud console, you need to visit the Cloud Dataproc service. The activation screen looks something like this: Creating a new Hadoop cluster Once the Dataproc is enabled in the project, we can click on Create to create a new Hadoop cluster. After this, we see another screen where we need to configure the cluster parameters: I have left most of the things to their default values. Later, we can click on the Create button which creates a new cluster for us. Logging in to the cluster After the cluster has successfully been created, we will automatically be taken to the cluster lists page. From there, we can launch an SSH window to log in to the single node cluster we have created. The SSH window looks something like this: As you can see, the Hadoop command is readily available for us and we can run any of the standard Hadoop commands to interact with the system. Deleting the cluster In order to delete the cluster, click on the DELETE button and it will display a confirmation window, as shown in the following screenshot. After this, the cluster will be deleted: Looks so simple, right? Yes. Cloud providers have made it very simple for users to use the Cloud and pay only for the usage. Data access in the Cloud The Cloud has become an important destination for storing both personal data and business data. Depending upon the importance and the secrecy requirements of the data, organizations have started using the Cloud to store their vital datasets. The following diagram tries to summarize the various access patterns of typical enterprises and how they leverage the Cloud to store their data: Cloud providers offer different varieties of storage. Let's take a look at what these types are: Block storage File-based storage Encrypted storage Offline storage Block storage This type of storage is primarily useful when we want to use this along with our compute servers, and want to manage the storage via the host operating system. To understand this better, this type of storage is equivalent to the hard disk/SSD that comes with our laptops/MacBook when we purchase them. In case of laptop storage, if we decide to increase the capacity, we need to replace the existing disk with another one. When it comes to the Cloud, if we want to add more capacity, we can just purchase another larger capacity storage and attach it to our server. This is one of the reasons why the Cloud has become popular as it has made it very easy to add or shrink the storage that we need. It's good to remember that, since there are many different types of access patterns for our applications, Cloud vendors also offer block storage with varying storage/speed requirements measured with their own capacity/IOPS, and so on. Let's take an example of this capacity upgrade requirement and see what we do to utilize this block storage on the Cloud. In order to understand this, let's look at the example in this diagram: Imagine a server created by the administrator called DB1 with an original capacity of 100 GB. Later, due to unexpected demand from customers, an application started consuming all the 100 GB of storage, so the administrator has decided to increase the capacity to 1 TB (1,024 GB). This is what the workflow looks like in this scenario: Create a new 1 TB disk on the Cloud Attach the disk to the server and mount it Take a backup of the database Copy the data from the existing disk to the new disk Start the database Verify the database Destroy the data on the old disk and return the disk This process is simplified but in production this might take some time, depending upon the type of maintenance that is being performed by the administrator. But, from the Cloud perspective, acquiring new block storage is very quick. File storage Files are the basics of computing. If you are familiar with UNIX/Linux environments, you already know that, everything is a file in the Unix world. But don't get confused with that as every operating system has its own way of dealing with hardware resources. In this case we are not worried about how the operating system deals with hardware resources, but we are talking about the important documents that the users store as part of their day-to-day business. These files can be: Movie/conference recordings Pictures Excel sheets Word documents Even though they are simple-looking files in our computer, they can have significant business importance and should be dealt with in a careful fashion, when we think of storing these on the Cloud. Most Cloud providers offer an easy way to store these simple files on the Cloud and also offer flexibility in terms of security as well. A typical workflow for acquiring the storage of this form is like this: Create a new storage bucket that's uniquely identified Add private/public visibility to this bucket Add multi-geography replication requirement to the data that is stored in this bucket Some Cloud providers bill their customers based on the number of features they select as part of their bucket creation. Please choose a hard-to-discover name for buckets that contain confidential data, and also make them private. Encrypted storage This is a very important requirement for business critical data as we do not want the information to be leaked outside the scope of the organization. Cloud providers offer an encryption at rest facility for us. Some vendors choose to do this automatically and some vendors also provide flexibility in letting us choose the encryption keys and methodology for the encrypting/decrypting data that we own. Depending upon the organization policy, we should follow best practices in dealing with this on the Cloud. With the increase in the performance of storage devices, encryption does not add significant overhead while decrypting/encrypting files. This is depicted in the following image: Continuing the same example as before, when we choose to encrypt the underlying block storage of 1 TB, we can leverage the Cloud-offered encryption where they automatically encrypt and decrypt the data for us. So, we do not have to employ special software on the host operating system to do the encryption and decryption. Remember that encryption can be a feature that's available in both the block storage and file-based storage offer from the vendor. Cold storage This storage is very useful for storing important backups in the Cloud that are rarely accessed. Since we are dealing with a special type of data here, we should also be aware that the Cloud vendor might charge significantly high amounts for data access from this storage, as it's meant to be written once and forgetten (until it's needed). The advantage with this mechanism is that we have to pay lesser amounts to store even petabytes of data. We looked at the different steps involved in building our own Hadoop cluster on the Cloud. And we saw different ways of storing and accessing our data on the Cloud. To know more about how to build expert Big Data systems, do checkout this book Modern Big Data Processing with Hadoop. Read More: What makes Hadoop so revolutionary? Machine learning APIs for Google Cloud Platform Getting to know different Big data Characteristics

0
0
29956

Packt

13 Feb 2017

14 min read

CNN architecture

Packt

13 Feb 2017

14 min read

0
0
29872

article-image-debug-application-using-qt-creator

Gebin George

27 Apr 2018

9 min read

How to Debug an application using Qt Creator

Gebin George

27 Apr 2018

9 min read

Today, we will learn about debugging an application using Qt Creator. A debugger is a program that can be used to test and debug other programs, in case of a sudden crash during the program execution or an unexpected behavior in the logic of the program. Most of the time (if not always), debuggers are used in the development environment and in conjunction with an IDE. In our case, we will learn how to use a debugger with Qt Creator. It is important to note that debuggers are not part of the Qt Framework, and, just like compilers, they are usually provided by the operating system SDK. Qt Creator automatically detects and uses debuggers if they are present on a system. This can be checked by navigating into the Qt Creator Options page via the main menu Tools and then Options. Make sure to select Build & Run from the list on the left side and then switch to the Debuggers tab from the top. You should be able to see one or more autodetected debuggers on the list. [box type="info" align="" class="" width=""]Windows Users: You should see something similar to the screenshot after this information box. If not, this means you have not installed any debuggers. You can easily download and install it using the instructions provided here: https:/ / docs. microsoft. com/ en- us/ windows- hardware/ drivers/debugger/ Or, you can independently search for the following topic online: Debugging Tools for Windows (WinDbg, KD, CDB, NTSD). Nevertheless, after the debugger is installed (assumingly, CDB or Microsoft Console Debugger for Microsoft Visual C++ Compilers and GDB for GCC Compilers), you can restart Qt Creator and return to this page. You should be able to have one or more entries similar to the following. Since we have installed a 32-bit version of the Qt and OpenCV Frameworks, choose the entry with x86 in its name to view its path, type, and other properties. macOS and Linux Users: There shouldn't be any action needed on your part and, depending on the OS, you'll see a GDB, LLDB, or some other debugger in the entries.[/box] Here's the screenshot of the Build & Run tab on the Options page: Depending on the operating system and the installed debugger, the preceding screenshot might be slightly different. Nevertheless, you'll have a debugger that you need to make sure is correctly set as the debugger for the Qt Kit you are using. So, make a note of the debugger path and name and switch to the Kits tab, and, after selecting the Qt Kit you were using, make sure the debugger for it is correctly set, as you can see in the following screenshot: Don't worry about choosing the wrong debugger, or any other options, since you'll be warned with relevant icons beside the Qt Kit icon selected at the top. The icon seen in the following image on the left side is usually displayed when everything is okay with the Kit, the second one from the left is an indication that something is not right, and the one on the right means a critical error. Move your mouse over the icon when it appears to see more information about the required actions needed to fix the issue: [box type="info" align="" class="" width=""]Critical issues with Qt Kits can be caused by many different factors such as a missing compiler which will make the kit completely useless until the issue is resolved. An example of a warning message in a Qt Kit would be a missing debugger, which will not make the kit useless, but you won't be able to use the debugger with it, thus it means less functionality than a completely configured Qt Kit.[/box] After the debugger is correctly set, you can start debugging your applications in one of the following ways, which basically have the same result: ending up in the Debugger view of the Qt Creator: Starting an application in Debugging mode Attaching to a running application (or process) [box type="info" align="" class="" width=""]Note that a debugging process can be started in many ways, such as remotely, by attaching to a process running on a separate machine and so on. However, the preceding methods will suffice for most cases and especially for the ones relevant to the Qt+OpenCV application development and what we learned throughout this book.[/box] Getting started with the debugging mode To start an application in the debugging mode, after opening a Qt project, you can use one of the following methods: Pressing the F5 button Using the Start Debugging button, right below the usual Run button with a similar icon, but with a small bug on it Using the main menu entries in the following order: Debug/Start Debugging/Start Debugging. To attach the debugger to a running application, you can use the main menu entries in the following order: Debug/Start Debugging/Attach to Running Application. This will open up the List of Processes window, from which you can choose your application or any other process you want to debug using its process ID or executable name. You can also use the Filter field (as seen in the following image) to find your application, since, most probably, the list of processes will be quite a long one. After choosing the correct process, make sure to press the Attach to Process button. No matter which one of the preceding methods you use, you will end up in the Qt Creator Debug mode, which is quite similar to the Edit mode, but it also allows you to do the following, among many others: Add, Enable, Disable, and View Breakpoints in the code (a Breakpoint is simply a point or a line in the code that we want the debugger to pause in the process and allow us to do a more detailed analysis of the status of the program) Interrupt running programs and processes to view and examine the code View and examine the function call stack (the call stack is a stack containing the hierarchical list of functions that led to a breakpoint or interrupted state) View and examine the variables Disassemble the source codes (disassembling in this sense means extracting the exact instructions that correspond to the function calls and other C++ codes in our program) You'll notice a performance drop in the application when it is started in debugging mode, which is obviously because of the fact that codes are being monitored and traced by the debugger. Here's a screenshot of the Qt Creator Debug mode, in which all of the capabilities mentioned earlier are visible in a single window and in the Debug mode of the Qt Creator: The area specified with the number 1 in the preceding screenshot in the code editor that you have already used through the book and are quite familiar with. Each line of code has a line number; you can click on their left side to toggle a breakpoint anywhere you want in the code. You can also right-click on the line numbers to set, remove, disable, or enable a breakpoint by selecting Set Breakpoint at Line X, Remove Breakpoint X, Disable Breakpoint X, or Enable Breakpoint X, where X in all of the commands mentioned here needs to be replaced by the line number. Apart from the code editor, you can also use the area mentioned with number 4 in the preceding screenshot to add, delete, edit, and further modify breakpoints in the code. You can also right-click on the same toolbar below the code editor that contains the debugger controls to open up the following menu and add or remove more panes to display additional debug and analysis information. We will cover the default debugger view, but make sure to check out each one of the following options on your own to familiarize yourself with the debugger even more: The area specified with number 2 in the preceding code can be used to view the call stack. Whether you interrupt the program by pressing the Interrupt button or choosing Debug/Interrupt from the menu while the it is running, set a breakpoint and stop the program in a specific line of code, or a malfunctioning code causes the program to fall into a trap and pause the process (since a crash and exception will be caught by the debugger), you can always view the hierarchy of function calls that led to the interrupted state, or further analyze them by checking the area 2 in the preceding Qt Creator screenshot. Finally, you can use the third area in the previous screenshot to view the local and global variables of the program in the interrupted location in the code. You can see the contents of the variables, whether they are standard data types, such as integers and floats or structures and classes, and also you can further expand and analyze their content to test and analyze any possible issues in your code. Using a debugger efficiently can mean hours of difference in testing and solving the issues in your code. In terms of practical usage of the debuggers, there is really no other way but to use it as much as you can and develop habits of your own to use the debugger, but also make note of good practices and tricks you found along the way and the ones we just went through. If you are interested, you can also read online about other possible methods of debugging, such as remote debugging, debugging using crash dump files (on Windows), and more. We saw how to practically debug an application using QT debugging mode. [box type="note" align="" class="" width=""]You read an excerpt from the book, Computer Vision with OpenCV 3 and Qt 5 written by Amin Ahmadi Tazehkandi. The book covers development of cross-platform applications using OpenCV 3 and Qt 5.[/box] 3 ways to deploy a QT and OpenCV application Debugging Your .NET Application

0
0
29850

article-image-predicting-bitcoin-price-from-historical-and-live-data

Sunith Shetty

06 Apr 2018

17 min read

Predicting Bitcoin price from historical and live data

Sunith Shetty

06 Apr 2018

17 min read

0
0
29814

article-image-2019-stack-overflow-survey-quick-overview

Sugandha Lahoti

10 Apr 2019

5 min read

2019 Stack Overflow survey: A quick overview

Sugandha Lahoti

10 Apr 2019

5 min read

The results of the 2019 Stack Overflow survey have just been published: 90,000 developers took the 20-minute survey this year. The survey shed light on some very interesting insights – from the developers’ preferred language for programming, to the development platform they hate the most, to the blockers to developer productivity. As the survey is quite detailed and comprehensive, here’s a quick look at the most important takeaways. Key highlights from the Stack Overflow Survey Programming languages Python again emerged as the fastest-growing programming language, a close second behind Rust. Interestingly, Python and Typescript achieved the same votes with almost 73% respondents saying it was their most loved language. Python was the most voted language developers wanted to learn next and JavaScript remains the most used programming language. The most dreaded languages were VBA and Objective C. Source: Stack Overflow Frameworks and databases in the Stack Overflow survey Developers preferred using React.js and Vue.js web frameworks while dreaded Drupal and jQuery. Redis was voted as the most loved database and MongoDB as the most wanted database. MongoDB’s inclusion in the list is surprising considering its controversial Server Side Public License. Over the last few months, Red Hat dropped support for MongoDB over this license, so did GNU Health Federation. Both of these organizations choose PostgreSQL over MongoDB, which is one of the reasons probably why PostgreSQL was the second most loved and wanted database of Stack Overflow Survey 2019. Source: Stack Overflow It’s interesting to see WebAssembly making its way in the popular technology segment as well as one of the top paying technologies. Respondents who use Clojure, F#, Elixir, and Rust earned the highest salaries Stackoverflow also did a new segment this year called "Blockchain in the real world" which gives insight into the adoption of Blockchain. Most respondents (80%) on the survey said that their organizations are not using or implementing blockchain technology. Source: Stack Overflow Developer lifestyles and learning About 80% of our respondents say that they code as a hobby outside of work and over half of respondents had written their first line of code by the time they were sixteen, although this experience varies by country and by gender. For instance, women wrote their first code later than men and non-binary respondents wrote code earlier than men. About one-quarter of respondents are enrolled in a formal college or university program full-time or part-time. Of professional developers who studied at the university level, over 60% said they majored in computer science, computer engineering, or software engineering. DevOps specialists and site reliability engineers are among the highest paid, most experienced developers most satisfied with their jobs, and are looking for new jobs at the lowest levels. The survey also noted that developers who are system admins or DevOps specialists are 25-30 times more likely to be men than women. Chinese developers are the most optimistic about the future while developers in Western European countries like France and Germany are among the least optimistic. Developers also overwhelmingly believe that Elon Musk will be the most influential person in tech in 2019. With more than 30,000 people responding to a free text question asking them who they think will be the most influential person this year, an amazing 30% named Tesla CEO Musk. For perspective, Jeff Bezos was in second place, being named by ‘only’ 7.2% of respondents. Although, this year the US survey respondents proportion of women, went up from 9% to 11%, it’s still a slow growth and points to problems with inclusion in the tech industry in general and on Stack Overflow in particular. When thinking about blockers to productivity, different kinds of developers report different challenges. Men are more likely to say that being tasked with non-development work is a problem for them, while gender minority respondents are more likely to say that toxic work environments are a problem. Stack Overflow survey demographics and diversity challenges This report is based on a survey of 88,883 software developers from 179 countries around the world. It was conducted between January 23 to February 14 and the median time spent on the survey for qualified responses was 23.3 minutes. The majority of survey respondents this year were people who said they are professional developers or who code sometimes as part of their work, or are students preparing for such a career. Majority of them were from the US, India, China and Europe. Stack Overflow acknowledged that their results did not represent racial disparities evenly and people of color continue to be underrepresented among developers. This year nearly 71% of respondents continued to be of White or European descent, a slight improvement from last year (74%). The survey notes that, “In the United States this year, 22% of respondents are people of color; last year 19% of United States respondents were people of color.” This clearly signifies that a lot of work is still needed to be done particularly for people of color, women, and underrepresented groups. Although, last year in August, Stack Overflow revamped its Code of Conduct to include more virtues around kindness, collaboration, and mutual respect. It also updated its developers salary calculator to include 8 new countries. Go through the full report to learn more about developer salaries, job priorities, career values, the best music to listen to while coding, and more. Developers believe Elon Musk will be the most influential person in tech in 2019, according to Stack Overflow survey results Creators of Python, Java, C#, and Perl discuss the evolution and future of programming language design at PuPPy Stack Overflow is looking for a new CEO as Joel Spolsky becomes Chairman

0
0
29753

Packt

10 Mar 2016

17 min read

Exploring HDFS

Packt

10 Mar 2016

17 min read

0
0
29624

article-image-fat-2018-conference-session-2-summary-interpretability-explainability

Savia Lobo

22 Feb 2018

5 min read

FAT* 2018 Conference Session 2 Summary: Interpretability and Explainability

Savia Lobo

22 Feb 2018

5 min read

This session of the FAT* 2018 is about interpretability and explainability in machine learning models. With the advances in Deep learning, machine learning models have become more accurate. However, with accuracy and advancements, it is a tough task to keep the models highly explainable. This means, these models may appear as black boxes to business users, who utilize them without knowing what lies within. Thus, it is equally important to make ML models interpretable and explainable, which can be beneficial and essential for understanding ML models and to have a ‘behind the scenes’ knowledge of what’s happening within them. This understanding can be highly essential for heavily regulated industries like Finance, Medicine, Defence and so on. The Conference on Fairness, Accountability, and Transparency (FAT), which would be held on the 23rd and 24th of February, 2018 is a multi-disciplinary conference that brings together researchers and practitioners interested in fairness, accountability, and transparency in socio-technical systems. The FAT 2018 conference will witness 17 research papers, 6 tutorials, and 2 keynote presentations from leading experts in the field. This article covers research papers pertaining to the 2nd session that is dedicated to Interpretability and Explainability of machine-learned decisions. If you’ve missed our summary of the 1st session on Online Discrimination and Privacy, visit the article link for a catch up. Paper 1: Meaningful Information and the Right to Explanation This paper addresses an active debate in policy, industry, academia, and the media about whether and to what extent Europe’s new General Data Protection Regulation (GDPR) grants individuals a “right to explanation” of automated decisions. The paper explores two major papers, European Union Regulations on Algorithmic Decision Making and a “Right to Explanation” by Goodman and Flaxman (2017) Why a Right to Explanation of Automated Decision-Making Does Not Exist in the General Data Protection Regulation by Wachter et al. (2017) This paper demonstrates that the specified framework is built on incorrect legal and technical assumptions. In addition to responding to the existing scholarly contributions, the article articulates a positive conception of the right to explanation, located in the text and purpose of the GDPR. The authors take a position that the right should be interpreted functionally, flexibly, and should, at a minimum, enable a data subject to exercise his or her rights under the GDPR and human rights law. Key takeaways: The first paper by Goodman and Flaxman states that GDPR creates a "right to explanation" but without any argument. The second paper is in response to the first paper, where Watcher et al. have published an extensive critique, arguing against the existence of such a right. The current paper, on the other hand, is partially concerned with responding to the arguments of Watcher et al. Paper 2: Interpretable Active Learning The paper tries to highlight how due to complex and opaque ML models, the process of active learning has also become opaque. Not much has been known about what specific trends and patterns, the active learning strategy may be exploring. The paper expands on explaining about LIME (Local Interpretable Model-agnostic Explanations framework) to provide explanations for active learning recommendations. The authors, Richard Phillips, Kyu Hyun Chang, and Sorelle A. Friedler, demonstrate uses of LIME in generating locally faithful explanations for an active learning strategy. Further, the paper shows how these explanations can be used to understand how different models and datasets explore a problem space over time. Key takeaways: The paper demonstrates how active learning choices can be made more interpretable to non-experts. It also discusses techniques that make active learning interpretable to expert labelers, so that queries and query batches can be explained and the uncertainty bias can be tracked via interpretable clusters. It showcases per-query explanations of uncertainty to develop a system that allows experts to choose whether to label a query. This will allow them to incorporate domain knowledge and their own interests into the labeling process. It introduces a quantified notion of uncertainty bias, the idea that an algorithm may be less certain about its decisions on some data clusters than others. Paper 3: Interventions over Predictions: Reframing the Ethical Debate for Actuarial Risk Assessment Actuarial risk assessments might be unduly perceived as a neutral way to counteract implicit bias and increase the fairness of decisions made within the criminal justice system, from pretrial release to sentencing, parole, and probation. However, recently, these assessments have come under increased scrutiny, as critics claim that the statistical techniques underlying them might reproduce existing patterns of discrimination and historical biases that are reflected in the data. The paper proposes that machine learning should not be used for prediction, but rather to surface covariates that are fed into a causal model for understanding the social, structural and psychological drivers of crime. The authors, Chelsea Barabas, Madars Virza, Karthik Dinakar, Joichi Ito (MIT), Jonathan Zittrain (Harvard), propose an alternative application of machine learning and causal inference away from predicting risk scores to risk mitigation. Key takeaways: The paper gives a brief overview of how risk assessments have evolved from a tool used solely for prediction to one that is diagnostic at its core. The paper places a debate around risk assessment in a broader context. One can get a fuller understanding of the way these actuarial tools have evolved to achieve a varied set of social and institutional agendas. It argues for a shift away from predictive technologies, towards diagnostic methods that will help in understanding the criminogenic effects of the criminal justice system itself, as well as evaluate the effectiveness of interventions designed to interrupt cycles of crime. It proposes that risk assessments, when viewed as a diagnostic tool, can be used to understand the underlying social, economic and psychological drivers of crime. The authors also posit that causal inference offers the best framework for pursuing the goals to achieve a fair and ethical risk assessment tool.

0
0
29623

article-image-gql-graph-query-language-joins-sql-as-a-global-standards-project-and-is-now-the-international-standard-declarative-query-language-for-graphs

Amrata Joshi

19 Sep 2019

6 min read

GQL (Graph Query Language) joins SQL as a Global Standards Project and will be the international standard declarative query language for graphs

Amrata Joshi

19 Sep 2019

6 min read

On Tuesday, the team at Neo4j, the graph database management system announced that the international committees behind the development of the SQL standard have voted to initiate GQL (Graph Query Language) as the new database query language. GQL is now going to be the international standard declarative query language for property graphs and it is also a Global Standards Project. GQL is developed and maintained by the same international group that maintains the SQL standard. How did the proposal for GQL pass? Last year in May, the initiative for GQL was first time processed in the GQL Manifesto. This year in June, the national standards bodies across the world from the ISO/IEC’s Joint Technical Committee 1 (responsible for IT standards) started voting on the GQL project proposal. The ballot closed earlier this week and the proposal was passed wherein ten countries including Germany, Korea, United States, UK, and China voted in favor. And seven countries agreed to put forward their experts to work on this project. Japan was the only country to vote against in the ballot because according to Japan, existing languages already do the job, and SQL/Property Graph Query extensions along with the rest of the SQL standard can do the same job. According to the Neo4j team, the GQL project will initiate development of next-generation technology standards for accessing data. Its charter mandates building on core foundations that are established by SQL and ongoing collaboration in order to ensure SQL and GQL interoperability and compatibility. GQL would reflect rapid growth in the graph database market by increasing adoption of the Cypher language. Stefan Plantikow, GQL project lead and editor of the planned GQL specification, said, “I believe now is the perfect time for the industry to come together and define the next generation graph query language standard.” Plantikow further added, “It’s great to receive formal recognition of the need for a standard language. Building upon a decade of experience with property graph querying, GQL will support native graph data types and structures, its own graph schema, a pattern-based approach to data querying, insertion and manipulation, and the ability to create new graphs, and graph views, as well as generate tabular and nested data. Our intent is to respect, evolve, and integrate key concepts from several existing languages including graph extensions to SQL.” Keith Hare, who has served as the chair of the international SQL standards committee for database languages since 2005, charted the progress toward GQL, said, “We have reached a balance of initiating GQL, the database query language of the future whilst preserving the value and ubiquity of SQL.” Hare further added, “Our committee has been heartened to see strong international community participation to usher in the GQL project. Such support is the mark of an emerging de jure and de facto standard .” The need for a graph-specific query language Researchers and vendors needed a graph-specific query language because of the following limitations: SQL/PGQ language is restricted to read-only queries SQL/PGQ cannot project new graphs The SQL/PGQ language can only access those graphs that are based on taking a graph view over SQL tables. Researchers and vendors needed a language like Cypher that would cover insertion and maintenance of data and not just data querying. But SQL wasn’t the apt model for a graph-centric language that takes graphs as query inputs and outputs a graph as a result. But GQL, on the other hand, builds in openCypher, a project that brings Cypher to Apache Spark and gives users a composable graph query language. SQL and GQL can work together According to most of the companies and national standards bodies that are supporting the GQL initiative, GQL and SQL are not competitors. Instead, these languages can complement each other via interoperation and shared foundations. Alastair Green, Query Languages Standards & Research Lead at Neo4j writes, “A SQL/PGQ query is in fact a SQL sub-query wrapped around a chunk of proto-GQL.” SQL is a language that is built around tables whereas GQL is built around graphs. Users can use GQL to find and project a graph from a graph. Green further writes, “I think that the SQL standards community has made the right decision here: allow SQL, a language built around tables, to quote GQL when the SQL user wants to find and project a table from a graph, but use GQL when the user wants to find and project a graph from a graph. Which means that we can produce and catalog graphs which are not just views over tables, but discrete complex data objects.” It is still not clear when will the first implementation version of GQL will be out. The official page reads, “The work of the GQL project starts in earnest at the next meeting of the SQL/GQL standards committee, ISO/IEC JTC 1 SC 32/WG3, in Arusha, Tanzania, later this month. It is impossible at this stage to say when the first implementable version of GQL will become available, but it is highly likely that some reasonably complete draft will have been created by the second half of 2020.” Developer community welcomes the new addition Users are excited to see how GQL will incorporate Cypher, a user commented on HackerNews, “It's been years since I've worked with the product and while I don't miss Neo4j, I do miss the query language. It's a little unclear to me how GQL will incorporate Cypher but I hope the initiative is successful if for no other reason than a selfish one: I'd love Cypher to be around if I ever wind up using a GraphDB again.” Few others mistook GQL to be Facebook’s GraphQL and are sceptical about the name. A comment on HackerNews reads, “Also, the name is of course justified, but it will be a mess to search for due to (Facebook) GraphQL.” A user commented, “I read the entire article and came away mistakenly thinking this was the same thing as GraphQL.” Another user commented, “That's quiet an unfortunate name clash with the existing GraphQL language in a similar domain.” Other interesting news in Data Media manipulation by Deepfakes and cheap fakes refquire both AI and social fixes, finds a Data & Society report Percona announces Percona Distribution for PostgreSQL to support open source databases Keras 2.3.0, the first release of multi-backend Keras with TensorFlow 2.0 support is now out

0
0
29596

article-image-transformers-2-0-nlp-library-with-deep-interoperability-between-tensorflow-2-0-and-pytorch

Fatema Patrawala

30 Sep 2019

3 min read

Transformers 2.0: NLP library with deep interoperability between TensorFlow 2.0 and PyTorch, and 32+ pretrained models in 100+ languages

Fatema Patrawala

30 Sep 2019

3 min read

Last week, Hugging Face, a startup specializing in natural language processing, released a landmark update to their popular Transformers library, offering unprecedented compatibility between two major deep learning frameworks, PyTorch and TensorFlow 2.0. Transformers (formerly known as pytorch-transformers and pytorch-pretrained-bert) provides general-purpose architectures (BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet…) for Natural Language Understanding (NLU) and Natural Language Generation (NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between TensorFlow 2.0 and PyTorch. Transformers 2.0 embraces the ‘best of both worlds’, combining PyTorch’s ease of use with TensorFlow’s production-grade ecosystem. The new library makes it easier for scientists and practitioners to select different frameworks for the training, evaluation and production phases of developing the same language model. “This is a lot deeper than what people usually think when they talk about compatibility,” said Thomas Wolf, who leads Hugging Face’s data science team. “It’s not only about being able to use the library separately in PyTorch and TensorFlow. We’re talking about being able to seamlessly move from one framework to the other dynamically during the life of the model.” https://twitter.com/Thom_Wolf/status/1177193003678601216 “It’s the number one feature that companies asked for since the launch of the library last year,” said Clement Delangue, CEO of Hugging Face. Notable features in Transformers 2.0 8 architectures with over 30 pretrained models, in more than 100 languages Load a model and pre-process a dataset in less than 10 lines of code Train a state-of-the-art language model in a single line with the tf.keras fit function Share pretrained models, reducing compute costs and carbon footprint Deep interoperability between TensorFlow 2.0 and PyTorch models Move a single model between TF2.0/PyTorch frameworks at will Seamlessly pick the right framework for training, evaluation, production As powerful and concise as Keras About Hugging Face Transformers With half a million installs since January 2019, Transformers is the most popular open-source NLP library. More than 1,000 companies including Bing, Apple or Stitchfix are using it in production for text classification, question-answering, intent detection, text generation or conversational. Hugging Face, the creators of Transformers, have raised US$5M so far from investors in companies like Betaworks, Salesforce, Amazon and Apple. On Hacker News, users are appreciating the company and how Transformers has become the most important library in NLP. Other interesting news in data Baidu open sources ERNIE 2.0, a continual pre-training NLP model that outperforms BERT and XLNet on 16 NLP tasks Dr Joshua Eckroth on performing Sentiment Analysis on social media platforms using CoreNLP Facebook open-sources PyText, a PyTorch based NLP modeling framework

0
0
29544

article-image-saving-backups-on-cloud-services-with-elasticsearch-plugins

Savia Lobo

29 Dec 2017

9 min read

Saving backups on cloud services with ElasticSearch plugins

Savia Lobo

29 Dec 2017

9 min read

[box type="note" align="" class="" width=""]Our article is a book excerpt taken from Mastering Elasticsearch 5.x. written by Bharvi Dixit. This book guides you through the intermediate and advanced functionalities of Elasticsearch, such as querying, indexing, searching, and modifying data. In other words, you will gain all the knowledge necessary to master Elasticsearch and put into efficient use.[/box] This article will explain how Elasticsearch, with the help of additional plugins, allows us to push our data outside of the cluster to the cloud. There are three possibilities where our repository can be located, at least using officially supported plugins: The S3 repository: AWS The HDFS repository: Hadoop clusters The GCS repository: Google cloud services The Azure repository: Microsoft's cloud platform Let's go through these repositories to see how we can push our backup data on the cloud services. The S3 repository The S3 repository is a part of the Elasticsearch AWS plugin, so to use S3 as the repository for snapshotting, we need to install the plugin first on every node of the cluster and each node must be restarted after the plugin installation: sudo bin/elasticsearch-plugin install repository-s3 After installing the plugin on every Elasticsearch node in the cluster, we need to alter their configuration (the elasticsearch.yml file) so that the AWS access information is available. The example configuration can look like this: cloud: aws: access_key: YOUR_ACCESS_KEY secret_key: YOUT_SECRET_KEY To create the S3 repository that Elasticsearch will use for snapshotting, we need to run a command similar to the following one: curl -XPUT 'http://localhost:9200/_snapshot/my_s3_repository' -d '{ "type": "s3", "settings": { "bucket": "bucket_name" } }' The following settings are supported when defining an S3-based repository: bucket: This is the required parameter describing the Amazon S3 bucket to which the Elasticsearch data will be written and from which Elasticsearch will read the data. region: This is the name of the AWS region where the bucket resides. By default, the US Standard region is used. base_path: By default, Elasticsearch puts the data in the root directory. This parameter allows you to change it and alter the place where the data is placed in the repository. server_side_encryption: By default, encryption is turned off. You can set this parameter to true in order to use the AES256 algorithm to store data. chunk_size: By default, this is set to 1GB and specifies the size of the data chunk that will be sent. If the snapshot size is larger than the chunk_size, Elasticsearch will split the data into smaller chunks that are not larger than the size specified in the chunk_size. The chunk size can be specified in size notations such as 1GB, 100mb, and 1024kB. buffer_size: The size of this buffer is set to 100mb by default. When the chunk size is greater than the value of the buffer_size, Elasticsearch will split it into buffer_size fragments and use the AWS multipart API to send it. The buffer size cannot be set lower than 5 MB because it disallows the use of the multipart API. endpoint: This defaults to AWS's default S3 endpoint. Setting a region overrides the endpoint setting. protocol: Specifies whether to use http or https. It default to cloud.aws.protocol or cloud.aws.s3.protocol. compress: Defaults to false and when set to true. This option allows snapshot metadata files to be stored in a compressed format. Please note that index files are already compressed by default. read_only: Makes a repository to be read only. It defaults to false. max_retries: This specifies the number of retries Elasticsearch will take before giving up on storing or retrieving the snapshot. By default, it is set to 3. In addition to the preceding properties, we are allowed to set two additional properties that can overwrite the credentials stored in elasticsearch.yml, which will be used to connect to S3. This is especially handy when you want to use several S3 repositories, each with its own security settings: access_key: This overwrites cloud.aws.access_key from elasticsearch.yml secret_key: This overwrites cloud.aws.secret_key from elasticsearch.yml Note: AWS instances resolve S3 endpoints to a public IP. If the Elasticsearch instances reside in a private subnet in an AWS VPC then all traffic to S3 will go through that VPC's NAT instance. If your VPC's NAT instance is a smaller instance size (for example, a t1.micro) or is handling a high volume of network traffic, your bandwidth to S3 may be limited by that NAT instance's networking bandwidth limitations. So, if you running your Elasticsearch cluster inside a VPC then make sure that you are using instances with a high networking bandwidth and there is no network congestion. Note: Instances residing in a public subnet in an AWS VPC will connect to S3 via the VPC's Internet gateway and not be bandwidth limited by the VPC's NAT instance. The HDFS repository If you use Hadoop and its HDFS (http://wiki.apache.org/hadoop/HDFS) filesystem, a good alternative to back up the Elasticsearch data is to store it in your Hadoop cluster. As with the case of S3, there is a dedicated plugin for this. To install it, we can use the following command: sudo bin/elasticsearch-plugin install repository-hdfs Note : The HDFS snapshot/restore plugin is built against the latest Apache Hadoop 2.x (currently 2.7.1). If your Hadoop distribution is not protocol compatible with Apache Hadoop, you can replace the Hadoop libraries inside the plugin folder with your own (you might have to adjust the security permissions required). Note: Even if Hadoop is already installed on the Elasticsearch nodes, for security reasons, the required libraries need to be placed under the plugin folder. Note that in most cases, if the distribution is compatible, one simply needs to configure the repository with the appropriate Hadoop configuration files. After installing the plugin on each node in the cluster and restarting every node, we can use the following command to create a repository in our Hadoop cluster: curl -XPUT 'http://localhost:9200/_snapshot/es_hdfs_repository' -d '{ "type": "hdfs" "settings": { "uri": "hdfs://namenode:8020/", "path": "elasticsearch_snapshots/es_hdfs_repository" } }' The available settings that we can use are as follows: uri: This is a required parameter that tells Elasticsearch where HDFS resides. It should have a format like hdfs://HOST:PORT/. path: This is the information about the path where snapshot files should be stored. It is a required parameter. load_default: This specifies whether the default parameters from the Hadoop configuration should be loaded and set to false if the reading of the settings should be disabled. This setting is enabled by default. chunk_size: This specifies the size of the chunk that Elasticsearch will use to split the snapshot data. If you want the snapshotting to be faster, you can use smaller chunks and more streams to push the data to HDFS. By default, it is disabled. conf.<key>: This is an optional parameter and tells where a key is in any Hadoop argument. The value provided using this property will be merged with the configuration. As an alternative, you can define your HDFS repository and its settings inside the elasticsearch.yml file of each node as follows: repositories: hdfs: uri: "hdfs://<host>:<port>/" path: "some/path" load_defaults: "true" conf.<key> : "<value>" compress: "false" chunk_size: "10mb" The Azure repository Just like Amazon S3, we are able to use a dedicated plugin to push our indices and metadata to Microsoft cloud services. To do this, we need to install a plugin on every node of the cluster, which we can do by running the following command: sudo bin/elasticsearch-plugin install repository-azure The configuration is also similar to the Amazon S3 plugin configuration. Our elasticsearch.yml file should contain the following section: cloud: azure: storage: my_account: account: your_azure_storage_account key: your_azure_storage_key Do not forget to restart all the nodes after installing the plugin. After Elasticsearch is configured, we need to create the actual repository, which we do by running the following command: curl -XPUT 'http://localhost:9200/_snapshot/azure_repository' -d '{ "type": "azure" }' The following settings are supported by the Elasticsearch Azure plugin: account: Microsoft Azure account settings to be used. container: As with the bucket in Amazon S3, every piece of information must reside in the container. This setting defines the name of the container in the Microsoft Azure space. The default value is elasticsearch-snapshots. base_path: This allows us to change the place where Elasticsearch will put the data. By default, the value for this setting is empty which causes Elasticsearch to put the data in the root directory. compress: This defaults to false and when enabled it allows us to compress the metadata files during the snapshot creation. chunk_size: This is the maximum chunk size used by Elasticsearch (set to 64m by default, and this is also the maximum value allowed). You can change it to change the size when the data should be split into smaller chunks. You can change the chunk size using size value notations such as, 1g, 100m, or 5k. An example of creating a repository using the settings follows: curl -XPUT "http://localhost:9205/_snapshot/azure_repository" -d' { "type": "azure", "settings": { "container": "es-backup-container", "base_path": "backups", "chunk_size": "100m", "compress": true } }' The Google cloud storage repository Similar to Amazon S3 and Microsoft Azure, we can use a GCS repository plugin for snapshotting and restoring of our indices. The settings for this plugin are almost similar to other cloud plugins. To know how to work with the Google cloud repository plugin please refer to the following URL: https://www.elastic.co/guide/en/elasticsearch/plugins/5.0/repository-gcs.htm Thus, in the article we learn how to carry out backup of your data from Elasticsearch clusters to the cloud, i.e. within different cloud repositories by making use of the additional plugin options with Elasticsearch. If you found our excerpt useful, you may explore other interesting features and advanced concepts of Elasticsearch 5.x like aggregation, index control, sharding, replication, and clustering in the book Mastering Elasticsearch 5.x.

0
1
29343

article-image-machine-learning-algorithms-naive-bayes-with-spark-mllib

Wilson D'souza

07 Nov 2017

7 min read

Machine Learning Algorithms: Implementing Naive Bayes with Spark MLlib

Wilson D'souza

07 Nov 2017

7 min read

[box type="note" align="" class="" width=""]In this article by Siamak Amirghodsi, Meenakshi Rajendran, Broderick Hall, and Shuen Mei from their book Apache Spark 2.x Machine Learning Cookbook, we look at how to implement Naïve Bayes classification algorithm with Spark 2.0 MLlib. The associated code and exercise are available at the end of the article.[/box] How to implement Naive Bayes with Spark MLlib Naïve Bayes is one of the most widely used classification algorithms which can be trained and optimized quite efficiently. Spark’s machine learning library, MLlib, primarily focuses on simplifying machine learning and has great support for multinomial naïve Bayes and Bernoulli naïve Bayes. Here we use the famous Iris dataset and use Apache Spark API NaiveBayes() to classify/predict which of the three classes of flower a given set of observations belongs to. This is an example of a multi-class classifier and requires multi-class metrics for measurements of fit. Let’s have a look at the steps to achieve this: For the Naive Bayes exercise, we use a famous dataset called iris.data, which can be obtained from UCI. The dataset was originally introduced in the 1930s by R. Fisher. The set is a multivariate dataset with flower attribute measurements classified into three groups. In short, by measuring four columns, we attempt to classify a species into one of the three classes of Iris flower (that is, Iris Setosa, Iris Versicolour, Iris Virginica).We can download the data from here: https://archive.ics.uci.edu/ml/datasets/Iris/ The column definition is as follows: Sepal length in cm Sepal width in cm Petal length in cm Petal width in cm Class: -- Iris Setosa => Replace it with 0 -- Iris Versicolour => Replace it with 1 -- Iris Virginica => Replace it with 2 The steps/actions we need to perform on the data are as follows: Download and then replace column five (that is, the label or classification classes) with a numerical value, thus producing the iris.data.prepared data file. The Naïve Bayes call requires numerical labels and not text, which is very common with most tools. Remove the extra lines at the end of the file. Remove duplicates within the program by using the distinct() call. Start a new project in IntelliJ or in an IDE of your choice. Make sure that the necessary JAR files are included. Set up the package location where the program will reside: package spark.ml.cookbook.chapter6 Import the necessary packages for SparkSession to gain access to the cluster and Log4j.Logger to reduce the amount of output produced by Spark: import org.apache.spark.mllib.linalg.{Vector, Vectors} import org.apache.spark.mllib.regression.LabeledPoint import org.apache.spark.mllib.classification.{NaiveBayes, NaiveBayesModel} import org.apache.spark.mllib.evaluation.{BinaryClassificationMetrics, MulticlassMetrics, MultilabelMetrics, binary} import org.apache.spark.sql.{SQLContext, SparkSession} import org.apache.log4j.Logger import org.apache.log4j.Level Initialize a SparkSession specifying configurations with the builder pattern, thus making an entry point available for the Spark cluster: val spark = SparkSession .builder .master("local[4]") .appName("myNaiveBayes08") .config("spark.sql.warehouse.dir", ".") .getOrCreate() val data = sc.textFile("../data/sparkml2/chapter6/iris.data.prepared.txt") Parse the data using map() and then build a LabeledPoint data structure. In this case, the last column is the Label and the first four columns are the features. Again, we replace the text in the last column (that is, the class of Iris) with numeric values (that is, 0, 1, 2) accordingly: val NaiveBayesDataSet = data.map { line => val columns = line.split(',') LabeledPoint(columns(4).toDouble , Vectors.dense(columns(0).toDouble,columns(1).toDouble,columns(2).to Double,columns(3).toDouble )) } Then make sure that the file does not contain any redundant rows. In this case, it has three redundant rows. We will use the distinct dataset going forward: println(" Total number of data vectors =", NaiveBayesDataSet.count()) val distinctNaiveBayesData = NaiveBayesDataSet.distinct() println("Distinct number of data vectors = ", distinctNaiveBayesData.count()) Output: (Total number of data vectors =,150) (Distinct number of data vectors = ,147) We inspect the data by examining the output: distinctNaiveBayesData.collect().take(10).foreach(println(_)) Output: (2.0,[6.3,2.9,5.6,1.8]) (2.0,[7.6,3.0,6.6,2.1]) (1.0,[4.9,2.4,3.3,1.0]) (0.0,[5.1,3.7,1.5,0.4]) (0.0,[5.5,3.5,1.3,0.2]) (0.0,[4.8,3.1,1.6,0.2]) (0.0,[5.0,3.6,1.4,0.2]) (2.0,[7.2,3.6,6.1,2.5]) .............. ................ ............. Split the data into training and test sets using a 30% and 70% ratio. The 13L in this case is simply a seeding number (L stands for long data type) to make sure the result does not change from run to run when using a randomSplit() method: val allDistinctData = distinctNaiveBayesData.randomSplit(Array(.30,.70),13L) val trainingDataSet = allDistinctData(0) val testingDataSet = allDistinctData(1) Print the count for each set: println("number of training data =",trainingDataSet.count()) println("number of test data =",testingDataSet.count()) Output: (number of training data =,44) (number of test data =,103) Build the model using train() and the training dataset: val myNaiveBayesModel = NaiveBayes.train(trainingDataSet Use the training dataset plus the map() and predict() methods to classify the flowers based on their features: val predictedClassification = testingDataSet.map( x => (myNaiveBayesModel.predict(x.features), x.label)) Examine the predictions via the output: predictedClassification.collect().foreach(println(_)) (2.0,2.0) (1.0,1.0) (0.0,0.0) (0.0,0.0) (0.0,0.0) (2.0,2.0) ....... ....... ....... Use MulticlassMetrics() to create metrics for the multi-class classifier. As a reminder, this is different from the previous recipe, in which we used BinaryClassificationMetrics(): val metrics = new MulticlassMetrics(predictedClassification) Use the commonly used confusion matrix to evaluate the model: val confusionMatrix = metrics.confusionMatrix println("Confusion Matrix= n",confusionMatrix) Output: (Confusion Matrix= ,35.0 0.0 0.0 0.0 34.0 0.0 0.0 14.0 20.0 ) We examine other properties to evaluate the model: val myModelStat=Seq(metrics.precision,metrics.fMeasure,metrics.recall) myModelStat.foreach(println(_)) Output: 0.8640776699029126 0.8640776699029126 0.8640776699029126 How it works... We used the IRIS dataset for this recipe, but we prepared the data ahead of time and then selected the distinct number of rows by using the NaiveBayesDataSet.distinct() API. We then proceeded to train the model using the NaiveBayes.train() API. In the last step, we predicted using .predict() and then evaluated the model performance via MulticlassMetrics() by outputting the confusion matrix, precision, and F-Measure metrics. The idea here was to classify the observations based on a selected feature set (that is, feature engineering) into classes that correspond to the left-hand label. The difference here was that we are applying joint probability given conditional probability to the classification. This concept is known as Bayes' theorem, which was originally proposed by Thomas Bayes in the 18th century. There is a strong assumption of independence that must hold true for the underlying features to make Bayes' classifier work properly. At a high level, the way we achieved this method of classification was to simply apply Bayes' rule to our dataset. As a refresher from basic statistics, Bayes' rule can be written as follows: The formula states that the probability of A given B is true is equal to probability of B given A is true times probability of A being true divided by probability of B being true. It is a complicated sentence, but if we step back and think about it, it will make sense. The Bayes' classifier is a simple yet powerful one that allows the user to take the entire probability feature space into consideration. To appreciate its simplicity, one must remember that probability and frequency are two sides of the same coin. The Bayes' classifier belongs to the incremental learner class in which it updates itself upon encountering a new sample. This allows the model to update itself on-the-fly as the new observation arrives rather than only operating in batch mode. We evaluated a model with different metrics. Since this is a multi-class classifier, we have to use MulticlassMetrics() to examine model accuracy. [box type="download" align="" class="" width=""]Download exercise and code files here. Exercise Files_Implementing Naive Bayes algorithm with Spark MLlib[/box] For more information on Multiclass Metrics, please see the following link: http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.mllib .evaluation.MulticlassMetrics Documentation for constructor can be found here: http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.ml.classification.NaiveBayes If you enjoyed this article, you should have a look at Apache Spark 2.0 Machine Learning Cookbook which contains this excerpt.

0
0
29257

article-image-halloween-costume-data-science-nerds

Packt Editorial Staff

31 Oct 2017

14 min read

(13*3)+ Halloween costume ideas for Data science nerds

Packt Editorial Staff

31 Oct 2017

14 min read

Are you a data scientist, a machine learning engineer, an AI researcher or simply a data enthusiast? Channel the inner data science nerd within you with these geeky ideas for your Halloween costumes! The Data Science Spectrum Don't know what to go as to this evening's party because you've been busy cleaning that terrifying data? Don’t worry, here are some easy-to-put-together Halloween costume ideas just for you. [dropcap]1[/dropcap] Big Data Go as Baymax, the healthcare robot, (who can also turn into battle mode when required). Grab all white clothes that you have. Stuff your tummy with some pillows and wear a white mask with cutouts for eyes. You are all ready to save the world. In fact, convince a friend or your brother to go as Hiro! [dropcap]2[/dropcap] A.I. agent Enter as Agent Smith, the AI antagonist, this Halloween. Lure everyone with your bold black suit paired with a white shirt and a black tie. A pair of polarized sunglasses would replicate you as the AI agent. Capture the crowd by being the most intelligent and cold-hearted personality of all. [dropcap]3[/dropcap] Data Miner Put on your dungaree with a tee. Fix a flashlight atop your cap. Grab a pickaxe from the gardening toolkit, if you have one. Stripe some mud onto your face. Enter the party wheeling with loads of data boxes that you have freshly mined. You’ll definitely grab some traffic for data. Unstructured data anyone? [dropcap]4[/dropcap] Data Lake Go as a Data lake this Halloween. Simply grab any blue item from your closet. Draw some fishes, crabs, and weeds. (Use a child’s marker for that). After all, it represents the data you have. And you’re all set. [dropcap]5[/dropcap] Dark Data Unleash the darkness within your soul! Just kidding. You don’t actually have to turn to the evil side. Just coming up with your favorite black-costume character would do. Looking for inspiration? Maybe, a witch, The dark knight, or The Darth Vader. [dropcap]6[/dropcap] Cloud A fluffy, white cloud is what you need to be this Halloween. Raid your nearby drug store for loads of cotton balls. Better still, tear up that old pillow you have been meaning to throw away for a while. Use the fiber inside to glue onto an unused tee. You will be the cutest cloud ever seen. Don’t forget to carry an umbrella in case you turn grey! [dropcap]7[/dropcap] Predictive Analytics Make your own paper wizard hat with silver stars and moons pasted on it. If you can arrange for an advocate gown, it would be great. Else you could use a long black bed sheet as a cape. And most importantly, a crystal ball to show off some prediction stunts at the Halloween. [dropcap]8[/dropcap] Gradient boosting Enter Halloween as the energy booster. Wear what you want. Grab loads of empty energy drink tetra packs and stick it all over you. Place one on your head too. Wear a nameplate that says “ G-booster Energy drink”. Fuel up some weak models this Halloween. [dropcap]9[/dropcap] Cryptocurrency Wear head to toe black. In fact, paint your face black as well, like the Grim reaper. Then grab a cardboard piece. Cut out a circle, paint it orange, and then draw a gold B symbol, just like you see in a bitcoin. This Halloween costume will definitely grab you the much-needed attention just as this popular cryptocurrency. [dropcap]10[/dropcap] IoT Are you a fan of IoT and the massive popularity it has gained? Then you should definitely dress up as your web-slinging, friendly neighborhood Spiderman. Just grab a spiderman costume from any costume store and attach some handmade web slings. Remember to connect with people by displaying your IoT knowledge. [dropcap]11[/dropcap] Self-driving car Choose a mono-color outfit of your choice (P.S. The color you would choose for your car). Cut out four wheels and paste two on your lower calves and two on your arms. Cut out headlights too. Put on a wiper goggle. And yes you do not need a steering wheel or the brakes, clutch and the accelerator. Enter the Halloween at your own pace, go self-driving this Halloween. Bonus point: You can call yourself Bumblebee or Optimus Prime. Machine Learning and Deep learning Frameworks If machine learning or deep learning is your forte, here are some fresh Halloween costume ideas based on some of the popular frameworks in that space. [dropcap]12[/dropcap] Torch Flame up the party with a costume inspired by the fantastic four superhero, Johnny Storm a.k.a The Human Torch. Wear a yellow tee and orange slacks. Draw some orange flames on your tee. And finally, wear a flame-inspired headband. Someone is a hot machine learning library! [dropcap]13[/dropcap] TensorFlow No efforts for this one. Just arrange for a pumpkin costume, paste a paper cut-out of the TensorFlow logo and wear it as a crown. Go as the most powerful and widely popular deep learning library. You will be the star of the Halloween as you are a Google Kid. [dropcap]14[/dropcap] Caffe Go as your favorite Starbucks coffee this Halloween. Wear any of your brown dress/ tee. Draw or stick a Starbucks logo. And then add frothing to the top by bunching up a cream-colored sheet. Mamma Mia! [dropcap]15[/dropcap] Pandas Go as a Panda this Halloween! Better still go as a group of Pandas. The best option is to buy a panda costume. But if you don’t want that, wear a white tee, black slacks, black goggles and some cardboard cutouts for ears. This will make you not only the cutest animal in the party but also a top data manipulation library. Good luck finding your python in the party by the way. [dropcap]16[/dropcap] Jupyter Notebook Go as a top trending open-source web application by dressing up as the largest planet in our solar system. People would surely be intimidated by your mass and also by your computing power. [dropcap]17[/dropcap] H2O Go to Halloween as a world famous open source deep learning platform. No, no, you don’t have to go as the platform itself. Instead go as the chemical alter-ego, water. Wear all blue and then grab some leftover asymmetric, blue cloth pieces to stick at your sides. Thirsty anyone? Data Viz & Analytics Tools If you’re all about analytics and visualization, grab the attention of every data geek in your party by dressing up as your favorite data insight tools. [dropcap]18[/dropcap] Excel Grab an old white tee and paint some green horizontal stripes. You’re all ready to go as the most widely used spreadsheet. The simplest of costumes, yet the most useful - a timeless classic that never goes out of fashion. [dropcap]19[/dropcap] MatLab If you have seriously run out of all costume ideas, going out as MatLab is your only solution. Just grab a blue tablecloth. Stick or sew it with some orange curtain and throw it over your head. You’re all ready to go as the multi-paradigm numerical computing environment. [dropcap]20[/dropcap] Weka Wear a brown overall, a brown wig, and paint your face brown. Make an orange beak out of a chart paper, and wear a pair orange stockings/ socks with your trousers tucked in. You are all set to enter as a data mining bird with ML algorithms and Java under your wings. [dropcap]21[/dropcap] Shiny Go all Shimmery!! Get some glitter powder and put it all over you. (You’ll have a tough time removing it though). Else choose a glittery outfit, with glittery shoes, and touch-up with some glitter on your face. Let the party see the bling of R that you bring. You will be the attractive storyteller out there. [dropcap]22[/dropcap] Bokeh A colorful polka-dotted outfit and some dim lights to do the magic. You are all ready to grab the show with such a dazzle. Make sure you enter the party gates with Python. An eye-catching beauty with the beast pair. [dropcap]23[/dropcap] Tableau Enter the Halloween as one of your favorite characters from history. But there is a term and condition for this: You cannot talk or move. Enjoy your Halloween by being still. Weird, but you’ll definitely grab everyone’s eye. [dropcap]24[/dropcap] Microsoft Power BI Power up your Halloween party by entering as a data insights superhero. Wear a yellow turtleneck, a stylish black leather jacket, black pants, some mid-thigh high boots and a slick attitude. You’re ready to save your party! Data Science oriented Programming languages These hand-picked Halloween costume ideas are for you if you consider yourself a top coder. By a top coder we mean you’re all about learning new programming languages in your spare and, well, your not so spare time. [dropcap]25[/dropcap] Python Easy peasy as the language looks, the reptile is not that easy to handle. A pair of python-printed shirt and trousers would do the job. You could be getting more people giving you candies some out of fear, other out of the ease. Definitely, go as a top trending and a go-to language which everyone loves! And yes, don’t forget the fangs. [dropcap]26[/dropcap] R Grab an eye patch and your favorite leather pants. Wear a loose white shirt with some rugged waistcoat and a sword. Here you are all decked up as a pirate for your next loot. You’ll surely thank me for giving you a brilliant Halloween idea. But yes! Don’t forget to make that Arrrr (R) noise! [dropcap]27[/dropcap] Java Go as a freshly roasted coffee bean! People in your Halloween party would be allured by your aroma. They would definitely compliment your unique idea and also the fact that you’re the most popular programming language. [dropcap]28[/dropcap] SAS March in your Halloween party up as a Special Airforce Service (SAS) agent. You would be disciplined, accurate, precise and smart. Just like the advanced software suite that goes by the same name. You would need a full black military costume, with a gas mask, some fake ammunition from a nearby toy store, and some attitude of course! [dropcap]29[/dropcap] SQL If you pride yourself on being very organized or are a stickler for the rules, you should go as SQL this Halloween. Prep-up yourself with an overall blue outfit. Spike up your hair and spray some temporary green hair color. Cut out bold letters S, Q, and L from a plain white paper and stick them on your chest. You are now ready to enter the Halloween party as the most popular database of all times. Sink in all the data that you collect this Halloween. [dropcap]30[/dropcap] Scala If Scala is your favorite programming language, add a spring to your Halloween by going as, well, a spring! Wear the brightest red that you have. Using a marker, draw some swirls around your body (You can ask your mom to help). Just remember to elucidate a 3D picture. And you’re all set. [dropcap]31[/dropcap] Julia If you want to make a red carpet entrance to your Halloween party, go as the Academy award-winning actress, Julia Roberts. You can even take up inspiration from her character in the 90s hit film Pretty Woman. For extra oomph, wear a pink, red, and purple necklace to highlight the Julia programming language [dropcap]32[/dropcap] Ruby Act pricey this Halloween. Be the elegant, dynamic yet simple programming language. Go blood red, wear on your brightest red lipstick, red pumps, dazzle up with all the red accessories that you have. You’ll definitely gather some secret admirers around the hall. [dropcap]33[/dropcap] Go Go as the mascot of Go, the top trending programming language. All you need is a blue mouse costume. Fear not if you don’t have one. Just wear a powder blue jumpsuit, grab a baby pink nose, and clip on a fake single, large front tooth. Ready for the party! [dropcap]34[/dropcap] Octave Go as a numerically competent programming language. And if that doesn’t sound very trendy, go as piano keys depicting an octave. You simply need to wear all white and divide your space into 8 sections. Then draw 5 horizontal black stripes. You won’t be able to do that vertically, well, because they are a big number. Here you go, you’re all set to fill the party with your melody. Fancy an AI system inspired Halloween costume? This is for you if you love the way AI works and the enigma that it has thrown around the world. This is for you if you are spellbound with AI magic. You should go dressed as one of these at your Halloween party this season. Just pick up the AI you want to look like and follow as advised. [dropcap]35[/dropcap] IBM Watson Wear a dark blue hat, a matching long overcoat, a vest and a pale blue shirt with a dark tie tucked into the vest. Complement it with a mustache and a brooding look. You are now ready to be IBM Watson at your Halloween party. [dropcap]36[/dropcap] Apple Siri If you want to be all cool and sophisticated like the Apple’s Siri, wear an alluring black turtleneck dress. Don’t forget to carry your latest iPhone and air pods. Be sure you don’t have a sore throat, in case someone needs your assistance. [dropcap]37[/dropcap] Microsoft Cortana If Microsoft Cortana is your choice of voice assistant, dress up as Cortana, the fictional synthetic intelligence character in the Halo video game series. Wear a blue bodysuit. Get a bob if you’re daring. (A wig would also do). Paint some dark blue robot like designs over your body and well, your face. And you’re all set. [dropcap]38[/dropcap] Salesforce Einstein Dress up as the world’s most famous physicist and also an AI-powered CRM. How? Just grab a white shirt, a blue pullover and a blue tie (Salesforce colors). Finish your look with a brown tweed coat, brown pants and shoes, a rugged white wig and mustache, and a deep thought on your face. [dropcap]39[/dropcap] Facebook Jarvis Get inspired by the Iron man’s Jarvis, the coolest A.I. in the Marvel universe. Just grab a plexiglass, draw some holograms and technological symbols over it with a neon marker. (Try to keep the color palette in shades of blues and reds). And fix this plexiglass in a curved fashion in front of your face by a headband. Do practice saying “Hello Mr. Stark.” [dropcap]40[/dropcap] Amazon Echo This is also an easy one. Grab a long, black chart paper. Roll it around in a tube form around your body. Draw the Amazon symbol at the bottom with some glittery, silver sketch pen, color your hair blue, and there you go. If you have a girlfriend, convince her to go as Amazon Alexa. [dropcap]41[/dropcap] SAP Leonardo Put on a hat, wear a long cloak, some fake overgrown mustache, and beard. Accessorize with a color palette and a paintbrush. You will be the Leonardo da Vinci of the Halloween party. Wait a minute, don’t forget to cut out SAP initials and stick them on your cap. After all, you are entering as SAP’s very own digital revolution system. [dropcap]42[/dropcap] Intel Neon Deck the Halloween hall with a Harley Quinn costume. For some extra dramatization, roll up some neon blue lights around your head. Create an Intel logo out of some blue neon lights and wear it as your neckpiece. [dropcap]43[/dropcap] Microsoft Brainwave This one will require a DIY task. Arrange for a red and green t-shirt, cut them into a vertical half. Stitch it in such a way that the green is on the left and the red on the right. Similarly, do that with your blue and yellow pants; with yellow on the left and blue on the right. You will look like the most powerful Microsoft’s logo. Wear a skullcap with wires protruding out and a Hololens like eyewear to go with. And so, you are all ready to enter the Halloween party as Microsoft’s deep learning acceleration platform for real-time AI. [dropcap]44[/dropcap] Sophia, the humanoid Enter with all the confidence and a top-to-toe professional attire. Be ready to answer any question thrown at you with grace and without a stroke of skepticism. And to top it off, sport a clean shaved head. And there, you are all ready to blow off everyone’s mind with a mix of beauty with super intelligent brains. Happy Halloween folks!

0
0
29251

How-To Tutorials - Data

New QGIS 3D capabilities and future plans presented by Martin Dobias, a core QGIS developer

Context – Understanding your Data using R

“All of my engineering teams have a machine learning feature on their roadmap” - Will Ballard talks artificial intelligence in 2019 [Interview]

Build Hadoop clusters using Google Cloud Platform [Tutorial]

CNN architecture

How to Debug an application using Qt Creator

Predicting Bitcoin price from historical and live data

2019 Stack Overflow survey: A quick overview

Exploring HDFS

FAT* 2018 Conference Session 2 Summary: Interpretability and Explainability

Trending Topics

GQL (Graph Query Language) joins SQL as a Global Standards Project and will be the international standard declarative query language for graphs

Transformers 2.0: NLP library with deep interoperability between TensorFlow 2.0 and PyTorch, and 32+ pretrained models in 100+ languages

Saving backups on cloud services with ElasticSearch plugins

Machine Learning Algorithms: Implementing Naive Bayes with Spark MLlib

(13*3)+ Halloween costume ideas for Data science nerds

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access