Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Data

1210 Articles
article-image-decoding-the-reasons-behind-alphabets-record-high-earnings-in-q2-2018
Sugandha Lahoti
25 Jul 2018
7 min read
Save for later

Decoding the reasons behind Alphabet’s record high earnings in Q2 2018

Sugandha Lahoti
25 Jul 2018
7 min read
Alphabet, Google’s parent company, saw its stock price rise quickly after it announced its Q2 2018 earning results, shocking analysts (in a good way) all over the world. Shares of Alphabet have jumped more than 5% in after-hours trading Monday, hitting a new record high. Source: NASDAQ It would seem that the EU’s fine of €4.34 billion on Google for breaching EU antitrust laws had little effect on its progress in terms of Q2 earnings. According to Ruth Porat, Google's CFO, Alphabet generated revenue of $32.66 billion during Q2 2018, compared to $26.01 billion during the same quarter last year. Excluding the fine, Alphabet still booked a net income of $3.2 billion, which equals earnings of $4.54 per share. Had the EU decision gone the other way, Alphabet would have had $32.6 billion in revenue and a profit of $8.2 billion. “We want Google to be the source you think of when you run into a problem.” - Sundar Pichai, Google CEO, in the Q2 2018 Earnings Call In Monday afternoon’s earnings call, CEO Sundar Pichai focused on three major domains that have helped Alphabet achieve its Q2 earnings. First, he claimed that machine learning and AI was becoming a crucial unifying component across all of Google's products and offerings helping to cement and consolidate its position in the market. Second, Pichai suggested that investments in computing, video, cloud and advertising platforms have helped push Google into new valuable markets. And third, the company's investment in new businesses and emerging markets was proving to be a real growth driver which should secure Google's future success. Let us look at the various facets of Google’s growth strategy that have proven to be successful this quarter. Investing in AI With the world spinning around the axis of AI, Alphabet is empowering all of its product and service offerings with AI and machine learning. At its annual developer conference earlier this year, Google I/O, Google announced new updates to their products that rely on machine learning. For example, the revamped Google news app uses machine learning to provide relevant news stories for users, and improvements to Google assistant also helped the organization strengthen its position in that particular market. (By the end of 2018, it will be available in more than 30 languages in 80 countries.) This is another smart move by Alphabet in its plan to make information accessible to all while generating more revenue-generating options for themselves and expanding their partnerships to new vendors and enterprise clients. Google Translate also saw a huge bump in volume especially during the World Cup, as fans all over the world traveled to Russia to witness the football gala. Another smart decision was adding updates to Google Maps. This has achieved a 50% year-on-year growth in Indonesia, India, and Nigeria, three very big and expanding markets. Defending its Android ecosystem and business model The first Android Phone arrived in 2008. The project was built on the simple idea of a mobile platform that was free and open to everyone. Today, there are more than 24,000 Android-powered devices from over 1400 phone manufacturers. Google’s decision to build a business model that encourages this open ecosystem to thrive has been a clever strategy. It not only generates significant revenue for the company but it also brings a world of developers and businesses into its ecosystem. It's vendor lock-in with a friendly face. Of course, with the EU watching closely, Google has to be careful to follow regulation. Failure to comply could mean the company would face penalty payments of up to 5% of its average daily worldwide turnover of Alphabet. According to Brian Wieser, an analyst at Pivotal Research Group, however, “There do not appear to be any signs that should cause a meaningful slow down anytime soon, as fines from the EU are not likely to hamper Alphabet’s growth rate. Conversely, regulatory changes such as GDPR in Europe (and similar laws implemented elsewhere) could have the effect of reinforcing Alphabet’s growth.” Forming new partnerships Google has always been very keen to form new partnerships and strategic alliances with a wide variety of companies and startups. It has been very smart in systematically looking for partners that will complement their strengths and bring the end product to the market. Partnering also provides flexibility; instead of developing new solutions and tools in-house, Google can instead bring interesting innovations into the Google ecosystem simply thanks to its financial clout. For example, Google has partnered with many electronic companies to expand the number of devices compatible with Google assistant. Furthermore, its investment in computing platforms and AI has also helped the organization to generate considerable momentum in their Made by Google hardware business across Pixel, Home, Nest, and Chromecast. Interestingly, we also saw an acceleration in business adoption of Chromebooks. Chromebooks are the most cost-efficient and secure way for businesses to enable their employees to work in the cloud. The unit sales of managed Chromebooks in Q2 grew by more than 175% year-on-year. “Advertising on Youtube has always been an incredibly strong and growing source of income for its creators. Now Google is also building new ways for creators to source income such as paid channel memberships, merchandise shelves on Youtube channels, and endorsements opportunities through Famebit.”, said Pichai. Famebit is a startup they acquired in 2016 which uses data analytics to build tools to connect brands with the right creators. This acquisition proved to be quite successful as almost half of the creators that used Famebit in 2018 doubled their revenue in the first 3 months. Google has also made significant strides in developing new shopping and commerce partnerships such as with leading global retailers like Carrefour, designed to give people the power to shop wherever and however they want. Such collaborations are great for Google as it brings their shopping, ads, and cloud products under one hood. The success of Google Cloud’s vertical strategy and customer-centric approach was illustrated by key wins including Domino's Pizza, Soundcloud, and PwC moving to GCP this quarter. Target, the chain of department store retailers in the US, is also migrating key areas of it’s business to GCP. AirAsia has also expanded its relationship with Google for using ML and data analytics. This shows that the cloud business is only going to grow further. Further, Google Cloud Platform catering to clients from across very different industries and domains signals a robust way to expand their cloud empire. Supporting future customers Google is not just thinking about its current customer base but also working on specialized products to support the next wave of people which are coming online for the first time, enabled the rise in accessibility of mobile devices. They have established high-speed public WiFi in 400 train stations in India in collaboration with the Indian railways and proposed the system in Indonesia and Mexico as well. They have also announced Google AI research center in Ghana Africa to spur AI innovation with researchers and engineers from Africa. They have also expanded the Google IT support professional certificate program to more than 25 community colleges in the US. This massive uproar by Alphabet even in the midst of EU antitrust case was the most talked about news among Wall Street analysts. Most of them consider it to be buy-in terms of stocks. For the next quarter, Google wants to continue fueling its growing cloud business “We are investing for the long run.” Pichai said. They also don’t plan to dramatically alter their Android strategy and continue to give the OS for free. Pichai said, “I’m confident that we will find a way to make sure Android is available at scale to users everywhere.” A quick look at E.U.’s antitrust case against Google’s Android Is Google planning to replace Android with Project Fuchsia? Google Cloud Launches Blockchain Toolkit to help developers build apps easily
Read more
  • 0
  • 0
  • 36075

article-image-time-series-modeling-what-is-it-why-it-matters-how-its-used
Sunith Shetty
10 Aug 2018
11 min read
Save for later

Time series modeling: What is it, Why it matters and How it's used

Sunith Shetty
10 Aug 2018
11 min read
A series can be defined as a number of events, objects, or people of a similar or related kind coming one after another; if we add the dimension of time, we get a time series. A time series can be defined as a series of data points in time order. In this article, we will understand what time series is and why it is one of the essential characteristics for forecasting. This article is an excerpt from a book written by Harish Gulati titled SAS for Finance. The importance of time series What importance, if any, does time series have and how will it be relevant in the future? These are just a couple of fundamental questions that any user should find answers to before delving further into the subject. Let's try to answer this by posing a question. Have you heard the terms big data, artificial intelligence (AI), and machine learning (ML)? These three terms make learning time series analysis relevant. Big data is primarily about a large amount of data that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interaction. AI is a kind of technology that is being developed by data scientists, computational experts, and others to enable processes to become more intelligent, while ML is an enabler that is helping to implement AI. All three of these terms are interlinked with the data they use, and a lot of this data is time series in its nature. This could be either financial transaction data, the behavior pattern of individuals during various parts of the day, or related to life events that we might experience. An effective mechanism that enables us to capture the data, store it, analyze it, and then build algorithms to predict transactions, behavior (and life events, in this instance) will depend on how big data is utilized and how AI and MI are leveraged. A common perception in the industry is that time series data is used for forecasting only. In practice, time series data is used for: Pattern recognition Forecasting Benchmarking Evaluating the influence of a single factor on the time series Quality control For example, a retailer may identify a pattern in clothing sales every time it gets a celebrity endorsement, or an analyst may decide to use car sales volume data from 2012 to 2017 to set a selling benchmark in units. An analyst might also build a model to quantify the effect of Lehman's crash at the height of the 2008 financial crisis in pushing up the price of gold. Variance in the success of treatments across time periods can also be used to highlight a problem, the tracking of which may enable a hospital to take remedial measures. These are just some of the examples that showcase how time series analysis isn't limited to just forecasting. In this chapter, we will review how the financial industry and others use forecasting, discuss what a good and a bad forecast is, and hope to understand the characteristics of time series data and its associated problems. Forecasting across industries Since one of the primary uses of time series data is forecasting, it's wise that we learn about some of its fundamental properties. To understand what the industry means by forecasting and the steps involved, let's visit a common misconception about the financial industry: only lending activities require forecasting. We need forecasting in order to grant personal loans, mortgages, overdrafts, or simply assess someone's eligibility for a credit card, as the industry uses forecasting to assess a borrower's affordability and their willingness to repay the debt. Even deposit products such as savings accounts, fixed-term savings, and bonds are priced based on some forecasts. How we forecast and the rationale for that methodology is different in borrowing or lending cases, however. All of these areas are related to time series, as we inevitably end up using time series data as part of the overall analysis that drives financial decisions. Let's understand the forecasts involved here a bit better. When we are assessing an individual's lending needs and limits, we are forecasting for a single person yet comparing the individual to a pool of good and bad customers who have been offered similar products. We are also assessing the individual's financial circumstances and behavior through industry-available scoring models or by assessing their past behavior, with the financial provider assessing the lending criteria. In the case of deposit products, as long as the customer is eligible to transact (can open an account and has passed know your customer (KYC), anti-money laundering (AML), and other checks), financial institutions don't perform forecasting at an individual level. However, the behavior of a particular customer is primarily driven by the interest rate offered by the financial institution. The interest rate, in turn, is driven by the forecasts the financial institution has done to assess its overall treasury position. The treasury is the department that manages the central bank's money and has the responsibility of ensuring that all departments are funded, which is generated through lending and attracting deposits at a lower rate than a bank lends. The treasury forecasts its requirements for lending and deposits, while various teams within the treasury adhere to those limits. Therefore, a pricing manager for a deposit product will price the product in such a way that the product will attract enough deposits to meet the forecasted targets shared by the treasury; the pricing manager also has to ensure that those targets aren't overshot by a significant margin, as the treasury only expects to manage a forecasted target. In both lending and deposit decisions, financial institutions do tend to use forecasting. A lot of these forecasts are interlinked, as we saw in the example of the treasury's expectations and the subsequent pricing decision for a deposit product. To decide on its future lending and borrowing positions, the treasury must have used time series data to determine what the potential business appetite for lending and borrowing in the market is and would have assessed that with the current cash flow situation within the relevant teams and institutions. Characteristics of time series data Any time series analysis has to take into account the following factors: Seasonality Trend Outliers and rare events Disruptions and step changes Seasonality Seasonality is a phenomenon that occurs each calendar year. The same behavior can be observed each year. A good forecasting model will be able to incorporate the effect of seasonality in its forecasts. Christmas is a great example of seasonality, where retailers have come to expect higher sales over the festive period. Seasonality can extend into months but is usually only observed over days or weeks. When looking at time series where the periodicity is hours, you may find a seasonality effect for certain hours of the day. Some of the reasons for seasonality include holidays, climate, and changes in social habits. For example, travel companies usually run far fewer services on Christmas Day, citing a lack of demand. During most holidays people love to travel, but this lack of demand on Christmas Day could be attributed to social habits, where people tend to stay at home or have already traveled. Social habit becomes a driving factor in the seasonality of journeys undertaken on Christmas Day. It's easier for the forecaster when a particular seasonal event occurs on a fixed calendar date each year; the issue comes when some popular holidays depend on lunar movements, such as Easter, Diwali, and Eid. These holidays may occur in different weeks or months over the years, which will shift the seasonality effect. Also, if some holidays fall closer to other holiday periods, it may lead to individuals taking extended holidays and travel sales may increase more than expected in such years. The coffee shop near the office may also experience lower sales for a longer period. Changes in the weather can also impact seasonality; for example, a longer, warmer summer may be welcome in the UK, but this would impact retail sales in the autumn as most shoppers wouldn't need to buy a new wardrobe. In hotter countries, sales of air-conditioners would increase substantially compared to the summer months' usual seasonality. Forecasters could offset this unpredictability in seasonality by building in a weather forecast variable. We will explore similar challenges in the chapters ahead. Seasonality shouldn't be confused with a cyclic effect. A cyclic effect is observed over a longer period of generally two years or more. The property sector is often associated with having a cyclic effect, where it has long periods of growth or slowdown before the cycle continues. Trend A trend is merely a long-term direction of observed behavior that is found by plotting data against a time component. A trend may indicate an increase or decrease in behavior. Trends may not even be linear, but a broad movement can be identified by analyzing plotted data. Outliers and rare events Outliers and rare events are terminologies that are often used interchangeably by businesses. These concepts can have a big impact on data, and some sort of outlier treatment is usually applied to data before it is used for modeling. It is almost impossible to predict an outlier or rare event but they do affect a trend. An example of an outlier could be a customer walking into a branch to deposit an amount that is 100 times the daily average of that branch. In this case, the forecaster wouldn't expect that trend to continue. Disruptions Disruptions and step changes are becoming more common in time series data. One reason for this is the abundance of available data and the growing ability to store and analyze it. Disruptions could include instances when a business hasn't been able to trade as normal. Flooding at the local pub may lead to reduced sales for a few days, for example. While analyzing daily sales across a pub chain, an analyst may have to make note of a disruptive event and its impact on the chain's revenue. Step changes are also more common now due to technological shifts, mergers and acquisitions, and business process re-engineering. When two companies announce a merger, they often try to sync their data. They might have been selling x and y quantities individually, but after the merger will expect to sell x + y + c (where c is the positive or negative effect of the merger). Over time, when someone plots sales data in this case, they will probably spot a step change in sales that happened around the time of the merger, as shown in the following screenshot: In the trend graph, we can see that online travel bookings are increasing. In the step change and disruptions chart, we can see that Q1 of 2012 saw a substantive increase in bookings, where Q1 of 2014 saw a substantive dip. The increase was due to the merger of two companies that took place in Q1 of 2012. The decrease in Q1 of 2014 was attributed to prolonged snow storms in Europe and the ash cloud disruption from volcanic activity over Iceland. While online bookings kept increasing after the step change, the disruption caused by the snow storm and ash cloud only had an effect on sales in Q1 of 2014. In this case, the modeler will have to treat the merger and the disruption differently while using them in the forecast, as disruption could be disregarded as an outlier and treated accordingly. Also note that the seasonality chart shows that Q4 of each year sees almost a 20% increase in travel bookings, and this pattern continues each calendar year. In this article, we defined time series and learned why it is important for forecasting. We also looked at the characteristics of time series data. To know more how to leverage the analytical power of SAS to perform financial analysis efficiently, you can check out the book SAS for Finance. Read more Getting to know SQL Server options for disaster recovery Implementing a simple Time Series Data Analysis in R Training RNNs for Time Series Forecasting
Read more
  • 0
  • 0
  • 35859

article-image-hand-gesture-recognition-using-kinect-depth-sensor
Packt
06 Oct 2015
26 min read
Save for later

Hand Gesture Recognition Using a Kinect Depth Sensor

Packt
06 Oct 2015
26 min read
In this article by Michael Beyeler author of the book OpenCV with Python Blueprints is to develop an app that detects and tracks simple hand gestures in real time using the output of a depth sensor, such as that of a Microsoft Kinect 3D sensor or an Asus Xtion. The app will analyze each captured frame to perform the following tasks: Hand region segmentation: The user's hand region will be extracted in each frame by analyzing the depth map output of the Kinect sensor, which is done by thresholding, applying some morphological operations, and finding connected components Hand shape analysis: The shape of the segmented hand region will be analyzed by determining contours, convex hull, and convexity defects Hand gesture recognition: The number of extended fingers will be determined based on the hand contour's convexity defects, and the gesture will be classified accordingly (with no extended finger corresponding to a fist, and five extended fingers corresponding to an open hand) Gesture recognition is an ever popular topic in computer science. This is because it not only enables humans to communicate with machines (human-machine interaction or HMI), but also constitutes the first step for machines to begin understanding the human body language. With affordable sensors, such as Microsoft Kinect or Asus Xtion, and open source software such as OpenKinect and OpenNI, it has never been easy to get started in the field yourself. So what shall we do with all this technology? The beauty of the algorithm that we are going to implement in this article is that it works well for a number of hand gestures, yet is simple enough to run in real time on a generic laptop. And if we want, we can easily extend it to incorporate more complicated hand pose estimations. The end product looks like this: No matter how many fingers of my left hand I extend, the algorithm correctly segments the hand region (white), draws the corresponding convex hull (the green line surrounding the hand), finds all convexity defects that belong to the spaces between fingers (large green points) while ignoring others (small red points), and infers the correct number of extended fingers (the number in the bottom-right corner), even for a fist. This article assumes that you have a Microsoft Kinect 3D sensor installed. Alternatively, you may install Asus Xtion or any other depth sensor for which OpenCV has built-in support. First, install OpenKinect and libfreenect from http://www.openkinect.org/wiki/Getting_Started. Then, you need to build (or rebuild) OpenCV with OpenNI support. The GUI used in this article will again be designed with wxPython, which can be obtained from http://www.wxpython.org/download.php. Planning the app The final app will consist of the following modules and scripts: gestures: A module that consists of an algorithm for recognizing hand gestures. We separate this algorithm from the rest of the application so that it can be used as a standalone module without the need for a GUI. gestures.HandGestureRecognition: A class that implements the entire process flow of hand gesture recognition. It accepts a single-channel depth image (acquired from the Kinect depth sensor) and returns an annotated RGB color image with an estimated number of extended fingers. gui: A module that provides a wxPython GUI application to access the capture device and display the video feed. In order to have it access the Kinect depth sensor instead of a generic camera, we will have to extend some of the base class functionality. gui.BaseLayout: A generic layout from which more complicated layouts can be built. Setting up the app Before we can get to the nitty-grittyof our gesture recognition algorithm, we need to make sure that we can access the Kinect sensor and display a stream of depth frames in a simple GUI. Accessing the Kinect 3D sensor Accessing Microsoft Kinect from within OpenCV is not much different from accessing a computer's webcam or camera device. The easiest way to integrate a Kinect sensor with OpenCV is by using an OpenKinect module called freenect. For installation instructions, see the preceding information box. The following code snippet grants access to the sensor using cv2.VideoCapture: import cv2 import freenect device = cv2.cv.CV_CAP_OPENNI capture = cv2.VideoCapture(device) On some platforms, the first call to cv2.VideoCapture fails to open a capture channel. In this case, we provide a workaround by opening the channel ourselves: if not(capture.isOpened(device)): capture.open(device) If you want to connect to your Asus Xtion, the device variable should be assigned the cv2.cv.CV_CAP_OPENNI_ASUS value instead. In order to give our app a fair chance to run in real time, we will limit the frame size to 640 x 480 pixels: capture.set(cv2.cv.CV_CAP_PROP_FRAME_WIDTH, 640) capture.set(cv2.cv.CV_CAP_PROP_FRAME_HEIGHT, 480) If you are using OpenCV 3, the constants you are looking for might be called cv3.CAP_PROP_FRAME_WIDTH and cv3.CAP_PROP_FRAME_HEIGHT. The read() method of cv2.VideoCapture is inappropriate when we need to synchronize a set of cameras or a multihead camera, such as a Kinect. In this case, we should use the grab() and retrieve() methods instead. An even easier way when working with OpenKinect is to use the sync_get_depth() and sync_get_video()methods. For the purpose of this article, we will need only the Kinect's depth map, which is a single-channel (grayscale) image in which each pixel value is the estimated distance from the camera to a particular surface in the visual scene. The latest frame can be grabbed via this code: depth, timestamp = freenect.sync_get_depth() The preceding code returns both the depth map and a timestamp. We will ignore the latter for now. By default, the map is in 11-bit format, which is inadequate to be visualized with cv2.imshow right away. Thus, it is a good idea to convert the image to 8-bit precision first. In order to reduce the range of depth values in the frame, we will clip the maximal distance to a value of 1,023 (or 2**10-1). This will get rid of values that correspond either to noise or distances that are far too large to be of interest to us: np.clip(depth, 0, 2**10-1, depth) depth >>= 2 Then, we convert the image into 8-bit format and display it: depth = depth.astype(np.uint8) cv2.imshow("depth", depth) Running the app In order to run our app, we will need to execute a main function routine that accesses the Kinect, generates the GUI, and executes the main loop of the app: import numpy as np import wx import cv2 import freenect from gui import BaseLayout from gestures import HandGestureRecognition def main(): device = cv2.cv.CV_CAP_OPENNI capture = cv2.VideoCapture() if not(capture.isOpened()): capture.open(device) capture.set(cv2.cv.CV_CAP_PROP_FRAME_WIDTH, 640) capture.set(cv2.cv.CV_CAP_PROP_FRAME_HEIGHT, 480) We will design a suitable layout (KinectLayout) for the current project: # start graphical user interface app = wx.App() layout = KinectLayout(None, -1, 'Kinect Hand Gesture Recognition', capture) layout.Show(True) app.MainLoop() The Kinect GUI The layout chosen for the current project (KinectLayout) is as plain as it gets. It should simply display the live stream of the Kinect depth sensor at a comfortable frame rate of 10 frames per second. Therefore, there is no need to further customize BaseLayout: class KinectLayout(BaseLayout): def _create_custom_layout(self): pass The only parameter that needs to be initialized this time is the recognition class. This will be useful in just a moment: def _init_custom_layout(self): self.hand_gestures = HandGestureRecognition() Instead of reading a regular camera frame, we need to acquire a depth frame via the freenect method sync_get_depth(). This can be achieved by overriding the following method: def _acquire_frame(self): As mentioned earlier, by default, this function returns a single-channel depth image with 11-bit precision and a timestamp. However, we are not interested in the timestamp, and we simply pass on the frame if the acquisition was successful: frame, _ = freenect.sync_get_depth() # return success if frame size is valid if frame is not None: return (True, frame) else: return (False, frame) The rest of the visualization pipeline is handled by the BaseLayout class. We only need to make sure that we provide a _process_frame method. This method accepts a depth image with 11-bit precision, processes it, and returns an annotated 8-bit RGB color image. Conversion to a regular grayscale image is the same as mentioned in the previous subsection: def _process_frame(self, frame): # clip max depth to 1023, convert to 8-bit grayscale np.clip(frame, 0, 2**10 – 1, frame) frame >>= 2 frame = frame.astype(np.uint8) The resulting grayscale image can then be passed to the hand gesture recognizer, which will return the estimated number of extended fingers (num_fingers) and the annotated RGB color image mentioned earlier (img_draw): num_fingers, img_draw = self.hand_gestures.recognize(frame) In order to simplify the segmentation task of the HandGestureRecognition class, we will instruct the user to place their hand in the center of the screen. To provide a visual aid for this, let's draw a rectangle around the image center and highlight the center pixel of the image in orange: height, width = frame.shape[:2] cv2.circle(img_draw, (width/2, height/2), 3, [255, 102, 0], 2) cv2.rectangle(img_draw, (width/3, height/3), (width*2/3, height*2/3), [255, 102, 0], 2) In addition, we print num_fingers on the screen: cv2.putText(img_draw, str(num_fingers), (30, 30),cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 255, 255)) return img_draw Tracking hand gestures in real time The bulk of the work is done by the HandGestureRecognition class, especially by its recognize method. This class starts off with a few parameter initializations, which will be explained and used later: class HandGestureRecognition: def __init__(self): # maximum depth deviation for a pixel to be considered # within range self.abs_depth_dev = 14 # cut-off angle (deg): everything below this is a convexity # point that belongs to two extended fingers self.thresh_deg = 80.0 The recognize method is where the real magic takes place. This method handles the entire process flow, from the raw grayscale image all the way to a recognized hand gesture. It implements the following procedure: It extracts the user's hand region by analyzing the depth map (img_gray) and returning a hand region mask (segment): def recognize(self, img_gray): segment = self._segment_arm(img_gray) It performs contour analysis on the hand region mask (segment). Then, it returns the largest contour area found in the image (contours) and any convexity defects (defects): [contours, defects] = self._find_hull_defects(segment) Based on the contours found and the convexity defects, it detects the number of extended fingers (num_fingers) in the image. Then, it annotates the output image (img_draw) with contours, defect points, and the number of extended fingers: img_draw = cv2.cvtColor(img_gray, cv2.COLOR_GRAY2RGB) [num_fingers, img_draw] = self._detect_num_fingers(contours, defects, img_draw) It returns the estimated number of extended fingers (num_fingers) as well as the annotated output image (img_draw): return (num_fingers, img_draw) Hand region segmentation The automatic detection of an arm, and later the hand region, could be designed to be arbitrarily complicated, maybe by combining information about the shape and color of an arm or hand. However, using a skin color as a determining feature to find hands in visual scenes might fail terribly in poor lighting conditions or when the user is wearing gloves. Instead, we choose to recognize the user's hand by its shape in the depth map. Allowing hands of all sorts to be present in any region of the image unnecessarily complicates the mission of this article, so we make two simplifying assumptions: We will instruct the user of our app to place their hand in front of the center of the screen, orienting their palm roughly parallel to the orientation of the Kinect sensor so that it is easier to identify the corresponding depth layer of the hand. We will also instruct the user to sit roughly 1 to 2 meters away from the Kinect, and to slightly extend their arm in front of their body so that the hand will end up in a slightly different depth layer than the arm. However, the algorithm will still work even if the full arm is visible. In this way, it will be relatively straightforward to segment the image based on the depth layer alone. Otherwise, we would have to come up with a hand detection algorithm first, which would unnecessarily complicate our mission. If you feel adventurous, feel free to do this on your own. Finding the most prominent depth of the image center region Once the hand is placed roughly in the center of the screen, we can start finding all image pixels that lie on the same depth plane as the hand. For this, we simply need to determine the most prominent depth value of the center region of the image. The simplest approach would be as follows: look only at the depth value of the center pixel: width, height = depth.shape center_pixel_depth = depth[width/2, height/2] Then, create a mask in which all pixels at a depth of center_pixel_depth are white and all others are black: import numpy as np depth_mask = np.where(depth == center_pixel_depth, 255, 0).astype(np.uint8) However, this approach will not be very robust, because chances are that: Your hand is not placed perfectly parallel to the Kinect sensor Your hand is not perfectly flat The Kinect sensor values are noisy Therefore, different regions of your hand will have slightly different depth values. The _segment_arm method takes a slightly better approach, that is, looking at a small neighborhood in the center of the image and determining the median (meaning the most prominent) depth value. First, we find the center (for example, 21 x 21 pixels) region of the image frame: def _segment_arm(self, frame): """ segments the arm region based on depth """ center_half = 10 # half-width of 21 is 21/2-1 lowerHeight = self.height/2 – center_half upperHeight = self.height/2 + center_half lowerWidth = self.width/2 – center_half upperWidth = self.width/2 + center_half center = frame[lowerHeight:upperHeight,lowerWidth:upperWidth] We can then reshape the depth values of this center region into a one-dimensional vector and determine the median depth value, med_val: med_val = np.median(center) We can now compare med_val with the depth value of all pixels in the image and create a mask in which all pixels whose depth values are within a particular range [med_val-self.abs_depth_dev, med_val+self.abs_depth_dev] are white and all other pixels are black. However, for reasons that will be clear in a moment, let's paint the pixels gray instead of white: frame = np.where(abs(frame – med_val) <= self.abs_depth_dev, 128, 0).astype(np.uint8) The result will look like this: Applying morphological closing to smoothen the segmentation mask A common problem with segmentation is that a hard threshold typically results in small imperfections (that is, holes, as in the preceding image) in the segmented region. These holes can be alleviated using morphological opening and closing. Opening removes small objects from the foreground (assuming that the objects are bright on a dark foreground), whereas closing removes small holes (dark regions). This means that we can get rid of the small black regions in our mask by applying morphological closing (dilation followed by erosion) with a small 3 x 3 pixel kernel: kernel = np.ones((3, 3), np.uint8) frame = cv2.morphologyEx(frame, cv2.MORPH_CLOSE, kernel) The result looks a lot smoother, as follows: Notice, however, that the mask still contains regions that do not belong to the hand or arm, such as what appears to be one of my knees on the left and some furniture on the right. These objects just happen to be on the same depth layer as my arm and hand. If possible, we could now combine the depth information with another descriptor, maybe a texture-based or skeleton-based hand classifier, that would weed out all non-skin regions. Finding connected components in a segmentation mask An easier approach is to realize that most of the times, hands are not connected to knees or furniture. We already know that the center region belongs to the hand, so we can simply apply cv2.floodfill to find all the connected image regions. Before we do this, we want to be absolutely certain that the seed point for the flood fill belongs to the right mask region. This can be achieved by assigning a grayscale value of 128 to the seed point. But we also want to make sure that the center pixel does not, by any coincidence, lie within a cavity that the morphological operation failed to close. So, let's set a small 7 x 7 pixel region with a grayscale value of 128 instead: small_kernel = 3 frame[self.height/2-small_kernel : self.height/2+small_kernel, self.width/2-small_kernel : self.width/2+small_kernel] = 128 Because flood filling (as well as morphological operations) is potentially dangerous, the Python version of later OpenCV versions requires specifying a mask that avoids flooding the entire image. This mask has to be 2 pixels wider and taller than the original image and has to be used in combination with the cv2.FLOODFILL_MASK_ONLY flag. It can be very helpful in constraining the flood filling to a small region of the image or a specific contour so that we need not connect two neighboring regions that should have never been connected in the first place. It's better to be safe than sorry, right? Ah, screw it! Today, we feel courageous! Let's make the mask entirely black: mask = np.zeros((self.height+2, self.width+2), np.uint8) Then we can apply the flood fill to the center pixel (seed point) and paint all the connected regions white: flood = frame.copy() cv2.floodFill(flood, mask, (self.width/2, self.height/2), 255, flags=4 | (255 << 8)) At this point, it should be clear why we decided to start with a gray mask earlier. We now have a mask that contains white regions (arm and hand), gray regions (neither arm nor hand but other things in the same depth plane), and black regions (all others). With this setup, it is easy to apply a simple binary threshold to highlight only the relevant regions of the pre-segmented depth plane: ret, flooded = cv2.threshold(flood, 129, 255, cv2.THRESH_BINARY) This is what the resulting mask looks like: The resulting segmentation mask can now be returned to the recognize method, where it will be used as an input to _find_hull_defects as well as a canvas for drawing the final output image (img_draw). Hand shape analysis Now that we (roughly) know where the hand is located, we aim to learn something about its shape. Determining the contour of the segmented hand region The first step involves determining the contour of the segmented hand region. Luckily, OpenCV comes with a pre-canned version of such an algorithm—cv2.findContours. This function acts on a binary image and returns a set of points that are believed to be part of the contour. Because there might be multiple contours present in the image, it is possible to retrieve an entire hierarchy of contours: def _find_hull_defects(self, segment): contours, hierarchy = cv2.findContours(segment, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE) Furthermore, because we do not know which contour we are looking for, we have to make an assumption to clean up the contour result. Since it is possible that some small cavities are left over even after the morphological closing—but we are fairly certain that our mask contains only the segmented area of interest—we will assume that the largest contour found is the one that we are looking for. Thus, we simply traverse the list of contours, calculate the contour area (cv2.contourArea), and store only the largest one (max_contour): max_contour = max(contours, key=cv2.contourArea) Finding the convex hull of a contour area Once we have identified the largest contour in our mask, it is straightforward to compute the convex hull of the contour area. The convex hull is basically the envelope of the contour area. If you think of all the pixels that belong to the contour area as a set of nails sticking out of a board, then the convex hull is the shape formed by a tight rubber band that surrounds all the nails. We can get the convex hull directly from our largest contour (max_contour): hull = cv2.convexHull(max_contour, returnPoints=False) Because we now want to look at convexity deficits in this hull, we are instructed by the OpenCV documentation to set the returnPoints optional flag to False. The convex hull drawn in green around a segmented hand region looks like this: Finding convexity defects of a convex hull As is evident from the preceding screenshot, not all points on the convex hull belong to the segmented hand region. In fact, all the fingers and the wrist cause severe convexity defects, that is, points of the contour that are far away from the hull. We can find these defects by looking at both the largest contour (max_contour) and the corresponding convex hull (hull): defects = cv2.convexityDefects(max_contour, hull) The output of this function (defects) is a 4-tuple that contains start_index (the point of the contour where the defect begins), end_index (the point of the contour where the defect ends), farthest_pt_index (the farthest from the convex hull point within the defect), and fixpt_depth (distance between the farthest point and the convex hull). We will make use of this information in just a moment when we reason about fingers. But for now, our job is done. The extracted contour (max_contour) and convexity defects (defects) can be passed to recognize, where they will be used as inputs to _detect_num_fingers: return (cnt,defects) Hand gesture recognition What remains to be done is classifying the hand gesture based on the number of extended fingers. For example, if we find five extended fingers, we assume the hand to be open, whereas no extended fingers imply a fist. All that we are trying to do is count from zero to five and make the app recognize the corresponding number of fingers. This is actually trickier than it might seem at first. For example, people in Europe might count to three by extending their thumb, index finger, and middle finger. If you do that in the US, people there might get horrendously confused, because people do not tend to use their thumbs when signaling the number two. This might lead to frustration, especially in restaurants (trust me). If we could find a way to generalize these two scenarios—maybe by appropriately counting the number of extended fingers—we would have an algorithm that could teach simple hand gesture recognition to not only a machine but also (maybe) to an average waitress. As you might have guessed, the answer has to do with convexity defects. As mentioned earlier, extended fingers cause defects in the convex hull. However, the inverse is not true; that is, not all convexity defects are caused by fingers! There might be additional defects caused by the wrist as well as the overall orientation of the hand or the arm. How can we distinguish between these different causes for defects? Distinguishing between different causes for convexity defects The trick is to look at the angle between the farthest point from the convex hull point within the defect (farthest_pt_index) and the start and end points of the defect (start_index and end_index, respectively), as illustrated in the following screenshot: In this screenshot, the orange markers serve as a visual aid to center the hand in the middle of the screen, and the convex hull is outlined in green. Each red dot corresponds to a farthest from the convex hull point (farthest_pt_index) for every convexity defect detected. If we compare a typical angle that belongs to two extended fingers (such as θj) to an angle that is caused by general hand geometry (such as θi), we notice that the former is much smaller than the latter. This is obviously because humans can spread their finger only a little, thus creating a narrow angle made by the farthest defect point and the neighboring fingertips. Therefore, we can iterate over all convexity defects and compute the angle between the said points. For this, we will need a utility function that calculates the angle (in radians) between two arbitrary, list-like vectors, v1 and v2: def angle_rad(v1, v2): return np.arctan2(np.linalg.norm(np.cross(v1, v2)), np.dot(v1, v2)) This method uses the cross product to compute the angle, rather than the standard way. The standard way of calculating the angle between two vectors v1 and v2 is by calculating their dot product and dividing it by the norm of v1 and the norm of v2. However, this method has two imperfections: You have to manually avoid division by zero if either the norm of v1 or the norm of v2 is zero The method returns relatively inaccurate results for small angles Similarly, we provide a simple function to convert an angle from degrees to radians: def deg2rad(angle_deg): return angle_deg/180.0*np.pi Classifying hand gestures based on the number of extended fingers What remains to be done is actually classifying the hand gesture based on the number of extended fingers. The _detect_num_fingers method will take as input the detected contour (contours), the convexity defects (defects), and a canvas to draw on (img_draw): def _detect_num_fingers(self, contours, defects, img_draw): Based on these parameters, it will then determine the number of extended fingers. However, we first need to define a cut-off angle that can be used as a threshold to classify convexity defects as being caused by extended fingers or not. Except for the angle between the thumb and the index finger, it is rather hard to get anything close to 90 degrees, so anything close to that number should work. We do not want the cut-off angle to be too high, because that might lead to misclassifications: self.thresh_deg = 80.0 For simplicity, let's focus on the special cases first. If we do not find any convexity defects, it means that we possibly made a mistake during the convex hull calculation, or there are simply no extended fingers in the frame, so we return 0 as the number of detected fingers: if defects is None: return [0, img_draw] But we can take this idea even further. Due to the fact that arms are usually slimmer than hands or fists, we can assume that the hand geometry will always generate at least two convexity defects (which usually belong to the wrists). So if there are no additional defects, it implies that there are no extended fingers: if len(defects) <= 2: return [0, img_draw] Now that we have ruled out all special cases, we can begin counting real fingers. If there are a sufficient number of defects, we will find a defect between every pair of fingers. Thus, in order to get the number right (num_fingers), we should start counting at 1: num_fingers = 1 Then we can start iterating over all convexity defects. For each defect, we will extract the four elements and draw its hull for visualization purposes: for i in range(defects.shape[0]): # each defect point is a 4-tuplestart_idx, end_idx, farthest_idx, _ == defects[i, 0] start = tuple(contours[start_idx][0]) end = tuple(contours[end_idx][0]) far = tuple(contours[farthest_idx][0]) # draw the hull cv2.line(img_draw, start, end [0, 255, 0], 2) Then we will compute the angle between the two edges from far to start and from far to end. If the angle is smaller than self.thresh_deg degrees, it means that we are dealing with a defect that is most likely caused by two extended fingers. In this case, we want to increment the number of detected fingers (num_fingers), and we draw the point with green. Otherwise, we draw the point with red: # if angle is below a threshold, defect point belongs # to two extended fingers if angle_rad(np.subtract(start, far), np.subtract(end, far)) < deg2rad(self.thresh_deg): # increment number of fingers num_fingers = num_fingers + 1 # draw point as green cv2.circle(img_draw, far, 5, [0, 255, 0], -1) else: # draw point as red cv2.circle(img_draw, far, 5, [255, 0, 0], -1) After iterating over all convexity defects, we pass the number of detected fingers and the assembled output image to the recognize method: return (min(5, num_fingers), img_draw) This will make sure that we do not exceed the common number of fingers per hand. The result can be seen in the following screenshots: Interestingly, our app is able to detect the correct number of extended fingers in a variety of hand configurations. Defect points between extended fingers are easily classified as such by the algorithm, and others are successfully ignored. Summary This article showed a relatively simple and yet surprisingly robust way of recognizing a variety of hand gestures by counting the number of extended fingers. The algorithm first shows how a task-relevant region of the image can be segmented using depth information acquired from a Microsoft Kinect 3D Sensor, and how morphological operations can be used to clean up the segmentation result. By analyzing the shape of the segmented hand region, the algorithm comes up with a way to classify hand gestures based on the types of convexity effects found in the image. Once again, mastering our use of OpenCV to perform a desired task did not require us to produce a large amount of code. Instead, we were challenged to gain an important insight that made us use the built-in functionality of OpenCV in the most effective way possible. Gesture recognition is a popular but challenging field in computer science, with applications in a large number of areas, such as human-computer interaction, video surveillance, and even the video game industry. You can now use your advanced understanding of segmentation and structure analysis to build your own state-of-the-art gesture recognition system. Resources for Article: Tracking Faces with Haar Cascades Our First Machine Learning Method - Linear Classification Solving problems with Python: Closest good restaurant
Read more
  • 0
  • 0
  • 35824

article-image-getting-to-know-and-manipulate-tensors-in-tensorflow
Sunith Shetty
29 Dec 2017
5 min read
Save for later

Getting to know and manipulate Tensors in TensorFlow

Sunith Shetty
29 Dec 2017
5 min read
[box type="note" align="" class="" width=""]This article is a book excerpt written by Rodolfo Bonnin, titled Building Machine Learning Projects with TensorFlow. In this book, you will learn to build powerful machine learning projects to tackle complex data for gaining valuable insights. [/box] Today, you will learn everything about Tensors, their properties and how they are used to represent data. What are tensors? TensorFlow bases its data management on tensors. Tensors are concepts from the field of mathematics, and are developed as a generalization of the linear algebra terms of vectors and matrices. Talking specifically about TensorFlow, a tensor is just a typed, multidimensional array, with additional operations, modeled in the tensor object. Tensor properties - ranks, shapes, and types TensorFlow uses tensor data structure to represent all data. Any tensor has a static type and dynamic dimensions, so you can change a tensor's internal organization in real-time. Another property of tensors, is that only objects of the tensor type can be passed between nodes in the computation graph. Let's now see what the properties of tensors are (from now on, every time we use the word tensor, we'll be referring to TensorFlow's tensor objects). Tensor rank Tensor ranks represent the dimensional aspect of a tensor, but is not the same as a matrix rank. It represents the quantity of dimensions in which the tensor lives, and is not a precise measure of the extension of the tensor in rows/columns or spatial equivalents. A rank one tensor is the equivalent of a vector, and a rank one tensor is a matrix. For a rank two tensor you can access any element with the syntax t[i, j]. For a rank three tensor you would need to address an element with t[i, j, k], and so on. In the following example, we will create a tensor, and access one of its components: import tensorflow as tf sess = tf.Session() tens1 = tf.constant([[[1,2],[2,3]],[[3,4],[5,6]]]) print sess.run(tens1)[1,1,0] Output: 5 This is a tensor of rank three, because in each element of the containing matrix, there is a vector element: Rank Math entity Code definition example 0 Scalar scalar = 1000 1 Vector Vector vector = [2, 8, 3] 2 Matrix matrix = [[4, 2, 1], [5, 3, 2], [5, 5, 6]] 3 3-tensor tensor = [[[4], [3], [2]], [[6], [100], [4]], [[5], [1], [4]]] n n-tensor … Tensor shape The TensorFlow documentation uses three notational conventions to describe tensor dimensionality: rank, shape, and dimension number. The following table shows how these relate to one another: Rank Shape Dimension number Example 0 [] 0 4 1 [D0] 1 [2] 2 [D0, D1] 2 [6, 2] 3 [D0, D1, D2] 3 [7, 3, 2] n [D0, D1, … Dn-1] n-D A tensor with shape [D0, D1, … Dn-1] In the following example, we create a sample rank three tensor, and print the shape of it: Tensor data types In addition to dimensionality, tensors have a fixed data type. You can assign any one of the following data types to a tensor: Data type Python type Description DT_FLOAT tf.float32 32 bits floating point. DT_DOUBLE tf.float64 64 bits floating point. DT_INT8 tf.int8 8 bits signed integer. DT_INT16 tf.int16 16 bits signed integer. DT_INT32 tf.int32 32 bits signed integer. DT_INT64 tf.int64 64 bits signed integer. DT_UINT8 tf.uint8 8 bits unsigned integer. DT_STRING tf.string Variable length byte arrays. Each element of a tensor is a byte array. DT_BOOL tf.bool Boolean. Creating new tensors We can either create our own tensors, or derivate them from the well-known numpy library. In the following example, we create some numpy arrays, and do some basic math with them: import tensorflow as tf import numpy as np x = tf.constant(np.random.rand(32).astype(np.float32)) y= tf.constant ([1,2,3]) x Y Output: <tf.Tensor 'Const_2:0' shape=(3,) dtype=int32> From numpy to tensors and vice versa TensorFlow is interoperable with numpy, and normally the eval() function calls will return a numpy object, ready to be worked with the standard numerical tools. We must note that the tensor object is a symbolic handle for the result of an operation, so it doesn't hold the resulting values of the structures it contains. For this reason, we must run the eval() method to get the actual values, which is the equivalent to Session.run(tensor_to_eval). In this example, we build two numpy arrays, and convert them to tensors: import tensorflow as tf #we import tensorflow import numpy as np #we import numpy sess = tf.Session() #start a new Session Object x_data = np.array([[1.,2.,3.],[3.,2.,6.]]) # 2x3 matrix x = tf.convert_to_tensor(x_data, dtype=tf.float32) print (x) Output: Tensor("Const_3:0", shape=(2, 3), dtype=float32) Useful method: tf.convert_to_tensor: This function converts Python objects of various types to tensor objects. It accepts tensorobjects, numpy arrays, Python lists, and Python scalars. Getting things done - interacting with TensorFlow As with the majority of Python's modules, TensorFlow allows the use of Python's interactive console: In the previous figure, we call the Python interpreter (by simply calling Python) and create a tensor of constant type. Then we invoke it again, and the Python interpreter shows the shape and type of the tensor. We can also use the IPython interpreter, which will allow us to employ a format more compatible with notebook-style tools, such as Jupyter: When talking about running TensorFlow Sessions in an interactive manner, it's better to employ the InteractiveSession object. Unlike the normal tf.Session class, the tf.InteractiveSession class installs itself as the default session on construction. So when you try to eval a tensor, or run an operation, it will not be necessary to pass a Session object to indicate which session it refers to. To summarize, we have learned about tensors, the key data structure in TensorFlow and simple operations we can apply to the data. To know more about different machine learning techniques and algorithms that can be used to build efficient and powerful projects, you can refer to the book Building Machine Learning Projects with TensorFlow.    
Read more
  • 0
  • 0
  • 35803

article-image-what-is-a-convolutional-neural-network-cnn-video
Richard Gall
25 Sep 2018
5 min read
Save for later

What is a convolutional neural network (CNN)? [Video]

Richard Gall
25 Sep 2018
5 min read
What is a convolutional neural network, exactly? Well, let's start with the basics: a convolutional neural network (CNN) is a type of neural network that is most often applied to image processing problems. You've probably seen them in action anywhere a computer is identifying objects in an image. But you can also use convolutional neural networks in natural language processing projects, too. The fact that they are useful for these fast growing areas is one of the main reasons they're so important in deep learning and artificial intelligence today. What makes a convolutional neural network unique? Once you understand how a convolutional neural network works and what makes it unique from other neural networks, you can see why they're so effective for processing and classifying images. But let’s first take a regular neural network. A regular neural network has an input layer, hidden layers and an output layer. The input layer accepts inputs in different forms, while the hidden layers perform calculations on these inputs. The output layer then delivers the outcome of the calculations and extractions. Each of these layers contains neurons that are connected to neurons in the previous layer, and each neuron has its own weight. This means you aren’t making any assumptions about the data being fed into the network - great usually, but not if you’re working with images or language. Convolutional neural networks work differently as they treat data as spatial. Instead of neurons being connected to every neuron in the previous layer, they are instead only connected to neurons close to it and all have the same weight. This simplification in the connections means the network upholds the spatial aspect of the data set. It means your network doesn’t think an eye is all over the image. The word ‘convolutional’ refers to the filtering process that happens in this type of network. Think of it this way, an image is complex - a convolutional neural network simplifies it so it can be better processed and ‘understood.’ What's inside a convolutional neural network? Like a normal neural network, a convolutional neural network is made up of multiple layers. There are a couple of layers that make it unique - the convolutional layer and the pooling layer. However, like other neural networks, it will also have a ReLu or rectified linear unit layer, and a fully connected layer. The ReLu layer acts as an activation function, ensuring non-linearity as the data moves through each layer in the network - without it, the data being fed into each layer would lose the dimensionality that we want to maintain. The fully connected layer, meanwhile, allows you to perform classification on your dataset. The convolutional layer The convolutional layer is the most important, so let’s start there. It works by placing a filter over an array of image pixels - this then creates what’s called a convolved feature map. "It’s a bit like looking at an image through a window which allows you to identify specific features you might not otherwise be able to see. The pooling layer Next we have the pooling layer - this downsamples or reduces the sample size of a particular feature map. This also makes processing much faster as it reduces the number of parameters the network needs to process. The output of this is a pooled feature map. There are two ways of doing this, max pooling, which takes the maximum input of a particular convolved feature, or average pooling, which simply takes the average. These steps amount to feature extraction, whereby the network builds up a picture of the image data according to its own mathematical rules. If you want to perform classification, you'll need to move into the fully connected layer. To do this, you'll need to flatten things out - remember, a neural network with a more complex set of connections can only process linear data. How to train a convolutional neural network There are a number of ways you can train a convolutional neural network. If you’re working with unlabelled data, you can use unsupervised learning methods. One of the best popular ways of doing this is using auto-encoders - this allows you to squeeze data in a space with low dimensions, performing calculations in the first part of the convolutional neural network. Once this is done you’ll then need to reconstruct with additional layers that upsample the data you have. Another option is to use generative adversarial networks, or GANs. With a GAN, you train two networks. The first gives you artificial data samples that should resemble data in the training set, while the second is a ‘discriminative network’ - it should distinguish between the artificial and the 'true' model. What's the difference between a convolutional neural network and a recurrent neural network? Although there's a lot of confusion about the difference between a convolutional neural network and a recurrent neural network, it's actually more simple than many people realise. Whereas a convolutional neural network is a feedforward network that filters spatial data, a recurrent neural network, as the name implies, feeds data back into itself. From this perspective recurrent neural networks are better suited to sequential data. Think of it like this: a convolutional network is able to perceive patterns across space - a recurrent neural network can see them over time. How to get started with convolutional neural networks If you want to get started with convolutional neural networks Python and TensorFlow are great tools to begin with. It’s worth exploring MNIST dataset too. This is a database of handwritten digits that you can use to get started with building your first convolutional neural network. To learn more about convolutional neural networks, artificial intelligence, and deep learning, visit Packt's store for eBooks and videos.
Read more
  • 0
  • 0
  • 35775

article-image-4-common-challenges-web-scraping-handle
Amarabha Banerjee
08 Mar 2018
13 min read
Save for later

4 common challenges in Web Scraping and how to handle them

Amarabha Banerjee
08 Mar 2018
13 min read
[box type="note" align="" class="" width=""]Our article is an excerpt from the book Web Scraping with Python, written by Richard Lawson. This book contains step by step tutorials on how to leverage Python programming techniques for ethical web scraping. [/box] In this article, we will explore primary challenges of Web Scraping and how to get away with it easily. Developing a reliable scraper is never easy, there are so many what ifs that we need to take into account. What if the website goes down? What if the response returns unexpected data? What if your IP is throttled or blocked? What if authentication is required? While we can never predict and cover all what ifs, we will discuss some common traps, challenges, and workarounds. Note that several of the recipes require access to a website that I have provided as a Docker container. They require more logic than the simple, static site we used in earlier chapters. Therefore, you will need to pull and run a Docker container using the following Docker commands: docker pull mheydt/pywebscrapecookbook docker run -p 5001:5001 pywebscrapecookbook Retrying failed page downloads Failed page requests can be easily handled by Scrapy using retry middleware. When installed, Scrapy will attempt retries when receiving the following HTTP error codes: [500, 502, 503, 504, 408] The process can be further configured using the following parameters: RETRY_ENABLED (True/False - default is True) RETRY_TIMES (# of times to retry on any errors - default is 2) RETRY_HTTP_CODES (a list of HTTP error codes which should be retried - default is [500, 502, 503, 504, 408]) How to do it The 06/01_scrapy_retry.py script demonstrates how to configure Scrapy for retries. The script file contains the following configuration for Scrapy: process = CrawlerProcess({ 'LOG_LEVEL': 'DEBUG', 'DOWNLOADER_MIDDLEWARES': { "scrapy.downloadermiddlewares.retry.RetryMiddleware": 500 }, 'RETRY_ENABLED': True, 'RETRY_TIMES': 3 }) process.crawl(Spider) process.start() How it works Scrapy will pick up the configuration for retries as specified when the spider is run. When encountering errors, Scrapy will retry up to three times before giving up. Supporting page redirects Page redirects in Scrapy are handled using redirect middleware, which is enabled by default. The process can be further configured using the following parameters: REDIRECT_ENABLED: (True/False - default is True) REDIRECT_MAX_TIMES: (The maximum number of redirections to follow for any single request - default is 20) How to do it The script in 06/02_scrapy_redirects.py demonstrates how to configure Scrapy to handle redirects. This configures a maximum of two redirects for any page. Running the script reads the NASA sitemap and crawls that content. This contains a large number of redirects, many of which are redirects from HTTP to HTTPS versions of URLs. There will be a lot of output, but here are a few lines demonstrating the output: Parsing: <200 https://www.nasa.gov/content/earth-expeditions-above/> ['http://www.nasa.gov/content/earth-expeditions-above', 'https://www.nasa.gov/content/earth-expeditions-above'] This particular URL was processed after one redirection, from an HTTP to an HTTPS version of the URL. The list defines all of the URLs that were involved in the redirection. You will also be able to see where redirection exceeded the specified level (2) in the output pages. The following is one example: 2017-10-22 17:55:00 [scrapy.downloadermiddlewares.redirect] DEBUG: Discarding <GET http://www.nasa.gov/topics/journeytomars/news/index.html>: max redirections reached How it works The spider is defined as the following: class Spider(scrapy.spiders.SitemapSpider): name = 'spider' sitemap_urls = ['https://www.nasa.gov/sitemap.xml'] def parse(self, response): print("Parsing: ", response) print (response.request.meta.get('redirect_urls')) This is identical to our previous NASA sitemap based crawler, with the addition of one line printing the redirect_urls. In any call to parse, this metadata will contain all redirects that occurred to get to this page. The crawling process is configured with the following code: process = CrawlerProcess({ 'LOG_LEVEL': 'DEBUG', 'DOWNLOADER_MIDDLEWARES': { "scrapy.downloadermiddlewares.redirect.RedirectMiddleware": 500 }, 'REDIRECT_ENABLED': True, 'REDIRECT_MAX_TIMES': 2 }) Redirect is enabled by default, but this sets the maximum number of redirects to 2 instead of the default of 20. Waiting for content to be available in Selenium A common problem with dynamic web pages is that even after the whole page has loaded, and hence the get() method in Selenium has returned, there still may be content that we need to access later as there are outstanding Ajax requests from the page that are still pending completion. An example of this is needing to click a button, but the button not being enabled until all data has been loaded asynchronously to the page after loading. Take the following page as an example: http://the-internet.herokuapp.com/dynamic_loading/2. This page finishes loading very quickly and presents us with a Start button: When pressing the button, we are presented with a progress bar for five seconds: And when this is completed, we are presented with Hello World! Now suppose we want to scrape this page to get the content that is exposed only after the button is pressed and after the wait? How do we do this? How to do it We can do this using Selenium. We will use two features of Selenium. The first is the ability to click on page elements. The second is the ability to wait until an element with a specific ID is available on the page. First, we get the button and click it. The button's HTML is the following: <div id='start'> <button>Start</button> </div> When the button is pressed and the load completes, the following HTML is added to the document: <div id='finish'> <h4>Hello World!"</h4> </div> We will use the Selenium driver to find the Start button, click it, and then wait until a div with an ID of 'finish' is available. Then we get that element and return the text in the enclosed <h4> tag. You can try this by running 06/03_press_and_wait.py. It's output will be the following: clicked Hello World! Now let's see how it worked. How it works Let us break down the explanation: We start by importing the required items from Selenium: from selenium import webdriver from selenium.webdriver.support import ui Now we load the driver and the page: driver = webdriver.PhantomJS() driver.get("http://the-internet.herokuapp.com/dynamic_loading/2") With the page loaded, we can retrieve the button: button = driver.find_element_by_xpath("//*/div[@id='start']/button") And then we can click the button: button.click() print("clicked") Next we create a WebDriverWait object: wait = ui.WebDriverWait(driver, 10) With this object, we can request Selenium's UI wait for certain events. This also sets a maximum wait of 10 seconds. Now using this, we can wait until we meet a criterion; that an element is identifiable using the following XPath: wait.until(lambda driver: driver.find_element_by_xpath("//*/div[@id='finish']")) When this completes, we can retrieve the h4 element and get its enclosing text: finish_element=driver.find_element_by_xpath("//*/div[@id='finish']/ h4") print(finish_element.text) Limiting crawling to a single domain We can inform Scrapy to limit the crawl to only pages within a specified set of domains. This is an important task, as links can point to anywhere on the web, and we often want to control where crawls end up going. Scrapy makes this very easy to do. All that needs to be done is setting the allowed_domains field of your scraper class. How to do it The code for this example is 06/04_allowed_domains.py. You can run the script with your Python interpreter. It will execute and generate a ton of output, but if you keep an eye on it, you will see that it only processes pages on nasa.gov. How it works The code is the same as previous NASA site crawlers except that we include allowed_domains=['nasa.gov']: class Spider(scrapy.spiders.SitemapSpider): name = 'spider' sitemap_urls = ['https://www.nasa.gov/sitemap.xml'] allowed_domains=['nasa.gov'] def parse(self, response): print("Parsing: ", response) The NASA site is fairly consistent with staying within its root domain, but there are occasional links to other sites such as content on boeing.com. This code will prevent moving to those external sites. Processing infinitely scrolling pages Many websites have replaced "previous/next" pagination buttons with an infinite scrolling mechanism. These websites use this technique to load more data when the user has reached the bottom of the page. Because of this, strategies for crawling by following the "next page" link fall apart. While this would seem to be a case for using browser automation to simulate the scrolling, it's actually quite easy to figure out the web pages' Ajax requests and use those for crawling instead of the actual page.  Let's look at spidyquotes.herokuapp.com/scroll as an example. Getting ready Open http://spidyquotes.herokuapp.com/scroll in your browser. This page will load additional content when you scroll to the bottom of the page: Screenshot of the quotes to scrape Once the page is open, go into your developer tools and select the network panel. Then, scroll to the bottom of the page. You will see new content in the network panel: When we click on one of the links, we can see the following JSON: { "has_next": true, "page": 2, "quotes": [{ "author": { "goodreads_link": "/author/show/82952.Marilyn_Monroe", "name": "Marilyn Monroe", "slug": "Marilyn-Monroe" }, "tags": ["friends", "heartbreak", "inspirational", "life", "love", "sisters"], "text": "u201cThis life is what you make it...." }, { "author": { "goodreads_link": "/author/show/1077326.J_K_Rowling", "name": "J.K. Rowling", "slug": "J-K-Rowling" }, "tags": ["courage", "friends"], "text": "u201cIt takes a great deal of bravery to stand up to our enemies, but just as much to stand up to our friends.u201d" }, This is great because all we need to do is continually generate requests to /api/quotes?page=x, increasing x until the has_next tag exists in the reply document. If there are no more pages, then this tag will not be in the document. How to do it The 06/05_scrapy_continuous.py file contains a Scrapy agent, which crawls this set of pages. Run it with your Python interpreter and you will see output similar to the following (the following is multiple excerpts from the output): <200 http://spidyquotes.herokuapp.com/api/quotes?page=2> 2017-10-29 16:17:37 [scrapy.core.scraper] DEBUG: Scraped from <200 http://spidyquotes.herokuapp.com/api/quotes?page=2> {'text': "“This life is what you make it. No matter what, you're going to mess up sometimes, it's a universal truth. But the good part is you get to decide how you're going to mess it up. Girls will be your friends - they'll act like it anyway. But just remember, some come, some go. The ones that stay with you through everything - they're your true best friends. Don't let go of them. Also remember, sisters make the best friends in the world. As for lovers, well, they'll come and go too. And baby, I hate to say it, most of them - actually pretty much all of them are going to break your heart, but you can't give up because if you give up, you'll never find your soulmate. You'll never find that half who makes you whole and that goes for everything. Just because you fail once, doesn't mean you're gonna fail at everything. Keep trying, hold on, and always, always, always believe in yourself, because if you don't, then who will, sweetie? So keep your head high, keep your chin up, and most importantly, keep smiling, because life's a beautiful thing and there's so much to smile about.”", 'author': 'Marilyn Monroe', 'tags': ['friends', 'heartbreak', 'inspirational', 'life', 'love', 'Sisters']} 2017-10-29 16:17:37 [scrapy.core.scraper] DEBUG: Scraped from <200 http://spidyquotes.herokuapp.com/api/quotes?page=2> {'text': '“It takes a great deal of bravery to stand up to our enemies, but just as much to stand up to our friends.”', 'author': 'J.K. Rowling', 'tags': ['courage', 'friends']} 2017-10-29 16:17:37 [scrapy.core.scraper] DEBUG: Scraped from <200 http://spidyquotes.herokuapp.com/api/quotes?page=2> {'text': "“If you can't explain it to a six year old, you don't understand it yourself.”", 'author': 'Albert Einstein', 'tags': ['simplicity', 'Understand']} When this gets to page 10 it will stop as it will see that there is no next page flag set in the Content. How it works Let's walk through the spider to see how this works. The spider starts with the following definition of the start URL: class Spider(scrapy.Spider): name = 'spidyquotes' quotes_base_url = 'http://spidyquotes.herokuapp.com/api/quotes' start_urls = [quotes_base_url] download_delay = 1.5 The parse method then prints the response and also parses the JSON into the data variable: def parse(self, response): print(response) data = json.loads(response.body) Then it loops through all the items in the quotes element of the JSON objects. For each item, it yields a new Scrapy item back to the Scrapy engine: for item in data.get('quotes', []): yield { 'text': item.get('text'), 'author': item.get('author', {}).get('name'), 'tags': item.get('tags'), } It then checks to see if the data JSON variable has a 'has_next' property, and if so it gets the next page and yields a new request back to Scrapy to parse the next page: if data['has_next']: next_page = data['page'] + 1 yield scrapy.Request(self.quotes_base_url + "?page=%s" % next_page) There's more... It is also possible to process infinite, scrolling pages using Selenium. The following code is in 06/06_scrape_continuous_twitter.py: from selenium import webdriver import time driver = webdriver.PhantomJS() print("Starting") driver.get("https://twitter.com") scroll_pause_time = 1.5 # Get scroll height last_height = driver.execute_script("return document.body.scrollHeight") while True: print(last_height) # Scroll down to bottom driver.execute_script("window.scrollTo(0, document.body.scrollHeight);") # Wait to load page time.sleep(scroll_pause_time) # Calculate new scroll height and compare with last scroll height new_height = driver.execute_script("return document.body.scrollHeight") print(new_height, last_height) if new_height == last_height: break last_height = new_height The output would be similar to the following: Starting 4882 8139 4882 8139 11630 8139 11630 15055 11630 15055 15055 15055 Process finished with exit code 0 This code starts by loading the page from Twitter. The call to .get() will return when the page is fully loaded. The scrollHeight is then retrieved, and the program scrolls to that height and waits for a moment for the new content to load. The scrollHeight of the browser is retrieved again, and if different than last_height, it will loop and continue processing. If the same as last_height, no new content has loaded and you can then continue on and retrieve the HTML for the completed page. We have discussed the common challenges faced in performing Web Scraping using Python and got to know their workaround. If you liked this post, be sure to check out Web Scraping with Python, which consists of useful recipes to work with Python and perform efficient web scraping.
Read more
  • 0
  • 0
  • 35552
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-deepmind-alphago-zero-game-changer-for-ai-research
Guest Contributor
09 May 2019
10 min read
Save for later

Why DeepMind AlphaGo Zero is a game changer for AI research

Guest Contributor
09 May 2019
10 min read
DeepMind, a London based artificial intelligence (AI) company currently owned by Alphabet, recently made great strides in AI with its AlphaGo program. It all began in October 2015 when the program beat the European Go champion Fan Hui 5-0, in a game of Go. This was the very first time an AI defeated a professional Go player. Earlier, computers were only known to have played Go at the "amateur" level. Then, the company made headlines again in 2016 after its AlphaGo program beat Lee Sedol, a professional Go player (a world champion) with a score of 4-1 in a five-game match. Furthermore, in late 2017, an improved version of the program called AlphaGo Zero defeated AlphaGo 100 games to 0. The best part? AlphaGo Zero's strategies were self-taught i.e it was trained without any data from human games. AlphaGo Zero was able to defeat its predecessor in only three days time with lesser processing power than AlphaGo. However, the original AlphaGo, on the other hand required months to learn how to play. All these facts beg the questions: what makes AlphaGo Zero so exceptional? Why is it such a big deal? How does it even work? So, without further ado, let’s dive into the what, why, and how of DeepMind’s AlphaGo Zero. What is DeepMind AlphaGo Zero? Simply put, AlphaGo Zero is the strongest Go program in the world (with the exception of AlphaZero). As mentioned before, it monumentally outperforms all previous versions of AlphaGo. Just check out the graph below which compares the Elo rating of the different versions of AlphaGo. Source: DeepMind The Elo rating system is a method for calculating the relative skill levels of players in zero-sum games such as chess and Go. It is named after its creator Arpad Elo, a Hungarian-American physics professor. Now, all previous versions of AlphaGo were trained using human data. The previous versions learned and improved upon the moves played by human experts/professional Go players. But AlphaGo Zero didn’t use any human data whatsoever. Instead, it had to learn completely from playing against itself. According to DeepMind's Professor David Silver, the reason that playing against itself enables it to do so much better than using strong human data is that AlphaGo always has an opponent of just the right level. So it starts off extremely naive, with perfectly random play. And yet at every step of the learning process, it has an opponent (a “sparring partner”) that’s exactly calibrated to its current level of performance. That is, to begin with, these players are terribly weak but over time they become progressively stronger and stronger. Why is reinforcement learning such a big deal? People tend to assume that machine learning is all about big data and massive amounts of computation. But actually, with AlphaGo Zero, AI scientists at DeepMind realized that algorithms matter much more than the computing processing power or data availability. AlphaGo Zero required less computation than previous versions and yet it was able to perform at a much higher level due to using much more principled algorithms than before. It is a system which is trained completely from scratch, starting from random behavior, and progressing from first principles to really discover tabula rasa, in playing the game of Go. It is, therefore, no longer constrained by the limits of human knowledge. Note that AlphaGo Zero did not use zero-shot learning which essentially is the ability of the machine to solve a task despite not having received any training for that task. How does it work? AlphaGo Zero is able to achieve all this by employing a novel form of reinforcement learning, in which AlphaGo Zero becomes its own teacher. As explained previously, the system starts off with a single neural network that knows absolutely nothing about the game of Go. By combining this neural network with a powerful search algorithm, it then plays games against itself. As it plays more and more games, the neural network is updated and tuned to predict moves, and even the eventual winner of the games. This revised neural network is then recombined with the search algorithm to generate a new, stronger version of AlphaGo Zero, and the process repeats. With each iteration, the performance of the system enhances with each iteration, and the quality of the self-play games’ advances, leading to increasingly accurate neural networks and ever-more powerful versions of AlphaGo Zero. Now, let’s dive into some of the technical details that make this version of AlphaGo so much better than all its forerunners. AlphaGo Zero's neural network was trained using TensorFlow, with 64 GPU workers and 19 CPU parameter servers. Only four Tensor Processing Units (TPUs) were used for inference. And of course, the neural network initially knew nothing about Go beyond the rules. Both AlphaGo and AlphaGo Zero took a general approach to play Go. Both evaluated the Go board and chose moves using a combination of two methods: Conducting a “lookahead” search: This means looking ahead several moves by simulating games, and hence seeing which current move is most likely to lead to a “good” position in the future. Assessing positions based on an “intuition” of whether a position is “good” or “bad”  and is likely to result in a win or a loss. Go is a truly intricate game which means computers can’t merely search all possible moves using a brute force approach to discover the best one. Method 1: Lookahead Before AlphaGo, all the finest Go programs tackled this issue by using “Monte Carlo Tree Search” or MCTS. This process involves initially exploring numerous possible moves on the board and then focusing this search over time as certain moves are found to be more likely to result in wins than others. Source: LOC Both AlphaGo and AlphaGo Zero apply a fairly elementary version of MCTS for their “lookahead” to correctly maintain the tradeoff between exploring new sequences of moves or more deeply explore already-explored sequences. Although MCTS has been at the heart of all effective Go programs preceding AlphaGo, it was DeepMind’s smart coalescence of this method with a neural network-based “intuition” that enabled it to attain superhuman performance. Method 2: Intuition DeepMind’s pivotal innovation with AlphaGo was to utilize deep neural networks to identify the state of the game and then use this knowledge to effectively guide the search of the MCTS. In particular, they trained networks that could record: The current board position Which player was playing The sequence of recent moves (in order to rule out certain moves as “illegal”) With this data, the neural networks could propose: Which move should be played If the current player is likely to win or not So how did DeepMind train neural networks to do this? Well, AlphaGo and AlphaGo Zero used rather different approaches in this case. AlphaGo had two separately trained neural networks: Policy Network and Value Network. Source: AlphaGo’s Nature Paper DeepMind then fused these two neural networks with MCTS  —  that is, the program’s “intuition” with its brute force “lookahead” search — in an ingenious way. It used the network that had been trained to predict: Moves to guide which branches of the game tree to search Whether a position was “winning” to assess the positions it encountered during its search This let AlphaGo to intelligently search imminent moves and eventually beat the world champion Lee Sedol. AlphaGo Zero, however, took this principle to the next level. Its neural network’s “intuition” was trained entirely differently from that of AlphaGo. More specifically: The neural network was trained to play moves that exhibited the improved evaluations from performing the “lookahead” search The neural network was tweaked so that it was more likely to play moves like those that led to wins and less likely to play moves similar to those that led to losses during the self-play games Much was made of the fact that no games between humans were used to train AlphaGo Zero. Thus, for a given state of a Go agent, it can constantly be made smarter by performing MCTS-based lookahead and using the results of that lookahead to upgrade the agent. This is how AlphaGo Zero was able to perpetually improve, from when it was an “amateur” all the way up to when it better than the best human players. Moreover, AlphaGo Zero’s neural network architecture can be referred to as a “two-headed” architecture. Source: Hacker Noon Its first 20 layers were “blocks” of a typically seen in modern neural net architectures. These layers were followed by two “heads”: One head that took the output of the first 20 layers and presented probabilities of the Go agent making certain moves Another head that took the output of the first 20 layers and generated a probability of the current player winning. What’s more, AlphaGo Zero used a more “state of the art” neural network architecture as opposed to AlphaGo. Particularly, it used a “residual” neural network architecture rather than a plainly “convolutional” architecture. Deep residual learning was pioneered by Microsoft Research in late 2015, right around the time work on the first version of AlphaGo would have been concluded. So, it is quite reasonable that DeepMind did not use them in the initial AlphaGo program. Notably, each of these two neural network-related acts —  switching from separate-convolutional to the more advanced dual-residual architecture and using the “two-headed” neural network architecture instead of separate neural networks  —  would have resulted in nearly half of the increase in playing strength as was realized when both were coupled. Source: AlphaGo’s Nature Paper Wrapping it up According to DeepMind: “After just three days of self-play training, AlphaGo Zero emphatically defeated the previously published version of AlphaGo - which had itself defeated 18-time world champion Lee Sedol - by 100 games to 0. After 40 days of self-training, AlphaGo Zero became even stronger, outperforming the version of AlphaGo known as “Master”, which has defeated the world's best players and world number one Ke Jie. Over the course of millions of AlphaGo vs AlphaGo games, the system progressively learned the game of Go from scratch, accumulating thousands of years of human knowledge during a period of just a few days. AlphaGo Zero also discovered new knowledge, developing unconventional strategies and creative new moves that echoed and surpassed the novel techniques it played in the games against Lee Sedol and Ke Jie.” Further, the founder and CEO of DeepMind, Dr. Demis Hassabis believes AlphaGo's algorithms are likely to most benefit to areas that need an intelligent search through an immense space of possibilities. Author Bio Gaurav is a Senior SEO and Content Marketing Analyst at The 20 Media, a Content Marketing agency that specializes in data-driven SEO. He has more than seven years of experience in Digital Marketing and along with that loves to read and write about AI, Machine Learning, Data Science and much more about the emerging technologies. In his spare time, he enjoys watching movies and listening to music. Connect with him on Twitter and LinkedIn. DeepMind researchers provide theoretical analysis on recommender system, ‘echo chamber’ and ‘filter bubble effect’ What if AIs could collaborate using human-like values? DeepMind researchers propose a Hanabi platform. Google DeepMind’s AI AlphaStar beats StarCraft II pros TLO and MaNa; wins 10-1 against the gamers  
Read more
  • 0
  • 0
  • 35304

article-image-top-10-mysql-8-performance-benchmarking-aspects-to-know
Amey Varangaonkar
27 Apr 2018
5 min read
Save for later

Top 10 MySQL 8 performance benchmarking aspects to know

Amey Varangaonkar
27 Apr 2018
5 min read
[box type="note" align="" class="" width=""]The following excerpt is taken from the book MySQL 8 Administrator’s Guide, co-authored by Chintan Mehta, Ankit Bhavsar, Hetal Oza and Subhash Shah. This book presents an in-depth view of the newly released features of MySQL 8 and how you can leverage them to administer a high-performance MySQL solution.[/box] Following the best practices for the configuration of MySQL helps us design and manage efficient database, and are quite a cherry on top - without which, it might seem a bit incomplete. In addition to configuration, benchmarking helps us validate and find bottlenecks in the database system and address them. In this article, we look at specific areas that will help us understand the best practices for configuration and performance benchmarking. 1. Resource utilization IO activity, CPU, and memory usage is something that you should not miss out. These metrics help us know how the system is performing while doing benchmarking and at the time of scaling. It also helps us derive impacts per transaction. 2. Stretching your benchmarking timelines We may often like to have a quick glance at performance metrics; however, ensuring that MySQL behaves in the same way for a longer duration of testing is also a key element. There is some basic stuff that might impact on performance when you stretch your benchmark timelines, such as memory fragmentation, degradation of IO, impact after data accumulation, cache management, and so on. We don't want our database to get restarted just to clean up junk items, correct? Therefore, it is suggested to run benchmarking for a long duration for stability and performance Validation. 3. Replicating production settings Let's benchmark in a production-replicated environment. Wait! Let's disable database replication in a replica environment until we are done with benchmarking. Gotcha! We have got some good numbers! It often happens that we don't simulate everything completely that we are going to configure in the production environment. It could prove to be costly, as we might unintentionally be benchmarking something in an environment that might have an adverse impact when it's in production. Replicate production settings, data, workload, and so on in your replicated environment while you do benchmarking. 4. Consistency of throughput and latency Throughput and latency go hand in hand. It is important to keep your eyes primarily focused on throughput; however, latency over time might be something to look out for. Performance dips, slowness, or stalls were noticed in InnoDB in its earlier days. It has improved a lot since then, but as there might be other cases depending on your workload, it is always good to keep an eye on throughput along with latency. 5. Sysbench can do more Sysbench is a wonderful tool to simulate your workloads, whether it be thousands of tables, transaction intensive, data in-memory, and so on. It is a splendid tool to simulate and gives you nice representation. 6. Virtualization world I would like to keep this simple; bare metal as compared to virtualization isn't the same. Hence, while doing benchmarking, measure your resources according to your environment. You might be surprised to see the difference in results if you compare both. 7. Concurrency Big data is seated on heavy data workload; high concurrency is important. MySQL 8 is extending its maximum CPU core support in every new release, optimizing concurrency based on your requirements and hardware resources should be taken care of. 8. Hidden workloads Do not miss out factors that run in the background, such as reporting for big data analytics, backups, and on-the-fly operations while you are benchmarking. The impact of such hidden workloads or obsolete benchmarking workloads can make your days (and nights) Miserable. 9. Nerves of your query Oops! Did we miss the optimizer? Not yet. An optimizer is a powerful tool that will read the nerves of your query and provide recommendations. It's a tool that I use before making changes to a query in production. It's a savior when you have complex queries to be optimized. These are a few areas that we should look out for. Let's now look at a few benchmarks that we did on MySQL 8 and compare them with the ones on MySQL 5.7. 10. Benchmarks To start with, let's fetch all the column names from all the InnoDB tables. The following is the query that we executed: SELECT t.table_schema, t.table_name, c.column_name FROM information_schema.tables t, information_schema.columns c WHERE t.table_schema = c.table_schema AND t.table_name = c.table_name AND t.engine='InnoDB'; The following figure shows how MySQL 8 performed a thousand times faster when having four instances: Following this, we also performed a benchmark to find static table metadata. The following is the query that we executed: SELECT TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE, ENGINE, ROW_FORMAT FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA LIKE 'chintan%'; The following figure shows how MySQL 8 performed around 30 times faster than MySQL 5.7:   It made us eager to go into a bit more detail. So, we thought of doing one last test to find dynamic table metadata. The following is the query that we executed: SELECT TABLE_ROWS FROM INFORMATION_SCHEMA.TABLES WHERE TABLE_SCHEMA LIKE 'chintan%'; The following figure shows how MySQL 8 performed around 30 times faster than MySQL 5.7: MySQL 8.0 brings enormous performance improvement to the table. Scaling from one to million tables, is a need for big data requirements, which is now achievable. We look forward to more benchmarks being officially released once MySQL 8 is available for general purpose. If you found this post useful, make sure to check out the book MySQL 8 Administrator’s Guide for more tips and tricks to manage MySQL 8 effectively. MySQL 8.0 is generally available with added features New updates to Microsoft Azure services for SQL Server, MySQL, and PostgreSQL  
Read more
  • 0
  • 0
  • 35254

article-image-distributed-training-in-tensorflow-2-x
Expert Network
30 Apr 2021
7 min read
Save for later

Distributed training in TensorFlow 2.x

Expert Network
30 Apr 2021
7 min read
TensorFlow 2 is a rich development ecosystem composed of two main parts: Training and Serving. Training consists of a set of libraries for dealing with datasets (tf.data), a set of libraries for building models, including high-level libraries (tf.Keras and Estimators), low-level libraries (tf.*), and a collection of pretrained models (tf.Hub). Training can happen on CPUs, GPUs, and TPUs via distribution strategies and the result can be saved using the appropriate libraries.  This article is an excerpt from the book, Deep Learning with TensorFlow 2 and Keras, Second Edition by Antonio Gulli, Amita Kapoor, and Sujit Pal. This book teaches deep learning techniques alongside TensorFlow (TF) and Keras. In this article, we’ll review the addition of the powerful new feature, distributed training, in TensorFlow 2.x.  One very useful addition to TensorFlow 2.x is the possibility to train models using distributed GPUs, multiple machines, and TPUs in a very simple way with very few additional lines of code. tf.distribute.Strategy is the TensorFlow API used in this case and it supports both tf.keras and tf.estimator APIs and eager execution. You can switch between GPUs, TPUs, and multiple machines by just changing the strategy instance. Strategies can be synchronous, where all workers train over different slices of input data in a form of sync data parallel computation, or asynchronous, where updates from the optimizers are not happening in sync. All strategies require that data is loaded in batches via the tf.data.Dataset api.  Note that the distributed training support is still experimental. A roadmap is given in Figure 1:  Figure 1: Distributed training support fr different strategies and APIs  Let’s discuss in detail all the different strategies reported in Figure 1.  Multiple GPUs  TensorFlow 2.x can utilize multiple GPUs. If we want to have synchronous distributed training on multiple GPUs on one machine, there are two things that we need to do: (1) We need to load the data in a way that will be distributed into the GPUs, and (2) We need to distribute some computations into the GPUs too:  In order to load our data in a way that can be distributed into the GPUs, we simply need tf.data.Dataset (which has already been discussed in the previous paragraphs). If we do not have a tf.data.Dataset but we have a normal tensor, then we can easily convert the latter into the former using tf.data.Dataset.from_tensors_slices(). This will take a tensor in memory and return a source dataset, the elements of which are slices of the given tensor. In our toy example, we use NumPy to generate training data x and labels y, and we transform it into tf.data.Dataset with tf.data.Dataset.from_tensor_slices(). Then we apply a shuffle to avoid bias in training across GPUs and then generate SIZE_BATCHES batches:  import tensorflow as tf import numpy as np from tensorflow import keras N_TRAIN_EXAMPLES = 1024*1024 N_FEATURES = 10 SIZE_BATCHES = 256  # 10 random floats in the half-open interval [0.0, 1.0). x = np.random.random((N_TRAIN_EXAMPLES, N_FEATURES)) y = np.random.randint(2, size=(N_TRAIN_EXAMPLES, 1)) x = tf.dtypes.cast(x, tf.float32) print (x) dataset = tf.data.Dataset.from_tensor_slices((x, y)) dataset = dataset.shuffle(buffer_size=N_TRAIN_EXAMPLES).batch(SIZE_BATCHES) In order to distribute some computations to GPUs, we instantiate a distribution = tf.distribute.MirroredStrategy() object, which supports synchronous distributed training on multiple GPUs on one machine. Then, we move the creation and compilation of the Keras model inside the strategy.scope(). Note that each variable in the model is mirrored across all the replicas. Let’s see it in our toy example: # this is the distribution strategy distribution = tf.distribute.MirroredStrategy() # this piece of code is distributed to multiple GPUs with distribution.scope(): model = tf.keras.Sequential()   model.add(tf.keras.layers.Dense(16, activation=‘relu’, input_shape=(N_FEATURES,)))   model.add(tf.keras.layers.Dense(1, activation=‘sigmoid’))   optimizer = tf.keras.optimizers.SGD(0.2)   model.compile(loss=‘binary_crossentropy’, optimizer=optimizer) model.summary()  # Optimize in the usual way but in reality you are using GPUs. model.fit(dataset, epochs=5, steps_per_epoch=10)  Note that each batch of the given input is divided equally among the multiple GPUs. For instance, if using MirroredStrategy() with two GPUs, each batch of size 256 will be divided among the two GPUs, with each of them receiving 128 input examples for each step. In addition, note that each GPU will optimize on the received batches and the TensorFlow backend will combine all these independent optimizations on our behalf. In short, using multiple GPUs is very easy and requires minimal changes to the tf.Keras code used for a single server.  MultiWorkerMirroredStrategy  This strategy implements synchronous distributed training across multiple workers, each one with potentially multiple GPUs. As of September 2019 the strategy works only with Estimators and it has experimental support for tf.Keras. This strategy should be used if you are aiming at scaling beyond a single machine with high performance. Data must be loaded with tf.Dataset and shared across workers so that each worker can read a unique subset.  TPUStrategy  This strategy implements synchronous distributed training on TPUs. TPUs are Google’s specialized ASICs chips designed to significantly accelerate machine learning workloads in a way often more efficient than GPUs. According to this public information (https://github.com/tensorflow/tensorflow/issues/24412):  “the gist is that we intend to announce support for TPUStrategy alongside Tensorflow 2.1. Tensorflow 2.0 will work under limited use-cases but has many improvements (bug fixes, performance improvements) that we’re including in Tensorflow 2.1, so we don’t consider it ready yet.”  ParameterServerStrategy  This strategy implements either multi-GPU synchronous local training or asynchronous multi-machine training. For local training on one machine, the variables of the models are placed on the CPU and operations are replicated across all local GPUs. For multi-machine training, some machines are designated as workers and some as parameter servers with the variables of the model placed on parameter servers. Computation is replicated across all GPUs of all workers. Multiple workers can be set up with the environment variable TF_CONFIG as in the following example:  os.environ[“TF_CONFIG”] = json.dumps({    “cluster”: {        “worker”: [“host1:port”, “host2:port”, “host3:port”],         “ps”: [“host4:port”, “host5:port”]    },    “task”: {“type”: “worker”, “index”: 1} })  In this article, we have seen how it is possible to train models using distributed GPUs, multiple machines, and TPUs in a very simple way with very few additional lines of code. Learn how to build machine and deep learning systems with the newly released TensorFlow 2 and Keras for the lab, production, and mobile devices with Deep Learning with TensorFlow 2 and Keras, Second Edition by Antonio Gulli, Amita Kapoor and Sujit Pal.  About the Authors  Antonio Gulli is a software executive and business leader with a passion for establishing and managing global technological talent, innovation, and execution. He is an expert in search engines, online services, machine learning, information retrieval, analytics, and cloud computing.   Amita Kapoor is an Associate Professor in the Department of Electronics, SRCASW, University of Delhi and has been actively teaching neural networks and artificial intelligence for the last 20 years. She is an active member of ACM, AAAI, IEEE, and INNS. She has co-authored two books.   Sujit Pal is a technology research director at Elsevier Labs, working on building intelligent systems around research content and metadata. His primary interests are information retrieval, ontologies, natural language processing, machine learning, and distributed processing. He is currently working on image classification and similarity using deep learning models. He writes about technology on his blog at Salmon Run. 
Read more
  • 0
  • 0
  • 35240

article-image-worried-about-deepfakes-check-out-the-new-algorithm-that-manipulate-talking-head-videos-by-altering-the-transcripts
Vincy Davis
07 Jun 2019
6 min read
Save for later

Worried about Deepfakes? Check out the new algorithm that manipulate talking-head videos by altering the transcripts

Vincy Davis
07 Jun 2019
6 min read
Last week, a team of researchers from Stanford University, Max Planck Institute for Informatics, Princeton University and Adobe Research published a paper titled “Text-based Editing of Talking-head Video”. This paper proposes a method to edit a talking-head video based on its transcript to produce a realistic output video, in which the dialogue of the speaker has been modified. Basically, the editor modifies a video using a text transcript, to add new words, delete unwanted ones or completely rearrange the pieces by dragging and dropping. This video will maintain a seamless audio-visual flow, without any jump cuts and will look almost flawless to the untrained eye. The researchers want this kind of text-based editing approach to lay the foundation for better editing tools, in post production of movies and television. Actors often botch small bits of performance or leave out a critical word. This algorithm can help video editors fix that, which has until now involves expensive reshoots. It can also help in easy adaptation of audio-visual video content to specific target audiences. The tool supports three types of edit operations- add new words, rearrange existing words, delete existing words. Ohad Fried, a researcher in the paper says that “This technology is really about better storytelling. Instructional videos might be fine-tuned to different languages or cultural backgrounds, for instance, or children’s stories could be adapted to different ages.” https://youtu.be/0ybLCfVeFL4 How does the application work? The method uses an input talking-head video and a transcript to perform text-based editing. The first step is to align phonemes to the input audio and track each input frame to construct a parametric head model. Next, a 3D parametric face model with each frame of the input talking-head video is registered. This helps in selectively blending different aspects of the face. Then, a background sequence is selected and is used for pose data and background pixels. The background sequence allows editors to edit challenging videos with hair movement and slight camera motion. As Facial expressions are an important parameter, the researchers have tried to preserve the retrieved expression parameters as much as possible, by smoothing out the transition between them. This provides an output of edited parameter sequence which describes the new desired facial motion and a corresponding retimed background video clip. This is forwarded to a ‘neural face rendering’ approach. This step changes the facial motion of the retimed background video to match the parameter sequence. Thus the rendering procedure produces photo-realistic video frames of the subject, appearing to speak the new phrase.These localized edits seamlessly blends into the original video, producing an edited result. Lastly to add the audio, the resulted video is retimed to match the recording at the level of phones. The researchers have used the performers own voice in all their synthesis results. Image Source: Text-based Editing of Talking-head Video The researchers have tested the system with a series of complex edits including adding, removing and changing words, as well as translations to different languages. When the application was tried in a crowd-sourced study with 138 participants, the edits were rated as “real”, almost 60% of the time. Fried said that “The visual quality is such that it is very close to the original, but there’s plenty of room for improvement.” Ethical considerations: Erosion of truth, confusion and defamation Even though the application is quite useful for video editors and producers, it raises important and valid concerns about its potential for misuse. The researchers have also agreed that such a technology might be used for illicit purposes. “We acknowledge that bad actors might use such technologies to falsify personal statements and slander prominent individuals. We are concerned about such deception and misuse.” They have recommended certain precautions to be taken to avoid deception and misuse such as using watermarking. “The fact that the video is synthesized may be obvious by context, directly stated in the video or signaled via watermarking. We also believe that it is essential to obtain permission from the performers for any alteration before sharing a resulting video with a broad audience.” They urge the community to continue to develop forensics, fingerprinting and verification techniques to identify manipulated video. They also support the creation of appropriate regulations and laws that would balance the risks of misuse of these tools against the importance of creative, consensual use cases. The public however remain dubious pointing out valid arguments on why the ‘Ethical Concerns’ talked about in the paper, fail. A user on Hacker News comments, “The "Ethical concerns" section in the article feels like a punt. The author quoting "this technology is really about better storytelling" is aspirational -- the technology's story will be written by those who use it, and you can bet people will use this maliciously.” https://twitter.com/glenngabe/status/1136667296980701185 Another user feels that such kind of technology will only result in “slow erosion of video evidence being trustworthy”. Others have pointed out how the kind of transformation mentioned in the paper, does not come under the broad category of ‘video-editing’ ‘We need more words to describe this new landscape’ https://twitter.com/BrianRoemmele/status/1136710962348617728 Another common argument is that the algorithm can be used to generate terrifyingly real Deepfake videos. A Shallow Fake video was Nancy Pelosi’s altered video, which circulated recently, that made it appear she was slurring her words by slowing down the video. Facebook was criticized for not acting faster to slow the video’s spread. Not just altering speeches of politicians, altered videos like these can also, for instance, be used to create fake emergency alerts, or disrupt elections by dropping a fake video of one of the candidates before voting starts. There is also the issue of defaming someone on a personal capacity. Sam Gregory, Program Director at Witness, tweets that one of the main steps in ensuring effective use of such tools would be to “ensure that any commercialization of synthetic media tools has equal $ invested in detection/safeguards as in detection.; and to have a grounded conversation on trade-offs in mitigation”. He has also listed more interesting recommendations. https://twitter.com/SamGregory/status/1136964998864015361 For more details, we recommend you to read the research paper. OpenAI researchers have developed Sparse Transformers, a neural network which can predict what comes next in a sequence ‘Facial Recognition technology is faulty, racist, biased, abusive to civil rights; act now to restrict misuse’ say experts to House Oversight and Reform Committee Now there’s a CycleGAN to visualize the effects of climate change. But is this enough to mobilize action?
Read more
  • 0
  • 0
  • 35226
article-image-how-to-learn-data-science-from-data-mining-to-machine-learning
Richard Gall
04 Sep 2019
6 min read
Save for later

How to learn data science: from data mining to machine learning

Richard Gall
04 Sep 2019
6 min read
Data science is a field that’s complex and diverse. If you’re trying to learn data science and become a data scientist it can be easy to fall down a rabbit hole of machine learning or data processing. To a certain extent, that’s good. To be an effective data scientist you need to be curious. You need to be prepared to take on a range of different tasks and challenges. But that’s not always that efficient: if you want to learn quickly and effectively, you need a clear structure - a curriculum - that you can follow. This post will show you what you need to learn and how to go about it. Statistics Statistics is arguably the cornerstone of data science. Nate Silver called data scientists “sexed up statisticians”, a comment that was perhaps unfair but still nevertheless contains a kernel of truth in it: that data scientists are always working in the domain of statistics. Once you understand this everything else you need to learn will follow easily. Machine learning, data manipulation, data visualization - these are all ultimately technological methods for performing statistical analysis really well. Best Packt books and videos content for learning statistics Statistics for Data Science R Statistics Cookbook Statistical Methods and Applied Mathematics in Data Science [Video] Before you go any deeper into data science, it’s critical that you gain a solid foundation in statistics. Data mining and wrangling This is an important element of data science that often gets overlooked with all the hype about machine learning. However, without effective data collection and cleaning, all your efforts elsewhere are going to be pointless at best. At worst they might even be misleading or problematic. Sometimes called data manipulation or data munging, it's really all about managing and cleaning data from different sources so it can be used for analytics projects. To do it well you need to have a clear sense of where you want to get to - do you need to restructure the data? Sort or remove certain parts of a data set? Once you understand this, it’s much easier to wrangle data effectively. Data mining and wrangling tools There are a number of different tools you can use for data wrangling. Python and R are the two key programming languages, and both have some useful tools for data mining and manipulation. Python in particular has a great range of tools for data mining and wrangling, such as pandas and NLTK (Natural Language Toolkit), but that isn’t to say R isn’t powerful in this domain. Other tools are available too - Weka and Apache Mahout, for example, are popular. Weka is written in Java so is a good option if you have experience with that programming language, while Mahout integrates well with the Hadoop ecosystem. Data mining and data wrangling books and videos If you need to learn data mining, wrangling and manipulation, Packt has a range of products. Here are some of the best: Data Wrangling with R Data Wrangling with Python Python Data Mining Quick Start Guide Machine Learning for Data Mining Machine learning and artificial intelligence Although Machine learning and artificial intelligence are huge trends in their own right, they are nevertheless closely aligned with data science. Indeed, you might even say that their prominence today has grown out of the excitement around data science that we first we witnessed just under a decade ago. It’s a data scientist’s job to use machine learning and artificial intelligence in a way that can drive business value. That could, for example, be to recommend products or services to customers, perhaps to gain a better understanding into existing products, or even to better manage strategic and financial risks through predictive modelling. So, while we can see machine learning in a massive range of digital products and platforms - all of which require smart development and design - for it to work successfully, it needs to be supported by a capable and creative data scientist. Machine learning and artificial intelligence books for data scientists Machine Learning Algorithms Machine Learning with R - Third Edition Machine Learning with Apache Spark Quick Start Guide Machine Learning with TensorFlow 1.x Keras Deep Learning Cookbook Data visualization A talented data scientist isn’t just a great statistician and engineer, they’re also a great communicator. This means so-called soft skills are highly valuable - the ability to communicate insights and ideas with key stakeholders is essential. But great communication isn’t just about soft skills, it’s also about data visualization. Data visualization is, at a fundamental level, about organizing and presenting data in a way that tells a story, clarifies a problem, or illustrates a solution. It’s essential that you don’t overlook this step. Indeed, spending time learning about effective data visualization can also help you to develop your soft skills. The principles behind storytelling and communication through visualization are, in truth, exactly the same when applied to other scenarios. Data visualization tools There are a huge range of data visualization tools available. As with machine learning, understanding the differences between them and working out what solution will work for you is actually an important part of the learning process. For that reason, don’t be afraid to spend a little bit of time with a range of data visualization tools. Many of the most popular data visualization tools are paid for products. Perhaps the best known of these is Tableau (which, incidentally was bought by Salesforce earlier this year). Tableau and its competitors are very user friendly, which means the barrier to entry is pretty low. They allow you to create some pretty sophisticated data visualizations fairly easily. However, sticking to these tools is not only expensive, it can also limit your abilities. We’d recommend trying a number of different data visualization tools, such as Seabor, D3.js, Matplotlib, and ggplot2. Data visualization books and videos for data scientists Applied Data Visualization with R and ggplot2 Tableau 2019.1 for Data Scientists [Video] D3.js Data Visualization Projects [Video] Tableau in 7 Steps [Video] Data Visualization with Python If you want to learn data science, just get started! As we've seen, data science requires a number of very different skills and takes in a huge breadth of tools. That means that if you're going to be a data scientist, you need to be prepared to commit to learning forver: you're never going to reach a point where you know everything. While that might sound intimidating, it's important to have confidence. With a sense of direction and purpose, and a learning structure that works for you, it's possible to develop and build your data science capabilities in a way that could unlock new opportunities and act as the basis for some really exciting projects.
Read more
  • 0
  • 0
  • 35213

article-image-google-confirms-it-paid-135-million-as-exit-packages-to-senior-execs-accused-of-sexual-harassment
Natasha Mathur
12 Mar 2019
4 min read
Save for later

Google confirms it paid $135 million as exit packages to senior execs accused of sexual harassment

Natasha Mathur
12 Mar 2019
4 min read
According to a complaint filed in a lawsuit yesterday, Google paid $135 million in total as exit packages to top two senior execs, namely Andy Rubin (creator of Android) and Amit Singhal (former senior VP of Google search) after they were accused of sexual misconduct in the company. The lawsuit was filed by an Alphabet shareholder, James Martin, in the Santa Clara, California Court. Google also confirmed paying the exit packages to senior execs to The Verge, yesterday. Speaking of the lawsuit, the complaint is against certain directors and officers of Alphabet, Google’s parent company, for their active and direct participation in “multi-year scheme” to hide sexual harassment and discrimination at Alphabet. It also states that the misconduct by these directors has caused severe financial and reputational damage to Alphabet. The exit packages for Rubin and Singhal were approved by the Leadership Development and Compensation Committee (LLDC). The news of Google paying high exit packages to its top execs first came to light last October, after the New York Times released a report on Google, stating that the firm paid $90 million to Rubin and $15 million to Singhal. Rubin had previously also received an offer for a $150 million stock grant, which he then further use to negotiate the $90 million in severance pay, even though he should have been fired for cause without any pay, states the lawsuit. To protest against the handling of sexual misconduct within Google, more than 20,000 Google employees along with vendors, and contractors, temps, organized Google “walkout for real change” and walked out of their offices in November 2018. Googlers also launched an industry-wide awareness campaign to fight against forced arbitration in January, where they shared information about arbitration on their Twitter and Instagram accounts throughout the day.   Last year in November, Google ended its forced arbitration ( a move that was soon followed by Facebook) for its employees (excluding temps, vendors, etc) and only in the case of sexual harassment. This led to contractors writing an open letter on Medium to Sundar Pichai, CEO, Google, in December, demanding him to address their demands of better conditions and equal benefits for contractors. In response to the Google walkout and the growing public pressure, Google finally decided to end its forced arbitration policy for all employees (including contractors) and for all kinds of discrimination within Google, last month. The changes will go into effect for all the Google employees starting March 21st, 2019. Yesterday, the Google walkout for real change group tweeted condemning the multi-million dollar payouts and has asked people to use the hashtag #Googlepayoutsforall to highlight other better ways that money could have been used. https://twitter.com/GoogleWalkout/status/1105450565193121792 “The conduct of Rubin and other executives was disgusting, illegal, immoral, degrading to women and contrary to every principle that Google claims it abides by”, reads the lawsuit. James Martin also filed a lawsuit against Alphabet’s board members, Larry Page, Sergey Brin, and Eric Schmidt earlier this year in January for covering up the sexual harassment allegations against the former top execs at Google. Martin had sued Alphabet for breaching its fiduciary duty to shareholders, unjust enrichment, abuse of power, and corporate waste. “The directors’ wrongful conduct allowed illegal conduct to proliferate and continue. As such, members of the Alphabet’s board were knowing direct enables of sexual harassment and discrimination”, reads the lawsuit. It also states that the board members not only violated the California and federal law but it also violated the ethical standards and guidelines set by Alphabet. Public reaction to the news is largely negative with people condemning Google’s handling of sexual misconduct: https://twitter.com/awesome/status/1105295877487263744 https://twitter.com/justkelly_ok/status/1105456081663225856 https://twitter.com/justkelly_ok/status/1105457965790707713 https://twitter.com/conradwt/status/1105386882135875584 https://twitter.com/mer__edith/status/1105464808831361025 For more information, check out the official lawsuit here. Recode Decode #GoogleWalkout interview shows why data and evidence don’t always lead to right decisions in even the world’s most data-driven company Liz Fong Jones, prominent ex-Googler shares her experience at Google and ‘grave concerns’ for the company Google’s pay equity analysis finds men, not women, are underpaid; critics call out design flaws in the analysis
Read more
  • 0
  • 0
  • 35139

article-image-key-skills-every-database-programmer-should-have
Sugandha Lahoti
05 Sep 2019
7 min read
Save for later

Key skills every database programmer should have

Sugandha Lahoti
05 Sep 2019
7 min read
According to Robert Half Technology’s 2019 IT salary report, ‘Database programmer’ is one of the 13 most in-demand tech jobs for 2019. For an entry-level programmer, the average salary is $98,250 which goes up to $167,750 for a seasoned expert. A typical database programmer is responsible for designing, developing, testing, deploying, and maintaining databases. In this article, we will list down the top critical tech skills essential to database programmers. #1 Ability to perform Data Modelling The first step is to learn to model the data. In Data modeling, you create a conceptual model of how data items relate to each other. In order to efficiently plan a database design, you should know the organization you are designing the database from. This is because Data models describe real-world entities such as ‘customer’, ‘service’, ‘products’, and the relation between these entities. Data models provide an abstraction for the relations in the database. They aid programmers in modeling business requirements and in translating business requirements into relations. They are also used for exchanging information between the developers and business owners. During the design phase, the database developer should pay great attention to the underlying design principles, run a benchmark stack to ensure performance, and validate user requirements. They should also avoid pitfalls such as data redundancy, null saturation, and tight coupling. #2 Know a database programming language, preferably SQL Database programmers need to design, write and modify programs to improve their databases. SQL is one of the top languages that are used to manipulate the data in a database and to query the database. It's also used to define and change the structure of the data—in other words, to implement the data model. Therefore it is essential that you learn SQL. In general, SQL has three parts: Data Definition Language (DDL): used to create and manage the structure of the data Data Manipulation Language (DML): used to manage the data itself Data Control Language (DCL): controls access to the data Considering, data is constantly inserted into the database, changed, or retrieved DML is used more often in day-to-day operations than the DDL, so you should have a strong grasp on DML. If you plan to grow in a database architect role in the near future, then having a good grasp of DDL will go a long way. Another reason why you should learn SQL is that almost every modern relational database supports SQL. Although different databases might support different features and implement their own dialect of SQL, the basics of the language remain the same. If you know SQL, you can quickly adapt to MySQL, for example. At present, there are a number of categories of database models predominantly, relational, object-relational, and NoSQL databases. All of these are meant for different purposes. Relational databases often adhere to SQL. Object-relational databases (ORDs) are also similar to relational databases. NoSQL, which stands for "not only SQL," is an alternative to traditional relational databases useful for working with large sets of distributed data. They provide benefits such as availability, schema-free, and horizontal scaling, but also have limitations such as performance, data retrieval constraints, and learning time. For beginners, it is advisable to first start with experimenting on relational databases learning SQL, gradually transitioning to NoSQL DBMS. #3 Know how to Extract, Transform, Load various data types and sources A database programmer should have a good working knowledge of ETL (Extract, Transform Load) programming. ETL developers basically extract data from different databases, transform it and then load the data into the Data Warehouse system. A Data Warehouse provides a common data repository that is essential for business needs. A database programmer should know how to tune existing packages, tables, and queries for faster ETL processing. They should conduct unit tests before applying any change to the existing ETL process. Since ETL takes data from different data sources (SQL Server, CSV, and flat files), a database developer should have knowledge on how to deal with different data sources. #4 Design and test Database plans Database programmers o perform regular tests to identify ways to solve database usage concerns and malfunctions. As databases are usually found at the lowest level of the software architecture, testing is done in an extremely cautious fashion. This is because changes in the database schema affect many other software components. A database developer should make sure that when changing the database structure, they do not break existing applications and that they are using the new structures properly. You should be proficient in Unit testing your database. Unit tests are typically used to check if small units of code are functioning properly. For databases, unit testing can be difficult. So the easiest way to do all of that is by writing the tests as SQL scripts. You should also know about System Integration Testing which is done on the complete system after the hardware and software modules of that system have been integrated. SIT validates the behavior of the system and ensures that modules in the system are functioning suitably. #5 Secure your Database Data protection and security are essential for the continuity of business. Databases often store sensitive data, such as user information, email addresses, geographical addresses, and payment information. A robust security system to protect your database against any data breach is therefore necessary. While a database architect is responsible for designing and implementing secure design options, a database admin must ensure that the right security and privacy policies are in place and are being observed. However, this does not absolve database programmers from adopting secure coding practices. Database programmers need to ensure that data integrity is maintained over time and is secure from unauthorized changes or theft. They need to especially be careful about Table Permissions i.e who can read and write to what tables. You should be aware of who is allowed to perform the 4 basic operations of INSERT, UPDATE, DELETE and SELECT against which tables. Database programmers should also adopt authentication best practices depending on the infrastructure setup, the application's nature, the user's characteristics, and data sensitivity. If the database server is accessed from the outside world, it is beneficial to encrypt sessions using SSL certificates to avoid packet sniffing. Also, you should secure database servers that trust all localhost connections, as anyone who accesses the localhost can access the database server. #6 Optimize your database performance A database programmer should also be aware of how to optimize their database performance to achieve the best results. At the basic level, they should know how to rewrite SQL queries and maintain indexes. Other aspects of optimizing database performance, include hardware configuration, network settings, and database configuration. Generally speaking, tuning database performance requires knowledge about the system's nature. Once the database server is configured you should calculate the number of transactions per second (TPS) for the database server setup. Once the system is up and running, and you should set up a monitoring system or log analysis, which periodically finds slow queries, the most time-consuming queries, etc. #7 Develop your soft skills Apart from the above technical skills, a database programmer needs to be comfortable communicating with developers, testers and project managers while working on any software project. A keen eye for detail and critical thinking can often spot malfunctions and errors that may otherwise be overlooked. A database programmer should be able to quickly fix issues within the database and streamline the code. They should also possess quick-thinking to prioritize tasks and meet deadlines effectively. Often database programmers would be required to work on documentation and technical user guides so strong writing and technical skills are a must. Get started If you want to get started with becoming a Database programmer, Packt has a range of products. Here are some of the best: PostgreSQL 11 Administration Cookbook Learning PostgreSQL 11 - Third Edition PostgreSQL 11 in 7 days [ Video ] Using MySQL Databases With Python [ Video ] Basic Relational Database Design [ Video ] How to learn data science: from data mining to machine learning How to ace a data science interview 5 barriers to learning and technology training for small software development teams
Read more
  • 0
  • 0
  • 34749
article-image-bitbucket-to-no-longer-support-mercurial-users-must-migrate-to-git-by-may-2020
Fatema Patrawala
21 Aug 2019
6 min read
Save for later

Bitbucket to no longer support Mercurial, users must migrate to Git by May 2020

Fatema Patrawala
21 Aug 2019
6 min read
Yesterday marked an end of an era for Mercurial users, as Bitbucket announced to no longer support Mercurial repositories after May 2020. Bitbucket, owned by Atlassian, is a web-based version control repository hosting service, for source code and development projects. It has used Mercurial since the beginning in 2008 and then Git since October 2011. Now almost after ten years of sharing its journey with Mercurial, the Bitbucket team has decided to remove the Mercurial support from the Bitbucket Cloud and its API. The official announcement reads, “Mercurial features and repositories will be officially removed from Bitbucket and its API on June 1, 2020.” The Bitbucket team also communicated the timeline for the sunsetting of the Mercurial functionality. After February 1, 2020 users will no longer be able to create new Mercurial repositories. And post June 1, 2020 users will not be able to use Mercurial features in Bitbucket or via its API and all Mercurial repositories will be removed. Additionally all current Mercurial functionality in Bitbucket will be available through May 31, 2020. The team said the decision was not an easy one for them and Mercurial held a special place in their heart. But according to a Stack Overflow Developer Survey, almost 90% of developers use Git, while Mercurial is the least popular version control system with only about 3% developer adoption. Apart from this Mercurial usage on Bitbucket saw a steady decline, and the percentage of new Bitbucket users choosing Mercurial fell to less than 1%. Hence they decided on removing the Mercurial repos. How can users migrate and export their Mercurial repos Bitbucket team recommends users to migrate their existing Mercurial repos to Git. They have also extended support for migration, and kept the available options open for discussion in their dedicated Community thread. Users can discuss about conversion tools, migration, tips, and also offer troubleshooting help. If users prefer to continue using the Mercurial system, there are a number of free and paid Mercurial hosting services for them. The Bitbucket team has also created a Git tutorial that covers everything from the basics of creating pull requests to rebasing and Git hooks. Community shows anger and sadness over decision to discontinue Mercurial support There is an outrage among the Mercurial users as they are extremely unhappy and sad with this decision by Bitbucket. They have expressed anger not only on one platform but on multiple forums and community discussions. Users feel that Bitbucket’s decision to stop offering Mercurial support is bad, but the decision to also delete the repos is evil. On Hacker News, users speculated that this decision was influenced by potential to market rather than based on technically superior architecture and ease of use. They feel GitHub has successfully marketed Git and that's how both have become synonymous to the developer community. One of them comments, “It's very sad to see bitbucket dropping mercurial support. Now only Facebook and volunteers are keeping mercurial alive. Sometimes technically better architecture and user interface lose to a non user friendly hard solutions due to inertia of mass adoption. So a lesson in Software development is similar to betamax and VHS, so marketing is still a winner over technically superior architecture and ease of use. GitHub successfully marketed git, so git and GitHub are synonymous for most developers. Now majority of open source projects are reliant on a single proprietary solution Github by Microsoft, for managing code and project. Can understand the difficulty of bitbucket, when Python language itself moved out of mercurial due to the same inertia. Hopefully gitlab can come out with mercurial support to migrate projects using it from bitbucket.” Another user comments that Mercurial support was the only reason for him to use Bitbucket when GitHub is miles ahead of Bitbucket. Now when it stops supporting Mercurial too, Bitbucket will end soon. The comment reads, “Mercurial support was the one reason for me to still use Bitbucket: there is no other Bitbucket feature I can think of that Github doesn't already have, while Github's community is miles ahead since everyone and their dog is already there. More importantly, Bitbucket leaves the migration to you (if I read the article correctly). Once I download my repo and convert it to git, why would I stay with the company that just made me go through an annoying (and often painful) process, when I can migrate to Github with the exact same command? And why isn't there a "migrate this repo to git" button right there? I want to believe that Bitbucket has smart people and that this choice is a good one. But I'm with you there - to me, this definitely looks like Bitbucket will die.” On Reddit, programming folks see this as a big change from Bitbucket as they are the major mercurial hosting provider. And they feel Bitbucket announced this at a pretty short notice and they require more time for migration. Apart from the developer community forums, on Atlassian community blog as well users have expressed displeasure. A team of scientists commented, “Let's get this straight : Bitbucket (offering hosting support for Mercurial projects) was acquired by Atlassian in September 2010. Nine years later Atlassian decides to drop Mercurial support and delete all Mercurial repositories. Atlassian, I hate you :-) The image you have for me is that of a harmful predator. We are a team of scientists working in a university. We don't have computer scientists, we managed to use a version control simple as Mercurial, and it was a hard work to make all scientists in our team to use a version control system (even as simple as Mercurial). We don't have the time nor the energy to switch to another version control system. But we will, forced and obliged. I really don't want to check out Github or something else to migrate our projects there, but we will, forced and obliged.” Atlassian Bitbucket, GitHub, and GitLab take collective steps against the Git ransomware attack Attackers wiped many GitHub, GitLab, and Bitbucket repos with ‘compromised’ valid credentials leaving behind a ransom note BitBucket goes down for over an hour
Read more
  • 0
  • 0
  • 34327

article-image-master-the-art-of-face-swapping-with-opencv-and-python-by-sylwek-brzeczkowski-developer-at-truststamp
Vincy Davis
12 Dec 2019
8 min read
Save for later

Master the art of face swapping with OpenCV and Python by Sylwek Brzęczkowski, developer at TrustStamp

Vincy Davis
12 Dec 2019
8 min read
No discussion on image processing can be complete without talking about OpenCV. Its 2500+ algorithms, extensive documentation and sample code are considered world-class for exploring real-time computer vision. OpenCV supports a wide variety of programming languages such as C++, Python, Java, etc., and is also available on different platforms including Windows, Linux, OS X, Android, and iOS. OpenCV-Python, the Python API for OpenCV is one of the most popular libraries used to solve computer vision problems. It combines the best qualities of OpenCV, C++ API, and the Python language. The OpenCV-Python library uses Numpy, which is a highly optimized library for numerical operations with a MATLAB-style syntax. This makes it easier to integrate the Python API with other libraries that use Numpy such as SciPy and Matplotlib. This is the reason why it is used by many developers to execute different computer vision experiments. Want to know more about OpenCV with Python? [box type="shadow" align="" class="" width=""]If you are interested in developing your computer vision skills, you should definitely master the algorithms in OpenCV 4 and Python explained in our book ‘Mastering OpenCV 4 with Python’ written by Alberto Fernández Villán. This book will help you build complete projects in relation to image processing, motion detection, image segmentation, and many other tasks by exploring the deep learning Python libraries and also by learning the OpenCV deep learning capabilities.[/box] At the PyData Warsaw 2018 conference, Sylwek Brzęczkowski walked through how to implement a face swap using OpenCV and Python. Face swaps are used by apps like Snapchat to dispense various face filters. Brzęczkowski is a Python developer at TrustStamp. Steps to implement face swapping with OpenCV and Python #1 Face detection using histogram of oriented gradients (HOG) Histogram of oriented gradients (HOG) is a feature descriptor that is used to detect objects in computer vision and image processing. Brzęczkowski demonstrated the working of a HOG using square patches which when hovered over an array of images produces a histogram of oriented gradients feature vectors. These feature vectors are then passed to the classifier to generate a result having the highest matching samples. In order to implement face detection using HOG in Python, the image needs to be imported using import OpenCV. Next a frontal face detector object is created for the loaded image detector=dlib.get_frontal_face_detector(). The detector then produces the vector with the detected face. #2 Facial landmark detection aka face alignment Face landmark detection is the process of finding points of interest in an image of a human face. When dlib is used for facial landmark detection, it returns 68 unique fashion landmarks for the whole face. After the first iteration of the algorithm, the value of T equals 0. This value increases linearly such that at the end of the iteration, T gets the value 10. The image evolved at this stage produces the ‘ground truth’, which means that the iteration can stop now. Due to this working, this stage of the process is also called as face alignment. To implement this stage, Brzęczkowski showed how to add a predictor in the Python program with the values shape_predictor_68_face_landmarks.dat such that it produces a model of around 100 megabytes. This process generally takes up a long time as we tend to pick the biggest clearer image for detection. #3 Finding face border using convex hull The convex hull is a set of points defined as the smallest convex polygon, which encloses all of the points in the set. This means that for a given set of points, the convex hull is the subset of these points such that all the given points are inside the subset. To find the face border in an image, we need to change the structure a bit. The structure is first passed to the convex hull function with return points to false, this means that we get an output of indexes. Brzęczkowski then exhibited the face border in the image in blue color using the find_convex_hull.py function. #4 Approximating nonlinear operations with linear operations In a linear filtering of an image, the value of an output pixel is a linear combination of the values of the pixels. Brzęczkowski put forth the example of Affine transformation which is a type of linear mapping method and is used to preserve points, straight lines, and planes. On the other hand, a non-linear filtering produces an output which is not a linear function of its input. He then goes on to unveil both the transitions using his own image. Brzęczkowski then advised users to check the website learnOpenCV.com to learn how to create a nonlinear operation with a linear one. #5 Finding triangles in an image using Delaunay triangulation A Delaunay triangulation subdivides a set of points in a plane into triangles such that the points become vertices of the triangles. This means that this method subdivides the space or the surface into triangles in such a way that if you look at any triangle on the image, it will not have another point inside the triangle. Brzęczkowski then demonstrates how the image developed in the previous stage contained “face points from which you can identify my teeth and then create sub div to the object, insert all these points that I created or all detected.” Next, he deploys Delaunay triangulation to produce a list of two angles. This list is then used to obtain the triangles in the image. Post this step, he uses the delaunay_triangulation.py function to generate these triangles on the images. #6 Blending one face into another To recap, we started from detecting a face using HOG and finding its border using convex hull, followed it by adding mouth points to indicate specific indexes. Next, Delaunay triangulation was implemented to obtain all the triangles on the images. Next, Brzęczkowski begins the blending of images using seamless cloning. A seamless cloning combines the attributes of other cloning methods to create a unique solution to allow “sequence-independent and scarless insertion of one or more fragments of DNA into a plasmid vector.” This cloning method also provides a variety of skin colors to choose from. Brzęczkowski then explains a feature called ‘pass on edit image’ in the Poisson image editing which uses the value of the gradients instead of the identities or the values of the pixels of the image. To implement the same method in OpenCV, he further demonstrates how information like source, destination, source image destination, mask and center (which is the location where the cloned part should be placed) is required to blend the two faces. Brzęczkowski then depicts a  string of illustrations to transform his image with the images of popular artists like Jamie Foxx, Clint Eastwood, and others. #7 Stabilization using optical flow with the Lucas-Kanade method In computer vision, the Lucas-Kanade method is a widely used differential method for optical flow estimation. It assumes that the flow is essentially constant in a local neighborhood of the pixel under consideration, and solves the basic optical flow equations for all the pixels in that neighborhood, by the least-squares criterion. Thus by combining information from several nearby pixels, the Lucas–Kanade method resolves the inherent ambiguity of the optical flow equation. This method is also less sensitive to noises in an image. By using this method to implement the stabilization of the face swapped image, it is assumed that the optical flow is essentially constant in a local neighborhood of the pixel under consideration in human language. This means that “if we have a red point in the center we assume that all the points around, let's say in this example is three on three pixels we assume that all of them have the same optical flow and thanks to that assumption we have nine equations and only two unknowns.” This makes the computation fairly easy to solve. By using this assumption the optical flow works smoothly if we have the previous gray position of the image. This means that for face swapping images using OpenCV, a user needs to have details of the previous points of the image along with the current points of the image. By combining all this information, the actual point becomes a combination of the detected landmark and the predicted landmark. Thus by implementing the Lucas-Kanade method for stabilizing the image, Brzęczkowski implements a non-shaky version of his face-swapped image. Watch Brzęczkowski’s full video to see a step-by-step implementation of a face-swapping task. You can learn advanced applications like facial recognition, target tracking, or augmented reality from our book, ‘Mastering OpenCV 4 with Python’ written by Alberto Fernández Villán. This book will also help you understand the application of artificial intelligence and deep learning techniques using popular Python libraries like TensorFlow and Keras. Getting to know PyMC3, a probabilistic programming framework for Bayesian Analysis in Python How to perform exception handling in Python with ‘try, catch and finally’ Implementing color and shape-based object detection and tracking with OpenCV and CUDA [Tutorial] OpenCV 4.0 releases with experimental Vulcan, G-API module and QR-code detector among others
Read more
  • 0
  • 0
  • 34232
Modal Close icon
Modal Close icon