Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Data

1215 Articles
article-image-how-to-build-an-options-trading-web-app-using-q-learning
Sunith Shetty
13 Apr 2018
19 min read
Save for later

How to build an options trading web app using Q-learning

Sunith Shetty
13 Apr 2018
19 min read
Today we will learn to develop an options trading web app using Q-learning algorithm and will also evaluate the model. Developing an options trading web app using Q-learning The trading algorithm is the process of using computers programmed to follow a defined set of instructions for placing a trade in order to generate profits at a speed and frequency that is impossible for a human trader. The defined sets of rules are based on timing, price, quantity, or any mathematical model. Problem description Through this project, we will predict the price of an option on a security for N days in the future according to the current set of observed features derived from the time of expiration, the price of the security, and volatility. The question would be: what model should we use for such an option pricing model? The answer is that there are actually many; Black-Scholes stochastic partial differential equations (PDE) is one of the most recognized. In mathematical finance, the Black-Scholes equation is necessarily a PDE overriding the price evolution of a European call or a European put under the Black-Scholes model. For a European call or put on an underlying stock paying no dividends, the equation is: Where V is the price of the option as a function of stock price S and time t, r is the risk-free interest rate, and σ σ (displaystyle sigma) is the volatility of the stock. One of the key financial insights behind the equation is that anyone can perfectly hedge the option by buying and selling the underlying asset in just the right way without any risk. This hedge implies that there is only one right price for the option, as returned by the Black-Scholes formula. Consider a January maturity call option on an IBM with an exercise price of $95. You write a January IBM put option with an exercise price of $85. Let us consider and focus on the call options of a given security, IBM. The following chart plots the daily price of the IBM stock and its derivative call option for May 2014, with a strike price of $190: Figure 1: IBM stock and call $190 May 2014 pricing in May-Oct 2013 Now, what will be the profit and loss be for this position if IBM is selling at $87 on the option maturity date? Alternatively, what if IBM is selling at $100? Well, it is not easy to compute or predict the answer. However, in options trading, the price of an option depends on a few parameters, such as time decay, price, and volatility: Time to expiration of the option (time decay) The price of the underlying security The volatility of returns of the underlying asset A pricing model usually does not consider the variation in trading volume in terms of the underlying security. Therefore, some researchers have included it in the option trading model. As we have described, any RL-based algorithm should have an explicit state (or states), so let us define the state of an option using the following four normalized features: Time decay (timeToExp): This is the time to expiration once normalized in the range of (0, 1). Relative volatility (volatility): within a trading session, this is the relative variation of the price of the underlying security. It is different than the more complex volatility of returns defined in the Black-Scholes model, for example. Volatility relative to volume (vltyByVol): This is the relative volatility of the price of the security adjusted for its trading volume. Relative difference between the current price and the strike price (priceToStrike): This measures the ratio of the difference between the price and the strike price to the strike price. The following graph shows the four normalized features that can be used for the IBM option strategy: Figure 2: Normalized relative stock price volatility, volatility relative to trading volume, and price relative to strike price for the IBM stock Now let us look at the stock and the option price dataset. There are two files IBM.csv and IBM_O.csv contain the IBM stock prices and option prices, respectively. The stock price dataset has the date, the opening price, the high and low price, the closing price, the trade volume, and the adjusted closing price. A shot of the dataset is given in the following diagram: Figure 3: IBM stock data On the other hand, IBM_O.csv has 127 option prices for IBM Call 190 Oct 18, 2014. A few values are 1.41, 2.24, 2.42, 2.78, 3.46, 4.11, 4.51, 4.92, 5.41, 6.01, and so on. Up to this point, can we develop a predictive model using a Q-Learning, algorithm that can help us answer the previously mentioned question: Can it tell us the how IBM can make maximum profit by utilizing all the available features? Well, we know how to implement the Q-Learning, and we know what option trading is. Implementing an options trading web application The goal of this project is to create an options trading web application that creates a Q-Learning model from the IBM stock data. Then the app will extract the output from the model as a JSON object and show the result to the user. Figure 4, shows the overall workflow: Figure 4: Workflow of the options trading Scala web The compute API prepares the input for the Q-learning algorithm, and the algorithm starts by extracting the data from the files to build the option model. Then it performs operations on the data such as normalization and discretization. It passes all of this to the Q-learning algorithm to train the model. After that, the compute API gets the model from the algorithm, extracts the best policy data, and puts it onto JSON to be returned to the web browser. Well, the implementation of the options trading strategy using Q-learning consists of the following steps: Describing the property of an option Defining the function approximation Specifying the constraints on the state transition Creating an option property Considering the market volatility, we need to be a bit more realistic, because any longer- term prediction is quite unreliable. The reason is that it would fall outside the constraint of the discrete Markov model. So, suppose we want to predict the price for next two days—that is, N= 2. That means the price of the option two days in the future is the value of the reward profit or loss. So, let us encapsulate the following four parameters: timeToExp: Time left until expiration as a percentage of the overall duration of the option Volatility normalized Relative volatility of the underlying security for a given trading session vltyByVol: Volatility of the underlying security for a given trading session relative to a trading volume for the session priceToStrike: Price of the underlying security relative to the Strike price for a given trading session The OptionProperty class defines the property of a traded option on a security. The constructor creates the property for an option: class OptionProperty(timeToExp:  Double,volatility: Double,vltyByVol: Double,priceToStrike:  Double) {  nval toArray  = Array[Double](timeToExp,  volatility, vltyByVol,  priceToStrike)  require(timeToExp   > 0.01, s"OptionProperty  time to expiration  found  $timeToExp  required 0.01") } Creating an option model Now we need to create an OptionModel to act as the container and the factory for the properties of the option. It takes the following parameters and creates a list of option properties, propsList, by accessing the data source of the four features described earlier: The symbol of the security. The strike price for option, strikePrice. The source of the data, src. The minimum time decay or time to expiration, minTDecay. Out-of-the-money options expire worthlessly, and in-the-money options have a very different price behavior as they get closer to the expiration. Therefore, the last minTDecay trading sessions prior to the expiration date are not used in the training process. The number of steps (or buckets), nSteps, is used in approximating the values of each feature. For instance, an approximation of four steps creates four buckets: (0, 25), (25, 50), (50, 75), and (75, 100). Then it assembles OptionProperties and computes the normalized minimum time to the expiration of the option. Then it computes an approximation of the value of options by discretization of the actual value in multiple levels from an array of options prices; finally it returns a map of an array of levels for the option price and accuracy. Here is the constructor of the class: class OptionModel( symbol:  String, strikePrice: Double, src:  DataSource, minExpT: Int, nSteps:  Int ) Inside this class implementation, at first, a validation is done using the check() method, by checking the following: strikePrice: A positive price is required minExpT: This has to be between 2 and 16 nSteps: Requires a minimum of two steps Here's the invocation of this method: check(strikePrice,  minExpT, nSteps) The signature of the preceding method is shown in the following code: def check(strikePrice:  Double, minExpT: Int, nSteps:  Int): Unit = { require(strikePrice  > 0.0, s"OptionModel.check  price found $strikePrice required  > 0") require(minExpT  > 2 && minExpT  < 16,s"OptionModel.check  Minimum expiration time found  $minExpT required  ]2, 16[") require(nSteps   > 1,s"OptionModel.check,  number of steps found $nSteps required  > 1") } Once the preceding constraint is satisfied, the list of option properties, named propsList, is created as follows: val propsList  = (for { price  <- src.get(adjClose) volatility  <- src.get(volatility) nVolatility  <- normalize[Double](volatility) vltyByVol  <- src.get(volatilityByVol) nVltyByVol <- normalize[Double](vltyByVol) priceToStrike  <- normalize[Double](price.map(p  => 1.0 - strikePrice / p)) } yield { nVolatility.zipWithIndex./:(List[OptionProperty]())  { case (xs,  (v, n)) => val normDecay  = (n + minExpT).toDouble  / (price.size + minExpT) new OptionProperty(normDecay,  v, nVltyByVol(n), priceToStrike(n))  :: xs } .drop(2).reverse }).get In the preceding code block, the factory uses the zipWithIndex Scala method to represent the index of the trading sessions. All feature values are normalized over the interval (0, 1), including the time decay (or time to expiration) of the normDecay option. The quantize() method of the OptionModel class converts the normalized value of each option property of features into an array of bucket indices. It returns a map of profit and loss for each bucket keyed on the array of bucket indices: def quantize(o:  Array[Double]): Map[Array[Int],  Double] = { val mapper  = new mutable.HashMap[Int,  Array[Int]] val acc:  NumericAccumulator[Int]  = propsList.view.map(_.toArray) map(toArrayInt(_)).map(ar  => { val enc = encode(ar) mapper.put(enc,  ar) enc }) .zip(o)./:( new NumericAccumulator[Int])  { case (_acc,  (t, y)) => _acc  += (t, y); _acc } acc.map  { case (k,  (v, w)) =>  (k, v / w) } .map  { case (k,  v) => (mapper(k),  v) }.toMap } The method also creates a mapper instance to index the array of buckets. An accumulator, acc, of type NumericAccumulator extends the Map[Int,  (Int, Double)] and computes this tuple (number of occurrences of features on each bucket, sum of the increase or decrease of the option price). The toArrayInt method converts the value of each option property (timeToExp, volatility, and so on) into the index of the appropriate bucket. The array of indices is then encoded to generate the id or index of a state. The method updates the accumulator with the number of occurrences and the total profit and loss for a trading session for the option. It finally computes the reward on each action by averaging the profit and loss on each bucket. The signature of the encode(), toArrayInt() is given in the following code: private def encode(arr:  Array[Int]): Int = arr./:((1,  0)) { case ((s,  t), n) =>  (s * nSteps,  t + s * n) }._2 private def toArrayInt(feature:  Array[Double]): Array[Int] = feature.map(x  => (nSteps * x).floor.toInt) final class NumericAccumulator[T] extends mutable.HashMap[T,  (Int, Double)] { def +=(key:  T, x: Double):  Option[(Int, Double)]  = { val newValue  = if (contains(key))  (get(key).get._1 + 1,  get(key).get._2 + x) else (1,  x) super.put(key,  newValue) } } Finally, and most importantly, if the preceding constraints are satisfied (you can modify these constraints though) and once the instantiation of the OptionModel class generates a list of OptionProperty elements if the constructor succeeds; otherwise, it generates an empty list. Putting it altogether Because we have implemented the Q-learning algorithm, we can now develop the options trading application using Q-learning. However, at first, we need to load the data using the DataSource class (we will see its implementation later on). Then we can create an option model from the data for a given stock with default strike and minimum expiration time parameters, using OptionModel, which defines the model for a traded option, on a security. Then we have to create the model for the profit and loss on an option given the underlying security. The profit and loss are adjusted to produce positive values. It instantiates an instance of the Q-learning class, that is, a generic parameterized class that implements the Q-learning algorithm. The Q-learning model is initialized and trained during the instantiation of the class, so it can be in the correct state for the runtime prediction. Therefore, the class instances have only two states: successfully trained and failed training Q-learning value action. Then the model is returned to get processed and visualized. So, let us create a Scala object and name it QLearningMain. Then, inside the QLearningMain object, define and initialize the following parameters: Name: Used to indicate the reinforcement algorithm's name (for our case, it's Q- learning) STOCK_PRICES: File that contains the stock data OPTION_PRICES: File that contains the available option data STRIKE_PRICE: Option strike price MIN_TIME_EXPIRATION: Minimum expiration time for the option recorded QUANTIZATION_STEP: Steps used in discretization or approximation of the value of the security ALPHA: Learning rate for the Q-learning algorithm DISCOUNT (gamma): Discount rate for the Q-learning algorithm MAX_EPISODE_LEN:Maximum number of states visited per episode NUM_EPISODES: Number of episodes used during training MIN_COVERAGE: Minimum coverage allowed during the training of the Q- learning model NUM_NEIGHBOR_STATES: Number of states accessible from any other state REWARD_TYPE: Maximum reward or Random Tentative initializations for each parameter are given in the following code: val name: String = "Q-learning"// Files containing the historical prices for the stock and option val STOCK_PRICES = "/static/IBM.csv" val OPTION_PRICES = "/static/IBM_O.csv"// Run configuration parameters val STRIKE_PRICE = 190.0 // Option strike price val MIN_TIME_EXPIRATION = 6 // Min expiration time for option recorded val QUANTIZATION_STEP = 32 // Quantization step (Double => Int) val ALPHA = 0.2 // Learning rate val DISCOUNT = 0.6 // Discount rate used in Q-Value update equation val MAX_EPISODE_LEN = 128 // Max number of iteration for an episode val NUM_EPISODES = 20 // Number of episodes used for training. val NUM_NEIGHBHBOR_STATES = 3 // No. of states from any other state Now the run() method accepts as input the reward type (Maximum  reward in our case), quantized step (in our case, QUANTIZATION_STEP), alpha (the learning rate, ALPHA in our case) and gamma (in our case, it's DISCOUNT, the discount rate for the Q-learning algorithm). It displays the distribution of values in the model. Additionally, it displays the estimated Q-value for the best policy on a Scatter plot (we will see this later). Here is the workflow of the preceding method: First, it extracts the stock price from the IBM.csv file Then it creates an option model createOptionModel using the stock prices and quantization, quantizeR (see the quantize method for more and the main method invocation later) The option prices are extracted from the IBM_o.csv file After that, another model, model, is created using the option model to evaluate it on the option prices, oPrices Finally, the estimated Q-Value (that is, Q-value = value * probability) is displayed 0n a Scatter plot using the display method By amalgamating the preceding steps, here's the signature of the run() method: private def run(rewardType:  String,quantizeR: Int,alpha:  Double,gamma: Double): Int = { val sPath  = getClass.getResource(STOCK_PRICES).getPath val src  = DataSource(sPath,  false, false, 1).get val option  = createOptionModel(src,  quantizeR) val oPricesSrc  = DataSource(OPTION_PRICES,  false, false, 1).get val oPrices  = oPricesSrc.extract.get val model  = createModel(option,  oPrices, alpha, gamma)model.map(m  => {if (rewardType  != "Random") display(m.bestPolicy.EQ,m.toString,s"$rewardType  with quantization order $quantizeR")1}).getOrElse(-1) } Now here is the signature of the createOptionModel() method that creates an option model using (see the OptionModel class): private def createOptionModel(src:  DataSource, quantizeR: Int): OptionModel = new OptionModel("IBM",  STRIKE_PRICE, src, MIN_TIME_EXPIRATION, quantizeR) Then the createModel() method creates a model for the profit and loss on an option given the underlying security. Note that the option prices are quantized using the quantize() method defined earlier. Then the constraining method is used to limit the number of actions available to any given state. This simple implementation computes the list of all the states within a radius of this state. Then it identifies the neighboring states within a predefined radius. Finally, it uses the input data to train the Q-learning model to compute the minimum value for the profit, a loss so the maximum loss is converted to a null profit. Note that the profit and loss are adjusted to produce positive values. Now let us see the signature of this method: def createModel(ibmOption: OptionModel,oPrice: Seq[Double],alpha: Double,gamma: Double): Try[QLModel] = { val qPriceMap = ibmOption.quantize(oPrice.toArray) val numStates = qPriceMap.size val neighbors = (n: Int) => { def getProximity(idx: Int, radius: Int): List[Int] = { val idx_max = if (idx + radius >= numStates) numStates - 1 else idx + radius val idx_min = if (idx < radius) 0 else idx - radiusRange(idx_min, idx_max + 1).filter(_ != idx)./:(List[Int]())((xs, n) => n :: xs)}getProximity(n, NUM_NEIGHBHBOR_STATES) } val qPrice: DblVec = qPriceMap.values.toVector val profit: DblVec = normalize(zipWithShift(qPrice, 1).map { case (x, y) => y - x}).get val maxProfitIndex = profit.zipWithIndex.maxBy(_._1)._2 val reward = (x: Double, y: Double) => Math.exp(30.0 * (y - x)) val probabilities = (x: Double, y: Double) => if (y < 0.3 * x) 0.0 else 1.0println(s"$name Goal state index: $maxProfitIndex") if (!QLearning.validateConstraints(profit.size, neighbors)) thrownew IllegalStateException("QLearningEval Incorrect states transition constraint") val instances = qPriceMap.keySet.toSeq.drop(1) val config = QLConfig(alpha, gamma, MAX_EPISODE_LEN, NUM_EPISODES, 0.1) val qLearning = QLearning[Array[Int]](config,Array[Int](maxProfitIndex),profit,reward,proba bilities,instances,Some(neighbors)) val modelO = qLearning.getModel if (modelO.isDefined) { val numTransitions = numStates * (numStates - 1)println(s"$name Coverage ${modelO.get.coverage} for $numStates states and $numTransitions transitions") val profile = qLearning.dumpprintln(s"$name Execution profilen$profile")display(qLearning)Success(modelO.get)} else Failure(new IllegalStateException(s"$name model undefined")) } Note that if the preceding invocation cannot create an option model, the code fails to show a message that the model creation failed. Nonetheless, remember that the minCoverage used in the following line is important, considering the small dataset we used (because the algorithm will converge very quickly): val config  = QLConfig(alpha,  gamma, MAX_EPISODE_LEN,  NUM_EPISODES, 0.0) Although we've already stated that it is not assured that the model creation and training will be successful, a Naïve clue would be using a very small minCoverage value between 0.0 and 0.22. Now, if the preceding invocation is successful, then the model is trained and ready for making prediction. If so, then the display method is used to display the estimated Q-value = value * probability in a Scatter plot. Here is the signature of the method: private def display(eq:  Vector[DblPair],results: String,params:  String): Unit = { import org.scalaml.plots.{ScatterPlot,  BlackPlotTheme, Legend} val labels  = Legend(name,  s"Q-learning config:  $params", "States", "States")ScatterPlot.display(eq, labels,  new BlackPlotTheme) } Hang on and do not lose patience! We are finally ready to see a simple rn and inspect the result. So let us do it: def main(args: Array[String]): Unit = {run("Maximum reward", QUANTIZATION_STEP, ALPHA, DISCOUNT) Action: state 71 => state 74 Action: state 71 => state 73 Action: state 71 => state 72 Action: state 71 => state 70 Action: state 71 => state 69 Action: state 71 => state 68...Instance: [I@1f021e6c - state: 124 Action: state 124 => state 125 Action: state 124 => state 123 Action: state 124 => state 122 Action: state 124 => state 121Q-learning Coverage 0.1 for 126 states and 15750 transitions Q-learning Execution profile Q-Value -> 5.572310105096295, 0.013869013819834967, 4.5746487300071825, 0.4037703812585325, 0.17606260549479869, 0.09205272504875522, 0.023205692430068765, 0.06363082458984902, 50.405283888218435... 6.5530411130514015 Model: Success(Optimal policy: Reward - 1.00,204.28,115.57,6.05,637.58,71.99,12.34,0.10,4939.71,521.30,402.73, with coverage: 0.1) Evaluating the model The preceding output shows the transition from one state to another, and for the 0.1 coverage, the Q-Learning model had 15,750 transitions for 126 states to reach goal state 37 with optimal rewards. Therefore, the training set is quite small and only a few buckets have actual values. So we can understand that the size of the training set has an impact on the number of states. Q-Learning will converge too fast for a small training set (like what we have for this example). However, for a larger training set, Q-Learning will take time to converge; it will provide at least one value for each bucket created by the approximation. Also, by seeing those values, it is difficult to understand the relation between Q-values and states. So what if we can see the Q-values per state? Why not! We can see them on a scatter plot: Figure  5: Q-value  per state Now let us display the profile of the log of the Q-value (QLData.value) as the recursive search (or training) progress for different episodes or epochs. The test uses a learning rate α = 0.1 and a discount rate γ = 0.9 (see more in the deployment section): Figure 6: Profile of the logarithmic Q-Value for different  epochs during Q-learning training The preceding chart illustrates the fact that the Q-value for each profile is independent of the order of the epochs during training. However, the number of iterations to reach the goal state depends on the initial state selected randomly in this example. To get more insights, inspect the output on your editor or access the API endpoint at http://localhost:9000/api/compute (see following). Now, what if we display the distribution of values in the model and display the estimated Q-value for the best policy on a Scatter plot for the given configuration parameters? Figure 7: Maximum reward with quantization 32 with the QLearning The final evaluation consists of evaluating the impact of the learning rate and discount rate on the coverage of the training: Figure 7: Impact of the learning rate and discount rate on the coverage of the training The coverage decreases as the learning rate increases. This result confirms the general rule of using learning rate < 0.2. A similar test to evaluate the impact of the discount rate on the coverage is inconclusive. We learned to develop a real-life application for options trading using a reinforcement learning algorithm called Q-learning. You read an excerpt from a book written by Md. Rezaul Karim, titled Scala Machine Learning Projects. In this book, you will learn to develop, build, and deploy research or commercial machine learning projects in a production-ready environment. Check out other related posts: Getting started with Q-learning using TensorFlow How to implement Reinforcement Learning with TensorFlow How Reinforcement Learning works  
Read more
  • 0
  • 0
  • 13444

article-image-image-filtering-techniques-opencv
Vijin Boricha
12 Apr 2018
15 min read
Save for later

Image filtering techniques in OpenCV

Vijin Boricha
12 Apr 2018
15 min read
In the world of computer vision, image filtering is used to modify images. These modifications essentially allow you to clarify an image in order to get the information you want. This could involve anything from extracting edges from an image, blurring it, or removing unwanted objects.  There are, of course, lots of reasons why you might want to use image filtering to modify an image. For example, taking a picture in sunlight or darkness will impact an images clarity - you can use image filters to modify the image to get what you want from it. Similarly, you might have a blurred or 'noisy' image that needs clarification and focus. Let's use an example to see how to do image filtering in OpenCV. This image filtering tutorial is an extract from Practical Computer Vision. Here's an example with considerable salt and pepper noise. This occurs when there is a disturbance in the quality of the signal that's used to generate the image. The image above can be easily generated using OpenCV as follows: # initialize noise image with zeros noise = np.zeros((400, 600)) # fill the image with random numbers in given range cv2.randu(noise, 0, 256) Let's add weighted noise to a grayscale image (on the left) so the resulting image will look like the one on the right: The code for this is as follows: # add noise to existing image noisy_gray = gray + np.array(0.2*noise, dtype=np.int) Here, 0.2 is used as parameter, increase or decrease the value to create different intensity noise. In several applications, noise plays an important role in improving a system's capabilities. This is particularly true when you're using deep learning models. The noise becomes a way of testing the precision of the deep learning application, and building it into the computer vision algorithm. Linear image filtering The simplest filter is a point operator. Each pixel value is multiplied by a scalar value. This operation can be written as follows: Here: The input image is F and the value of pixel at (i,j) is denoted as f(i,j) The output image is G and the value of pixel at (i,j) is denoted as g(i,j) K is scalar constant This type of operation on an image is what is known as a linear filter. In addition to multiplication by a scalar value, each pixel can also be increased or decreased by a constant value. So overall point operation can be written like this: This operation can be applied both to grayscale images and RGB images. For RGB images, each channel will be modified with this operation separately. The following is the result of varying both K and L. The first image is input on the left. In the second image, K=0.5 and L=0.0, while in the third image, K is set to 1.0 and L is 10. For the final image on the right, K=0.7 and L=25. As you can see, varying K changes the brightness of the image and varying L changes the contrast of the image: This image can be generated with the following code: import numpy as np import matplotlib.pyplot as plt import cv2 def point_operation(img, K, L): """ Applies point operation to given grayscale image """ img = np.asarray(img, dtype=np.float) img = img*K + L # clip pixel values img[img > 255] = 255 img[img < 0] = 0 return np.asarray(img, dtype = np.int) def main(): # read an image img = cv2.imread('../figures/flower.png') gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # k = 0.5, l = 0 out1 = point_operation(gray, 0.5, 0) # k = 1., l = 10 out2 = point_operation(gray, 1., 10) # k = 0.8, l = 15 out3 = point_operation(gray, 0.7, 25) res = np.hstack([gray,out1, out2, out3]) plt.imshow(res, cmap='gray') plt.axis('off') plt.show() if __name__ == '__main__': main() 2D linear image filtering While the preceding filter is a point-based filter, image pixels have information around the pixel as well. In the previous image of the flower, the pixel values in the petal are all yellow. If we choose a pixel of the petal and move around, the values will be quite close. This gives some more information about the image. To extract this information in filtering, there are several neighborhood filters. In neighborhood filters, there is a kernel matrix which captures local region information around a pixel. To explain these filters, let's start with an input image, as follows: This is a simple binary image of the number 2. To get certain information from this image, we can directly use all the pixel values. But instead, to simplify, we can apply filters on this. We define a matrix smaller than the given image which operates in the neighborhood of a target pixel. This matrix is termed kernel; an example is given as follows: The operation is defined first by superimposing the kernel matrix on the original image, then taking the product of the corresponding pixels and returning a summation of all the products. In the following figure, the lower 3 x 3 area in the original image is superimposed with the given kernel matrix and the corresponding pixel values from the kernel and image are multiplied. The resulting image is shown on the right and is the summation of all the previous pixel products: This operation is repeated by sliding the kernel along image rows and then image columns. This can be implemented as in following code. We will see the effects of applying this on an image in coming sections. # design a kernel matrix, here is uniform 5x5 kernel = np.ones((5,5),np.float32)/25 # apply on the input image, here grayscale input dst = cv2.filter2D(gray,-1,kernel) However, as you can see previously, the corner pixel will have a drastic impact and results in a smaller image because the kernel, while overlapping, will be outside the image region. This causes a black region, or holes, along with the boundary of an image. To rectify this, there are some common techniques used: Padding the corners with constant values maybe 0 or 255, by default OpenCV will use this. Mirroring the pixel along the edge to the external area Creating a pattern of pixels around the image The choice of these will depend on the task at hand. In common cases, padding will be able to generate satisfactory results. The effect of the kernel is most crucial as changing these values changes the output significantly. We will first see simple kernel-based filters and also see their effects on the output when changing the size. Box filtering This filter averages out the pixel value as the kernel matrix is denoted as follows: Applying this filter results in blurring the image. The results are as shown as follows: In frequency domain analysis of the image, this filter is a low pass filter. The frequency domain analysis is done using Fourier transformation of the image, which is beyond the scope of this introduction. We can see on changing the kernel size, the image gets more and more blurred: As we increase the size of the kernel, you can see that the resulting image gets more blurred. This is due to averaging out of peak values in small neighbourhood where the kernel is applied. The result for applying kernel of size 20x20 can be seen in the following image. However, if we use a very small filter of size (3,3) there is negligible effect on the output, due to the fact that the kernel size is quite small compared to the photo size. In most applications, kernel size is heuristically set according to image size: The complete code to generate box filtered photos is as follows: def plot_cv_img(input_image, output_image): """ Converts an image from BGR to RGB and plots """ fig, ax = plt.subplots(nrows=1, ncols=2) ax[0].imshow(cv2.cvtColor(input_image, cv2.COLOR_BGR2RGB)) ax[0].set_title('Input Image') ax[0].axis('off') ax[1].imshow(cv2.cvtColor(output_image, cv2.COLOR_BGR2RGB)) ax[1].set_title('Box Filter (5,5)') ax[1].axis('off') plt.show() def main(): # read an image img = cv2.imread('../figures/flower.png') # To try different kernel, change size here. kernel_size = (5,5) # opencv has implementation for kernel based box blurring blur = cv2.blur(img,kernel_size) # Do plot plot_cv_img(img, blur) if __name__ == '__main__': main() Properties of linear filters Several computer vision applications are composed of step by step transformations of an input photo to output. This is easily done due to several properties associated with a common type of filters, that is, linear filters: The linear filters are commutative such that we can perform multiplication operations on filters in any order and the result still remains the same: a * b = b * a They are associative in nature, which means the order of applying the filter does not affect the outcome: (a * b) * c = a * (b * c) Even in cases of summing two filters, we can perform the first summation and then apply the filter, or we can also individually apply the filter and then sum the results. The overall outcome still remains the same: Applying a scaling factor to one filter and multiplying to another filter is equivalent to first multiplying both filters and then applying scaling factor These properties play a significant role in other computer vision tasks such as object detection and segmentation. A suitable combination of these filters enhances the quality of information extraction and as a result, improves the accuracy. Non-linear image filtering While in many cases linear filters are sufficient to get the required results, in several other use cases performance can be significantly increased by using non-linear image filtering. Mon-linear image filtering is more complex, than linear filtering. This complexity can, however, give you more control and better results in your computer vision tasks. Let's take a look at how non-linear image filtering works when applied to different images. Smoothing a photo Applying a box filter with hard edges doesn't result in a smooth blur on the output photo. To improve this, the filter can be made smoother around the edges. One of the popular such filters is a Gaussian filter. This is a non-linear filter which enhances the effect of the center pixel and gradually reduces the effects as the pixel gets farther from the center. Mathematically, a Gaussian function is given as: where μ is mean and σ is variance. An example kernel matrix for this kind of filter in 2D discrete domain is given as follows: This 2D array is used in normalized form and effect of this filter also depends on its width by changing the kernel width has varying effects on the output as discussed in further section. Applying gaussian kernel as filter removes high-frequency components which results in removing strong edges and hence a blurred photo: While this filter performs better blurring than a box filter, the implementation is also quite simple with OpenCV: def plot_cv_img(input_image, output_image): """ Converts an image from BGR to RGB and plots """ fig, ax = plt.subplots(nrows=1, ncols=2) ax[0].imshow(cv2.cvtColor(input_image, cv2.COLOR_BGR2RGB)) ax[0].set_title('Input Image') ax[0].axis('off') ax[1].imshow(cv2.cvtColor(output_image, cv2.COLOR_BGR2RGB)) ax[1].set_title('Gaussian Blurred') ax[1].axis('off') plt.show() def main(): # read an image img = cv2.imread('../figures/flower.png') # apply gaussian blur, # kernel of size 5x5, # change here for other sizes kernel_size = (5,5) # sigma values are same in both direction blur = cv2.GaussianBlur(img,(5,5),0) plot_cv_img(img, blur) if __name__ == '__main__': main() The histogram equalization technique The basic point operations, to change the brightness and contrast, help in improving photo quality but require manual tuning. Using histogram equalization technique, these can be found algorithmically and create a better-looking photo. Intuitively, this method tries to set the brightest pixels to white and the darker pixels to black. The remaining pixel values are similarly rescaled. This rescaling is performed by transforming original intensity distribution to capture all intensity distribution. An example of this equalization is as following: The preceding image is an example of histogram equalization. On the right is the output and, as you can see, the contrast is increased significantly. The input histogram is shown in the bottom figure on the left and it can be observed that not all the colors are observed in the image. After applying equalization, resulting histogram plot is as shown on the right bottom figure. To visualize the results of equalization in the image , the input and results are stacked together in following figure. Code for the preceding photos is as follows: def plot_gray(input_image, output_image): """ Converts an image from BGR to RGB and plots """ # change color channels order for matplotlib fig, ax = plt.subplots(nrows=1, ncols=2) ax[0].imshow(input_image, cmap='gray') ax[0].set_title('Input Image') ax[0].axis('off') ax[1].imshow(output_image, cmap='gray') ax[1].set_title('Histogram Equalized ') ax[1].axis('off') plt.savefig('../figures/03_histogram_equalized.png') plt.show() def main(): # read an image img = cv2.imread('../figures/flower.png') # grayscale image is used for equalization gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # following function performs equalization on input image equ = cv2.equalizeHist(gray) # for visualizing input and output side by side plot_gray(gray, equ) if __name__ == '__main__': main() Median image filtering Median image filtering a similar technique as neighborhood filtering. The key technique here, of course, is the use of a median value. As such, the filter is non-linear. It is quite useful in removing sharp noise such as salt and pepper. Instead of using a product or sum of neighborhood pixel values, this filter computes a median value of the region. This results in the removal of random peak values in the region, which can be due to noise like salt and pepper noise. This is further shown in the following figure with different kernel size used to create output. In this image first input is added with channel wise random noise as: # read the image flower = cv2.imread('../figures/flower.png') # initialize noise image with zeros noise = np.zeros(flower.shape[:2]) # fill the image with random numbers in given range cv2.randu(noise, 0, 256) # add noise to existing image, apply channel wise noise_factor = 0.1 noisy_flower = np.zeros(flower.shape) for i in range(flower.shape[2]): noisy_flower[:,:,i] = flower[:,:,i] + np.array(noise_factor*noise, dtype=np.int) # convert data type for use noisy_flower = np.asarray(noisy_flower, dtype=np.uint8) The created noisy image is used for median image filtering as: # apply median filter of kernel size 5 kernel_5 = 5 median_5 = cv2.medianBlur(noisy_flower,kernel_5) # apply median filter of kernel size 3 kernel_3 = 3 median_3 = cv2.medianBlur(noisy_flower,kernel_3) In the following photo, you can see the resulting photo after varying the kernel size (indicated in brackets). The rightmost photo is the smoothest of them all: The most common application for median blur is in smartphone application which filters input image and adds an additional artifacts to add artistic effects. The code to generate the preceding photograph is as follows: def plot_cv_img(input_image, output_image1, output_image2, output_image3): """ Converts an image from BGR to RGB and plots """ fig, ax = plt.subplots(nrows=1, ncols=4) ax[0].imshow(cv2.cvtColor(input_image, cv2.COLOR_BGR2RGB)) ax[0].set_title('Input Image') ax[0].axis('off') ax[1].imshow(cv2.cvtColor(output_image1, cv2.COLOR_BGR2RGB)) ax[1].set_title('Median Filter (3,3)') ax[1].axis('off') ax[2].imshow(cv2.cvtColor(output_image2, cv2.COLOR_BGR2RGB)) ax[2].set_title('Median Filter (5,5)') ax[2].axis('off') ax[3].imshow(cv2.cvtColor(output_image3, cv2.COLOR_BGR2RGB)) ax[3].set_title('Median Filter (7,7)') ax[3].axis('off') plt.show() def main(): # read an image img = cv2.imread('../figures/flower.png') # compute median filtered image varying kernel size median1 = cv2.medianBlur(img,3) median2 = cv2.medianBlur(img,5) median3 = cv2.medianBlur(img,7) # Do plot plot_cv_img(img, median1, median2, median3) if __name__ == '__main__': main() Image filtering and image gradients These are more edge detectors or sharp changes in a photograph. Image gradients widely used in object detection and segmentation tasks. In this section, we will look at how to compute image gradients. First, the image derivative is applying the kernel matrix which computes the change in a direction. The Sobel filter is one such filter and kernel in the x-direction is given as follows: Here, in the y-direction: This is applied in a similar fashion to the linear box filter by computing values on a superimposed kernel with the photo. The filter is then shifted along the image to compute all values. Following is some example results, where X and Y denote the direction of the Sobel kernel: This is also termed as an image derivative with respect to given direction(here X or Y). The lighter resulting photographs (middle and right) are positive gradients, while the darker regions denote negative and gray is zero. While Sobel filters correspond to first order derivatives of a photo, the Laplacian filter gives a second-order derivative of a photo. The Laplacian filter is also applied in a similar way to Sobel: The code to get Sobel and Laplacian filters is as follows: # sobel x_sobel = cv2.Sobel(img,cv2.CV_64F,1,0,ksize=5) y_sobel = cv2.Sobel(img,cv2.CV_64F,0,1,ksize=5) # laplacian lapl = cv2.Laplacian(img,cv2.CV_64F, ksize=5) # gaussian blur blur = cv2.GaussianBlur(img,(5,5),0) # laplacian of gaussian log = cv2.Laplacian(blur,cv2.CV_64F, ksize=5) We learnt about types of filters and how to perform image filtering in OpenCV. To know more about image transformation and 3D computer vision check out this book Practical Computer Vision. Check out for more: Fingerprint detection using OpenCV 3 3 ways to deploy a QT and OpenCV application OpenCV 4.0 is on schedule for July release  
Read more
  • 0
  • 1
  • 85106

article-image-understanding-sql-server-recovery-models-to-effectively-backup-and-restore-your-database
Vijin Boricha
11 Apr 2018
9 min read
Save for later

Understanding SQL Server recovery models to effectively backup and restore your database

Vijin Boricha
11 Apr 2018
9 min read
Before you even think about your backups, you need to understand the SQL Server recovery model that is internally used while the database is in operational mode. In this regard, today we will learn about SQL Server recovery models, backup and restore. SQL Server recovery models A recovery model is about maintaining data in the event of a server failure. Also, it defines the amount of information that SQL Server writes to the log file for the purpose of recovery. SQL Server has three database recovery models: Simple recovery model Full recovery model Bulk-logged recovery model Simple recovery model This model is typically used for small databases and scenarios where data changes are infrequent. It is limited to restoring the database to the point when the last backup was created. It means that all changes made after the backup are lost. You will need to recreate all changes manually. The major benefit of this model is that the log file takes only a small amount of storage space. How and when to use it depends on the business scenario. Full recovery model This model is recommended when recovery from damaged storage is the highest priority and data loss should be minimal. SQL Server uses copies of database and log files to restore the database. The database engine logs all changes to the database, including bulk operation and most DDL statements. If the transaction log file is not damaged, SQL Server can recover all data except any transaction which are in process at the time of failure (that is, not committed in to the database file). All logged transactions give you the opportunity of point-in-time recovery, which is a really cool feature. A major limitation of this model is the large size of the log files which leads you to performance and storage issues. Use it only in scenarios where every insert is important and loss of data is not an option. Bulk-logged recovery model This model is somewhere between simple and full. It uses database and log backups to recreate the database. Compared to the full recovery model, it uses less log space for CREATE INDEX and bulk load operations, such as SELECT INTO. Let's look at this example. SELECT INTO can load a table with 1,000,000 records with a single statement. The log will only record the occurrence of these operations but not the details. This approach uses less storage space compared to the full recovery model. The bulk-logged recovery model is good for databases which are used for ETL process and data migrations. SQL Server has a system database model. This database is the template for each new one you create. If you use only the CREATE DATABASE statement without any additional parameters, it simply copies the model database with all the properties and metadata. It also inherits the default recovery model, which is full. So, the conclusion is that each new database will be in full recovery mode. This can be changed during and after the creation process. Here is a SQL statement to check recovery models of all your databases on SQL Server on Linux instance: 1> SELECT name, recovery_model, recovery_model_desc 2> FROM sys.databases 3> GO name recovery_model recovery_model_desc ------------------------ -------------- ------------------- master 3 SIMPLE tempdb 3 SIMPLE model 1 FULL msdb 3 SIMPLE AdventureWorks 3 SIMPLE WideWorldImporters 3 SIMPLE (6 rows affected) The following DDL statement will change the recovery model for the model database from full to simple: 1> USE master 2> ALTER DATABASE model 3> SET RECOVERY SIMPLE 4> GO If you now execute the SELECT statement again to check recovery models, you will notice that model now has different properties. Backup and restore Now it's time for SQL coding and implementing backup/restore operations in our own Environments. First let's create a full database backup of our University database: 1> BACKUP DATABASE University 2> TO DISK = '/var/opt/mssql/data/University.bak' 3> GO Processed 376 pages for database'University', file 'University' on file 1. Processed 7 pages for database 'University', file 'University_log' on file 1. BACKUP DATABASE successfully processed 383 pages in 0.562 seconds (5.324 MB/sec) 2. Now let's check the content of the table Students: 1> USE University 2> GO Changed database context to 'University' 1> SELECT LastName, FirstName 2> FROM Students 3> GO LastName FirstName --------------- ---------- Azemovic Imran Avdic Selver Azemovic Sara Doe John (4 rows affected) 3. As you can see there are four records. Let's now simulate a large import from the AdventureWorks database, Person.Person table. We will adjust the PhoneNumber data to fit our 13 nvarchar characters. But first we will drop unique index UQ_user_name so that we can quickly import a large amount of data. 1> DROP INDEX UQ_user_name 2> ON dbo.Students 3> GO 1> INSERT INTO Students (LastName, FirstName, Email, Phone, UserName) 2> SELECT T1.LastName, T1.FirstName, T2.PhoneNumber, NULL, 'user.name' 3> FROM AdventureWorks.Person.Person AS T1 4> INNER JOIN AdventureWorks.Person.PersonPhone AS T2 5> ON T1.BusinessEntityID = T2.BusinessEntityID 6> WHERE LEN (T2.PhoneNumber) < 13 7> AND LEN (T1.LastName) < 15 AND LEN (T1.FirstName)< 10 8> GO (10661 rows affected) 4. Let's check the new row numbers: 1> SELECT COUNT (*) FROM Students 2> GO ----------- 10665 (1 rows affected) Note: As you see the table now has 10,665 rows (10,661+4). But don't forget that we had created a full database backup before the import procedure. 5. Now, we will create a differential backup of the University database: 1> BACKUP DATABASE University 2> TO DISK = '/var/opt/mssql/data/University-diff.bak' 3> WITH DIFFERENTIAL 4> GO Processed 216 pages for database 'University', file 'University' on file 1. Processed 3 pages for database 'University', file 'University_log' on file 1. BACKUP DATABASE WITH DIFFERENTIAL successfully processed 219 pages in 0.365 seconds (4.676 MB/sec). 6. If you want to see the state of .bak files on the disk, follow this procedure. However, first enter superuser mode with sudo su. This is necessary because a regular user does not have access to the data folder: 7. Now let's test the transaction log backup of University database log file. However, first you will need to make some changes inside the Students table: 1> UPDATE Students 2> SET Phone = 'N/A' 3> WHERE Phone IS NULL 4> GO 1> BACKUP LOG University 2> TO DISK = '/var/opt/mssql/data/University-log.bak' 3> GO Processed 501 pages for database 'University', file 'University_log' on file 1. BACKUP LOG successfully processed 501 pages in 0.620 seconds (6.313 MB/sec) Note: Next steps are to test restore database options of full and differential backup procedures. 8. First, restore the full database backup of University database. Remember that the Students table had four records before the first backup, and it currently has 10,665 (as we checked in step 4): 1> ALTER DATABASE University 2> SET SINGLE_USER WITH ROLLBACK IMMEDIATE 3> RESTORE DATABASE University 4> FROM DISK = '/var/opt/mssql/data/University.bak' 5> WITH REPLACE 6> ALTER DATABASE University SET MULTI_USER 7> GO Nonqualified transactions are being rolled back. Estimated rollback completion: 0%. Nonqualified transactions are being rolled back. Estimated rollback completion: 100%. Processed 376 pages for database 'University', file 'University' on file 1. Processed 7 pages for database 'University', file 'University_log' on file 1. RESTORE DATABASE successfully processed 383 pages in 0.520 seconds (5.754 MB/sec). Note: Before the restore procedure, the database is switched to single user mode.This way we are closing all connections that could abort the restore procedure. In the last step, we are switching the database to multi-user mode again. 9. Let's check the number of rows again. You will see the database is restored to its initial state, before the import of more than 10,000 records from the AdventureWorks database: 1> SELECT COUNT (*) FROM Students 2> GO ------- 4 (1 rows affected) 10. Now it's time to restore the content of the differential backup and return the University database to its state after the import procedure: 1> USE master 2> ALTER DATABASE University 3> SET SINGLE_USER WITH ROLLBACK IMMEDIATE 4> RESTORE DATABASE University 5> FROM DISK = N'/var/opt/mssql/data/University.bak' 6> WITH FILE = 1, NORECOVERY, NOUNLOAD, REPLACE, STATS = 5 7> RESTORE DATABASE University 8> FROM DISK = N'/var/opt/mssql/data/University-diff.bak' 9> WITH FILE = 1, NOUNLOAD, STATS = 5 10> ALTER DATABASE University SET MULTI_USER 11> GO Processed 376 pages for database 'University', file 'University' on file 1. Processed 7 pages for database 'University', file 'University_log' on file 1. RESTORE DATABASE successfully processed 383 pages in 0.529 seconds (5.656 MB/sec). Processed 216 pages for database 'University', file 'University' on file 1. Processed 3 pages for database 'University', file 'University_log' on file 1. RESTORE DATABASE successfully processed 219 pages in 0.309 seconds (5.524 MB/sec). We'll look at a really cool feature of SQL Server: backup compression. A backup can be a very large file, and if companies create backups on daily basis, then you can do the math on the amount of storage required. Disk space is cheap today, but it is not free. As a database administrator on SQL Server on Linux, you should consider any possible option to optimize and save money. Backup compression is just that kind of feature. It provides you with a compression procedure (ZIP, RAR) after creating regular backups. So, you save time, space, and money. Let's consider a full database backup of the University database. The uncompressed file is about 3 MB. After we create a new one with compression, the size should be reduced. The compression ratio mostly depends on data types inside the database. It is not a magic stick but it can save space. The following SQL command will create a full database backup of the University database and compress it: 1> BACKUP DATABASE University 2> TO DISK = '/var/opt/mssql/data/University-compress.bak' 3> WITH NOFORMAT, INIT, SKIP, NOREWIND, NOUNLOAD, COMPRESSION, STATS = 10 4> GO Now exit to bash, enter superuser mode, and type the following ls command to compare the size of the backup files: tumbleweed:/home/dba # ls -lh /var/opt/mssql/data/U*.bak As you can see, the compression size is 676 KB and it is around five times smaller. That is a huge space saving without any additional tools. SQL Server on Linux has one security feature with backup. We learned about SQL Server recovery model, and how to efficiently backup and restore our database. You can know more about transaction logs and elements of backup strategy from this book SQL Server on Linux. Also, check out: Getting to know SQL Server options for disaster recovery Get SQL Server user management right How to integrate SharePoint with SQL Server Reporting Services
Read more
  • 0
  • 0
  • 48511

article-image-recurrent-neural-networks-lstm-architecture
Richard Gall
11 Apr 2018
2 min read
Save for later

Recurrent neural networks and the LSTM architecture

Richard Gall
11 Apr 2018
2 min read
A recurrent neural network is a class of artificial neural networks that contain a network like series of nodes, each with a directed or one-way connection to every other node. These nodes can be classified as either input, output, or hidden. Input nodes receive data from outside of the network, hidden nodes modify the input data, and output nodes provide the intended results. RNNs are well known for their extensive usage in NLP tasks. The video tutorial above has been taken from Natural Language Processing with Python. Why are recurrent neural networks well suited for NLP? What makes RNNs so popular and effective for natural language processing tasks is that they operate sequentially over data sets. For example, a movie review is an arbitrary sequence of letters and characters, which the RNN can take as an input. The subsequent hidden and output layers are also capable of working with sequences. In a basic sentiment analysis example, you might just have a binary output - like classifying movie reviews as positive or negative. RNNs can do more than this - they are capable of generating a sequential output, such as taking an input sentence in English and translating it into Spanish. This ability to sequentially process data is what makes recurrent neural networks so well suited for NLP tasks. RNNs and long short-term memory Recurrent neural networks can sometimes become unstable due to the complexity of the connections they are built upon. That's where LSTM architecture helps. LSTM introduces something called a memory cell. The memory cell simplifies what could be incredibly by using a series of different gates to govern the way it changes within the network. The input gate manages inputs The output gates manage outputs Self-recurrent connection that keeps the memory cell in a consistent state between different steps The forget gate simply allows the memory cell to 'forget' its previous state [dropcap]R[/dropcap]ead Next High-level concepts Neural Network Architectures 101: Understanding Perceptrons 4 ways to enable Continual learning into Neural Networks Tutorials Build a generative chatbot using recurrent neural networks (LSTM RNNs) Training RNNs for Time Series Forecasting Implement Long-short Term Memory (LSTM) with TensorFlow How to auto-generate texts from Shakespeare writing using deep recurrent neural networks Research in this area Paper in Two minutes: Attention Is All You Need
Read more
  • 0
  • 0
  • 22547

article-image-mark-zuckerberg-congressional-testimony-5-things-learned
Richard Gall
11 Apr 2018
8 min read
Save for later

Mark Zuckerberg's Congressional testimony: 5 things we learned

Richard Gall
11 Apr 2018
8 min read
Mark Zuckerberg yesterday (April 10 2018) testified in front of congress. That's a pretty big deal. Congress has been waiting some time for the chance to grill the Facebook chief, with "Zuck" resisting. So the fact that he finally had his day in D.C. indicates the level of pressure currently on him. Some have lamented the fact that senators were given so little time to respond to Zuckerberg - there was no time to really get deep into the issues at hand. However, although it's true that there was a lot that was superficial about the event, if you looked closely, there was plenty to take away from it. Here are the 5 of the most important things we learned from Mark Zuckerberg's testimony in front of Congress. Policy makers don't really understand that much about tech The most shocking thing to come out of Zuckerberg's testimony was unsurprising; the fact that some of the most powerful people in the U.S. don't really understand the technology that's being discussed. More importantly this is technology they're going to have to be making decisions on. One Senator brought printouts of Facebook pages and asked Zuckerberg if these were examples of Russian propaganda groups. Another was confused about Facebook's business model - how could it run a free service and still make money? Those are just two pretty funny examples, but the senators' lack of understanding could be forgiven due to their age. However, there surely isn't any excuse for 45 year old Senator Brian Schatz to misunderstand the relationship between Whatsapp and Facebook. https://twitter.com/pdmcleod/status/983809717116993537 Chris Cillizza argued on CNN that "the senate's tech illiteracy saved Zuckerberg". He explained: The problem was that once Zuckerberg responded - and he largely stuck to a very strict script in doing so - the lack of tech knowledge among those asking him questions was exposed. The result? Zuckerberg was rarely pressed, rarely forced off his talking points, almost never made to answer for the very real questions his platform faces. This lack of knowledge led to proceedings being less than satisfactory for onlookers. Until this knowledge gap is tackled, it's always going to be a challenge for political institutions to keep up with technological innovators. Ultimately, that's what makes regulation hard. Zuckerberg is still held up as the gatekeeper of tech in 2018 Zuckerberg is still held up as a gatekeeper or oracle of modern technology. That is probably a consequence of the point above. Because there's such a knowledge gap within the institutions that govern and regulate, it's more manageable for them to look to a figurehead. That, of course, goes both ways - on the one hand Zuckerberg is a fountain of knowledge, someone who can solve these problems. On the other hand is part of a Silicon Valley axis of evil, nefariously plotting the downfall of democracy and how to read your WhatsApp messages. Most people know that neither is true. The key point, though, is that however you feel about Zuckerberg, he is not the man you need to ask about regulation. This is something that Zephy Teachout argues on the Guardian. "We shouldn’t be begging for Facebook’s endorsement of laws, or for Mark Zuckerberg’s promises of self-regulation" she writes. In fact, one of the interesting subplots of the hearing was the fact that Zuckerberg didn't actually know that much. For example, a lot has been made of how extensive his notes were. And yes, you certainly would expect someone facing a panel of Senators in Washington to be well-briefed. But it nevertheless underlines an important point - the fact that Facebook is a complex and multi-faceted organization that far exceeds the knowledge of its founder and CEO. In turn, this tells you something about technology that's often lost within the discourse: the fact that its hard to consider what's happening at a superficial or abstract level without completely missing the point. There's a lot you could say about Zuckerberg's notes. One of the most interesting was the point around GDPR. The note is very prescriptive: it says "Don't say we already do what GDPR requires." Many have noted that this throws up a lot of issues, not least how Facebook plan to tackle GDPR in just over a month if they haven't moved on it already. But it's the suggestion that Zuckerberg was completely unaware of the situation that is most remarkable here. He doesn't even know where his company is on one of the most important pieces of data legislation for decades. Facebook is incredibly naive If senators were often naive - or plain ignorant - on matters of technology - during the hearing, there was plenty of evidence to indicate that Zuckerberg is just as naive. The GDPR issue mentioned above is just one example. But there are other problems too. You can't, for example, get much more naive than thinking that Cambridge Analytica had deleted the data that Facebook had passed to it. Zuckerberg's initial explanation was that he didn't realize that Cambridge Analytica was "not an app developer or advertiser", but he corrected this saying that his team told him they were an advertiser back in 2015, which meant they did have reason to act on it but chose not to. Zuckerberg apologized for this mistake, but it's really difficult to see how this would happen. There almost appears to be a culture of naivety within Facebook, whereby the organization generally, and Zuckerberg specifically, don't fully understand the nature of the platform it has built and what it could be used for. It's only now, with Zuckerberg talking about an "arms race" with Russia that this naivety is disappearing. But its clear there was an organizational blindspot that has got us to where we are today. Facebook still thinks AI can solve all of its problems The fact that Facebook believes AI is the solution to so many of its problems is indicative of this ingrained naivety. When talking to Congress about the 'arms race' with Russian intelligence, and the wider problem of hate speech, Zuckerberg signaled that the solution lies in the continued development of better AI systems. However, he conceded that building systems actually capable of detecting such speech could be 5 to 10 years away. This is a problem. It's proving a real challenge for Facebook to keep up with the 'misuse' of its platform. Foreign Policy reports that: "...just last week, the company took down another 70 Facebook accounts, 138 Facebook pages, and 65 Instagram accounts controlled by Russia’s Internet Research Agency, a baker’s dozen of whose executives and operatives have been indicted by Special Counsel Robert Mueller for their role in Russia’s campaign to propel Trump into the White House." However, the more AI comes to be deployed on Facebook, the more that the company is going to have to rethink how it describes itself. By using algorithms to regulate the way the platform is used, there comes to be an implicit editorializing of content. That's not necessarily a bad thing, but it does mean we again return to this final problem... There's still confusion about the difference between a platform and a publisher Central to every issue that was raised in Zuckerberg's testimony was the fact that Facebook remains confused about whether it is a platform or a publisher. Or, more specifically, the extent to which it is responsible for the content on the platform. It's hard to single out Zuckerberg here because everyone seems to be confused on this point. But it's interesting that he seems to have never really thought about the problem. That does seem to be changing, however. In his testimony, Zuckerberg said that "Facebook was responsible" for the content on its platforms. This statement marks a big change from the typical line used by every social media platform - that platforms are just platforms, they bear no responsibility for what is published on them. However, just when you think Zuckerberg is making a definitive statement, he steps back. He went on to say that "I agree that we are responsible for the content, but we don't produce the content." This statement hints that he still wants to keep the distinction between platform and publisher. Unfortunately for Zuckerberg, that might be too late. Read Next OpenAI charter puts safety, standards, and transparency first ‘If tech is building the future, let’s make that future inclusive and representative of all of society’ – An interview with Charlotte Jee What your organization needs to know about GDPR 20 lessons on bias in machine learning systems by Kate Crawford at NIPS 2017
Read more
  • 0
  • 0
  • 17950

article-image-what-is-lstm
Richard Gall
11 Apr 2018
3 min read
Save for later

What is LSTM?

Richard Gall
11 Apr 2018
3 min read
What does LSTM stand for? LSTM stands for long short term memory. It is a model or architecture that extends the memory of recurrent neural networks. Typically, recurrent neural networks have 'short term memory' in that they use persistent previous information to be used in the current neural network. Essentially, the previous information is used in the present task. That means we do not have a list of all of the previous information available for the neural node. Find out how LSTM works alongside recurrent neural networks. Watch this short video tutorial. How does LSTM work? LSTM introduces long-term memory into recurrent neural networks. It mitigates the vanishing gradient problem, which is where the neural network stops learning because the updates to the various weights within a given neural network become smaller and smaller. It does this by using a series of 'gates'. These are contained in memory blocks which are connected through layers, like this: There are three types of gates within a unit: Input Gate: Scales input to cell (write) Output Gate: Scales output to cell (read) Forget Gate: Scales old cell value (reset) Each gate is like a switch that controls the read/write, thus incorporating the long-term memory function into the model. Applications of LSTM There are a huge range of ways that LSTM can be used, including: Handwriting recognition Time series anomaly detection Speech recognition Learning grammar Composing music The difference between LSTM and GRU There are many similarities between LSTM and GRU (Gated Recurrent Units). However, there are some important differences that are worth remembering: A GRU has two gates, whereas an LSTM has three gates. GRUs don't possess any internal memory that is different from the exposed hidden state. They don't have the output gate, which is present in LSTMs. There is no second nonlinearity applied when computing the output in GRU. GRU as a concept, is a little newer than LSTM. It is generally more efficient - it trains models at a quicker rate than LSTM. It is also easier to use. Any modifications you need to make to a model can be done fairly easily. However, LSTM should perform better than GRU where longer term memory is required. Ultimately, comparing performance is going to depend on the data set you are using. 4 ways to enable Continual learning into Neural Networks Build a generative chatbot using recurrent neural networks (LSTM RNNs) How Deep Neural Networks can improve Speech Recognition and generation
Read more
  • 0
  • 0
  • 48111
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-sql-server-user-management
Vijin Boricha
10 Apr 2018
8 min read
Save for later

Get SQL Server user management right

Vijin Boricha
10 Apr 2018
8 min read
The question who are you sounds pretty simple, right? Well, possibly not where philosophy is concerned, and neither is it where databases are concerned either. But user management is essential for anyone managing databases. In this tutorial, learn how SQL server user management works - and how to configure it in the right way. SQL Server user management: the authentication process During the setup procedure, you have to select a password which actually uses the SQL Server authentication process. This database engine comes from Windows and it is tightly connected with Active Directory and internal Windows authentication. In this phase of development, SQL Server on Linux only supports SQL authentication. SQL Server has a very secure entry point. This means no access without the correct credentials. Every information system has some way of checking a user's identity, but SQL Server has three different ways of verifying identity, and the ability to select the most appropriate method, based on individual or business needs. When using SQL Server authentication, logins are created on SQL Server. Both the user name and the password are created by using SQL Server and stored in SQL Server. Users connecting through SQL Server authentication must provide their credentials every time that they connect (user name and password are transmitted through the network). Note: When using SQL Server authentication, it is highly recommended to set strong passwords for all SQL Server accounts.  As you'll have noticed, so far you have not had any problems accessing SQL Server resources. The reason for this is very simple. You are working under the sa login. This login has unlimited SQL Server access. In some real-life scenarios, sa is not something to play with. It is good practice to create a login under a different name with the same level of access. Now let's see how to create a new SQL Server login. But, first, we'll check the list of current SQL Server logins. To do this, access the sys.sql_logins system catalog view and three attributes: name, is_policy_checked, and is_expiration_checked. The attribute name is clear; the second one will show the login enforcement password policy; and the third one is for enforcing account expiration. Both attributes have a Boolean type of value: TRUE or FALSE (1 or 0). Type the following command to list all SQL logins: 1> SELECT name, is_policy_checked, is_expiration_checked 2> FROM sys.sql_logins 3> WHERE name = 'sa' 4> GO name is_policy_checked is_expiration_checked -------------- ----------------- --------------------- sa 1 0 (1 rows affected) 2. If you want to see what your password for the sa login looks like, just type this version of the same statement. This is the result of the hash function: 1> SELECT password_hash 2> FROM sys.sql_logins 3> WHERE name = 'sa' 4> GO password_hash ------------------------------------------------------------- 0x0200110F90F4F4057F1DF84B2CCB42861AE469B2D43E27B3541628 B72F72588D36B8E0DDF879B5C0A87FD2CA6ABCB7284CDD0871 B07C58D0884DFAB11831AB896B9EEE8E7896 (1 rows affected) 3. Now let's create the login dba, which will require a strong password and will not expire: 1> USE master 2> GO Changed database context to 'master'. 1> CREATE LOGIN dba 2> WITH PASSWORD ='S0m3c00lPa$$', 3> CHECK_EXPIRATION = OFF, 4> CHECK_POLICY = ON 5> GO 4. Re-check the dba on the login list: 1> SELECT name, is_policy_checked, is_expiration_checked 2> FROM sys.sql_logins 3> WHERE name = 'dba' 4> GO name is_policy_checked is_expiration_checked ----------------- ----------------- --------------------- dba 1 0 (1 rows affected) Notice that dba logins do not have any kind of privilege. Let's check that part. First close your current sqlcmd session by typing exit. Now, connect again but, instead of using sa, you will connect with the dba login. After the connection has been successfully created, try to change the content of the active database to AdventureWorks. This process, based on the login name, should looks like this: # dba@tumbleweed:~> sqlcmd -S suse -U dba Password: 1> USE AdventureWorks 2> GO Msg 916, Level 14, State 1, Server tumbleweed, Line 1 The server principal "dba" is not able to access the database "AdventureWorks" under the current security context As you can see, the authentication process will not grant you anything. Simply, you can enter the building but you can't open any door. You will need to pass the process of authorization first. Authorization process After authenticating a user, SQL Server will then determine whether the user has permission to view and/or update data, view metadata, or perform administrative tasks (server-side level, database-side level, or both). If the user, or a group to which the user is amember, has some type of permission within the instance and/or specific databases, SQL Server will let the user connect. In a nutshell, authorization is the process of checking user access rights to specific securables. In this phase, SQL Server will check the login policy to determine whether there are any access rights to the server and/or database level. Login can have successful authentication, but no access to the securables. This means that authentication is just one step before login can proceed with any action on SQL Server. SQL Server will check the authorization process on every T-SQL statement. In other words, if a user has SELECT permissions on some database, SQL Server will not check once and then forget until the next authentication/authorization process. Every statement will be verified by the policy to determine whether there are any changes. Permissions are the set of rules that govern the level of access that principals have to securables. Permissions in an SQL Server system can be granted, revoked, or denied. Each of the SQL Server securables has associated permissions that can be granted to each Principal. The only way a principal can access a resource in an SQL Server system is if it is granted permission to do so. At this point, it is important to note that authentication and authorization are two different processes, but they work in conjunction with one another. Furthermore, the terms login and user are to be used very carefully, as they are not the same: Login is the authentication part User is the authorization part Prior to accessing any database on SQL Server, the login needs to be mapped as a user. Each login can have one or many user instances in different databases. For example, one login can have read permission in AdventureWorks and write permission in WideWorldImporters. This type of granular security is a great SQL Server security feature. A login name can be the same or different from a user name in different databases. In the following lines, we will create a database user dba based on login dba. The process will be based on the AdventureWorks database. After that we will try to enter the database and execute a SELECT statement on the Person.Person table: dba@tumbleweed:~> sqlcmd -S suse -U sa Password: 1> USE AdventureWorks 2> GO Changed database context to 'AdventureWorks'. 1> CREATE USER dba 2> FOR LOGIN dba 3> GO 1> exit dba@tumbleweed:~> sqlcmd -S suse -U dba Password: 1> USE AdventureWorks 2> GO Changed database context to 'AdventureWorks'. 1> SELECT * 2> FROM Person.Person 3> GO Msg 229, Level 14, State 5, Server tumbleweed, Line 1 The SELECT permission was denied on the object 'Person', database 'AdventureWorks', schema 'Person' We are making progress. Now we can enter the database, but we still can't execute SELECT or any other SQL statement. The reason is very simple. Our dba user still is not authorized to access any types of resources. Schema separation In Microsoft SQL Server, a schema is a collection of database objects that are owned by a single principal and form a single namespace. All objects within a schema must be uniquely named and a schema itself must be uniquely named in the database catalog. SQL Server (since version 2005) breaks the link between users and schemas. In other words, users do not own objects; schemas own objects, and principals own schemas. Users can now have a default schema assigned using the DEFAULT_SCHEMA option from the CREATE USER and ALTER USER commands. If a default schema is not supplied for a user, then the dbo will be used as the default schema. If a user from a different default schema needs to access objects in another schema, then the user will need to type a full name. For example, Denis needs to query the Contact tables in the Person schema, but he is in Sales. To resolve this, he would type: SELECT * FROM Person.Contact Keep in mind that the default schema is dbo. When database objects are created and not explicitly put in schemas, SQL Server will assign them to the dbo default database schema. Therefore, there is no need to type dbo because it is the default schema. You read a book excerpt from SQL Server on Linux written by Jasmin Azemović.  From this book, you will be able to recognize and utilize the full potential of setting up an efficient SQL Server database solution in the Linux environment. Check out other posts on SQL Server: How SQL Server handles data under the hood How to implement In-Memory OLTP on SQL Server in Linux How to integrate SharePoint with SQL Server Reporting Services
Read more
  • 0
  • 0
  • 28566

article-image-how-greedy-algorithms-work
Richard Gall
10 Apr 2018
2 min read
Save for later

How greedy algorithms work

Richard Gall
10 Apr 2018
2 min read
What is a greedy algorithm? Greedy algorithms are useful for optimization problems. They make the optimal choice at a localized and immediate level, with the aim being that you’ll find the optimal solution you want. It’s important to note that they don’t always find you the best solution for the data science problem you’re trying to solve - so apply them wisely. In the below video tutorial above, from Fundamental Algorithms in Scala, you'll learn when and how to apply a simple greedy algorithm, and see examples of both an iterative and recursive algorithm in action. with examples of both an iterative algorithm and a recursive algorithm. The advantages and disadvantages of greedy algorithms Greedy algorithms have a number of advantages and disadvantages. While on the one hand, it's relatively easy to come up with them, it is actually pretty challenging to identify the issues around the 'correctness' of your algorithm. That means that ultimately the optimization problem you're trying to solve by using greedy algorithms isn't really a technical issue as such. Instead, it's more of an issue with the scope and definition of your data analysis project. It's a human problem, not a mechanical one. Different ways to apply greedy algorithms There are a number of areas where greedy algorithms can most successfully be applied. In fact, it's worth exploring some of these problems if you want to get to know them in more detail. They should give you a clearer indication of how they work, what makes them useful, and potential drawbacks. Some of the best examples are: Huffman coding Dijkstra's algorithm Continuous knapsack problem  To learn more about other algorithms check out these articles: 4 popular algorithms for Distance-based outlier detection 10 machine learning algorithms every engineer needs to know 4 Clustering Algorithms every Data Scientist should know Backpropagation Algorithm To learn to implement specific algorithms, use these tutorials: Creating a reference generator for a job portal using Breadth First Search (BFS) algorithm Implementing gradient descent algorithm to solve optimization problems Implementing face detection using the Haar Cascades and AdaBoost algorithm Getting started with big data analysis using Google’s PageRank algorithm Implementing the k-nearest neighbors' algorithm in Python Machine Learning Algorithms: Implementing Naive Bayes with Spark MLlib
Read more
  • 0
  • 0
  • 66175

article-image-opencv-primer-what-can-you-do-with-computer-vision-and-how-to-get-started
Vijin Boricha
10 Apr 2018
11 min read
Save for later

OpenCV Primer: What can you do with Computer Vision and how to get started?

Vijin Boricha
10 Apr 2018
11 min read
Computer vision applications have become quite ubiquitous in our lives. The applications are varied, ranging from apps that play Virtual Reality (VR) or Augmented Reality (AR) games to applications for scanning documents using smartphone cameras. On our smartphones, we have QR code scanning and face detection, and now we even have facial recognition techniques. Online, we can now search using images and find similar looking images. Photo sharing applications can identify people and make an album based on the friends or family found in the photos. Due to improvements in image stabilization techniques, even with shaky hands, we can create stable videos. In this context, we will learn about basic computer vision, reading an image and image color conversion. With the recent advancements in deep learning techniques, applications like image classification, object detection, tracking, and so on have become more accurate and this has led to the development of more complex autonomous systems, such as drones, self-driving cars, humanoids, and so on. Using deep learning, images can be transformed into more complex details; for example, images can be converted into Van Gogh style paintings.  Such progress in several domains makes a non-expert wonder, how computer vision is capable of inferring this information from images. The motivation lies in human perception and the way we can perform complex analyzes of the environment around us. We can estimate the closeness of, structure and shape of objects, and estimate the textures of a surface too. Even under different lights, we can identify objects and even recognize something if we have seen it before. Considering these advancements and motivations, one of the basic questions that arise is what is computer vision? In this article, we will begin by answering this question and then provide a broader overview of the various sub-domains and applications within computer vision. Later in the article, we will start with basic image operations. What is computer vision? In order to begin the discussion on computer vision, observe the following image: Even if we have never done this activity before, we can clearly tell that the image is of people skiing in the snowy mountains on a cloudy day. This information that we perceive is quite complex and can be subdivided into more basic inferences for a computer vision System. The most basic observation that we can get from an image is of the things or objects in it. In the previous image, the various things that we can see are trees, mountains, snow, sky, people, and so on. Extracting this information is often referred to as image classification, where we would like to label an image with a predefined set of categories. In this case, the labels are the things that we see in the image. A wider observation that we can get from the previous image is landscape. We can tell that the image consists of snow, mountains, and sky, as shown in the following image: Although it is difficult to create exact boundaries for where the snow, mountain, and sky are in the image, we can still identify approximate regions of the image for each of them. This is often termed as segmentation of an image, where we break it up into regions according to object occupancy. Making our observation more concrete, we can further identify the exact boundaries of objects in the image, as shown in the following figure: In the image, we see that people are doing different activities and as such have different shapes; some are sitting, some are standing, some are skiing. Even with this many variations, we can detect objects and can create bounding boxes around them. Only a few bounding boxes are shown in the image for understanding—we can observe much more than these. While, in the image, we show rectangular bounding boxes around some objects, we are not categorizing what object is in the box. The next step would be to say the box contains a person. This combined observation of detecting and categorizing the box is often referred to as object detection. Extending our observation of people and surroundings, we can say that different people in the image have different heights, even though some are nearer and others are farther from the camera. This is due to our intuitive understanding of image formation and the relations of objects. We know that a tree is usually much taller than a person, even if the trees in the image are shorter than the people nearer to the camera. Extracting the information about geometry in the image is another sub-field of computer vision, often referred to as image reconstruction. Computer vision is everywhere In the previous section, we developed an initial understanding of computer vision. With this understanding, there are several algorithms that have been developed and are used in industrial applications. Studying these not only improve our understanding of the system but can also seed newer ideas to improve overall systems. In this section, we will extend our understanding of computer vision by looking at various applications and their problem formulations: Image classification: In the past few years, categorizing images based on the objects within has gained popularity. This is due to advances in algorithms as well as the availability of large datasets. Deep learning algorithms for image classification have significantly improved the accuracy while being trained on datasets like Imagenet. The trained model is often further used to improve other recognition algorithms like object detection, as well as image categorization in online applications. In this book, we will see how to create a simple algorithm to classify images using deep learning models. [box type="note" align="" class="" width=""]Here is a simple tutorial to see how to perform image classification in OpenCV to see object detection in action.[/box] Object detection: Not just self-driving cars, but robotics, automated retail stores, traffic detection, smartphone camera apps, image filters and many more applications use object detection. These also benefit from deep learning and vision techniques as well as the availability of large, annotated datasets. We saw an introduction to object detection in the previous section that produces bounding boxes around objects and also categorizes what object is inside the box. [box type="note" align="" class="" width=""]Check out this tutorial on fingerprint detection in OpenCV to see object detection in action.[/box] Object Tracking: Following robots, surveillance cameras and people interaction are few of the several applications of object tracking. This consists of defining the location and keeps track of corresponding objects across a sequence of images. Image geometry: This is often referred to as computing the depth of objects from the camera. There are several applications in this domain too. Smartphones apps are now capable of computing three-dimensional structures from the video created onboard. Using the three-dimensional reconstructed digital models, further extensions like AR or VR application are developed to interface the image world with the real world. Image segmentation: This is creating cluster regions in images, such that one cluster has similar properties. The usual approach is to cluster image pixels belonging to the same object. Recent applications have grown in self-driving cars and healthcare analysis using image regions. Image generation: These have a greater impact in the artistic domain, merging different image styles or generating completely new ones. Now, we can mix and merge Van Gogh's painting style with smartphone camera images to create images that appear as if they were painted in a similar style to Van Gogh's. The field is quickly evolving, not only through making newer methods of image analysis but also finding newer applications where computer vision can be used. Therefore, applications are not just limited to those explained above. [box type="note" align="" class="" width=""]Check out this post on Image filtering techniques in OpenCV.[/box] Getting started with image operations In this section, we will see basic image operations for reading and writing images. We will also, see how images are represented digitally. Before we proceed further with image IO, let's see what an image is made up of in the digital world. An image is simply a two-dimensional array, with each cell of the array containing intensity values. A simple image is a black and white image with 0s representing white and 1s representing black. This is also referred to as a binary image. A further extension of this is dividing black and white into a broader grayscale with a range of 0 to An image of this type, in the three-dimensional view, is as follows, where x and y are pixel locations and z is the intensity value: This is a top view, but on viewing sideways we can see the variation in the intensities that make up the image: We can see that there are several peaks and image intensities that are not smooth. Let's apply smoothing algorithm. As we can see, pixel intensities form more continuous formations, even though there is no significant change in the object representation. import numpy as np import matplotlib.pyplot as plt from mpl_toolkits.mplot3d import Axes3D import cv2 # loads and read an image from path to file img = cv2.imread('../figures/building_sm.png') # convert the color to grayscale gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # resize the image(optional) gray = cv2.resize(gray, (160, 120)) # apply smoothing operation gray = cv2.blur(gray,(3,3)) # create grid to plot using numpy xx, yy = np.mgrid[0:gray.shape[0], 0:gray.shape[1]] # create the figure fig = plt.figure() ax = fig.gca(projection='3d') ax.plot_surface(xx, yy, gray ,rstride=1, cstride=1, cmap=plt.cm.gray, linewidth=1) # show it plt.show() This code uses the following libraries: NumPy, OpenCV, and matplotlib. In the further sections of this article, we will see operations on images using their color properties. Please download the relevant images from the website to view them clearly. Reading an image We can use the OpenCV library to read an image, as follows. Here, change the path to the image file according to use: import cv2 # loads and read an image from path to file img = cv2.imread('../figures/flower.png') # displays previous image cv2.imshow("Image",img) # keeps the window open until a key is pressed cv2.waitKey(0) # clears all window buffers cv2.destroyAllWindows() The resulting image is shown in the following screenshot: Here, we read the image in BGR color format where B is blue, G is green, and R is red. Each pixel in the output is collectively represented using the values of each of the colors. An example of the pixel location and its color values is shown in the previous figure bottom. Image color conversions An image is made up pixels and is usually visualized according to the value stored. There is also an additional property that makes different kinds of image. Each of the value stored in a pixel is linked to a fixed representation. For example, a pixel value of 10 can represent gray intensity value 1o or blue color intensity value 10 and so on. It is therefore important to understand different color types and their conversion. In this section, we will see color types and conversions using OpenCV. [box type="note" align="" class="" width=""]Did you know OpenCV 4 is on schedule for July release, check out this news piece to know about it in detail.[/box] Grayscale: This is a simple one channel image with values ranging from 0 to 255 that represent the intensity of pixels. The previous image can be converted to grayscale, as follows: import cv2 # loads and read an image from path to file img = cv2.imread('../figures/flower.png') # convert the color to grayscale gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY) # displays previous image cv2.imshow("Image",gray) # keeps the window open until a key is pressed cv2.waitKey(0) # clears all window buffers cv2.destroyAllWindows() The resulting image is as shown in the following screenshot: HSV and HLS: These are another representation of color representing H is hue, S is saturation, V is value, and L is lightness. These are motivated by the human perception system. An example of image conversion for these is as follows: # convert the color to hsv hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV) # convert the color to hls hls = cv2.cvtColor(img, cv2.COLOR_BGR2HLS) This conversion is as shown in the following figure, where an input image read in BGR format is converted to each of the HLS (on left) and HSV (on right) colortypes: LAB color space: Denoted L for lightness, A for green-red colors, and B for blueyellow colors, this consists of all perceivable colors. This is used to convert between one type of colorspace (for example, RGB) to others (such as CMYK) because of its device independence properties. On devices where the format is different to that of the image that is sent, the incoming image color space is first converted to LAB and then to the corresponding space available on the device. The output of converting an RGB image is as follows: This article is an excerpt from the book Practical Computer Vision written by Abhinav Dadhich.  This book will teach you different computer vision techniques and show how to apply them in practical applications.    
Read more
  • 0
  • 0
  • 27819

article-image-predicting-bitcoin-price-from-historical-and-live-data
Sunith Shetty
06 Apr 2018
17 min read
Save for later

Predicting Bitcoin price from historical and live data

Sunith Shetty
06 Apr 2018
17 min read
Today, you will learn how to collect Bitcoin historical and live-price data. You will also learn to transform data into time series and train your model to make insightful predictions. Historical and live-price data collection We will be using the Bitcoin historical price data from Kaggle. For the real-time data, Cryptocompare API will be used. Historical data collection For training the ML algorithm, there is a Bitcoin  Historical  Price Data dataset available to the public on Kaggle (version 10). The dataset can be downloaded here. It has 1 minute OHLC data for BTC-USD pairs from several exchanges. At the beginning of the project, for most of them, data was available from January 1, 2012 to May 31, 2017; but for the Bitstamp exchange, it's available until October 20, 2017 (as well as for Coinbase, but that dataset became available later): Figure 1: The Bitcoin  historical dataset on Kaggle Note that you need to be a registered user and be logged in in order to download the file. The file that we are using is bitstampUSD_1-min_data_2012-01-01_to_2017-10-20.csv. Now, let us get the data we have. It has eight columns: Timestamp: The time elapsed in seconds since January 1, 1970. It is 1,325,317,920 for the first row and 1,325,317,920 for the second 1. (Sanity check! The difference is 60 seconds). Open: The price at the opening of the time interval. It is 4.39 dollars. Therefore it is the price of the first trade that happened after Timestamp (1,325,317,920 in the first row's case). Close: The price at the closing of the time interval. High: The highest price from all orders executed during the interval. Low: The same as High but it is the lowest price. Volume_(BTC): The sum of all Bitcoins that were transferred during the time interval. So, take all transactions that happened during the selected interval and sum up the BTC values of each of them. Volume_(Currency): The sum of all dollars transferred. Weighted_Price: This is derived from the volumes of BTC and USD. By dividing all dollars traded by all bitcoins, we can get the weighted average price of BTC during this minute. So Weighted_Price=Volume_(Currency)/Volume_(BTC). One of the most important parts of the data-science pipeline after data collection (which is in a sense outsourced; we use data collected by others) is data preprocessing—clearing a dataset and transforming it to suit our needs. Transformation of historical data into a time series Stemming from our goal—predict the direction of price change—we might ask ourselves, does having an actual price in dollars help to achieve this? Historically, the price of Bitcoin was usually rising, so if we try to fit a linear regression, it will show further exponential growth (whether in the long run this will be true is yet to be seen). Assumptions and design choices One of the assumptions of this project is as follows: whether we are thinking about Bitcoin trading in November 2016 with a price of about $700, or trading in November 2017 with a price in the $6500-7000 range, patterns in how people trade are similar. Now, we have several other assumptions, as described in the following points: Assumption one: From what has been said previously, we can ignore the actual price and rather look at its change. As a measure of this, we can take the delta between opening and closing prices. If it is positive, it means the price grew during that minute; the price went down if it is negative and stayed the same if delta = 0. In the following figure, we can see that Delta was -1.25 for the first minute observed, -12.83 for the second one, and -0.23 for the third one. Sometimes, the open price can differ significantly from the close price of the previous minute (although Delta is negative during all three of the observed minutes, for the third minute the shown price was actually higher than close for a second). But such things are not very common, and usually the open price doesn't change significantly compared to the close price of the previous minute. Assumption two: The next need to consider...  is predicting the price change in a black box environment. We do not use other sources of knowledge such as news, Twitter feeds, and others to predict how the market would react to them. This is a more advanced topic. The only data we use is price and volume. For simplicity of the prototype, we can focus on price only and construct time series data. Time series prediction is a prediction of a parameter based on the values of this parameter in the past. One of the most common examples is temperature prediction. Although there are many supercomputers using satellite and sensor data to predict the weather, a simple time series analysis can lead to some valuable results. We predict the price at T+60 seconds, for instance, based on the price at T, T-60s, T-120s and so on. Assumption three: Not all data in the dataset is valuable. The first 600,000 records are not informative, as price changes are rare and trading volumes are small. This can affect the model we are training and thus make end results worse. That is why the first 600,000 of rows are eliminated from the dataset. Assumption four: We need to Label  our data so that we can use a supervised ML algorithm. This is the easiest measure, without concerns about transaction fees. Data preprocessing Taking into account the goals of data preparation, Scala was chosen as an easy and interactive way to manipulate data: val priceDataFileName: String = "bitstampUSD_1- min_data_2012-01-01_to_2017-10-20.csv" val spark = SparkSession .builder() .master("local[*]") .config("spark.sql.warehouse.dir", "E:/Exp/") .appName("Bitcoin Preprocessing") .getOrCreate() val data = spark.read.format("com.databricks.spark.csv").option("header", "true").load(priceDataFileName) data.show(10) >>> println((data.count(),  data.columns.size)) >>> (3045857,  8) In the preceding code, we load data from the file downloaded from Kaggle and look at what is inside. There are 3045857 rows in the dataset and 8 columns, described before. Then we create the Delta column, containing the difference between closing and opening prices (that is, to consider only that data where meaningful trading has started to occur): val dataWithDelta  = data.withColumn("Delta",  data("Close") - data("Open")) The following code labels our data by assigning 1 to the rows the Delta value of which was positive; it assigns 0 otherwise: import org.apache.spark.sql.functions._  import spark.sqlContext.implicits._  val dataWithLabels  = dataWithDelta.withColumn("label",    when($"Close"  -  $"Open"  > 0, 1).otherwise(0))  rollingWindow(dataWithLabels,  22, outputDataFilePath,  outputLabelFilePath) This code transforms the original dataset into time series data. It takes the Delta values of WINDOW_SIZE rows (22 in this experiment) and makes a new row out of them. In this way, the first row has Delta values from t0 to t21, and the second one has values from t1 to t22. Then we create the corresponding array with labels (1 or 0). Finally, we save X and Y into files where 612000 rows were cut off from the original dataset; 22 means rolling window size and 2 classes represents that labels are binary 0 and 1: val dropFirstCount: Int = 612000 def rollingWindow(data: DataFrame, window: Int, xFilename: String, yFilename: String): Unit = { var i = 0 val xWriter = new BufferedWriter(new FileWriter(new File(xFilename))) val yWriter = new BufferedWriter(new FileWriter(new File(yFilename))) val zippedData = data.rdd.zipWithIndex().collect() System.gc() val dataStratified = zippedData.drop(dropFirstCount)//slice 612K while (i < (dataStratified.length - window)) { val x = dataStratified .slice(i, i + window) .map(r => r._1.getAs[Double]("Delta")).toList val y = dataStratified.apply(i + window)._1.getAs[Integer]("label") val stringToWrite = x.mkString(",") xWriter.write(stringToWrite + "n") yWriter.write(y + "n") i += 1 if (i % 10 == 0) { xWriter.flush() yWriter.flush() } } xWriter.close() yWriter.close() } In the preceding code segment: val outputDataFilePath:  String = "output/scala_test_x.csv" val outputLabelFilePath:  String = "output/scala_test_y.csv" Real-time data through the Cryptocompare API For real-time data, the Cryptocompare API is used, more specifically HistoMinute, which gives us access to OHLC data for the past seven days at most. The details of the API will be discussed in a section devoted to implementation, but the API response is very similar to our historical dataset, and this data is retrieved using a regular HTTP request. For example, a simple JSON response has the following structure: {  "Response":"Success", "Type":100, "Aggregated":false, "Data":  [{"time":1510774800,"close":7205,"high":7205,"low":7192.67,"open":7198,  "volumefrom":81.73,"volumeto":588726.94},  {"time":1510774860,"close":7209.05,"high":7219.91,"low":7205,"open":7205, "volumefrom":16.39,"volumeto":118136.61},    ... (other  price data)    ], "TimeTo":1510776180,    "TimeFrom":1510774800,    "FirstValueInArray":true,    "ConversionType":{"type":"force_direct","conversionSymbol":""} } Through Cryptocompare HistoMinute, we can get open, high, low, close, volumefrom, and volumeto from each minute of historical data. This data is stored for 7 days only; if you need more, use the hourly or daily path. It uses BTC conversion if data is not available because the coin is not being traded in the specified currency: Now, the following method fetches the correctly formed URL of the Cryptocompare API, which is a fully formed URL with all parameters, such as currency, limit, and aggregation specified. It finally returns the future that will have a response body parsed into the data model, with the price list to be processed at an upper level: import javax.inject.Inject import play.api.libs.json.{JsResult, Json} import scala.concurrent.Future import play.api.mvc._ import play.api.libs.ws._ import processing.model.CryptoCompareResponse class RestClient @Inject() (ws: WSClient) { def getPayload(url : String): Future[JsResult[CryptoCompareResponse]] = { val request: WSRequest = ws.url(url) val future = request.get() implicit val context = play.api.libs.concurrent.Execution.Implicits.defaultContext future.map { response => response.json.validate[CryptoCompareResponse] } } } In the preceding code segment, the CryptoCompareResponse class is the model of API, which takes the following parameters: Response Type Aggregated\ Data FirstValueInArray TimeTo TimeFrom Now, it has the following signature: case class CryptoCompareResponse(Response  : String, Type : Int,  Aggregated  : Boolean, Data  :  List[OHLC], FirstValueInArray  :  Boolean, TimeTo  : Long,  TimeFrom:  Long) object CryptoCompareResponse  {  implicit  val cryptoCompareResponseReads  = Json.reads[CryptoCompareResponse] } Again, the preceding two code segments the open-high-low-close (also known as OHLC), are a model class for mapping with CryptoAPI response data array internals. It takes these parameters: Time: Timestamp in seconds, 1508818680, for instance. Open: Open price at a given minute interval. High: Highest price. Low: Lowest price. Close: Price at the closing of the interval. Volumefrom: Trading volume in the from currency. It's BTC in our case. Volumeto: The trading volume in the to currency, USD in our case. Dividing Volumeto by Volumefrom gives us the weighted price of BTC. Now, it has the following signature: case class OHLC(time:  Long,  open:  Double,  high:  Double,  low:  Double,  close:  Double,  volumefrom:  Double,  volumeto:  Double)  object OHLC  {    implicit  val implicitOHLCReads  = Json.reads[OHLC]  } Model training for prediction Inside the project, in the package folder prediction.training, there is a Scala object called TrainGBT.scala. Before launching, you have to specify/change four things: In the code, you need to set up spark.sql.warehouse.dir in some actual place on your computer that has several gigabytes of free space: set("spark.sql.warehouse.dir",  "/home/user/spark") The RootDir is the main folder, where all files and train models will be stored:rootDir  = "/home/user/projects/btc-prediction/" Make sure that the x filename matches the one produced by the Scala script in the preceding step: x  = spark.read.format("com.databricks.spark.csv").schema(xSchema).load(rootDir  + "scala_test_x.csv") Make sure that the y filename matches the one produced by Scala script: y_tmp=spark.read.format("com.databricks.spark.csv").schema (ySchema).load(rootDir + "scala_test_y.csv") The code for training uses the Apache Spark ML library (and libraries required for it) to train the classifier, which means they have to be present in your class path to be able to run it. The easiest way to do that (since the whole project uses SBT) is to run it from the project root folder by typing sbtrun-main  prediction.training.TrainGBT, which will resolve all dependencies and launch training. Depending on the number of iterations and depth, it can take several hours to train the model. Now let us see how training is performed on the example of the gradient-boosted trees model. First, we need to create a SparkSession object: val xSchema = StructType(Array( StructField("t0", DoubleType, true), StructField("t1", DoubleType, true), StructField("t2", DoubleType, true), StructField("t3", DoubleType, true), StructField("t4", DoubleType, true), StructField("t5", DoubleType, true), StructField("t6", DoubleType, true), StructField("t7", DoubleType, true), StructField("t8", DoubleType, true), StructField("t9", DoubleType, true), StructField("t10", DoubleType, true), StructField("t11", DoubleType, true), StructField("t12", DoubleType, true), StructField("t13", DoubleType, true), StructField("t14", DoubleType, true), StructField("t15", DoubleType, true), StructField("t16", DoubleType, true), StructField("t17", DoubleType, true), StructField("t18", DoubleType, true), StructField("t19", DoubleType, true), StructField("t20", DoubleType, true), StructField("t21", DoubleType, true)) ) Then we read the files we defined for the schema. It was more convenient to generate two separate files in Scala for data and labels, so here we have to join them into a single DataFrame: Then we read the files we defined for the schema. It was more convenient to generate two separate files in Scala for data and labels, so here we have to join them into a single DataFrame: import spark.implicits._ val y = y_tmp.withColumn("y", 'y.cast(IntegerType)) import org.apache.spark.sql.functions._ val x_id = x.withColumn("id", monotonically_increasing_id()) val y_id = y.withColumn("id", monotonically_increasing_id()) val data = x_id.join(y_id, "id") The next step is required by Spark—we need to vectorize the features: val featureAssembler = new VectorAssembler() .setInputCols(Array("t0", "t1", "t2", "t3", "t4", "t5", "t6", "t7", "t8", "t9", "t10", "t11", "t12", "t13", "t14", "t15", "t16", "t17", "t18", "t19", "t20", "t21")) .setOutputCol("features") We split the data into train and test sets randomly in the proportion of 75% to 25%. We set the seed so that the splits would be equal among all times we run the training: val Array(trainingData,testData) = dataWithLabels.randomSplit(Array(0.75, 0.25), 123) We then define the model. It tells which columns are features and which are labels. It also sets parameters: val gbt = new GBTClassifier() .setLabelCol("label") .setFeaturesCol("features") .setMaxIter(10) .setSeed(123) Create a pipeline of steps—vector assembling of features and running GBT: val pipeline = new Pipeline() .setStages(Array(featureAssembler, gbt)) Defining evaluator function—how the model knows whether it is doing well or not. As we have only two classes that are imbalanced, accuracy is a bad measurement; area under the ROC curve is better: val rocEvaluator = new BinaryClassificationEvaluator() .setLabelCol("label") .setRawPredictionCol("rawPrediction") .setMetricName("areaUnderROC") K-fold cross-validation is used to avoid overfitting; it takes out one-fifth of the data at each iteration, trains the model on the rest, and then tests on this one-fifth: val cv = new CrossValidator() .setEstimator(pipeline) .setEvaluator(rocEvaluator) .setEstimatorParamMaps(paramGrid) .setNumFolds(numFolds) .setSeed(123) val cvModel = cv.fit(trainingData) After we get the trained model (which can take an hour or more depending on the number of iterations and parameters we want to iterate on, specified in paramGrid), we then compute the predictions on the test data: val predictions = cvModel.transform(testData) In addition, evaluate quality of predictions: val roc = rocEvaluator.evaluate(predictions) The trained model is saved for later usage by the prediction service: val gbtModel = cvModel.bestModel.asInstanceOf[PipelineModel] gbtModel.save(rootDir + "__cv__gbt_22_binary_classes_" + System.nanoTime()/ 1000000 + ".model") In summary, the code for model training is given as follows: import org.apache.spark.{ SparkConf, SparkContext } import org.apache.spark.ml.{ Pipeline, PipelineModel } import org.apache.spark.ml.classification.{ GBTClassificationModel, GBTClassifier, RandomForestClassificationModel, RandomForestClassifier} import org.apache.spark.ml.evaluation.{BinaryClassificationEvaluator, MulticlassClassificationEvaluator} import org.apache.spark.ml.feature.{IndexToString, StringIndexer, VectorAssembler, VectorIndexer} import org.apache.spark.ml.tuning.{CrossValidator, ParamGridBuilder} import org.apache.spark.sql.types.{DoubleType, IntegerType, StructField, StructType} import org.apache.spark.sql.SparkSession object TrainGradientBoostedTree { def main(args: Array[String]): Unit = { val maxBins = Seq(5, 7, 9) val numFolds = 10 val maxIter: Seq[Int] = Seq(10) val maxDepth: Seq[Int] = Seq(20) val rootDir = "output/" val spark = SparkSession .builder() .master("local[*]") .config("spark.sql.warehouse.dir", ""/home/user/spark/") .appName("Bitcoin Preprocessing") .getOrCreate() val xSchema = StructType(Array( StructField("t0", DoubleType, true), StructField("t1", DoubleType, true), StructField("t2", DoubleType, true), StructField("t3", DoubleType, true), StructField("t4", DoubleType, true), StructField("t5", DoubleType, true), StructField("t6", DoubleType, true), StructField("t7", DoubleType, true), StructField("t8", DoubleType, true), StructField("t9", DoubleType, true), StructField("t10", DoubleType, true), StructField("t11", DoubleType, true), StructField("t12", DoubleType, true), StructField("t13", DoubleType, true), StructField("t14", DoubleType, true), StructField("t15", DoubleType, true), StructField("t16", DoubleType, true), StructField("t17", DoubleType, true), StructField("t18", DoubleType, true), StructField("t19", DoubleType, true), StructField("t20", DoubleType, true), StructField("t21", DoubleType, true))) val ySchema = StructType(Array(StructField("y", DoubleType, true))) val x = spark.read.format("csv").schema(xSchema).load(rootDir + "scala_test_x.csv") val y_tmp = spark.read.format("csv").schema(ySchema).load(rootDir + "scala_test_y.csv") import spark.implicits._ val y = y_tmp.withColumn("y", 'y.cast(IntegerType)) import org.apache.spark.sql.functions._ //joining 2 separate datasets in single Spark dataframe val x_id = x.withColumn("id", monotonically_increasing_id()) val y_id = y.withColumn("id", monotonically_increasing_id()) val data = x_id.join(y_id, "id") val featureAssembler = new VectorAssembler() .setInputCols(Array("t0", "t1", "t2", "t3", "t4", "t5", "t6", "t7", "t8", "t9", "t10", "t11", "t12", "t13", "t14", "t15", "t16", "t17","t18", "t19", "t20", "t21")) .setOutputCol("features") val encodeLabel = udf[Double, String] { case "1" => 1.0 case "0" => 0.0 } val dataWithLabels = data.withColumn("label", encodeLabel(data("y"))) //123 is seed number to get same datasplit so we can tune params val Array(trainingData, testData) = dataWithLabels.randomSplit(Array(0.75, 0.25), 123) val gbt = new GBTClassifier() .setLabelCol("label") .setFeaturesCol("features") .setMaxIter(10) .setSeed(123) val pipeline = new Pipeline() .setStages(Array(featureAssembler, gbt)) // *********************************************************** println("Preparing K-fold Cross Validation and Grid Search") // *********************************************************** val paramGrid = new ParamGridBuilder() .addGrid(gbt.maxIter, maxIter) .addGrid(gbt.maxDepth, maxDepth) .addGrid(gbt.maxBins, maxBins) .build() val cv = new CrossValidator() .setEstimator(pipeline) .setEvaluator(new BinaryClassificationEvaluator()) .setEstimatorParamMaps(paramGrid) .setNumFolds(numFolds) .setSeed(123) // ************************************************************ println("Training model with GradientBoostedTrees algorithm") // ************************************************************ // Train model. This also runs the indexers. val cvModel = cv.fit(trainingData) cvModel.save(rootDir + "cvGBT_22_binary_classes_" + System.nanoTime() / 1000000 + ".model") println("Evaluating model on train and test data and calculating RMSE") // ********************************************************************** // Make a sample prediction val predictions = cvModel.transform(testData) // Select (prediction, true label) and compute test error. val rocEvaluator = new BinaryClassificationEvaluator() .setLabelCol("label") .setRawPredictionCol("rawPrediction") .setMetricName("areaUnderROC") val roc = rocEvaluator.evaluate(predictions) val prEvaluator = new BinaryClassificationEvaluator() .setLabelCol("label") .setRawPredictionCol("rawPrediction") .setMetricName("areaUnderPR") val pr = prEvaluator.evaluate(predictions) val gbtModel = cvModel.bestModel.asInstanceOf[PipelineModel] gbtModel.save(rootDir + "__cv__gbt_22_binary_classes_" + System.nanoTime()/1000000 +".model") println("Area under ROC curve = " + roc) println("Area under PR curve= " + pr) println(predictions.select().show(1)) spark.stop() } } Now let us see how the training went: >>> Area under ROC curve = 0.6045355104779828 Area under PR curve= 0.3823834607704922 Therefore, we have not received very high accuracy, as the ROC is only 60.50% out of the best GBT model. Nevertheless, if we tune the hyperparameters, we will get better accuracy. We learned how a complete ML pipeline can be implemented, from collecting historical data, to transforming it into a format suitable for testing hypotheses. We also performed machine learning model training to carry out predictions. You read an excerpt from a book written by Md. Rezaul Karim, titled Scala Machine Learning Projects. In this book, you will learn to build powerful machine learning applications for performing advanced numerical computing and functional programming.  
Read more
  • 0
  • 0
  • 29814
article-image-deep-learning-methodologies-extend-image-based-application-videos
Sunith Shetty
05 Apr 2018
4 min read
Save for later

Datasets and deep learning methodologies to extend image-based applications to videos

Sunith Shetty
05 Apr 2018
4 min read
In today’s tutorial, we will extend image based application to videos, which will include pose estimation, captioning, and generating videos. Extending image-based application to videos Images can be used for pose estimation, style transfer, image generation, segmentation, captioning, and so on. Similarly, these applications find a place in videos too. Using the temporal information may improve the predictions from images and vice versa. In this section, we will see how to extend these applications to videos. Regressing the human pose Human pose estimation is an important application of video data and can improve other tasks such as action recognition. First, let's see a description of the datasets available for pose estimation: Poses in the wild dataset: Contains 30 videos annotated with the human pose. The dataset is annotated with human upper body joints. Frames Labeled In Cinema (FLIC): A human pose dataset obtained from 30 Movies. Pfister et al. proposed a method to predict the human pose in videos. The following is the pipeline for regressing the human pose: The frames from the video are taken and passed through a convolutional network. The layers are fused, and the pose heatmaps are obtained. The pose heatmaps are combined with optical flow to get the warped heatmaps. The warped heatmaps across a timeframe are pooled to produce the pooled heatmap, getting the final pose. Tracking facial landmarks Face analysis in videos requires face detection, landmark detection, pose estimation, verification, and so on. Computing landmarks are especially crucial for capturing facial animation, human-computer interaction, and human activity recognition. Instead of computing over frames, it can be computed over video. Gu et al. proposed a method to use a joint estimation of detection and tracking of facial landmarks in videos using RNN. The results outperform frame wise predictions and other previous models. The landmarks are computed by CNN, and the temporal aspect is encoded in an RNN. Synthetic data was used for training. Segmenting videos Videos can be segmented in a better way when temporal information is used. Gadde et al. proposed a method to combine temporal information by warping. The following image demonstrates the solution, which segments two frames and combines the warping: The warping net is shown in the following image: Reproduced from Gadde et al The optical flow is computed between two frames, which are combined with warping. The warping module takes the optical flow, transforms it, and combines it with the warped representations. Captioning videos Captions can be generated for videos, describing the context. Let's see a list of the datasets available for captioning videos: Microsoft Research - Video To Text (MSR-VTT) has 200,000 video clip and sentence pairs. MPII Movie Description Corpus (MPII-MD) has 68,000 sentences with 94 movies. Montreal Video Annotation Dataset (M-VAD) has 49,000 clips. YouTube2Text has 1,970 videos with 80,000 descriptions. Yao et al. proposed a method for captioning videos. A 3D convolutional network trained for action recognition is used to extract the local temporal features. An attention mechanism is then used on the features to generate text using an RNN. The process is shown here: Reproduced from Yao et al Donahue et al. proposed another method for video captioning or description, which uses LSTM with convolution features. This is similar to the preceding approach, except that we use 2D convolution features over here, as shown in the following image: Reproduced from Donahue et al We have several ways to combine text with images, such as activity recognition, image description, and video description techniques. The following image illustrates these techniques: Reproduced from Donahue et al Venugopalan et al. proposed a method for video captioning using an encoder-decoder approach. The following is a visualization of the technique proposed by him: Reproduced from Venugopalan et al The CNN can be computed on the frames or the optical flow of the images for this method.   Generating videos Videos can be generated using generative models, in an unsupervised manner. The future frames can be predicted using the current frame. Ranzato et al. proposed a method for generating videos, inspired by language models. An RNN model is utilized to take a patch of the image and predict the next patch. To summarize, we learned about video-based solutions in various scenarios such as action recognition, gesture recognition, security applications, and intrusion detection. You read an excerpt from a book written by Rajalingappaa Shanmugamani titled, Deep Learning for Computer Vision. This book will help you learn to model and train advanced neural networks for implementation of Computer Vision tasks.    
Read more
  • 0
  • 0
  • 16730

article-image-10-reasons-data-scientists-love-jupyter-notebooks
Aarthi Kumaraswamy
04 Apr 2018
5 min read
Save for later

10 reasons why data scientists love Jupyter notebooks

Aarthi Kumaraswamy
04 Apr 2018
5 min read
In the last twenty years, Python has been increasingly used for scientific computing and data analysis as well. Today, the main advantage of Python and one of the main reasons why it is so popular is that it brings scientific computing features to a general-purpose language that is used in many research areas and industries. This makes the transition from research to production much easier. IPython is a Python library that was originally meant to improve the default interactive console provided by Python and to make it scientist-friendly. In 2011, ten years after the first release of IPython, the IPython Notebook was introduced. This web-based interface to IPython combines code, text, mathematical expressions, inline plots, interactive figures, widgets, graphical interfaces, and other rich media within a standalone sharable web document. This platform provides an ideal gateway to interactive scientific computing and data analysis. IPython has become essential to researchers, engineers, data scientists, teachers and their students. Within a few years, IPython gained an incredible popularity among the scientific and engineering communities. The Notebook started to support more and more programming languages beyond Python. In 2014, the IPython developers announced the Jupyter project, an initiative created to improve the implementation of the Notebook and make it language-agnostic by design. The name of the project reflects the importance of three of the main scientific computing languages supported by the Notebook: Julia, Python, and R. Today, Jupyter is an ecosystem by itself that comprehends several alternative Notebook interfaces (JupyterLab, nteract, Hydrogen, and others), interactive visualization libraries, authoring tools compatible with notebooks. Jupyter has its own conference named JupyterCon. The project received funding from several companies as well as the Alfred P. Sloan Foundation and the Gordon and Betty Moore Foundation. Apart from the rich legacy that Jupyter notebooks come from and the richer ecosystem that it provides developers, here are ten more reasons for you to start using it for your next data science project if aren’t already using it now. All in one place: The Jupyter Notebook is a web-based interactive environment that combines code, rich text, images, videos, animations, mathematical equations, plots, maps, interactive figures and widgets, and graphical user interfaces, into a single document. Easy to share: Notebooks are saved as structured text files (JSON format), which makes them easily shareable. Easy to convert: Jupyter comes with a special tool, nbconvert, which converts notebooks to other formats such as HTML and PDF. Another online tool, nbviewer, allows us to render a publicly-available notebook directly in the browser. Language independent: The architecture of Jupyter is language independent. The decoupling between the client and kernel makes it possible to write kernels in any language. Easy to create kernel wrappers: Jupyter brings a lightweight interface for kernel languages that can be wrapped in Python. Wrapper kernels can implement optional methods, notably for code completion and code inspection. Easy to customize: Jupyter interface can be used to create an entirely customized experience in the Jupyter Notebook (or another client application such as the console). Extensions with custom magic commands: Create IPython extensions with custom magic commands to make interactive computing even easier. Many third-party extensions and magic commands exist, for example, the %%cython magic that allows one to write Cython code directly in a notebook. Stress-free Reproducible experiments: Jupyter notebooks can help you conduct efficient and reproducible interactive computing experiments with ease. It lets you keep a detailed record of your work. Also, the ease of use of the Jupyter Notebook means that you don't have to worry about reproducibility; just do all of your interactive work in notebooks, put them under version control, and commit regularly. Don't forget to refactor your code into independent reusable components. Effective teaching-cum-learning tool: The Jupyter Notebook is not only a tool for scientific research and data analysis but also a great tool for teaching. An example is IPython Blocks - a library that allows you or your students to create grids of colorful blocks. Interactive code and data exploration: The ipywidgets package provides many common user interface controls for exploring code and data interactively. You enjoyed excerpts from Cyrille Rossant’s latest book, IPython Cookbook, Second Edition. This book contains 100+ recipes for high-performance scientific computing and data analysis, from the latest IPython/Jupyter features to the most advanced tricks, to help you write better and faster code. For free recipes from the book, head over to the Ipython Cookbook Github page. If you loved what you saw, support Cyrille’s work by buying a copy of the book today! Related Jupyter articles: Latest Jupyter news updates: Is JupyterLab all set to phase out Jupyter Notebooks? What’s new in Jupyter Notebook 5.3.0 3 ways JupyterLab will revolutionize Interactive Computing Jupyter notebooks tutorials: Getting started with the Jupyter notebook (part 1) Jupyter and Python Scripting Jupyter as a Data Laboratory: Part 1    
Read more
  • 0
  • 0
  • 52676

article-image-generative-models-action-create-van-gogh-neural-artistic-style-transfer
Sunith Shetty
03 Apr 2018
14 min read
Save for later

Generative Models in action: How to create a Van Gogh with Neural Artistic Style Transfer

Sunith Shetty
03 Apr 2018
14 min read
In today’s tutorial, we will learn the principles behind neural artistic style transfer and show a working example to transfer the style of Van Gogh art onto an image. Neural artistic style transfer An image can be considered as a combination of style and content. The artistic style transfer technique transforms an image to look like a painting with a specific painting style. We will see how to code this idea up. The loss function will compare the generated image with the content of the photo and style of the painting. Hence, the optimization is carried out for the image pixel, rather than for the weights of the network. Two values are calculated by comparing the content of the photo with the generated image followed by the style of the painting and the generated image. Content loss Since pixels are not a good choice, we will use the CNN features of various layers, as they are a better representation of the content. The initial layers have high-frequency such as edges, corners, and textures but the later layers represent objects, and hence are better for content. The latter layer can compare the object to object better than the pixel. But for this, we need to first import the required libraries, using the following code: import  numpy as  np from PIL  import  Image from  scipy.optimize  import fmin_l_bfgs_b from  scipy.misc  import imsave from  vgg16_avg  import VGG16_Avg from  keras import  metrics from  keras.models  import Model from  keras import  backend as K  Now, let's load the required image, using the following command: content_image = Image.open(work_dir + 'bird_orig.png') We will use the following image for this instance: As we are using the VGG architecture for extracting the features, the mean of all the ImageNet images has to be subtracted from all the images, as shown in the following code: imagenet_mean = np.array([123.68, 116.779, 103.939], dtype=np.float32) def subtract_imagenet_mean(image):  return (image - imagenet_mean)[:, :, :, ::-1] Note that the channels are different. The preprocess function takes the generated image and subtracts the mean and then reverses the channel. The deprocess function reverses that effect because of the preprocessing step, as shown in the following code: def add_imagenet_mean(image, s):  return np.clip(image.reshape(s)[:, :, :, ::-1] + imagenet_mean, 0,    255) First, we will see how to create an image with the content from another image. This is a process of creating an image from random noise. The content used here is the sum of the activation in some layer. We will minimize the loss of the content between the random noise and image, which is termed as the content loss. This loss is similar to pixel-wise loss but applied on layer activations, hence will capture the content leaving out the noise. Any CNN architecture can be used to do forward inference of content image and random noise. The activations are taken and the mean squared error is calculated, comparing the activations of these two outputs. The pixel of the random image is updated while the CNN weights are frozen. We will freeze the VGG network for this case. Now, the VGG model can be loaded. Generative images are very sensitive to subsampling techniques such as max pooling. Getting back the pixel values from max pooling is not possible. Hence, average pooling is a smoother method than max pooling. The function to convert VGG model with average pooling is used for loading the model, as shown here: vgg_model = VGG16_Avg(include_top=False) Note that the weights are the same for this model as the original, even though the pooling type has been changed. The ResNet and Inception models are not suited for this because of their inability to provide various abstractions. We will take the activations from the last convolutional layer of the VGG model namely block_conv1, while the model was frozen. This is the third last layer from the VGG, with a wide receptive field. The code for the same is given here for your reference: content_layer = vgg_model.get_layer('block5_conv1').output Now, a new model is created with a truncated VGG, till the layer that was giving good features. Hence, the image can be loaded now and can be used to carry out the forward inference, to get the actually activated layers. A TensorFlow variable is created to capture the activation, using the following code: content_model = Model(vgg_model.input, content_layer) content_image_array = subtract_imagenet_mean(np.expand_dims(np.array(content_image), 0)) content_image_shape = content_image_array.shape target = K.variable(content_model.predict(content_image_array)) Let's define an evaluator class to compute the loss and gradients of the image. The following class returns the loss and gradient values at any point of the iteration: class ConvexOptimiser(object): def __init__(self, cost_function, tensor_shape): self.cost_function = cost_function self.tensor_shape = tensor_shape self.gradient_values = None def loss(self, point): loss_value, self.gradient_values = self.cost_function([point.reshape(self.tensor_shape)]) return loss_value.astype(np.float64) def gradients(self, point): return self.gradient_values.flatten().astype(np.float64) Loss function can be defined as the mean squared error between the values of activations at specific convolutional layers. The loss will be computed between the layers of generated image and the original content photo, as shown here: mse_loss = metrics.mean_squared_error(content_layer, target) The gradients of the loss can be computed by considering the input of the model, as shown: grads = K.gradients(mse_loss, vgg_model.input) The input to the function is the input of the model and the output will be the array of loss and gradient values as shown: cost_function = K.function([vgg_model.input], [mse_loss]+grads) This function is deterministic to optimize, and hence SGD is not required: optimiser = ConvexOptimiser(cost_function, content_image_shape) This function can be optimized using a simple optimizer, as it is convex and hence is deterministic. We can also save the image at every step of the iteration. We will define it in such a way that the gradients are accessible, as we are using the scikit-learn's optimizer, for the final optimization. Note that this loss function is convex and so, a simple optimizer is good enough for the computation. The optimizer can be defined using the following code: def optimise(optimiser, iterations, point, tensor_shape, file_name): for i in range(iterations): point, min_val, info = fmin_l_bfgs_b(optimiser.loss, point.flatten(), fprime=optimiser.gradients, maxfun=20) point = np.clip(point, -127, 127) print('Loss:', min_val) imsave(work_dir + 'gen_'+file_name+'_{i}.png', add_imagenet_mean(point.copy(), tensor_shape)[0]) return point The optimizer takes loss function, point, and gradients, and returns the updates. A random image needs to be generated so that the content loss will be minimized, using the following code: def generate_rand_img(shape):  return np.random.uniform(-2.5, 2.5, shape)/1 generated_image = generate_rand_img(content_image_shape) Here is the random image that is created: The optimization can be run for 10 iterations to see the results, as shown: iterations = 10 generated_image = optimise(optimiser, iterations, generated_image, content_image_shape, 'content') If everything goes well, the loss should print as shown here, over the iterations: Current loss value: 73.2010421753 Current loss value: 22.7840042114 Current loss value: 12.6585302353 Current loss value: 8.53817081451 Current loss value: 6.64649534225 Current loss value: 5.56395864487 Current loss value: 4.83072710037 Current loss value: 4.32800722122 Current loss value: 3.94804215431 Current loss value: 3.66387653351 Here is the image that is generated and now, it almost looks like a bird. The optimization can be run for further iterations to have this done: An optimizer took the image and updated the pixels so that the content is the same. Though the results are worse, it can reproduce the image to a certain extent with the content. All the images through iterations give a good intuition on how the image is generated. There is no batching involved in this process. In the next section, we will see how to create an image in the style of a painting. Style loss using the Gram matrix After creating an image that has the content of the original image, we will see how to create an image with just the style. Style can be thought of as a mix of colour and texture of an image. For that purpose, we will define style loss. First, we will load the image and convert it to an array, as shown in the following code: style_image = Image.open(work_dir + 'starry_night.png') style_image = style_image.resize(np.divide(style_image.size, 3.5).astype('int32')) Here is the style image we have loaded: Now, we will preprocess this image by changing the channels, using the following code: style_image_array = subtract_imagenet_mean(np.expand_dims(style_image, 0)[:, :, :, :3]) style_image_shape = style_image_array.shape For this purpose, we will consider several layers, like we have done in the following code: model = VGG16_Avg(include_top=False, input_shape=shp[1:]) outputs = {l.name: l.output for l in model.layers} Now, we will take multiple layers as an array output of the first four blocks, using the following code: layers = [outputs['block{}_conv1'.format(o)] for o in range(1,3)] A new model is now created, that can output all those layers and assign the target variables, using the following code: layers_model = Model(model.input, layers) targs = [K.variable(o) for o in layers_model.predict(style_arr)] Style loss is calculated using the Gram matrix. The Gram matrix is the product of a matrix and its transpose. The activation values are simply transposed and multiplied. This matrix is then used for computing the error between the style and random images. The Gram matrix loses the location information but will preserve the texture information. We will define the Gram matrix using the following code: def grammian_matrix(matrix):  flattened_matrix = K.batch_flatten(K.permute_dimensions(matrix, (2, 0, 1)))  matrix_transpose_dot = K.dot(flattened_matrix, K.transpose(flattened_matrix))  element_count = matrix.get_shape().num_elements()  return matrix_transpose_dot / element_count As you might be aware now, it is a measure of the correlation between the pair of columns. The height and width dimensions are flattened out. This doesn't include any local pieces of information, as the coordinate information is disregarded. Style loss computes the mean squared error between the Gram matrix of the input image and the target, as shown in the following code def style_mse_loss(x, y):  return metrics.mse(grammian_matrix(x), grammian_matrix(y)) Now, let's compute the loss by summing up all the activations from the various layers, using the following code: style_loss = sum(style_mse_loss(l1[0], l2[0]) for l1, l2 in zip(style_features, style_targets)) grads = K.gradients(style_loss, vgg_model.input) style_fn = K.function([vgg_model.input], [style_loss]+grads) optimiser = ConvexOptimiser(style_fn, style_image_shape) We then solve it as the same way we did before, by creating a random image. But this time, we will also apply a Gaussian filter, as shown in the following code: generated_image = generate_rand_img(style_image_shape) The random image generated will look like this: The optimization can be run for 10 iterations to see the results, as shown below: generated_image = optimise(optimiser, iterations, generated_image, style_image_shape) If everything goes well, the solver should print the loss values similar to the following: Current loss value: 5462.45556641 Current loss value: 189.738555908 Current loss value: 82.4192581177 Current loss value: 55.6530838013 Current loss value: 37.215713501 Current loss value: 24.4533748627 Current loss value: 15.5914745331 Current loss value: 10.9425945282 Current loss value: 7.66888141632 Current loss value: 5.84042310715 Here is the image that is generated: Here, from a random noise, we have created an image with a particular painting style without any location information. In the next section, we will see how to combine both—the content and style loss. Style transfer Now we know how to reconstruct an image, as well as how to construct an image that captures the style of an original image. The obvious idea may be to just combine these two approaches by weighting and adding the two loss functions, as shown in the following code: w,h = style.size src = img_arr[:,:h,:w] Like before, we're going to grab a sequence of layer outputs to compute the style loss. However, we still only need one layer output to compute the content loss. How do we know which layer to grab? As we discussed earlier, the lower the layer, the more exact the content reconstruction will be. In merging content reconstruction with style, we might expect that a looser reconstruction of the content will allow more room for the style to affect (re: inspiration). Furthermore, a later layer ensures that the image looks like the same subject, even if it doesn't have the same details. The following code is used for this process: style_layers = [outputs['block{}_conv2'.format(o)] for o in range(1,6)] content_name = 'block4_conv2' content_layer = outputs[content_name] Now, a separate model for style is created with required output layers, using the following code: style_model = Model(model.input, style_layers) style_targs = [K.variable(o) for o in style_model.predict(style_arr)] We will also create another model for the content with the content layer, using the following code: content_model = Model(model.input, content_layer) content_targ = K.variable(content_model.predict(src)) Now, the merging of the two approaches is as simple as merging their respective loss functions. Note that as opposed to our previous functions, this function is producing three separate types of outputs: One for the original image One for the image whose style we're emulating One for the random image whose pixels we are training One way for us to tune how the reconstructions mix is by changing the factor on the content loss, which we have here as 1/10. If we increase that denominator, the style will have a larger effect on the image, and if it's too large, the original content of the image will be obscured by an unstructured style. Likewise, if it is too small then the image will not have enough style. We will use the following code for this process: style_wgts = [0.05,0.2,0.2,0.25,0.3] The loss function takes both style and content layers, as shown here: loss = sum(style_loss(l1[0], l2[0])*w    for l1,l2,w in zip(style_layers, style_targs, style_wgts)) loss += metrics.mse(content_layer, content_targ)/10 grads = K.gradients(loss, model.input) transfer_fn = K.function([model.input], [loss]+grads) evaluator = Evaluator(transfer_fn, shp) We will run the solver for 10 iterations as before, using the following code: iterations=10 x = rand_img(shp) x = solve_image(evaluator, iterations, x) The loss values should be printed as shown here: Current loss value: 2557.953125 Current loss value: 732.533630371 Current loss value: 488.321166992 Current loss value: 385.827178955 Current loss value: 330.915924072 Current loss value: 293.238189697 Current loss value: 262.066864014 Current loss value: 239.34185791 Current loss value: 218.086700439 Current loss value: 203.045211792 These results are remarkable. Each one of them does a fantastic job of recreating the original image in the style of the artist. The generated image will look like the following: We will now conclude the style transfer section. This operation is really slow but can work with any images. In the next section, we will see how to use a similar idea to create a superresolution network. There are several ways to make this better, such as: Adding a Gaussian filter to a random image Adding different weights to the layers Different layers and weights can be used to content Initialization of image rather than random image Color can be preserved Masks can be used for specifying what is required Any sketch can be converted to painting Drawing a sketch and creating the image Any image can be converted to artistic style by training a CNN to output such an image. To summarize, we learned to implement to transfer style from one image to another while preserving the content as is. You read an excerpt from a book written by Rajalingappaa Shanmugamani titled Deep Learning for Computer Vision. In this book, you will learn how to model and train advanced neural networks to implement a variety of Computer Vision tasks.
Read more
  • 0
  • 0
  • 21816
article-image-top-5-free-business-intelligence-tools
Amey Varangaonkar
02 Apr 2018
7 min read
Save for later

Top 5 free Business Intelligence tools

Amey Varangaonkar
02 Apr 2018
7 min read
There is no shortage of business intelligence tools available to modern businesses today. But they're not always easy on the pocket. Great functionality, stylish UI and ease of use always comes with a price tag. If you can afford it, great - if not, it's time to start thinking about open source and free business intelligence tools.  Free business intelligence tools can power your business Take a look at 5 of the best free or open source business intelligence tools. They're all as effective and powerful as anything you'd pay a premium for. You simply need to know what you're doing with them. BIRT BIRT (Business Intelligence and Reporting Tools) is an open-source project that offers industry-standard reporting and BI capabilities. It's available as both a desktop and web application. As a top-level project within the umbrella of the Eclipse Foundation, it's got a good pedigree that means you can be confident in its potency. BIRT is especially useful for businesses which have a working environment built around Java and Java EE, as its reporting and charting engines can integrate seamlessly with Java. From creating a range of reports to different types of charts and graphs, BIRT can also be used for advanced analytical tasks. You can learn about the impressive reporting capabilities that BIRT offers on its official features page. Pros: The BIRT platform is one of the most popularly used open source business intelligence tools across the world, with more than 12 million downloads and 2.5 million users across more than 150 countries. With a large community of users, getting started with this tool, or getting solutions to problems that you might come across should be easy. Cons: Some programming experience, preferably in Java, is required to make the best use of this tool. The complex functions and features may not be easy to grasp for absolute beginners. Jaspersoft Community Jaspersoft, formerly known as Panscopic, is one of the leading open source suites of tools for a variety of reporting and business intelligence tasks. It was acquired by TIBCO in 2014 in a deal worth approximately $185 million, and has grown in popularity ever since. Jaspersoft began with the promise of “saving the world from the oppression of complex, heavyweight business intelligence”, and the Community edition offers the following set of tools for easier reporting and analytics: JasperReports Server: This tool is used for designing standalone or embeddable reports which can be used across third party applications JasperReports Library: You can design pixel-perfect reports from different kinds of datasets Jaspersoft ETL: This is a popular warehousing tool powered by Talend for extracting useful insights from a variety of data sources Jaspersoft Studio: Eclipse-based report designer for JasperReports and JasperReports Server Visualize.js: A JavaScript-based framework to embed Jaspersoft applications Pros: Jaspersoft, like BIRT, has a large community of developers looking to actively solve any problem they might come across. More often than not, your queries are bound to be answered satisfactorily. Cons: Absolute beginners might struggle with the variety of offerings and their applications. The suite of Jaspersoft tools is more suited for someone with an intermediate programming experience. KNIME KNIME is a free, open-source data analytics and business intelligence company that offers a robust platform for reporting and data integration. Used commonly by data scientists and analysts, KNIME offers features for data mining, machine learning and data visualization in order to build effective end-to-end data pipelines. There are 2 major product offerings from KNIME: KNIME Analytics Platform KNIME Cloud Analytics Platform Considered to be one of the most established players in the Analytics and business intelligence market, KNIME has customers in over 60 countries worldwide. You can often find KNIME featured as a ‘Leader’ in the Gartner Magic Quadrant. It finds applications in a variety of enterprise use-cases, including pharma, CRM, finance, and more. Pros: If you want to leverage the power of predictive analytics and machine learning, KNIME offers you just the perfect environment to build industry-standard, accurate models. You can create a wide variety of visualizations including complex plots and charts, and perform complex ETL tasks with relative ease. Cons: KNIME is not suited for beginners. It's built instead for established professionals such as data scientists and analysts who want to conduct analyses quickly and efficiently. Tableau Public Tableau Public’s promise is simple - “Visualize and share your data in minutes - for free”. Tableau is one of the most popular business intelligence tools out there, rivalling the likes of Qlik, Spotfire, Power BI among others. Along with its enterprise edition which offers premium analytics, reporting and dashboarding features, Tableau also offers a freely available Public version for effective visual analytics. Last year, Tableau released an announcement that the interactive stories and reports published on the Tableau Public platform had received more than 1 billion views worldwide. Leading news organizations around the world, including BBC and CNBC, use Tableau Public for data visualization. Pros: Tableau Public is a very popular tool with a very large community of users. If you find yourself struggling to understand or execute any feature on this platform, there are ample number of solutions available on the community forums and also on forums such as Stack Overflow. The quality of visualizations is industry-standard, and you can publish them anywhere on the web without any hassle. Cons:It’s quite difficult to think of any drawback of using Tableau Public, to be honest. Having limited features as compared to the enterprise edition of Tableau is obviously a shortcoming, though. [box type="info" align="" class="" width=""]Editor’s tip: If you want to get started with Tableau Public and create interesting data stories using it, Creating Data Stories with Tableau Public is one book you do not want to miss out on![/box] Microsoft Power BI Microsoft Power BI is a paid, enterprise-ready offering by Microsoft to empower businesses to find intuitive data insights across a variety of data formats. Microsoft also offers a stripped-down version of Power BI with limited Business Intelligence capabilities called as Power BI Desktop. In this free version, users are offered up to 1 GB of data to work on, and the ability to create different kinds of visualizations on CSV data as well as Excel spreadsheets. The reports and visualizations built using Power BI Desktop can be viewed on mobile devices as well as on browsers, and can be updated on the go. Pros: Free, very easy to use. Power BI Desktop allows you to create intuitive visualizations and reports. For beginners looking to learn the basics of Business Intelligence and data visualization, this is a great tool to use. You can also work with any kind of data and connect it to the Power BI Desktop effortlessly. Cons: You don’t get the full suite of features on Power BI Desktop which make Power BI such an elegant and wonderful Business Intelligence tool. Also, new reports and dashboards cannot be created via the mobile platform. [box type="info" align="" class="" width=""]Editor’s Tip: If you want to get started with Microsoft Power BI, or want handy tips on using Power BI effectively, our Microsoft Power BI Cookbook will prove to be of great use! [/box] There are a few other free and open source tools which are quite effective and find a honorary mention in this article. We were absolutely spoilt for choices, and choosing the top 5 tools list among all these options was a lot of hard work! Some other tools which deserve a honorary mention are - Dataiku Free Edition, Pentaho Community Edition, QlikView Personal Edition, Rapidminer, among others. You may want to check them out as well. What do you think about this list? Are there any other free/open source business intelligence tools which should’ve made it into list?
Read more
  • 0
  • 0
  • 44955

article-image-3-ways-to-deploy-a-qt-and-opencv-application
Gebin George
02 Apr 2018
16 min read
Save for later

3 ways to deploy a QT and OpenCV application

Gebin George
02 Apr 2018
16 min read
[box type="note" align="" class="" width=""]This article is an excerpt from the book, Computer Vision with OpenCV 3 and Qt5 written by Amin Ahmadi Tazehkandi.  This book covers how to build, test, and deploy Qt and OpenCV apps, either dynamically or statically.[/box] Today, we will learn three different methods to deploy a QT + OpenCV application. It is extremely important to provide the end users with an application package that contains everything it needs to be able to run on the target platform. And demand very little or no effort at all from the users in terms of taking care of the required dependencies. Achieving this kind of works-out-of-the-box condition for an application relies mostly on the type of the linking (dynamic or static) that is used to create an application, and also the specifications of the target operating system. Deploying using static linking Deploying an application statically means that your application will run on its own and it eliminates having to take care of almost all of the needed dependencies, since they are already inside the executable itself. It is enough to simply make sure you select the Release mode while building your application, as seen in the following screenshot: When your application is built in the Release mode, you can simply pick up the produced executable file and ship it to your users. If you try to deploy your application to Windows users, you might face an error similar to the following when your application is executed: The reason for this error is that on Windows, even when building your Qt application statically, you still need to make sure that Visual C++ Redistributables exist on the target system. This is required for C++ applications that are built by using Microsoft Visual C++, and the version of the required redistributables correspond to the Microsoft Visual Studio installed on your computer. In our case, the official title of the installer for these libraries is Visual C++ Redistributables for Visual Studio 2015, and it can be downloaded from the following link: https:/ / www. microsoft. com/en- us/ download/ details. aspx? id= 48145. It is a common practice to include the redistributables installer inside the installer for our application and perform a silent installation of them if they are not already installed. This process happens with most of the applications you use on your Windows PCs, most of the time, without you even noticing it. We already quite briefly talked about the advantages (fewer files to deploy) and disadvantages (bigger executable size) of static linking. But when it is meant in the context of deployment, there are some more complexities that need to be considered. So, here is another (more complete) list of disadvantages, when using static linking to deploy your applications: The building takes more time and the executable size gets bigger and bigger. You can't mix static and shared (dynamic) Qt libraries, which means you can't use the power of plugins and extending your application without building everything from scratch. Static linking, in a sense, means hiding the libraries used to build an application. Unfortunately, this option is not offered with all libraries, and failing to comply with it can lead to licensing issues with your application. This complexity arises partly because of the fact that Qt Framework uses some third-party libraries that do not offer the same set of licensing options as Qt itself. Talking about licensing issues is not a discussion suitable for this book, so we'll suffice with mentioning that you must be careful when you plan to create commercial applications using static linking of Qt libraries. For a detailed list of licenses used by third-party libraries within Qt, you can always refer to the Licenses Used in Qt web page from the following link: http://doc.qt.io/qt-5/ licenses-used-in-qt.html Static linking, even with all of its disadvantages that we just mentioned, is still an option, and a good one in some cases, provided that you can comply with the licensing options of the Qt Framework. For instance, in Linux operating systems where creating an installer for our application requires some extra work and care, static linking can help extremely reduce the effort needed to deploy applications (merely a copy and paste). So, the final decision of whether to use static linking or not is mostly on you and how you plan to deploy your application. Making this important decision will be much easier by the end of this chapter, when you have an overview of the possible linking and deployment methods. Deploying using dynamic linking When you deploy an application built with Qt and OpenCV using shared libraries (or dynamic linking), you need to make sure that the executable of your application is able to reach the runtime libraries of Qt and OpenCV, in order to load and use them. This reachability or visibility of runtime libraries can have different meanings depending on the operating system. For instance, on Windows, you need to copy the runtime libraries to the same folder where your application executable resides, or put them in a folder that is appended to the PATH environment value. Qt Framework offers command-line tools to simplify the deployment of Qt applications on Windows and macOS. As mentioned before, the first thing you need to do is to make sure your application is built in the Release mode, and not Debug mode. Then, if you are on Windows, first copy the executable (let us assume it is called app.exe) from the build folder into a separate folder (which we will refer to as deploy_path) and execute the following commands using a command-line instance: cd deploy_path QT_PATHbinwindeployqt app.exe The windeployqt tool is a deployment helper tool that simplifies the process of copying the required Qt runtime libraries into the same folder as the application executable. Itsimply takes an executable as a parameter and after determining the modules used to create it, copies all required runtime libraries and any additional required dependencies, such as Qt plugins, translations, and so on. This takes care of all the required Qt runtime libraries, but we still need to take care of OpenCV runtime libraries. If you followed all of the steps in Chapter 1, Introduction to OpenCV and Qt, for building OpenCV libraries dynamically, then you only need to manually copy the opencv_world330.dll and opencv_ffmpeg330.dll files from OpenCV installation folder (inside the x86vc14bin folder) into the same folder where your application executable resides. We didn't really go into the benefits of turning on the BUILD_opencv_world option when we built OpenCV in the early chapters of the book; however, it should be clear now that this simplifies the deployment and usage of the OpenCV libraries, by requiring only a single entry for LIBS in the *.pro file and manually copying only a single file (not counting the ffmpeg library) when deploying OpenCV applications. It should be also noted that this method has the disadvantage of copying all OpenCV codes (in a single library) along your application even when you do not need or use all of its modules in a project. Also note that on Windows, as mentioned in the Deploying using static linking section, you still need to similarly provide the end users of your application with Microsoft Visual C++ Redistributables. On a macOS operating system, it is also possible to easily deploy applications written using Qt Framework. For this reason, you can use the macdeployqt command-line tool provided by Qt. Similar to windeployqt, which accepts a Windows executable and fills the same folder with the required libraries, macdeployqt accepts a macOS application bundle and makes it deployable by copying all of the required Qt runtimes as private frameworks inside the bundle itself. Here is an example: cd deploy_path QT_PATH/bin/macdeployqt my_app_bundle Optionally, you can also provide an additional -dmg parameter, which leads to the creation of a macOS *.dmg (disk image) file. As for the deployment of OpenCV libraries when dynamic linking is used, you can create an installer using Qt Installer Framework (which we will learn about in the next section), a third-party provider, or a script that makes sure the required runtime libraries are copied to their required folders. This is because of the fact that simply copying your runtime libraries (whether it is OpenCV or anything else) to the same folder as the application executable does not help with making them visible to an application on macOS. The same also applies to the Linux operating system, where unfortunately even a tool for deploying Qt runtime libraries does not exist (at least for the moment), so we also need to take care of Qt libraries in addition to OpenCV libraries, either by using a trusted third-party provider (which you can search for online) or by using the cross-platform installer provided by Qt itself, combined with some scripting to make sure everything is in place when our application is executed. Deploy using Qt Installer Framework Qt Installer Framework allows you to create cross-platform installers of your Qt applications for Windows, macOS, and Linux operating systems. It allows for creating standard installer wizards where the user is taken through consecutive dialogs that provide all the necessary information, and finally display the progress for when the application is being installed and so on, similar to most of installations you have probably faced, and especially the installation of Qt Framework itself. Qt Installer Framework is based on Qt Framework itself but is provided as a different package and does not require Qt SDK (Qt Framework, Qt Creator, and so on) to be present on a computer. It is also possible to use Qt Installer Framework in order to create installer packages for any application, not just Qt applications. In this section, we are going to learn how to create a basic installer using Qt Installer Framework, which takes care of installing your application on a target computer and copying all the necessary dependencies. The result will be a single executable installer file that you can put on a web server to be downloaded or provide it in a USB stick or CD, or any other media type. This example project will help you get started with working your way around the many great capabilities of Qt Installer Framework by yourself. You can use the following link to download and install the Qt Installer Framework. Make sure to simply download the latest version when you use this link, or any other source for downloading it. At the moment, the latest version is 3.0.2: https://download.qt.io/official_releases/qt-installer-framework After you have downloaded and installed Qt Installer Framework, you can start creating the required files that Qt Installer Framework needs in order to create an installer. You can do this by simply browsing to the Qt Installer Framework, and from the examples folder copying the tutorial folder, which is also a template in case you want to quickly rename and re-edit all of the files and create your installer quickly. We will go the other way and create them manually; first because we want to understand the structure of the required files and folders for the Qt Installer Framework, and second, because it is still quite easy and simple. Here are the required steps for creating an installer: Assuming that you have already finished developing your Qt and OpenCV application, you can start by creating a new folder that will contain the installer files. Let's assume this folder is called deploy. Create an XML file inside the deploy folder and name it config.xml. This XML file must contain the following: <?xml version="1.0" encoding="UTF-8"?> <Installer> <Name>Your application</Name> <Version>1.0.0</Version> <Title>Your application Installer</Title> <Publisher>Your vendor</Publisher> <StartMenuDir>Super App</StartMenuDir> <TargetDir>@HomeDir@/InstallationDirectory</TargetDir> </Installer> Make sure to replace the required XML fields in the preceding code with information relevant to your application and then save and close this file: Now, create a folder named packages inside the deploy folder. This folder will contain the individual packages that you want the user to be able to install, or make them mandatory or optional so that the user can review and decide what will be installed. In the case of simpler Windows applications that are written using Qt and OpenCV, usually it is enough to have just a single package that includes the required files to run your application, and even do silent installation of Microsoft Visual C++ Redistributables. But for more complex cases, and especially when you want to have more control over individual installable elements of your application, you can also go for two or more packages, or even sub-packages. This is done by using domain-like folder names for each package. Each package folder can have a name like com.vendor.product, where vendor and product are replaced by the developer name or company and the application. A subpackage (or sub-component) of a package can be identified by adding. subproduct to the name of the parent package. For instance, you can have the following folders inside the packages folder: com.vendor.product com.vendor.product.subproduct1 com.vendor.product.subproduct2 com.vendor.product.subproduct1.subsubproduct1 … This can go on for as many products (packages) and sub-products (sub-packages) as we like. For our example case, let's create a single folder that contains our executable, since it describes it all and you can create additional packages by simply adding them to the packages folder. Let's name it something like com.amin.qtcvapp. Now, follow these required steps: Now, create two folders inside the new package folder that we created, the com.amin.qtcvapp folder. Rename them to data and meta. These two folders must exist inside all packages. Copy your application files inside the data folder. This folder will be extracted into the target folder exactly as it is (we will talk about setting the target folder of a package in the later steps). In case you are planning to create more than one package, then make sure to separate their data correctly and in a way that it makes sense. Of course, you won't be faced with any errors if you fail to do so, but the users of your application will probably be confused, for instance by skipping a package that should be installed at all times and ending up with an installed application that does not work. Now, switch to the meta folder and create the following two files inside that folder, and fill them with the codes provided for each one of them. The package.xml file should contain the following. There's no need to mention that you must fill the fields inside the XML with values relevant to your package: <?xml version="1.0" encoding="UTF-8"?> <Package> <DisplayName>The component</DisplayName> <Description>Install this component.</Description> <Version>1.0.0</Version> <ReleaseDate>1984-09-16</ReleaseDate> <Default>script</Default> <Script>installscript.qs</Script> </Package> The script in the previous XML file, which is probably the most important part of the creation of an installer, refers to a Qt Installer Script (*.qs file), which is named installerscript.qs and can be used to further customize the package, its target folder, and so on. So, let us create a file with the same name (installscript.qs) inside the meta folder, and use the following code inside it: function Component() { // initializations go here } Component.prototype.isDefault = function() { // select (true) or unselect (false) the component by default return true; } Component.prototype.createOperations = function() { try { // call the base create operations function component.createOperations(); } catch (e) { console.log(e); } } This is the most basic component script, which customizes our package (well, it only performs the default actions) and it can optionally be extended to change the target folder, create shortcuts in the Start menu or desktop (on Windows), and so on. It is a good idea to keep an eye on the Qt Installer Framework documentation and learn about its scripting to be able to create more powerful installers that can put all of the required dependencies of your app in place, and automatically. You can also browse through all of the examples inside the examples folder of the Qt Installer Framework and learn how to deal with different deployment cases. For instance, you can try to create individual packages for Qt and OpenCV dependencies and allow the users to deselect them, in case they already have the Qt runtime libraries on their computer. The last step is to use the binarycreator tool to create our single and standalone installer. Simply run the following command by using a Command Prompt (or Terminal) instance: binarycreator -p packages -c config.xml myinstaller The binarycreator is located inside the Qt Installer Framework bin folder. It requires two parameters that we have already prepared. -p must be followed by our packages folder and -c must be followed by the configuration file (or config.xml) file. After executing this command, you will get myinstaller (on Windows, you can append *.exe to it), which you can execute to install your application. This single file should contain all of the required files needed to run your application, and the rest is taken care of. You only need to provide a download link to this file, or provide it on a CD to your users. The following are the dialogs you will face in this default and most basic installer, which contains most of the usual dialogs you would expect when installing an application: If you go to the installation folder, you will notice that it contains a few more files than you put inside the data folder of your package. Those files are required by the installer to handle modifications and uninstall your application. For instance, the users of your application can easily uninstall your application by executing the maintenance tool executable, which would produce another simple and user-friendly dialog to handle the uninstall process: We saw how to deploy a QT + OpenCV applications using static linking, dynamic linking, and QT installer. If you found our post useful, do check out this book Computer Vision with OpenCV 3 and Qt5  to accentuate your OpenCV applications by developing them with Qt.  
Read more
  • 0
  • 0
  • 37237
Modal Close icon
Modal Close icon