Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Data

1229 Articles
article-image-mastering-transfer-learning-fine-tuning-bert-and-vision-transformers
Sinan Ozdemir
27 Nov 2024
15 min read
Save for later

Mastering Transfer Learning: Fine-Tuning BERT and Vision Transformers

Sinan Ozdemir
27 Nov 2024
15 min read
DataPro is a weekly, expert-curated newsletter trusted by 120k+ global data professionals. Built by data practitioners, it blends first-hand industry experience with practical insights and peer-driven learning.Make sure to subscribe here so you never miss a key update in the data world. This article is an excerpt from the book, "Principles of Data Science", by Sinan Ozdemir. This book provides an end-to-end framework for cultivating critical thinking about data, performing practical data science, building performant machine learning models, and mitigating bias in AI pipelines. Learn the fundamentals of computational math and stats while exploring modern machine learning and large pre-trained models.IntroductionTransfer learning (TL) has revolutionized the field of deep learning by enabling pre-trained models to adapt their broad, generalized knowledge to specific tasks with minimal labeled data. This article delves into TL with BERT and GPT, demonstrating how to fine-tune these advanced models for text classification and image classification tasks. Through hands-on examples, we illustrate how TL leverages pre-trained architectures to simplify complex problems and achieve high accuracy with limited data.TL with BERT and GPTIn this article, we will take some models that have already learned a lot from their pre-training and fine-tune them to perform a new, related task. This process involves adjusting the model’s parameters to better suit the new task, much like fine-tuning a musical instrument:Figure 12.8 – ITLITL takes a pre-trained model that was generally trained on a semi-supervised (or unsupervised) task and then is given labeled data to learn a specific task.Examples of TLLet’s take a look at some examples of TL with specific pre-trained models.Example – Fine-tuning a pre-trained model for text classificationConsider a simple text classification problem. Suppose we need to analyze customer reviews and determine whether they’re positive or negative. We have a dataset of reviews, but it’s not nearly large enough to train a deep learning (DL) model from scratch. We will fine-tune BERT on a text classification task, allowing the model to adapt its existing knowledge to our specific problem.We will have to move away from the popular scikit-learn library to another popular library called transformers, which was created by HuggingFace (the pre-trained model repository I mentioned earlier) as scikit-learn does not (yet) support Transformer models.Figure 12.9 shows how we will have to take the original BERT model and make some minor modifications to it to perform text classification. Luckily, the transformers package has a built-in class to do this for  us called BertForSequenceClassification:Figure 12.9 – Simplest text classification caseIn many TL cases, we need to architect additional layers. In the simplest text classification case, we add a classification layer on top of a pre-trained BERT model so that it can perform the kind of classification we want.The following code block shows an end-to-end code example of fine-tuning BERT on a text classification task. Note that we are also using a package called datasets, also made by HuggingFace, to load a sentiment classification task from IMDb reviews. Let’s  begin by loading up the dataset:# Import necessary libraries from datasets import load_dataset from transformers import BertTokenizer, BertForSequenceClassification, Trainer, TrainingArguments # Load the dataset imdb_data = load_dataset('imdb', split='train[:1000]') # Loading only 1000 samples for a toy example # Define the tokenizer tokenizer = BertTokenizer.from_pretrained('bert-base-uncased') # Preprocess the data def encode(examples): return tokenizer(examples['text'], truncation=True, padding='max_ length', max_length=512) imdb_data = imdb_data.map(encode, batched=True) # Format the dataset to PyTorch tensors imdb_data.set_format(type='torch', columns=['input_ids', 'attention_ mask', 'label'])With our dataset loaded up, we can run some training code to update our BERT model on our labeled data:# Define the model model = BertForSequenceClassification.from_pretrained( 'bert-base-uncased', num_labels=2) # Define the training arguments training_args = TrainingArguments( output_dir='./results', num_train_epochs=1, per_device_train_batch_size=4 ) # Define the trainer trainer = Trainer(model=model, args=training_args, train_dataset=imdb_ data) # Train the model trainer.train() # Save the model model.save_pretrained('./my_bert_model')Once we have our saved model, we can use the following code to run the model against unseen data:from transformers import pipeline # Define the sentiment analysis pipeline nlp = pipeline('sentiment-analysis', model=model, tokenizer=tokenizer) # Use the pipeline to predict the sentiment of a new review review = "The movie was fantastic! I enjoyed every moment of it." result = nlp(review) # Print the result print(f"label: {result[0]['label']}, with score: {round(result[0] ['score'], 4)}") # "The movie was fantastic! I enjoyed every moment of it." # POSITIVE: 99%Example – TL for image classificationWe could take a pre-trained model such as ResNet or the Vision Transformer (shown in Figure 12.10), initially trained on a large-scale image dataset such as ImageNet. This model has already learned to detect various features from images, from simple shapes to complex objects. We can take advantage of this knowledge, fi ne-tuning  the model on a custom image classification task:Figure 12.10 – The Vision TransformerThe Vision Transformer is like a BERT model for images. It relies on many of the same principles, except instead of text tokens, it uses segments of images as “tokens” instead.The following code block shows an end-to-end code example of fine-tuning the Vision Transformer on an image classification task. The code should look very similar to the BERT code from the previous section because the aim of the transformers library is to standardize training and usage of modern pre-trained models so that no matter what task you are performing, they can offer a relatively unified training and inference experience.Let’s begin by loading up our data and taking a look at the kinds of images we have (seen in Figure 12.11). Note that we are only going to use 1% of the dataset to show that you really don’t need that much data to get a lot out of pre-trained models!# Import necessary libraries from datasets import load_dataset from transformers import ViTImageProcessor, ViTForImageClassification from torch.utils.data import DataLoader import matplotlib.pyplot as plt import torch from torchvision.transforms.functional import to_pil_image # Load the CIFAR10 dataset using Hugging Face datasets # Load only the first 1% of the train and test sets train_dataset = load_dataset("cifar10", split="train[:1%]") test_dataset = load_dataset("cifar10", split="test[:1%]") # Define the feature extractor feature_extractor = ViTImageProcessor.from_pretrained('google/vitbase-patch16-224') # Preprocess the data def transform(examples): # print(examples) # Convert to list of PIL Images examples['pixel_values'] = feature_ extractor(images=examples["img"], return_tensors="pt")["pixel_values"] return examples # Apply the transformations train_dataset = train_dataset.map( transform, batched=True, batch_size=32 ).with_format('pt') test_dataset = test_dataset.map( transform, batched=True, batch_size=32 ).with_format('pt')We can similarly use the model using the following code:Figure 12.11 – A single example from CIFAR10 showing an airplaneNow, we can train our pre-trained Vision Transformer:# Define the model model = ViTForImageClassification.from_pretrained( 'google/vit-base-patch16-224', num_labels=10, ignore_mismatched_sizes=True ) LABELS = ['airplane', 'automobile', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck'] model.config.id2label = LABELS # Define a function for computing metrics def compute_metrics(p): predictions, labels = p preds = np.argmax(predictions, axis=1) return {"accuracy": accuracy_score(labels, preds)} # Define the training arguments training_args = TrainingArguments( output_dir='./results', num_train_epochs=5, per_device_train_batch_size=4, load_best_model_at_end=True, # Save and evaluate at the end of each epoch evaluation_strategy='epoch', save_strategy='epoch' ) # Define the trainer trainer = Trainer( model=model, args=training_args, train_dataset=train_dataset, eval_dataset=test_dataset )Our final model has about 95% accuracy on 1% of the test set. We can now use our new classifier on unseen images, as in this next code block:from PIL import Image from transformers import pipeline # Define an image classification pipeline classification_pipeline = pipeline( 'image-classification', model=model, feature_extractor=feature_extractor ) # Load an image image = Image.open('stock_image_plane.jpg') # Use the pipeline to classify the image result = classification_pipeline(image)Figure 12.12 shows the result of this single classification, and it looks like it did pretty well:Figure 12.12 – Our classifier predicting a stock image of a plane correctlyWith minimal labeled data, we can leverage TL to turn models off the shelf into powerhouse predictive models.ConclusionTransfer learning is a transformative technique in deep learning, empowering developers to harness the power of pre-trained models like BERT and the Vision Transformer for specialized tasks. From sentiment analysis to image classification, these models can be fine-tuned with minimal labeled data, offering impressive performance and adaptability. By using libraries like HuggingFace’s transformers, TL streamlines model training, making state-of-the-art AI accessible and versatile across domains. As demonstrated in this article, TL is not only efficient but also a practical way to achieve powerful predictive capabilities with limited resources.Author BioSinan is an active lecturer focusing on large language models and a former lecturer of data science at the Johns Hopkins University. He is the author of multiple textbooks on data science and machine learning including "Quick Start Guide to LLMs". Sinan is currently the founder of LoopGenius which uses AI to help people and businesses boost their sales and was previously the founder of the acquired Kylie.ai, an enterprise-grade conversational AI platform with RPA capabilities. He holds a Master’s Degree in Pure Mathematics from Johns Hopkins University and is based in San Francisco.
Read more
  • 0
  • 0
  • 38736

article-image-implementing-autoencoders-using-h2o
Amey Varangaonkar
27 Oct 2017
4 min read
Save for later

Implementing Autoencoders using H2O

Amey Varangaonkar
27 Oct 2017
4 min read
[box type="note" align="" class="" width=""]This excerpt is taken from the book Neural Networks with R, Chapter 7, Use Cases of Neural Networks - Advanced Topics, written by Giuseppe Ciaburro and Balaji Venkateswaran. In this article, we see how R is an effective tool for neural network modelling, by implementing autoencoders using the popular H2O library.[/box] An autoencoder is an ANN used for learning without efficient coding control. The purpose of an autoencoder is to learn coding for a set of data, typically to reduce dimensionality. Architecturally, the simplest form of autoencoder is an advanced and non-recurring neural network very similar to the MLP, with an input level, an output layer, and one or more hidden layers that connect them, but with the layer outputs having the same number of input level nodes for rebuilding their inputs. In this section, we present an example of implementing Autoencoders using H2O on a movie dataset. The dataset used in this example is a set of movies and genre taken from https://grouplens.org/datasets/movielens We use the movies.csv file, which has three columns: movieId title genres There are 164,979 rows of data for clustering. We will use h2o.deeplearning to have the autoencoder parameter fix the clusters. The objective of the exercise is to cluster the movies based on genre, which can then be used to recommend similar movies or same genre movies to the users. The program uses h20.deeplearning, with the autoencoder parameter set to T: library("h2o") setwd ("c://R") #Load the training dataset of movies movies=read.csv ( "movies.csv", header=TRUE) head(movies) model=h2o.deeplearning(2:3, training_frame=as.h2o(movies), hidden=c(2), autoencoder = T, activation="Tanh") summary(model) features=h2o.deepfeatures(model, as.h2o(movies), layer=1) d=as.matrix(features[1:10,]) labels=as.vector(movies[1:10,2]) plot(d,pch=17) text(d,labels,pos=3) Now, let's go through the code: library("h2o") setwd ("c://R") These commands load the library in the R environment and set the working directory where we will have inserted the dataset for the next reading. Then we load the data: movies=read.csv( "movies.csv", header=TRUE) To visualize the type of data contained in the dataset, we analyze a preview of one of these variables: head(movies) The following figure shows the first 20 rows of the movie dataset: Now we build and train model: model=h2o.deeplearning(2:3, training_frame=as.h2o(movies), hidden=c(2), autoencoder = T, activation="Tanh") Let's analyze some of the information contained in model: summary(model) This is an extract from the results of the summary() function: In the next command, we use the h2o.deepfeatures() function to extract the nonlinear feature from an h2o dataset using an H2O deep learning model: features=h2o.deepfeatures(model, as.h2o(movies), layer=1) In the following code, the first six rows of the features extracted from the model are shown: > features DF.L1.C1 DF.L1.C2 1 0.2569208 -0.2837829 2 0.3437048 -0.2670669 3 0.2969089 -0.4235294 4 0.3214868 -0.3093819 5 0.5586608 0.5829145 6 0.2479671 -0.2757966 [9125 rows x 2 columns] Finally, we plot a diagram where we want to see how the model grouped the movies through the results obtained from the analysis: d=as.matrix(features[1:10,]) labels=as.vector(movies[1:10,2]) plot(d,pch=17) text(d,labels,pos=3) The plot of the movies, once clustering is done, is shown next. We have plotted only 100 movie titles due to space issues. We can see some movies being closely placed, meaning they are of the same genre. The titles are clustered based on distances between them, based on genre. Given a large number of titles, the movie names cannot be distinguished, but what appears to be clear is that the model has grouped the movies into three distinct groups. If you found this excerpt useful, make sure you check out the book Neural Networks with R, containing an interesting coverage of many such useful and insightful topics.
Read more
  • 0
  • 1
  • 38655

article-image-getting-to-know-tensorflow
Kartikey Pandey
29 Nov 2017
7 min read
Save for later

Getting to know TensorFlow

Kartikey Pandey
29 Nov 2017
7 min read
[box type="note" align="" class="" width=""] The following book excerpt is from the title Machine Learning Algorithms by Guiseppe Bonaccorso. The book describes important Machine Learning algorithms commonly used in the field of data science. These algorithms can be used for supervised as well as unsupervised learning, reinforcement learning, and semi-supervised learning. Few famous ones covered in the book are Linear regression, Logistic Regression, SVM, Naive Bayes, K-Means, Random Forest, TensorFlow, and Feature engineering. [/box] Here, in the article, we look at understanding most important Deep learning library-Tensorflow with contextual examples. Brief Introduction to TensorFlow TensorFlow is a computational framework created by Google and has become one of the most diffused deep-learning toolkits. It can work with both CPUs and GPUs and already implements most of the operations and structures required to build and train a complex model. TensorFlow can be installed as a Python package on Linux, Mac, and Windows (with or without GPU support); however, we suggest you follow the instructions provided on the website to avoid common mistakes. The main concept behind TensorFlow is the computational graph or a set of subsequent operations that transform an input batch into the desired output. In the following figure, there's a schematic representation of a graph: Starting from the bottom, we have two input nodes (a and b), a transpose operation (that works on b), a matrix multiplication and a mean reduction. The init block is a separate operation, which is formally part of the graph, but it's not directly connected to any other node; therefore it's autonomous (indeed, it's a global initializer). As this one is only a brief introduction, it's useful to list all of the most important strategic elements needed to work with TensorFlow so as to be able to build a few simple examples that can show the enormous potential of this framework: Graph: This represents the computational structure that connects a generic input batch with the output tensors through a directed network made of operations. It's defined as a tf.Graph() instance and normally used with a Python context Manager. Placeholder: This is a reference to an external variable, which must be explicitly supplied when it's requested for the output of an operation that uses it directly or indirectly. For example, a placeholder can represent a variable x, which is first transformed into its squared value and then summed to a constant value. The output is thenx2+c, which is materialized by passing a concrete value for x. It's defined as a tf.placeholder() instance. Variable: An internal variable used to store values which are updated by the algorithm. For example, a variable can be a vector containing the weights of a logistic regression. It's normally initialized before a training process and automatically modified by the built-in optimizers. It's defined as a tf.Variable() instance. A variable can also be used to store elements which must not be considered during training processes; in this case, it must be declared with the parameter trainable=False Constant: A constant value defined as a tf.constant() instance. Operation: A mathematical operation that can work with placeholders, variables, and constants. For example, the multiplication of two matrices is an operation defined a tf.constant(). Among all operations, gradient calculation is one of the most important. TensorFlow allows determining the gradients starting from a determined point in the computational graph, until the origin or another point that must be logically before it. We're going to see an example of this Operation. Session: This is a sort of wrapper-interface between TensorFlow and our working environment (for example, Python or C++). When the evaluation of a graph is needed, this macro-operation will be managed by a session, which must be fed with all placeholder values and will produce the required outputs using the requested devices. For our purposes, it's not necessary to go deeper into this concept; however, I invite the reader to retrieve further information from the website or from one of the resources listed at the end of this chapter. It's declared as an instance of tf.Session() or, as we're going to do, an instance of tf.InteractiveSession(). This type of session is particularly useful when working with notebooks or shell commands, because it places itself automatically as the default one. Device: A physical computational device, such as a CPU or a GPU. It's declared explicitly through an instance of the class tf.device()and used with a context manager. When the architecture contains more computational devices, it's possible to split the jobs so as to parallelize many operations. If no device is specified, TensorFlow will use the default one (which is the main CPU or a suitable GPU if all the necessary components are installed). Let’s now analyze this with a simple example here about computing gradients: Computing gradients The option to compute the gradients of all output tensors with respect to any connected input or node is one of the most interesting features of TensorFlow because it allows us to create learning algorithms without worrying about the complexity of all transformations. In this example, we first define a linear dataset representing the function f(x) = x in the range (-100, 100): import numpy as np >>> nb_points = 100 >>> X = np.linspace(-nb_points, nb_points, 200, dtype=np.float32) The corresponding plot is shown in the following figure: Now we want to use TensorFlow to compute: The first step is defining a graph: import tensorflow as tf >>> graph = tf.Graph() Within the context of this graph, we can define our input placeholder and other operations: >>> with graph.as_default(): >>> Xt = tf.placeholder(tf.float32, shape=(None, 1), name='x') >>> Y = tf.pow(Xt, 3.0, name='x_3') >>> Yd = tf.gradients(Y, Xt, name='dx') >>> Yd2 = tf.gradients(Yd, Xt, name='d2x') A placeholder is generally defined with a type (first parameter), a shape, and an optional name. We've decided to use a tf.float32 type because this is the only type also supported by GPUs. Selecting shape=(None, 1) means that it's possible to use any bidimensional vectors with the second dimension equal to 1. The first operation computes the third power if Xt is working on all elements. The second operation computes all the gradients of Y with respect to the input placeholder Xt. The last operation will repeat the gradient computation, but in this case, it uses Yd, which is the output of the first gradient operation. We can now pass some concrete data to see the results. The first thing to do is create a session connected to this graph: >>> session = tf.InteractiveSession(graph=graph) By using this session, we ask any computation using the method run(). All the input parameters must be supplied through a feed-dictionary, where the key is the placeholder, while the value is the actual array: >>> X2, dX, d2X = session.run([Y, Yd, Yd2], feed_dict={Xt: X.reshape((nb_points*2, 1))}) We needed to reshape our array to be compliant with the placeholder. The first argument of run() is a list of tensors that we want to be computed. In this case, we need all operation outputs. The plot of each of them is shown in the following figure: As expected, they represent respectively: x3, 3x2, and 6x. Further in the book, we look at a slightly more complex example of Logistic Regression to implement a logistic regression algorithm. Refer to Chapter 14, Brief Introduction to Deep Learning and Tensorflow of Machine Learning Algorithms to read the complete chapter.    
Read more
  • 0
  • 0
  • 38620

article-image-how-to-install-keras-on-docker-and-cloud-ml
Amey Varangaonkar
20 Dec 2017
3 min read
Save for later

How to Install Keras on Docker and Cloud ML

Amey Varangaonkar
20 Dec 2017
3 min read
[box type="note" align="" class="" width=""]The following extract is taken from the book Deep Learning with Keras, written by Antonio Gulli and Sujit Pal. It contains useful techniques to train effective deep learning models using the highly popular Keras library.[/box] Keras is a deep learning library which can be used on the enterprise platform, by deploying it on a container. In this article, we see how to install Keras on Docker and Google’s Cloud ML. Installing Keras on Docker One of the easiest ways to get started with TensorFlow and Keras is running in a Docker container. A convenient solution is to use a predefined Docker image for deep learning created by the community that contains all the popular DL frameworks (TensorFlow, Theano, Torch, Caffe, and so on). Refer to the GitHub repository at https://github.com/saiprashanths/dl-docker for the code files. Assuming that you already have Docker up and running (for more information, refer to https://www.docker.com/products/overview), installing it is pretty simple and is shown as follows: The following screenshot, says something like, after getting the image from Git, we build the Docker image: In this following screenshot, we see how to run it: From within the container, it is possible to activate support for Jupyter Notebooks (for more information, refer to http://jupyter.org/): Access it directly from the host machine on port: It is also possible to access TensorBoard (for more information, refer to https://www.tensorflow.org/how_tos/summaries_and_tensorboard/) with the help of the command in the screenshot that follows, which is discussed in the next section: After running the preceding command, you will be redirected to the following page: Installing Keras on Google Cloud ML Installing Keras on Google Cloud is very simple. First, we can install Google Cloud (for the downloadable file, refer to https://cloud.google.com/sdk/), a command-line interface for Google Cloud Platform; then we can use CloudML, a managed service that enables us to easily build machine, learning models with TensorFlow. Before using Keras, let's use Google Cloud with TensorFlow to train an MNIST example available on GitHub. The code is local and training happens in the cloud: In the following screenshot, you can see how to run a training session: We can use TensorBoard to show how cross-entropy decreases across iterations: In the next screenshot, we see the graph of cross-entropy: Now, if we want to use Keras on the top of TensorFlow, we simply download the Keras source from PyPI (for the downloadable file, refer to https://pypi.python.org/pypi/Keras/1.2.0 or later versions) and then directly use Keras as a CloudML package solution, as in the following example: Here, trainer.task2.py is an example script: from keras.applications.vgg16 import VGG16 from keras.models import Model from keras.preprocessing import image from keras.applications.vgg16 import preprocess_input import numpy as np # pre-built and pre-trained deep learning VGG16 model base_model = VGG16(weights='imagenet', include_top=True) for i, layer in enumerate(base_model.layers): print (i, layer.name, layer.output_shape) Thus we saw, how fairly easy it is to set up and run Keras on a Docker container and Cloud ML. If this article interested you, make sure to check out our book Deep Learning with Keras, where you can learn to install Keras on other popular platforms such as Amazon Web Services and Microsoft Azure.  
Read more
  • 0
  • 0
  • 38206

article-image-introducing-generative-adversarial-networks
Amey Varangaonkar
11 Dec 2017
6 min read
Save for later

Implementing a simple Generative Adversarial Network (GANs)

Amey Varangaonkar
11 Dec 2017
6 min read
[box type="note" align="" class="" width=""]The following excerpt is taken from Chapter 2 - Learning Features with Unsupervised Generative Networks of the book Deep Learning with Theano, written by Christopher Bourez. This book talks about modeling and training effective deep learning models with Theano, a popular Python-based deep learning library. [/box] In this article, we introduce you to the concept of Generative Adversarial Networks, a popular class of Artificial Intelligence algorithms used in unsupervised machine learning. Code files for this particular chapter are available for download towards the end of the post. Generative adversarial networks are composed of two models that are alternatively trained to compete with each other. The generator network G is optimized to reproduce the true data distribution, by generating data that is difficult for the discriminator D to differentiate from real data. Meanwhile, the second network D is optimized to distinguish real data and synthetic data generated by G. Overall, the training procedure is similar to a two-player min-max game with the following objective function: Here, x is real data sampled from real data distribution, and z the noise vector of the generative model. In some ways, the discriminator and the generator can be seen as the police and the thief: to be sure the training works correctly, the police is trained twice as much as the thief. Let's illustrate GANs with the case of images as data. In particular, let's again take our example from Chapter 2, Classifying Handwritten Digits with a Feedforward Network about MNIST digits, and consider training a generative adversarial network, to generate images, conditionally on the digit we want. The GAN method consists of training the generative model using a second model, the discriminative network, to discriminate input data between real and fake. In this case, we can simply reuse our MNIST image classification model as discriminator, with two classes, real or fake, for the prediction output, and also condition it on the label of the digit that is supposed to be generated. To condition the net on the label, the digit label is concatenated with the inputs: def conv_cond_concat(x, y): return T.concatenate([x, y*T.ones((x.shape[0], y.shape[1], x.shape[2], x.shape[3]))], axis=1) def discrim(X, Y, w, w2, w3, wy): yb = Y.dimshuffle(0, 1, 'x', 'x') X = conv_cond_concat(X, yb) h = T.nnet.relu(dnn_conv(X, w, subsample=(2, 2), border_mode=(2, 2)), alpha=0.2 ) h = conv_cond_concat(h, yb) h2 = T.nnet.relu(batchnorm(dnn_conv(h, w2, subsample=(2, 2), border_mode=(2, 2))), alpha=0.2) h2 = T.flatten(h2, 2) h2 = T.concatenate([h2, Y], axis=1) h3 = T.nnet.relu(batchnorm(T.dot(h2, w3))) h3 = T.concatenate([h3, Y], axis=1) y = T.nnet.sigmoid(T.dot(h3, wy)) return y Note the use of two leaky rectified linear units, with a leak of 0.2, as activation for the first two convolutions. To generate an image given noise and label, the generator network consists of a stack of deconvolutions, using an input noise vector z that consists of 100 real numbers ranging from 0 to 1: To create a deconvolution in Theano, a dummy convolutional forward pass is created, which gradient is used as deconvolution: def deconv(X, w, subsample=(1, 1), border_mode=(0, 0), conv_ mode='conv'): img = gpu_contiguous(T.cast(X, 'float32')) kerns = gpu_contiguous(T.cast(w, 'float32')) desc = GpuDnnConvDesc(border_mode=border_mode, subsample=subsample, conv_mode=conv_mode)(gpu_alloc_empty(img.shape[0], kerns.shape[1], img.shape[2]*subsample[0], img.shape[3]*subsample[1]).shape, kerns. shape) out = gpu_alloc_empty(img.shape[0], kerns.shape[1], img. shape[2]*subsample[0], img.shape[3]*subsample[1]) d_img = GpuDnnConvGradI()(kerns, img, out, desc) return d_img def gen(Z, Y, w, w2, w3, wx): yb = Y.dimshuffle(0, 1, 'x', 'x') Z = T.concatenate([Z, Y], axis=1) h = T.nnet.relu(batchnorm(T.dot(Z, w))) h = T.concatenate([h, Y], axis=1) h2 = T.nnet.relu(batchnorm(T.dot(h, w2))) h2 = h2.reshape((h2.shape[0], ngf*2, 7, 7)) h2 = conv_cond_concat(h2, yb) h3 = T.nnet.relu(batchnorm(deconv(h2, w3, subsample=(2, 2), border_mode=(2, 2)))) h3 = conv_cond_concat(h3, yb) x = T.nnet.sigmoid(deconv(h3, wx, subsample=(2, 2), border_ mode=(2, 2))) return x Real data is given by the tuple (X,Y), while generated data is built from noise and label (Z,Y): X = T.tensor4() Z = T.matrix() Y = T.matrix() gX = gen(Z, Y, *gen_params) p_real = discrim(X, Y, *discrim_params) p_gen = discrim(gX, Y, *discrim_params) Generator and discriminator models compete during adversarial learning: The discriminator is trained to label real data as real (1) and label generated data as generated (0), hence minimizing the following cost function: d_cost = T.nnet.binary_crossentropy(p_real, T.ones(p_real.shape)).mean() + T.nnet.binary_crossentropy(p_gen, T.zeros(p_gen.shape)). mean() The generator is trained to deceive the discriminator as much as possible. The training signal for the generator is provided by the discriminator network (p_gen) to the generator: g_cost = T.nnet.binary_crossentropy(p_gen,T.ones(p_gen.shape)). mean() The same as usual follows. Cost with respect to the parameters for each model is computed and training optimizes the weights of each model alternatively, with two times more the discriminator. In the case of GANs, competition between discriminator and generator does not lead to decreases in each loss. From the first epoch: To the 45th epoch: Generated examples look closer to real ones: Generative models, and especially Generative Adversarial Networks are currently the trending areas of Deep Learning. It has also found its way in a few practical applications as well.  For example, a generative model can successfully be trained to generate the next most likely video frames by learning the features of the previous frames. Another popular example where GANs can be used is, search engines that predict the next likely word before it is even entered by the user, by studying the sequence of the previously entered words. If you found this excerpt useful, do check out more comprehensive coverage of popular deep learning topics in our book  Deep Learning with Theano. [box type="download" align="" class="" width=""] Download files [/box]  
Read more
  • 0
  • 0
  • 38189

article-image-4-important-business-intelligence-considerations-for-the-rest-of-2019
Richard Gall
16 Sep 2019
7 min read
Save for later

4 important business intelligence considerations for the rest of 2019

Richard Gall
16 Sep 2019
7 min read
Business intelligence occupies a strange position, often overshadowed by fields like data science and machine learning. But it remains a critical aspect of modern business - indeed, the less attention the world appears to pay to it, the more it is becoming embedded in modern businesses. Where analytics and dashboards once felt like a shiny and exciting interruption in our professional lives, today it is merely the norm. But with business intelligence almost baked into the day to day routines and activities of many individuals, teams, and organizations, what does this actually mean in practice. For as much as we’d like to think that we’re all data-driven now, the reality is that there’s much we can do to use data more effectively. Research confirms that data-driven initiatives often fail - so with that in mind here’s what’s important when it comes to business intelligence in 2019. Popular business intelligence eBooks and videos Oracle Business Intelligence Enterprise Edition 12c - Second Edition Microsoft Power BI Quick Start Guide Implementing Business Intelligence with SQL Server 2019 [Video] Hands-On Business Intelligence with Qlik Sense Hands-On Dashboard Development with QlikView Getting the balance between self-service business intelligence and centralization Self-service business intelligence is one of the biggest trends to emerge in the last two years. In practice, this means that a diverse range of stakeholders (marketers and product managers for example) have access to analytics tools. They’re no longer purely the preserve of data scientists and analysts. Self-service BI makes a lot of sense in the context of today’s data rich and data-driven environment. The best way to empower team members to actually use data is to remove any bottlenecks (like a centralized data team) and allow them to go directly to the data and tools they need to make decisions. In essence, self-service business intelligence solutions are a step towards the democratization of data. However, while the notion of democratizing data sounds like a noble cause, the reality is a little more complex. There are a number of different issues that make self-service BI a challenging thing to get right. One of the biggest pain points, for example, are the skill gaps of teams using these tools. Although self-service BI should make using data easy for team members, even the most user-friendly dashboards need a level of data literacy to be useful. Read next: What are the limits of self-service BI? Many analytics products are being developed with this problem in mind. But it’s still hard to get around - you don’t, after all, want to sacrifice the richness of data for simplicity and accessibility. Another problem is the messiness of data itself - and this ultimately points to one of the paradoxes of self-service BI. You need strong alignment - centralization even - if you’re to ensure true democratization. The answer to all this isn’t to get tied up in decentralization or centralization. Instead, what’s important is striking a balance between the two. Decentralization needs centralization - there needs to be strong governance and clarity over what data exists, how it’s used, how it’s accessed - someone needs to be accountable for that for decentralized, self-service BI to actually work. Read next: How Qlik Sense is driving self-service Business Intelligence Self-service business intelligence: recommended viewing Power BI Masterclass - Beginners to Advanced [Video] Data storytelling that makes an impact Data storytelling is a phrase that’s used too much without real consideration as to what it means or how it can be done. Indeed, all too often it’s used to refer to stylish graphs and visualizations. And yes, stylish graphs and data visualizations are part of data storytelling, but you can’t just expect some nice graphics to communicate in depth data insights to your colleagues and senior management. To do data storytelling well, you need to establish a clear sense of objectives and goals. By that I’m not referring only to your goals, but also those of the people around you. It goes without saying that data and insight needs context, but what that context should be, exactly, is often the hard part - objectives and aims are perhaps the straightforward way of establishing that context and ensuring your insights are able to establish the scope of a problem and propose a way forward. Data storytelling can only really make an impact if you are able to strike a balance between centralization and self-service. Stakeholders that use self-service need confidence that everything they need is both available and accurate - this can only really be ensured by a centralized team of data scientists, architects, and analysts. Data storytelling: recommend viewing Data Storytelling with Qlik Sense [Video] Data Storytelling with Power BI [Video] The impact of cloud It’s impossible to properly appreciate the extent to which cloud is changing the data landscape. Not only is it easier than ever to store and process data, it’s also easy to do different things with it. This means that it’s now possible to do machine learning, or artificial intelligence projects with relative ease (the word relative being important, of course). For business intelligence, this means there needs to be a clear strategy that joins together every piece of the puzzle, from data collection to analysis. This means there needs to be buy-in and input from stakeholders before a solution is purchased - or built - and then the solution needs to be developed with every individual use case properly understood and supported. Indeed, this requires a combination of business acumen, soft skills, and technical expertise. A large amount of this will rest on the shoulders of an organization’s technical leadership team, but it’s also worth pointing out that those in other departments still have a part to play. If stakeholders are unable to present a clear vision of what their needs and goals are it’s highly likely that the advantages of cloud will pass them by when it comes to business intelligence. Cloud and business intelligence: recommended viewing Going beyond Dashboards with IBM Cognos Analytics [Video] Business intelligence ethics Ethics has become a huge issue for organizations over the last couple of years. With the Cambridge Analytica scandal placing the spotlight on how companies use customer data, and GDPR forcing organizations to take a new approach to (European) user data, it’s undoubtedly the case that ethical considerations have added a new dimension to business intelligence. But what does this actually mean in practice? Ethics manifests itself in numerous ways in business intelligence. Perhaps the most obvious is data collection - do you have the right to use someone’s data in a certain way? Sometimes the law will make it clear. But other times it will require individuals to exercise judgment and be sensitive to the issues that could arise. But there are other ways in which individuals and organizations need to think about ethics. Being data-driven is great, especially if you can approach insight in a way that is actionable and proactive. But at the same time it’s vital that business intelligence isn’t just seen as a replacement for human intelligence. Indeed, this is true not just in an ethical sense, but also in terms of sound strategic thinking. Business intelligence without human insight and judgment is really just the opposite of intelligence. Conclusion: business intelligence needs organizational alignment and buy-in There are many issues that have been slowly emerging in the business intelligence world for the last half a decade. This might make things feel confusing, but in actual fact it underlines the very nature of the challenges organizations, leadership teams, and engineers face when it comes to business intelligence. Essentially, doing business intelligence well requires you - and those around you - to tie all these different elements. It's certainly not straightforward, but with focus and a clarity of thought, it's possible to build a really effective BI program that can fulfil organizational needs well into the future.
Read more
  • 0
  • 0
  • 37845
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-implementing-decision-trees
Packt
22 Sep 2015
4 min read
Save for later

Implementing Decision Trees

Packt
22 Sep 2015
4 min read
 In this article by the author, Sunila Gollapudi, of this book, Practical Machine Learning, we will outline a business problem that can be addressed by building a decision tree-based model, and see how it can be implemented in Apache Mahout, R, Julia, Apache Spark, and Python. This can happen many, many times. So, building a website or an app will take a bit longer than it used to. (For more resources related to this topic, see here.) Implementing decision trees Here, we will explore implementing decision trees using various frameworks and tools. The R example We will use the rpart and ctree packages in R to build decision tree-based models: Import the packages for data import and decision tree libraries as shown here: Start data manipulation: Create a categorical variable on Sales and append to the existing dataset as shown here: Using random functions, split data into training and testing datasets; Fit the tree model with training data and check how the model is working with testing data, measure the error: Prune the tree; Plotting the pruned tree will look like the following: The Spark example Java-based example using MLib is shown here: import java.util.HashMap; import scala.Tuple2; import org.apache.spark.api.java.JavaPairRDD; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; import org.apache.spark.api.java.function.Function; import org.apache.spark.api.java.function.PairFunction; import org.apache.spark.mllib.regression.LabeledPoint; import org.apache.spark.mllib.tree.DecisionTree; import org.apache.spark.mllib.tree.model.DecisionTreeModel; import org.apache.spark.mllib.util.MLUtils; import org.apache.spark.SparkConf; SparkConf sparkConf = new SparkConf().setAppName("JavaDecisionTree"); JavaSparkContext sc = new JavaSparkContext(sparkConf); // Load and parse the data file. String datapath = "data/mllib/sales.txt"; JavaRDD<LabeledPoint> data = MLUtils.loadLibSVMFile(sc.sc(), datapath).toJavaRDD(); // Split the data into training and test sets (30% held out for testing) JavaRDD<LabeledPoint>[] splits = data.randomSplit(new double[]{0.7, 0.3}); JavaRDD<LabeledPoint> trainingData = splits[0]; JavaRDD<LabeledPoint> testData = splits[1]; // Set parameters. // Empty categoricalFeaturesInfo indicates all features are continuous. Integer numClasses = 2; Map<Integer, Integer> categoricalFeaturesInfo = new HashMap<Integer, Integer>(); String impurity = "gini"; Integer maxDepth = 5; Integer maxBins = 32; // Train a DecisionTree model for classification. final DecisionTreeModel model = DecisionTree.trainClassifier(trainingData, numClasses, categoricalFeaturesInfo, impurity, maxDepth, maxBins); // Evaluate model on test instances and compute test error JavaPairRDD<Double, Double> predictionAndLabel = testData.mapToPair(new PairFunction<LabeledPoint, Double, Double>() { @Override public Tuple2<Double, Double> call(LabeledPoint p) { return new Tuple2<Double, Double>(model.predict(p.features()), p.label()); } }); Double testErr = 1.0 * predictionAndLabel.filter(new Function<Tuple2<Double, Double>, Boolean>() { @Override public Boolean call(Tuple2<Double, Double> pl) { return !pl._1().equals(pl._2()); } }).count() / testData.count(); System.out.println("Test Error: " + testErr); System.out.println("Learned classification tree model:n" + model.toDebugString()); The Julia example We will use the DecisionTree package in Julia as shown here; julia> Pkg.add("DecisionTree")julia> using DecisionTree We will use the RDatasets package to load the dataset for the example in context; julia> Pkg.add("RDatasets"); using RDatasets julia> sales = data("datasets", "sales"); julia> features = array(sales[:, 1:4]); # use matrix() for Julia v0.2 julia> labels = array(sales[:, 5]); # use vector() for Julia v0.2 julia> stump = build_stump(labels, features); julia> print_tree(stump) Feature 3, Threshold 3.0 L-> price : 50/50 R-> shelvelock : 50/100 Pruning the tree julia> length(tree) 11 julia> pruned = prune_tree(tree, 0.9); julia> length(pruned) 9 Summary In this article, we implemented decision trees using R, Spark, and Julia. Resources for Article: Further resources on this subject: An overview of common machine learning tasks[article] How to do Machine Learning with Python[article] Modeling complex functions with artificial neural networks [article]
Read more
  • 0
  • 0
  • 37536

article-image-getting-know-sql-server-options-disaster-recovery
Sunith Shetty
27 Feb 2018
10 min read
Save for later

Getting to know SQL Server options for disaster recovery

Sunith Shetty
27 Feb 2018
10 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Marek Chmel and Vladimír Mužný titled SQL Server 2017 Administrator's Guide. This book will help you learn to implement and administer successful database solutions with SQL Server 2017.[/box] Today, we will explore the disaster recovery basics to understand the common terms in high availability and disaster recovery. We will then discuss SQL Server offering for HA/DR options. Disaster recovery basics Disaster recovery (DR) is a set of tools, policies, and procedures, which help us during the recovery of your systems after a disastrous event. Disaster recovery is just a subset of a more complex discipline called business continuity planning, where more variables come in place and you expect more sophisticated plans on how to recover the business operations. With careful planning, you can minimize the effects of the disaster, because you have to keep in mind that it's nearly impossible to completely avoid disasters. The main goal of a disaster recovery plan is to minimize the downtime of our service and to minimize the data loss. To measure these objectives, we use special metrics: Recovery Point and Time Objectives. Recovery Time Objective (RTO) is the maximum time that you can use to recover the system. This time includes your efforts to fix the problem without starting the disaster recovery procedures, the recovery itself, proper testing after the disaster recovery, and the communication to the stakeholders. Once a disaster strikes, clocks are started to measure the disaster recovery actions and the Recovery Time Actual (RTA) metric is calculated. If you manage to recover the system within the Recovery Time Objective, which means that RTA < RTO, then you have met the metrics with a proper combination of the plan and your ability to restore the system. Recovery Point Objective (RPO) is the maximum tolerable period for acceptable data loss. This defines how much data can be lost due to disaster. The Recovery Point Objective has an impact on your implementation of backups, because you plan for a recovery strategy that has specific requirements for your backups. If you can avoid to lose one day of work, you can properly plan your backup types and the frequency of the backups that you need to take. The following image is an illustration of the very concepts that we discussed in the preceding paragraph: When we talk about system availability, we usually use a percentage of the availability time. This availability is a calculated uptime in a given year or month (any date metric that you need) and is usually compared to a following table of "9s". Availability also expresses a tolerable downtime in a given time frame so that the system still meets the availability metric. In the following table, we'll see some basic availability options with tolerable downtime a year and a day: Availability % Downtime a year Downtime a day 90% 36.5 days 2.4 hours 98% 7.3 days 28.8 minutes 99% 3.65 days 14.4 minutes 99.9% 8.76 hours 1.44 minutes 99.99% 52.56 minutes 8.64 seconds 99.999% 5.26 minutes less than 1 second This tolerable downtime consists of the unplanned downtime and can be caused by many factors: Natural Disasters Hardware failures Human errors (accidental deletes, code breakdowns, and so on) Security breaches Malware For these, we can have a mitigation plan in place that will help us reduce the downtime to a tolerable range, and we usually deploy a combination of high availability solutions and disaster recovery solutions so that we can quickly restore the operations. On the other hand, there's a reasonable set of events that require a downtime on your service due to the maintenance and regular operations, which does not affect the availability on your system. These can include the following: New releases of the software Operating system patching SQL Server patching Database maintenance and upgrades Our goal is to have the database online as much as possible, but there will be times when the database will be offline and, from the perspective of the management and operation, we're talking about several keywords such as uptime, downtime, time to repair, and time between failures, as you can see in the following image: It's really critical not only to have a plan for disaster recovery, but also to practice the disaster recovery itself. Many companies follow the procedure of proper disaster recovery plan testing with different types of exercise where each and every aspect of the disaster recovery is carefully evaluated by teams who are familiar with the tools and procedures for a real disaster event. This exercise may have different scope and frequency, as listed in the following points: Tabletop exercises usually involve only a small number of people and focus on a specific aspect of the DR plan. This would be a DBA team drill to recover a single SQL Server or a small set of servers with simulated outage. Medium-sized exercises will involve several teams to practice team communication and interaction. Complex exercises usually simulate larger events such as data center loss, where a new virtual data center is built and all new servers and services are provisioned by the involved teams. Such exercises should be run on a periodic basis so that all the teams and team personnel are up to speed with the disaster recovery plans. SQL Server options for high availability and disaster recovery SQL Server has many features that you can put in place to implement a HA/DR solution that will fit your needs. These features include the following: Always On Failover Cluster Always On Availability Groups Database mirroring Log shipping Replication In many cases, you will combine more of the features together, as your high availability and disaster recovery needs will overlap. HA/DR does not have to be limited to just one single feature. In complex scenarios, you'll plan for a primary high availability solution and secondary high availability solution that will work as your disaster recovery solution at the same time. Always On Failover Cluster An Always On Failover Cluster (FCI) is an instance-level protection mechanism, which is based on top of a Windows Failover Cluster Feature (WFCS). SQL Server instance will be installed across multiple WFCS nodes, where it will appear in the network as a single computer. All the resources that belong to one SQL Server instance (disk, network, names) can be owned by one node of the cluster and, during any planned or unplanned event like a failure of any server component, these can be moved to another node in the cluster to preserve operations and minimize downtime, as shown in the following image: Always On Availability Groups Always On Availability Groups were introduced with SQL Server 2012 to bring a database-level protection to the SQL Server. As with the Always On Failover Cluster, Availability Groups utilize the Windows Failover Cluster feature, but in this case, single SQL Server is not installed as a clustered instance but runs independently on several nodes. These nodes can be configured as Always On Availability Group nodes to host a database, which will be synchronized among the hosts. The replica can be either synchronous or asynchronous, so Always On Availability Groups are a good fit either as a solution for one data center or even distant data centers to keep your data safe. With new SQL Server versions, Always On Availability Groups were enhanced and provide many features for database high availability and disaster recovery scenarios. You can refer to the following image for a better understanding: Database mirroring Database mirroring is an older HA/DR feature available in SQL Server, which provides database-level protection. Mirroring allows synchronizing the databases between two servers, where you can include one more server as a witness server as a failover quorum. Unlike the previous two features, database mirroring does not require any special setup such as Failover Cluster and the configuration can be achieved via SSMS using a wizard available via database properties. Once a transaction occurs on the primary node, it's copied to the second node to the mirrored database. With proper configuration, database mirroring can provide failover options for high availability with automatic client redirection. Database mirroring is not preferred solution for HA/DR, since it's marked as a deprecated feature from SQL Server 2012 and is replaced by Basic Availability Groups on current versions. Log shipping Log shipping configuration, as the name suggests, is a mechanism to keep a database in sync by copying the logs to the remote server. Log shipping, unlike mirroring, is not copying each single transaction, but copies the transactions in batches via transaction log backup on the primary node and log restore on the secondary node. Unlike all previously mentioned features, log shipping does not provide an automatic failover option, so it's considered more as a disaster recovery option than a high availability one. Log shipping operates on regular intervals where three jobs have to run: Backup job to backup the transaction log on the primary system Copy job to copy the backups to the secondary system Restore job to restore the transaction log backup on the secondary system Log shipping supports multiple standby databases, which is quite an advantage compared to database mirroring. One more advantage is the standby configuration for log shipping, which allows read-only access to the secondary database. This is mainly used for many reporting scenarios, where the reporting applications use read-only access and such configuration allows performance offload to the secondary system. Replication Replication is a feature for data movement from one server to another that allows many different scenarios and topologies. Replication uses a model of publisher/subscriber, where the Publisher is the server offering the content via a replication article and subscribers are getting the data. The configuration is more complex compared to mirroring and log shipping features, but allows you much more variety in the configuration for security, performance, and topology. Replication has many benefits and a few of them are as follows: Works on the object level (whereas other features work on database or instance level) Allows merge replication, where more servers synchronize data between each other Allows bi-directional synchronization of data Allows other than SQL Server partners (Oracle, for example) There are several different replication types that can be used with SQL Server, and you can choose them based on the needs for HA/DR options and the data availability requirements on the secondary servers. These options include the following: Snapshot replication Transactional replication Peer-to-peer replication Merge replication We introduced the disaster recovery discipline with the whole big picture of business continuity on SQL Server. Disaster recovery is not only about having backups, but more about the ability to bring the service back to operation after severe failures. We have seen several options that can be used to implement part of disaster recovery on SQL Server--log shipping, replication, and mirroring. To know more about how to design and use an optimal database management strategy, do checkout the book SQL Server 2017 Administrator's Guide.  
Read more
  • 0
  • 0
  • 37420

article-image-decrypting-bitcoin-2017-journey
Ashwin Nair
28 Dec 2017
7 min read
Save for later

There and back again: Decrypting Bitcoin`s 2017 journey from $1000 to $20000

Ashwin Nair
28 Dec 2017
7 min read
Lately, Bitcoin has emerged as the most popular topic of discussion amongst colleagues, friends and family. The conversations more or less have a similar theme - filled with inane theories around what Bitcoin is and and the growing fear of missing out on Bitcoin mania. Well to be fair, who would want to miss out on an opportunity to grow their money by more than 1000%. That`s the return posted by Bitcoin since the start of the year. Bitcoin at the time of writing this article is at $15,000 with a marketcap over $250 billion. To put the hype in context, Bitcoin is now valued higher than 90% of companies in S&P 500 list. Supposedly, invented by an anonymous group or an individual under the alias Satoshi Nakamoto in 2009, Bitcoin has been seen as a digital currency that the internet might not need but one that it deserves. Satoshi`s vision was to create a robust electronic payment system that functions smoothly without a need for a trusted third party. This was achievable with the help of Blockchain, a digital ledger which records transactions in an indelible fashion and is distributed across multiple nodes. This ensured that no transaction is altered or deleted, thus completely eliminating the need for a trusted third party. For all the excitement around Bitcoin and the increase in interest level towards owning one, the thought of dissecting the roller coaster journey from sub $1000 level to $15,000 this year seemed pretty exciting. Started year with a bang / Global uncertainties leading to Bitcoin rally - Oct 2016 to Jan 2017 Global uncertainties played a big role in driving Bitcoin`s price and boy 2016 was full of it!  From Brexit to Trump winning the US elections and major economic commotion in terms of Devaluation of China`s Yuan and India`s Demonetization drive all leading to investors seeking shelter in Bitcoin. The first blip - Jan 2017 to March 2017 China as a country has had a major impact in determining Bitcoin`s fate. Early 2017, China contributed to over 80% of Bitcoin transactions indicating the amount of power Chinese traders and investors had in controlling Bitcoin`s price. However, People's Bank of China`s closer inspection of the exchanges revealed several irregularities in the transactions and business practices. Which eventually led to officials halting withdrawals using Bitcoin transactions and taking stringent steps towards cryptocurrency exchanges. Source - Bloomberg On the path tobecoming  mainstream and gaining support from Ethereum - Mar 2017 to June 2017 During this phase, we saw the rise of another cryptocurrency, Ethereum, a close digital cousin of Bitcoin. Similar to Bitcoin, Ethereum is also built on top of Blockchain technology allowing it to build a decentralized public network. However, Ethereum’s capabilities extend beyond being a cryptocurrency and help developers to build and deploy any kind of decentralized applications. Ethereum`s valuation in this period rose from $20 to $375 which was quite beneficial for Bitcoin. As every reference of Ethereum had Bitcoin mentioned as well, whether it was to explain what Ethereum is or how it can take the number 1 cryptocurrency spot in the future. This, coupled with the rise in Blockchain`s popularity, increased Bitcoin`s visibility within USA. The media started observing politicians, celebrities and other prominent personalities speaking on Bitcoin as well. Bitcoin also received a major boost from Chinese exchanges, wherein withdrawals of the cryptocurrency resumed after nearly a four-month freeze. All these factors led to Bitcoin crossing an all time high of $2500, up by more than 150% since the start of the year. The Curious case of Fork - June 2017 to September 2017 The month of July saw the Cryptocurrencies market cap witnessing a sharp decline, with questions being raised on the price volatility and whether Bitcoin`s rally for the year was over? We can of-course now confidently debunk that question. Though, there hasn’t been any proven rationale behind the decline, one of the reasons seems to be profit booking following months of steep rally in valuations witnessed by Bitcoin. Another major factor, which might have driven the price collapse may be an unresolved dispute among leading members of the bitcoin community over how to overcome the growing problem of Bitcoin being slow and expensive. With growing usage of Bitcoin, its performance in terms of transaction time has slowed down. Bitcoin`s network due to its limitation in terms of block size could only execute around 7 transactions per second compared to the VISA Network which could do over 1600 transactions. This also led to transaction fees being increased substantially to $5.00 per transaction and settlement time often taking hours and even days. This eventually put Bitcoin`s flaw in the spotlight when compared with services offered by competitors such as Paypal in terms of cost and transaction time. Source - Coinmarketcap The poor performance of Bitcoin led to investors opting for other cryptocurrencies. The above graph, shows how Bitcoin`s dominance fell substantially compared to other cryptocurrencies such as Ethereum and Litecoin during this time.   With the core-community still unable to come to a consensus on how to improve the performance and update the software, the prospect of a “fork” was raised.  Fork highlights change in underlying software protocol of Bitcoin to make previous rules valid or invalid. There are two types of blockchain forks - soft fork and hard fork. Around August, the community announced to go ahead with a hard fork in the form of Bitcoin Cash. This news was surprisingly taken in a positive manner leading to Bitcoin rebounding strongly and reaching new heights of around $4000 in price. Once bitten, Twice Shy (China) September 2017 to October 2017 Source - Zerohedge The month of September saw another setback for Bitcoin due to measures taken from People`s Bank of China. This time, PBoC banned initial coin offerings (ICO), thus prohibiting the practice of building and selling cryptocurrency to any investors or to finance any startup projects within China. Based on a  report by National Committee of Experts on Internet Financial Security Technology, Chinese Investors were involved in 2.6 Billion Yuan worth of ICOs in January-June, 2017 reflecting China`s exposure towards Bitcoin. My Precious (October 2017 to December 2017) Source - Manmanroe During the last quarter Bitcoin`s surge has shocked even hardcore Bitcoin fanatics.  Everything seems to be going right for Bitcoin at the moment.  While at the start of the year China was the major contributor towards the hike in Bitcoin`s valuation, now the momentum seemed to have shifted to a much sensible and responsible market in terms of Japan who have embraced Bitcoin in quite a positive manner. As you can see from the below graph, Japan now holds more than 50% of transaction compared to USA which is much lesser in size. Besides Japan, we are also seeing positive developments in country such as Russia and India, who are looking to legalize cryptocurrency usage. Moreover, the level of interests towards Bitcoin from institutional investors is at its peak. All these factors have resulted in Bitcoin to cross the 5 digit mark for the first time in Nov, 2017 and touching an all time high figure of close to $20,000 in December, 2017. Post the record high, Bitcoin has been witnessing a crash and rebound phenomenon in the last two weeks of December. From a record high of $20,000 to $11,000 and now at $15,000, Bitcoin is still a volatile digital currency if one is looking for a quick price appreciation. Despite the valuation dilemma and the price volatility, one thing is sure: the world is warming up to the idea of cryptocurrencies and even owning one. There are already several predictions being made on how Bitcoin`s astronomical growth is going to continue in 2018. However, Bitcoin needs to overcome several challenges before it can replace the traditional currency and and be widely accepting in banking practices. Besides, the rise of other cryptocurriencies such as Ethereum, LiteCoin or bitcoin cash who are looking to dethrone Bitcoin from the #1 spot, there are broader issues at hand which the Bitcoin community should prioritize such as how to curb the effect of Bitcoin`s mining activities on the environment and on having smoother reforms as well as building regulatory roadmap from countries before people actually start using instead of just looking it as a tool for making a quick buck.  
Read more
  • 0
  • 33
  • 37275

article-image-3-ways-to-deploy-a-qt-and-opencv-application
Gebin George
02 Apr 2018
16 min read
Save for later

3 ways to deploy a QT and OpenCV application

Gebin George
02 Apr 2018
16 min read
[box type="note" align="" class="" width=""]This article is an excerpt from the book, Computer Vision with OpenCV 3 and Qt5 written by Amin Ahmadi Tazehkandi.  This book covers how to build, test, and deploy Qt and OpenCV apps, either dynamically or statically.[/box] Today, we will learn three different methods to deploy a QT + OpenCV application. It is extremely important to provide the end users with an application package that contains everything it needs to be able to run on the target platform. And demand very little or no effort at all from the users in terms of taking care of the required dependencies. Achieving this kind of works-out-of-the-box condition for an application relies mostly on the type of the linking (dynamic or static) that is used to create an application, and also the specifications of the target operating system. Deploying using static linking Deploying an application statically means that your application will run on its own and it eliminates having to take care of almost all of the needed dependencies, since they are already inside the executable itself. It is enough to simply make sure you select the Release mode while building your application, as seen in the following screenshot: When your application is built in the Release mode, you can simply pick up the produced executable file and ship it to your users. If you try to deploy your application to Windows users, you might face an error similar to the following when your application is executed: The reason for this error is that on Windows, even when building your Qt application statically, you still need to make sure that Visual C++ Redistributables exist on the target system. This is required for C++ applications that are built by using Microsoft Visual C++, and the version of the required redistributables correspond to the Microsoft Visual Studio installed on your computer. In our case, the official title of the installer for these libraries is Visual C++ Redistributables for Visual Studio 2015, and it can be downloaded from the following link: https:/ / www. microsoft. com/en- us/ download/ details. aspx? id= 48145. It is a common practice to include the redistributables installer inside the installer for our application and perform a silent installation of them if they are not already installed. This process happens with most of the applications you use on your Windows PCs, most of the time, without you even noticing it. We already quite briefly talked about the advantages (fewer files to deploy) and disadvantages (bigger executable size) of static linking. But when it is meant in the context of deployment, there are some more complexities that need to be considered. So, here is another (more complete) list of disadvantages, when using static linking to deploy your applications: The building takes more time and the executable size gets bigger and bigger. You can't mix static and shared (dynamic) Qt libraries, which means you can't use the power of plugins and extending your application without building everything from scratch. Static linking, in a sense, means hiding the libraries used to build an application. Unfortunately, this option is not offered with all libraries, and failing to comply with it can lead to licensing issues with your application. This complexity arises partly because of the fact that Qt Framework uses some third-party libraries that do not offer the same set of licensing options as Qt itself. Talking about licensing issues is not a discussion suitable for this book, so we'll suffice with mentioning that you must be careful when you plan to create commercial applications using static linking of Qt libraries. For a detailed list of licenses used by third-party libraries within Qt, you can always refer to the Licenses Used in Qt web page from the following link: http://doc.qt.io/qt-5/ licenses-used-in-qt.html Static linking, even with all of its disadvantages that we just mentioned, is still an option, and a good one in some cases, provided that you can comply with the licensing options of the Qt Framework. For instance, in Linux operating systems where creating an installer for our application requires some extra work and care, static linking can help extremely reduce the effort needed to deploy applications (merely a copy and paste). So, the final decision of whether to use static linking or not is mostly on you and how you plan to deploy your application. Making this important decision will be much easier by the end of this chapter, when you have an overview of the possible linking and deployment methods. Deploying using dynamic linking When you deploy an application built with Qt and OpenCV using shared libraries (or dynamic linking), you need to make sure that the executable of your application is able to reach the runtime libraries of Qt and OpenCV, in order to load and use them. This reachability or visibility of runtime libraries can have different meanings depending on the operating system. For instance, on Windows, you need to copy the runtime libraries to the same folder where your application executable resides, or put them in a folder that is appended to the PATH environment value. Qt Framework offers command-line tools to simplify the deployment of Qt applications on Windows and macOS. As mentioned before, the first thing you need to do is to make sure your application is built in the Release mode, and not Debug mode. Then, if you are on Windows, first copy the executable (let us assume it is called app.exe) from the build folder into a separate folder (which we will refer to as deploy_path) and execute the following commands using a command-line instance: cd deploy_path QT_PATHbinwindeployqt app.exe The windeployqt tool is a deployment helper tool that simplifies the process of copying the required Qt runtime libraries into the same folder as the application executable. Itsimply takes an executable as a parameter and after determining the modules used to create it, copies all required runtime libraries and any additional required dependencies, such as Qt plugins, translations, and so on. This takes care of all the required Qt runtime libraries, but we still need to take care of OpenCV runtime libraries. If you followed all of the steps in Chapter 1, Introduction to OpenCV and Qt, for building OpenCV libraries dynamically, then you only need to manually copy the opencv_world330.dll and opencv_ffmpeg330.dll files from OpenCV installation folder (inside the x86vc14bin folder) into the same folder where your application executable resides. We didn't really go into the benefits of turning on the BUILD_opencv_world option when we built OpenCV in the early chapters of the book; however, it should be clear now that this simplifies the deployment and usage of the OpenCV libraries, by requiring only a single entry for LIBS in the *.pro file and manually copying only a single file (not counting the ffmpeg library) when deploying OpenCV applications. It should be also noted that this method has the disadvantage of copying all OpenCV codes (in a single library) along your application even when you do not need or use all of its modules in a project. Also note that on Windows, as mentioned in the Deploying using static linking section, you still need to similarly provide the end users of your application with Microsoft Visual C++ Redistributables. On a macOS operating system, it is also possible to easily deploy applications written using Qt Framework. For this reason, you can use the macdeployqt command-line tool provided by Qt. Similar to windeployqt, which accepts a Windows executable and fills the same folder with the required libraries, macdeployqt accepts a macOS application bundle and makes it deployable by copying all of the required Qt runtimes as private frameworks inside the bundle itself. Here is an example: cd deploy_path QT_PATH/bin/macdeployqt my_app_bundle Optionally, you can also provide an additional -dmg parameter, which leads to the creation of a macOS *.dmg (disk image) file. As for the deployment of OpenCV libraries when dynamic linking is used, you can create an installer using Qt Installer Framework (which we will learn about in the next section), a third-party provider, or a script that makes sure the required runtime libraries are copied to their required folders. This is because of the fact that simply copying your runtime libraries (whether it is OpenCV or anything else) to the same folder as the application executable does not help with making them visible to an application on macOS. The same also applies to the Linux operating system, where unfortunately even a tool for deploying Qt runtime libraries does not exist (at least for the moment), so we also need to take care of Qt libraries in addition to OpenCV libraries, either by using a trusted third-party provider (which you can search for online) or by using the cross-platform installer provided by Qt itself, combined with some scripting to make sure everything is in place when our application is executed. Deploy using Qt Installer Framework Qt Installer Framework allows you to create cross-platform installers of your Qt applications for Windows, macOS, and Linux operating systems. It allows for creating standard installer wizards where the user is taken through consecutive dialogs that provide all the necessary information, and finally display the progress for when the application is being installed and so on, similar to most of installations you have probably faced, and especially the installation of Qt Framework itself. Qt Installer Framework is based on Qt Framework itself but is provided as a different package and does not require Qt SDK (Qt Framework, Qt Creator, and so on) to be present on a computer. It is also possible to use Qt Installer Framework in order to create installer packages for any application, not just Qt applications. In this section, we are going to learn how to create a basic installer using Qt Installer Framework, which takes care of installing your application on a target computer and copying all the necessary dependencies. The result will be a single executable installer file that you can put on a web server to be downloaded or provide it in a USB stick or CD, or any other media type. This example project will help you get started with working your way around the many great capabilities of Qt Installer Framework by yourself. You can use the following link to download and install the Qt Installer Framework. Make sure to simply download the latest version when you use this link, or any other source for downloading it. At the moment, the latest version is 3.0.2: https://download.qt.io/official_releases/qt-installer-framework After you have downloaded and installed Qt Installer Framework, you can start creating the required files that Qt Installer Framework needs in order to create an installer. You can do this by simply browsing to the Qt Installer Framework, and from the examples folder copying the tutorial folder, which is also a template in case you want to quickly rename and re-edit all of the files and create your installer quickly. We will go the other way and create them manually; first because we want to understand the structure of the required files and folders for the Qt Installer Framework, and second, because it is still quite easy and simple. Here are the required steps for creating an installer: Assuming that you have already finished developing your Qt and OpenCV application, you can start by creating a new folder that will contain the installer files. Let's assume this folder is called deploy. Create an XML file inside the deploy folder and name it config.xml. This XML file must contain the following: <?xml version="1.0" encoding="UTF-8"?> <Installer> <Name>Your application</Name> <Version>1.0.0</Version> <Title>Your application Installer</Title> <Publisher>Your vendor</Publisher> <StartMenuDir>Super App</StartMenuDir> <TargetDir>@HomeDir@/InstallationDirectory</TargetDir> </Installer> Make sure to replace the required XML fields in the preceding code with information relevant to your application and then save and close this file: Now, create a folder named packages inside the deploy folder. This folder will contain the individual packages that you want the user to be able to install, or make them mandatory or optional so that the user can review and decide what will be installed. In the case of simpler Windows applications that are written using Qt and OpenCV, usually it is enough to have just a single package that includes the required files to run your application, and even do silent installation of Microsoft Visual C++ Redistributables. But for more complex cases, and especially when you want to have more control over individual installable elements of your application, you can also go for two or more packages, or even sub-packages. This is done by using domain-like folder names for each package. Each package folder can have a name like com.vendor.product, where vendor and product are replaced by the developer name or company and the application. A subpackage (or sub-component) of a package can be identified by adding. subproduct to the name of the parent package. For instance, you can have the following folders inside the packages folder: com.vendor.product com.vendor.product.subproduct1 com.vendor.product.subproduct2 com.vendor.product.subproduct1.subsubproduct1 … This can go on for as many products (packages) and sub-products (sub-packages) as we like. For our example case, let's create a single folder that contains our executable, since it describes it all and you can create additional packages by simply adding them to the packages folder. Let's name it something like com.amin.qtcvapp. Now, follow these required steps: Now, create two folders inside the new package folder that we created, the com.amin.qtcvapp folder. Rename them to data and meta. These two folders must exist inside all packages. Copy your application files inside the data folder. This folder will be extracted into the target folder exactly as it is (we will talk about setting the target folder of a package in the later steps). In case you are planning to create more than one package, then make sure to separate their data correctly and in a way that it makes sense. Of course, you won't be faced with any errors if you fail to do so, but the users of your application will probably be confused, for instance by skipping a package that should be installed at all times and ending up with an installed application that does not work. Now, switch to the meta folder and create the following two files inside that folder, and fill them with the codes provided for each one of them. The package.xml file should contain the following. There's no need to mention that you must fill the fields inside the XML with values relevant to your package: <?xml version="1.0" encoding="UTF-8"?> <Package> <DisplayName>The component</DisplayName> <Description>Install this component.</Description> <Version>1.0.0</Version> <ReleaseDate>1984-09-16</ReleaseDate> <Default>script</Default> <Script>installscript.qs</Script> </Package> The script in the previous XML file, which is probably the most important part of the creation of an installer, refers to a Qt Installer Script (*.qs file), which is named installerscript.qs and can be used to further customize the package, its target folder, and so on. So, let us create a file with the same name (installscript.qs) inside the meta folder, and use the following code inside it: function Component() { // initializations go here } Component.prototype.isDefault = function() { // select (true) or unselect (false) the component by default return true; } Component.prototype.createOperations = function() { try { // call the base create operations function component.createOperations(); } catch (e) { console.log(e); } } This is the most basic component script, which customizes our package (well, it only performs the default actions) and it can optionally be extended to change the target folder, create shortcuts in the Start menu or desktop (on Windows), and so on. It is a good idea to keep an eye on the Qt Installer Framework documentation and learn about its scripting to be able to create more powerful installers that can put all of the required dependencies of your app in place, and automatically. You can also browse through all of the examples inside the examples folder of the Qt Installer Framework and learn how to deal with different deployment cases. For instance, you can try to create individual packages for Qt and OpenCV dependencies and allow the users to deselect them, in case they already have the Qt runtime libraries on their computer. The last step is to use the binarycreator tool to create our single and standalone installer. Simply run the following command by using a Command Prompt (or Terminal) instance: binarycreator -p packages -c config.xml myinstaller The binarycreator is located inside the Qt Installer Framework bin folder. It requires two parameters that we have already prepared. -p must be followed by our packages folder and -c must be followed by the configuration file (or config.xml) file. After executing this command, you will get myinstaller (on Windows, you can append *.exe to it), which you can execute to install your application. This single file should contain all of the required files needed to run your application, and the rest is taken care of. You only need to provide a download link to this file, or provide it on a CD to your users. The following are the dialogs you will face in this default and most basic installer, which contains most of the usual dialogs you would expect when installing an application: If you go to the installation folder, you will notice that it contains a few more files than you put inside the data folder of your package. Those files are required by the installer to handle modifications and uninstall your application. For instance, the users of your application can easily uninstall your application by executing the maintenance tool executable, which would produce another simple and user-friendly dialog to handle the uninstall process: We saw how to deploy a QT + OpenCV applications using static linking, dynamic linking, and QT installer. If you found our post useful, do check out this book Computer Vision with OpenCV 3 and Qt5  to accentuate your OpenCV applications by developing them with Qt.  
Read more
  • 0
  • 0
  • 37237
article-image-getting-know-generative-models-types
Sunith Shetty
20 Feb 2018
9 min read
Save for later

Getting to know Generative Models and their types

Sunith Shetty
20 Feb 2018
9 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Rajdeep Dua and Manpreet Singh Ghotra titled Neural Network Programming with Tensorflow. In this book, you will use TensorFlow to build and train neural networks of varying complexities, without any hassle.[/box] In today’s tutorial, we will learn about generative models, and their types. We will also look into how discriminative models differs from generative models. Introduction to Generative models Generative models are the family of machine learning models that are used to describe how data is generated. To train a generative model we first accumulate a vast amount of data in any domain and later train a model to create or generate data like it. In other words, these are the models that can learn to create data that is similar to data that we give them. One such approach is using Generative Adversarial Networks (GANs). There are two kinds of machine learning models: generative models and discriminative models. Let's examine the following list of classifiers: decision trees, neural networks, random forests, generalized boosted models, logistic regression, naive bayes, and Support Vector Machine (SVM). Most of these are classifiers and ensemble models. The odd one out here is Naive Bayes. It's the only generative model in the list. The others are examples of discriminative models. The fundamental difference between generative and discriminative models lies in the underlying probability inference structure. Let's go through some of the key differences between generative and discriminative models. Discriminative versus generative models Discriminative models learn P(Y|X), which is the conditional relationship between the target variable Y and features X. This is how least squares regression works, and it is the kind of inference pattern that gets used. It is an approach to sort out the relationship among variables. Generative models aim for a complete probabilistic description of the dataset. With generative models, the goal is to develop the joint probability distribution P(X, Y), either directly or by computing P(Y | X) and P(X) and then inferring the conditional probabilities required to classify newer data. This method requires more solid probabilistic thought than regression demands, but it provides a complete model of the probabilistic structure of the data. Knowing the joint distribution enables you to generate the data; hence, Naive Bayes is a generative model. Suppose we have a supervised learning task, where xi is the given features of the data points and yi is the corresponding labels. One way to predict y on future x is to learn a function f() from (xi,yi) that takes in x and outputs the most likely y. Such models fall in the category of discriminative models, as you are learning how to discriminate between x's from different classes. Methods like SVMs and neural networks fall into this category. Even if you're able to classify the data very accurately, you have no notion of how the data might have been generated. The second approach is to model how the data might have been generated and learn a function f(x,y) that gives a score to the configuration determined by x and y together. Then you can predict y for a new x by finding the y for which the score f(x,y) is maximum. A canonical example of this is Gaussian mixture models. Another example of this is: you can imagine x to be an image and y to be a kind of object like a dog, namely in the image. The probability written as p(y|x) tells us how much the model believes that there is a dog, given an input image compared to all possibilities it knows about. Algorithms that try to model this probability map directly are called discriminative models. Generative models, on the other hand, try to learn a function called the joint probability p(y, x). We can read this as how much the model believes that x is an image and there is a dog y in it at the same time. These two probabilities are related and that could be written as p(y, x) = p(x) p(y|x), with p(x) being how likely it is that the input x is an image. The p(x) probability is usually called a density function in literature. The main reason to call these models generative ultimately connects to the fact that the model has access to the probability of both input and output at the same time. Using this, we can generate images of animals by sampling animal kinds y and new images x from p(y, x). We can mainly learn the density function p(x) which only depends on the input space. Both models are useful; however, comparatively, generative models have an interesting advantage over discriminative models, namely, they have the potential to understand and explain the underlying structure of the input data even when there are no labels available. This is very desirable when working in the real world. Types of generative models Discriminative models have been at the forefront of the recent success in the field of machine learning. Models make predictions that depend on a given input, although they are not able to generate new samples or data. The idea behind the recent progress of generative modeling is to convert the generation problem to a prediction one and use deep learning algorithms to learn such a problem. Autoencoders One way to convert a generative to a discriminative problem can be by learning the mapping from the input space itself. For example, we want to learn an identity map that, for each image x, would ideally predict the same image, namely, x = f(x), where f is the predictive model. This model may not be of use in its current form, but from this, we can create a generative model. Here, we create a model formed of two main components: an encoder model q(h|x) that maps the input to another space, which is referred to as hidden or the latent space represented by h, and a decoder model q(x|h) that learns the opposite mapping from the hidden input space. These components--encoder and decoder--are connected together to create an end-to-end trainable model. Both the encoder and decoder models are neural networks of different architectures, for example, RNNs and Attention Nets, to get desired outcomes. As the model is learned, we can remove the decoder from the encoder and then use them separately. To generate a new data sample, we can first generate a sample from the latent space and then feed that to the decoder to create a new sample from the output space. GAN As seen with autoencoders, we can think of a general concept to create networks that will work together in a relationship, and training them will help us learn the latent spaces that allow us to generate new data samples. Another type of generative network is GAN, where we have a generator model q(x|h) to map the small dimensional latent space of h (which is usually represented as noise samples from a simple distribution) to the input space of x. This is quite similar to the role of decoders in autoencoders. The deal is now to introduce a discriminative model p(y| x), which tries to associate an input instance x to a yes/no binary answer y, about whether the generator model generated the input or was a genuine sample from the dataset we were training on. Let's use the image example done previously. Assume that the generator model creates a new image, and we also have the real image from our actual dataset. If the generator model was right, the discriminator model would not be able to distinguish between the two images easily. If the generator model was poor, it would be very simple to tell which one was a fake or fraud and which one was real. When both these models are coupled, we can train them end to end by assuring that the generator model is getting better over time to fool the discriminator model, while the discriminator model is trained to work on the harder problem of detecting frauds. Finally, we desire a generator model with outputs that are indistinguishable from the real data that we used for the training. Through the initial parts of the training, the discriminator model can easily detect the samples coming from the actual dataset versus the ones generated synthetically by the generator model, which is just beginning to learn. As the generator gets better at modeling the dataset, we begin to see more and more generated samples that look similar to the dataset. The following example depicts the generated images of a GAN model learning over time: Sequence models If the data is temporal in nature, then we can use specialized algorithms called Sequence Models. These models can learn the probability of the form p(y|x_n, x_1), where i is an index signifying the location in the sequence and x_i is the ith  input sample. As an example, we can consider each word as a series of characters, each sentence as a series of words, and each paragraph as a series of sentences. Output y could be the sentiment of the sentence. Using a similar trick from autoencoders, we can replace y with the next item in the series or sequence, namely y = x_n + 1, allowing the model to learn. To summarize, we learned generative models are a fast advancing area of study and research. As we proceed to advance these models and grow the training and datasets, we can expect to generate data examples that depict completely believable images. This can be used in several applications such as image denoising, painting, structured prediction, and exploration in reinforcement learning. To know more about how to build and optimize neural networks using TensorFlow, do checkout this book Neural Network Programming with Tensorflow.    
Read more
  • 0
  • 0
  • 37175

article-image-microsoft-build-2019-microsoft-showcases-new-updates-to-ms-365-platfrom-with-focus-on-ai-and-developer-productivity
Sugandha Lahoti
07 May 2019
10 min read
Save for later

Microsoft Build 2019: Microsoft showcases new updates to MS 365 platform with focus on AI and developer productivity

Sugandha Lahoti
07 May 2019
10 min read
At the ongoing Microsoft Build 2019 conference, Microsoft has announced a ton of new features and tool releases with a focus on innovation using AI and mixed reality with the intelligent cloud and the intelligent edge. In his opening keynote, Microsoft CEO Satya Nadella outlined the company’s vision and developer opportunity across Microsoft Azure, Microsoft Dynamics 365 and IoT Platform, Microsoft 365, and Microsoft Gaming. “As computing becomes embedded in every aspect of our lives, the choices developers make will define the world we live in,” said Satya Nadella, CEO, Microsoft. “Microsoft is committed to providing developers with trusted tools and platforms spanning every layer of the modern technology stack to build magical experiences that create new opportunity for everyone.” https://youtu.be/rIJRFHDr1QE Increasing developer productivity in Microsoft 365 platform Microsoft Graph data connect Microsoft Graphs are now powered with data connectivity, a service that combines analytics data from the Microsoft Graph with customers’ business data. Microsoft Graph data connect will provide Office 365 data and Microsoft Azure resources to users via a toolset. The migration pipelines are deployed and managed through Azure Data Factory. Microsoft Graph data connect can be used to create new apps shared within enterprises or externally in the Microsoft Azure Marketplace. It is generally available as a feature in Workplace Analytics and also as a standalone SKU for ISVs. More information here. Microsoft Search Microsoft Search works as a unified search experience across all Microsoft apps-  Office, Outlook, SharePoint, OneDrive, Bing and Windows. It applies AI technology from Bing and deep personalized insights surfaced by the Microsoft Graph to personalized searches. Other features included in Microsoft Search are: Search box displacement Zero query typing and key-phrase suggestion feature Query history feature, and personal search query history Administrator access to the history of popular searches for their organizations, but not to search history for individual users Files/people/site/bookmark suggestions Microsoft Search will begin publicly rolling out to all Microsoft 365 and Office 365 commercial subscriptions worldwide at the end of May. Read more on MS Search here. Fluid Framework As the name suggests Microsoft's newly launched Fluid framework allows seamless editing and collaboration between different applications. Essentially, it is a web-based platform and componentized document model that allows users to, for example, edit a document in an application like Word and then share a table from that document in Microsoft Teams (or even a third-party application) with real-time syncing. Microsoft says Fluid can translate text, fetch content, suggest edits, perform compliance checks, and more. The company will launch the software developer kit and the first experiences powered by the Fluid Framework later this year on Microsoft Word, Teams, and Outlook. Read more about Fluid framework here. Microsoft Edge new features Microsoft Build 2019 paved way for a bundle of new features to Microsoft’s flagship web browser, Microsoft Edge. New features include: Internet Explorer mode: This mode integrates Internet Explorer directly into the new Microsoft Edge via a new tab. This allows businesses to run legacy Internet Explorer-based apps in a modern browser. Privacy Tools: Additional privacy controls which allow customers to choose from 3 levels of privacy in Microsoft Edge—Unrestricted, Balanced, and Strict. These options limit third parties to track users across the web.  “Unrestricted” allows all third-party trackers to work on the browser. “Balanced” prevents third-party trackers from sites the user has not visited before. And “Strict” blocks all third-party trackers. Collections: Collections allows users to collect, organize, share and export content more efficiently and with Office integration. Microsoft is also migrating Edge as a whole over to Chromium. This will make Edge easier to develop for by third parties. For more details, visit Microsoft’s developer blog. New toolkit enhancements in Microsoft 365 Platform Windows Terminal Windows Terminal is Microsoft’s new application for Windows command-line users. Top features include: User interface with emoji-rich fonts and graphics-processing-unit-accelerated text rendering Multiple tab support and theming and customization features Powerful command-line user experience for users of PowerShell, Cmd, Windows Subsystem for Linux (WSL) and all forms of command-line application Windows Terminal will arrive in mid-June and will be delivered via the Microsoft Store in Windows 10. Read more here. React Native for Windows Microsoft announced a new open-source project for React Native developers at Microsoft Build 2019. Developers who prefer to use the React/web ecosystem to write user-experience components can now leverage those skills and components on Windows by using “React Native for Windows” implementation. React for Windows is under the MIT License and will allow developers to target any Windows 10 device, including PCs, tablets, Xbox, mixed reality devices and more. The project is being developed on GitHub and is available for developers to test. More mature releases will follow soon. Windows Subsystem for Linux 2 Microsoft rolled out a new architecture for Windows Subsystem for Linux: WSL 2 at the MSBuild 2019. Microsoft will also be shipping a fully open-source Linux kernel with Windows specially tuned for WSL 2. New features include massive file system performance increases (twice as much speed for file-system heavy operations, such as Node Package Manager install). WSL also supports running Linux Docker containers. The next generation of WSL arrives for Insiders in mid-June. More information here. New releases in multiple Developer Tools .NET 5 arrives in 2020 .NET 5 is the next major version of the .NET Platform which will be available in 2020. .NET 5 will have all .NET Core features as well as more additions: One Base Class Library containing APIs for building any type of application More choice on runtime experiences Java interoperability will be available on all platforms. Objective-C and Swift interoperability will be supported on multiple operating systems .NET 5 will provide both Just-in-Time (JIT) and Ahead-of-Time (AOT) compilation models to support multiple compute and device scenarios. .NET 5 also will offer one unified toolchain supported by new SDK project types as well as a flexible deployment model (side-by-side and self-contained EXEs) Detailed information here. ML.NET 1.0 ML.NET is Microsoft’s open-source and cross-platform framework that runs on Windows, Linux, and macOS and makes machine learning accessible for .NET developers. Its new version, ML.NET 1.0, was released at the Microsoft Build Conference 2019 yesterday. Some new features in this release are: Automated Machine Learning Preview: Transforms input data by selecting the best performing ML algorithm with the right settings. AutoML support in ML.NET is in preview and currently supports Regression and Classification ML tasks. ML.NET Model Builder Preview: Model Builder is a simple UI tool for developers which uses AutoML to build ML models. It also generates model training and model consumption code for the best performing model. ML.NET CLI Preview: ML.NET CLI is a dotnet tool which generates ML.NET Models using AutoML and ML.NET. The ML.NET CLI quickly iterates through a dataset for a specific ML Task and produces the best model. Visual Studio IntelliCode, Microsoft’s tool for AI-assisted coding Visual Studio IntelliCode, Microsoft’s AI-assisted coding is now generally available. It is essentially an enhanced IntelliSense, Microsoft’s extremely popular code completion tool. Intellicode is trained by using the code of thousands of open-source projects from GitHub that have at least 100 stars. It is available for C# and XAML for Visual Studio and Java, JavaScript, TypeScript, and Python for Visual Studio Code. IntelliCode also is included by default in Visual Studio 2019, starting in version 16.1 Preview 2. Additional capabilities, such as custom models, remain in public preview. Visual Studio 2019 version 16.1 Preview 2 Visual Studio 2019 version 16.1 Preview 2 release includes IntelliCode and the GitHub extensions by default. It also brings out of preview the Time Travel Debugging feature introduced with version 16.0. Also includes multiple performances and productivity improvements for .NET and C++ developers. Gaming and Mixed Reality Minecraft AR game for mobile devices At the end of Microsoft’s Build 2019 keynote yesterday, Microsoft teased a new Minecraft game in augmented reality, running on a phone. The teaser notes that more information will be coming on May 17th, the 10-year anniversary of Minecraft. https://www.youtube.com/watch?v=UiX0dVXiGa8 HoloLens 2 Development Edition and unreal engine support The HoloLens 2 Development Edition includes a HoloLens 2 device, $500 in Azure credits and three-months free trials of Unity Pro and Unity PiXYZ Plugin for CAD data, starting at $3,500 or as low as $99 per month. The HoloLens 2 Development Edition will be available for preorder soon and will ship later this year. Unreal Engine support for streaming and native platform integration will be available for HoloLens 2 by the end of May. Intelligent Edge and IoT Azure IoT Central new features Microsoft Build 2019 also featured new additions to Azure IoT Central, an IoT software-as-a-service solution. Better rules processing and customs rules with services like Azure Functions or Azure Stream Analytics Multiple dashboards and data visualization options for different types of users Inbound and outbound data connectors, so that operators can integrate with   systems Ability to add custom branding and operator resources to an IoT Central application with new white labeling options New Azure IoT Central features are available for customer trials. IoT Plug and Play IoT Plug and Play is a new, open modeling language to connect IoT devices to the cloud seamlessly without developers having to write a single line of embedded code. IoT Plug and Play also enable device manufacturers to build smarter IoT devices that just work with the cloud. Cloud developers will be able to find IoT Plug and Play enabled devices in Microsoft’s Azure IoT Device Catalog. The first device partners include Compal, Kyocera, and STMicroelectronics, among others. Azure Maps Mobility Service Azure Maps Mobility Service is a new API which provides real-time public transit information, including nearby stops, routes and trip intelligence. This API also will provide transit services to help with city planning, logistics, and transportation. Azure Maps Mobility Service will be in public preview in June. Read more about Azure Maps Mobility Service here. KEDA: Kubernetes-based event-driven autoscaling Microsoft and Red Hat collaborated to create KEDA, which is an open-sourced project that supports the deployment of serverless, event-driven containers on Kubernetes. It can be used in any Kubernetes environment — in any public/private cloud or on-premises such as Azure Kubernetes Service (AKS) and Red Hat OpenShift. KEDA has support for built-in triggers to respond to events happening in other services or components. This allows the container to consume events directly from the source, instead of routing through HTTP. KEDA also presents a new hosting option for Azure Functions that can be deployed as a container in Kubernetes clusters. Securing elections and political campaigns ElectionGuard SDK and Microsoft 365 for Campaigns ElectionGuard, is a free open-source software development kit (SDK) as an extension of Microsoft’s Defending Democracy Program to enable end-to-end verifiability and improved risk-limiting audit capabilities for elections in voting systems. Microsoft365 for Campaigns provides security capabilities of Microsoft 365 Business to political parties and individual candidates. More details here. Microsoft Build is in its 6th year and will continue till 8th May. The conference hosts over 6,000 attendees with early 500 student-age developers and over 2,600 customers and partners in attendance. Watch it live here! Microsoft introduces Remote Development extensions to make remote development easier on VS Code Docker announces a collaboration with Microsoft’s .NET at DockerCon 2019 How Visual Studio Code can help bridge the gap between full-stack development and DevOps [Sponsered by Microsoft]
Read more
  • 0
  • 0
  • 37033

article-image-getting-started-with-data-visualization-in-tableau
Amarabha Banerjee
13 Feb 2018
5 min read
Save for later

Getting started with Data Visualization in Tableau

Amarabha Banerjee
13 Feb 2018
5 min read
[box type="note" align="" class="" width=""]This article is an book extract from Mastering Tableau, written by David Baldwin. Tableau has emerged as one of the most popular Business Intelligence solutions in recent times, thanks to its powerful and interactive data visualization capabilities. This book will empower you to become a master in Tableau by exploiting the many new features introduced in Tableau 10.0.[/box] In today’s post, we shall explore data visualization basics with Tableau and explore a real world example using these techniques. Tableau Software has a focused vision resulting in a small product line. The main product (and hence the center of the Tableau universe) is Tableau Desktop. Assuming you are a Tableau author, that's where almost all your time will be spent when working with Tableau. But of course you must be able to connect to data and output the results. Thus, as shown in the following figure, the Tableau universe encompasses data sources, Tableau Desktop, and output channels, which include the Tableau Server family and Tableau Reader: Worksheet and dashboard creation At the heart of Tableau are worksheets and dashboards. Worksheets contain individual visualizations and dashboards contain one or more worksheets. Additionally, worksheets and dashboards may be combined into stories to communicate specific insights to the end user via a presentation environment. Lastly, all worksheets, dashboards, and stories are organized in workbooks that can be accessed via Tableau Desktop, Server, or Reader. In this section, we will look at worksheet and dashboard creation with the intent of not only communicating the basics, but also providing some insight that may prove helpful to even more seasoned Tableau authors. Worksheet creation At the most fundamental level, a visualization in Tableau is created by placing one or more fields on one or more shelves. To state this as a pseudo-equation: Field(s) + shelf(s) = Viz As an example, note that the visualization created in the following screenshot is generated by placing the Sales field on the Text shelf. Although the results are quite simple – a single number – this does qualify as a view. In other words, a Field (Sales) placed on a shelf (Text) has generated a Viz: Exercise – fundamentals of visualizations Let's explore the basics of creating a visualization via an exercise: Navigate to h t t p s ://p u b l i c . t a b l e a u . c o m /p r o f i l e /d a v i d 1. . b a l d w i n #!/ to locate and download the workbook associated with this chapter. In the workbook, find the tab labeled Fundamentals of Visualizations: Locate Region within the Dimensions portion of the Data pane: Drag Region to the Color shelf; that is, Region + Color shelf = what is shown in the following screenshot: Click on the Color shelf and then on Edit Colors… to adjust colors as desired: Next, move Region to the Size, Label/Text, Detail, Columns, and Rows shelves. After placing Region on each shelf, click on the shelf to access additional options. Lastly, choose other fields to drop on various shelves to continue exploring Tableau's behavior. As you continue exploring Tableau's behavior by dragging and dropping different fields onto different shelves, you will notice that Tableau responds with default behaviors. These defaults, however, can be overridden, which we will explore in the following section. Dashboard Creation Although, as stated previously, a dashboard contains one or more worksheets, dashboards are much more than static presentations. They are an essential part of Tableau's interactivity. In this section, we will populate a dashboard with worksheets and then deploy actions for interactivity. Exercise – building a dashboard In the workbook associated with this chapter, navigate to the tab entitled Building a Dashboard. Within the Dashboard pane located on the left-hand portion of the screen, double-click on each of the following worksheets (in the order in which they are listed) to add them to the dashboard: US Sales Customer Segment Scatter Plot Customers In the lower right-hand corner of the dashboard, click in the blank area below Profit Ratio to select the vertical container: After clicking in the blank area, you should see a blue border around the filter and the legends. This indicates that the vertical container is selected. As shown in the following screenshot, select the vertical container handle and drag it to the left-hand side of the Customers worksheet. Note the gray shading, which communicates where the container will be placed: The gray shading (provided by Tableau when dragging elements such as worksheets and containers onto a dashboard) helpfully communicates where the element will be placed. Take your time and observe carefully when placing an element on a dashboard or the results may be Unexpected. 5. Format the dashboard as desired. The following tips may prove helpful: Adjust the sizes of the elements on the screen by hovering over the edges between each element and then clicking and dragging as Desired. 2. Note that the Sales and Profit legends in the following screenshot are floating elements. Make an element float by right-clicking on the element handle and selecting Floating. (See the previous screenshot and note that the handle is located immediately above Region, in the upper-right-hand corner). 3. Create Horizontal and Vertical containers by dragging those objects from the bottom portion of the Dashboard pane. 4. Drag the edges of containers to adjust the size of each worksheet. 5. Display the dashboard title via Dashboard | Show Title…: If you enjoyed our post, be sure to check out Mastering Tableau which consists of many useful data visualization and data analysis techniques.  
Read more
  • 0
  • 0
  • 36691
article-image-f8-pytorch-announcements-pytorch-1-1-releases-with-new-ai-toolsopen-sourcing-botorch-and-ax-and-more
Bhagyashree R
03 May 2019
4 min read
Save for later

F8 PyTorch announcements: PyTorch 1.1 releases with new AI tools, open sourcing BoTorch and Ax, and more

Bhagyashree R
03 May 2019
4 min read
Despite Facebook’s frequent appearance in the news for all the wrong reasons, we cannot deny that its open source contributions to AI have been its one redeeming quality. At its F8 annual developer conference showcasing its exceptional AI prowess, Facebook shared how the production-ready PyTorch 1.0 is being adopted by the community and also the release of PyTorch 1.1. Facebook introduced PyTorch in 2017, and since then it has been well-received by developers. It partnered with the AI community for further development in PyTorch and released the stable version last year in December. Along with optimizing and fixing other parts of PyTorch, the team introduced Just-in-time compilation for production support that allows seamless transitions between eager mode and graph mode. PyTorch 1.0 in leading businesses, communities, and universities Facebook is leveraging end-to-end workflows of PyTorch 1.0 for building and deploying translation and NLP at large scale. These NLP systems are delivering a staggering 6 billion translations for applications such as Messenger. PyTorch has also enabled Facebook to quickly iterate their ML systems. It has helped them accelerate their research-to-production cycle. Other leading organizations and businesses are also now using PyTorch for speeding up the development of AI features. Airbnb’s Smart Reply feature is backed by PyTorch libraries and APIs for conversational AI. ATOM (Accelerating Therapeutics for Opportunities in Medicine) has come up with a variational autoencoder that represents diverse chemical structures and designs new drug candidates. Microsoft has built large-scale distributed language models that are now in production in offerings such as Cognitive Services. PyTorch 1.1 releases with new model understanding and visualization tools Along with showcasing how the production-ready version is being accepted by the community, the PyTorch team further announced the release of PyTorch 1.1. This release focuses on improved performance, brings new model understanding and visualization tools for improved usability, and more. Following are some of the key feature PyTorch 1.1 comes with: Support for TensorBoard: TensorBoard, a suite of visualization tools, is now natively supported in PyTorch. You can use it through the  “from torch.utils.tensorboard import SummaryWriter” command. Improved JIT compiler: Along with some bug fixes, the team has expanded capabilities in TorchScript such as support for dictionaries, user classes, and attributes. Introducing new APIs: New APIs are introduced to support Boolean tensors and custom recurrent neural networks. Distributed training: This release comes with improved performance for common models such as CNNs. Multi-device modules support and the ability to split models across GPUs while still using Distributed Data Parallel is added. Ax, BoTorch, and more: Open source tools for Machine Learning engineers Facebook announced that it is open sourcing two new tools, Ax and BoTorch that are aimed at solving large scale exploration problems both in research and production environment. Built on top of PyTorch, BoTorch leverages its features such as auto-differentiation, massive parallelism, and deep learning to help in researches related Bayesian optimization. Ax is a general purpose ML platform for managing adaptive experiments. Both Ax and BoTorch use probabilistic models that efficiently use data and meaningfully quantify the costs and benefits of exploring new regions of problem space. Facebook has also open sourced PyTorch-BigGraph (PBG), a tool that makes it easier and faster to produce graph embeddings for extremely large graphs with billions of entities and trillions of edges. PBG comes with support for sharding and negative sampling and also offers sample use cases based on Wikidata embedding. As a result of its collaboration with Google, AI Platform Notebooks, a new histed JupyterLab service from Google Cloud Platform, now comes preinstalled with PyTorch. It also comes integrated with other GCP services such as BigQuery, Cloud Dataproc, Cloud Dataflow, and AI Factory. The broader PyTorch community has also come up with some impressive open source tools. BigGAN-Torch is basically a full reimplementation of PyTorch that uses gradient accumulation to provide the benefits of big batches by only using a few GPUs. GeomLoss is an API written in Python that defines PyTorch layers for geometric loss functions between sampled measures, images, and volumes. It provides efficient GPU implementations for Kernel norms, Hausdorff divergences, and unbiased Sinkhorn divergences. PyTorch Geometric is a geometric deep learning extension library for PyTorch consisting of various methods for deep learning on graphs and other irregular structures. Read the official announcement on Facebook’s AI  blog. Facebook open-sources F14 algorithm for faster and memory-efficient hash tables “Is it actually possible to have a free and fair election ever again?,” Pulitzer finalist, Carole Cadwalladr on Facebook’s role in Brexit F8 Developer Conference Highlights: Redesigned FB5 app, Messenger update, new Oculus Quest and Rift S, Instagram shops, and more
Read more
  • 0
  • 0
  • 36491

article-image-getting-to-know-different-big-data-characteristics
Gebin George
05 Jan 2018
4 min read
Save for later

Getting to know different Big data Characteristics

Gebin George
05 Jan 2018
4 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Osvaldo Martin titled Mastering Predictive Analytics with R, Second Edition. This book will help you leverage the flexibility and modularity of R to experiment with a range of different techniques and data types.[/box] Our article will quickly walk you through all the fundamental characteristics of Big Data. For you to determine if your data source qualifies as big data or as needing special handling, you can start by examining your data source in the following areas: The volume (amount) of data. The variety of data. The number of different sources and spans of the data. Let's examine each of these areas. Volume If you are talking about the number of rows or records, then most likely your data source is not a big data source since big data is typically measured in gigabytes, terabytes, and petabytes. However, space doesn't always mean big, as these size measurements can vary greatly in terms of both volume and functionality. Additionally, data sources of several million records may qualify as big data, given their structure (or lack of structure). Varieties Data used in predictive models may be structured or unstructured (or both) and include transactions from databases, survey results, website logs, application messages, and so on (by using a data source consisting of a higher variety of data, you are usually able to cover a broader context for the analytics you derive from it). Variety, much like volume, is considered a normal qualifier for big data. Sources and spans If the data source for your predictive analytics project is the result of integrating several sources, you most likely hit on both criteria of volume and variety and your data qualifies as big data. If your project uses data that is affected by governmental mandates, consumer requests is a historical analysis, you are almost certainty using big data. Government regulations usually require that certain types of data need to be stored for several years. Products can be consumer driven over the lifetime of the product and with today's trends, historical analysis data is usually available for more than five years. Again, all examples of big data sources. Structure You will often find that data sources typically fall into one of the following three categories: 1. Sources with little or no structure in the data (such as simple text files). 2. Sources containing both structured and unstructured data (like data that is sourced from document management systems or various websites, and so on). 3. Sources containing highly structured data (like transactional data stored in a relational database example). How your data source is categorized will determine how you prepare and work with your data in each phase of your predictive analytics project. Although data sources with structure can obviously still fall into the category of big data, it's data containing both structured and unstructured data (and of course totally unstructured data) that fit as big data and will require special handling and or pre-processing. Statistical noise Finally, we should take a note here that other factors (other than those discussed already in the chapter) can qualify your project data source as being unwieldy, overly complex, or a big data source. These include (but are not limited to): Statistical noise (a term for recognized amounts of unexplained variations within the data) Data suffering from mismatched understandings (the differences in interpretations of the data by communities, cultures, practices, and so on) Once you have determined that the data source that you will be using in your predictive analytics project seems to qualify as big (again as we are using the term here) then you can proceed with the process of deciding how to manage and manipulate that data source, based upon the known challenges this type of data demands, so as to be most effective. In the next section, we will review some of these common problems, before we go on to offer useable solutions. We have learned fundamental characteristics which define Big Data, to further use them for Analytics. If you enjoyed our post, check out the book Mastering Predictive Analytics with R, Second Edition to learn complex machine learning models using R.    
Read more
  • 0
  • 0
  • 36368
Modal Close icon
Modal Close icon