Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Data

1210 Articles
article-image-how-to-install-keras-on-docker-and-cloud-ml
Amey Varangaonkar
20 Dec 2017
3 min read
Save for later

How to Install Keras on Docker and Cloud ML

Amey Varangaonkar
20 Dec 2017
3 min read
[box type="note" align="" class="" width=""]The following extract is taken from the book Deep Learning with Keras, written by Antonio Gulli and Sujit Pal. It contains useful techniques to train effective deep learning models using the highly popular Keras library.[/box] Keras is a deep learning library which can be used on the enterprise platform, by deploying it on a container. In this article, we see how to install Keras on Docker and Google’s Cloud ML. Installing Keras on Docker One of the easiest ways to get started with TensorFlow and Keras is running in a Docker container. A convenient solution is to use a predefined Docker image for deep learning created by the community that contains all the popular DL frameworks (TensorFlow, Theano, Torch, Caffe, and so on). Refer to the GitHub repository at https://github.com/saiprashanths/dl-docker for the code files. Assuming that you already have Docker up and running (for more information, refer to https://www.docker.com/products/overview), installing it is pretty simple and is shown as follows: The following screenshot, says something like, after getting the image from Git, we build the Docker image: In this following screenshot, we see how to run it: From within the container, it is possible to activate support for Jupyter Notebooks (for more information, refer to http://jupyter.org/): Access it directly from the host machine on port: It is also possible to access TensorBoard (for more information, refer to https://www.tensorflow.org/how_tos/summaries_and_tensorboard/) with the help of the command in the screenshot that follows, which is discussed in the next section: After running the preceding command, you will be redirected to the following page: Installing Keras on Google Cloud ML Installing Keras on Google Cloud is very simple. First, we can install Google Cloud (for the downloadable file, refer to https://cloud.google.com/sdk/), a command-line interface for Google Cloud Platform; then we can use CloudML, a managed service that enables us to easily build machine, learning models with TensorFlow. Before using Keras, let's use Google Cloud with TensorFlow to train an MNIST example available on GitHub. The code is local and training happens in the cloud: In the following screenshot, you can see how to run a training session: We can use TensorBoard to show how cross-entropy decreases across iterations: In the next screenshot, we see the graph of cross-entropy: Now, if we want to use Keras on the top of TensorFlow, we simply download the Keras source from PyPI (for the downloadable file, refer to https://pypi.python.org/pypi/Keras/1.2.0 or later versions) and then directly use Keras as a CloudML package solution, as in the following example: Here, trainer.task2.py is an example script: from keras.applications.vgg16 import VGG16 from keras.models import Model from keras.preprocessing import image from keras.applications.vgg16 import preprocess_input import numpy as np # pre-built and pre-trained deep learning VGG16 model base_model = VGG16(weights='imagenet', include_top=True) for i, layer in enumerate(base_model.layers): print (i, layer.name, layer.output_shape) Thus we saw, how fairly easy it is to set up and run Keras on a Docker container and Cloud ML. If this article interested you, make sure to check out our book Deep Learning with Keras, where you can learn to install Keras on other popular platforms such as Amazon Web Services and Microsoft Azure.  
Read more
  • 0
  • 0
  • 38206

article-image-introducing-generative-adversarial-networks
Amey Varangaonkar
11 Dec 2017
6 min read
Save for later

Implementing a simple Generative Adversarial Network (GANs)

Amey Varangaonkar
11 Dec 2017
6 min read
[box type="note" align="" class="" width=""]The following excerpt is taken from Chapter 2 - Learning Features with Unsupervised Generative Networks of the book Deep Learning with Theano, written by Christopher Bourez. This book talks about modeling and training effective deep learning models with Theano, a popular Python-based deep learning library. [/box] In this article, we introduce you to the concept of Generative Adversarial Networks, a popular class of Artificial Intelligence algorithms used in unsupervised machine learning. Code files for this particular chapter are available for download towards the end of the post. Generative adversarial networks are composed of two models that are alternatively trained to compete with each other. The generator network G is optimized to reproduce the true data distribution, by generating data that is difficult for the discriminator D to differentiate from real data. Meanwhile, the second network D is optimized to distinguish real data and synthetic data generated by G. Overall, the training procedure is similar to a two-player min-max game with the following objective function: Here, x is real data sampled from real data distribution, and z the noise vector of the generative model. In some ways, the discriminator and the generator can be seen as the police and the thief: to be sure the training works correctly, the police is trained twice as much as the thief. Let's illustrate GANs with the case of images as data. In particular, let's again take our example from Chapter 2, Classifying Handwritten Digits with a Feedforward Network about MNIST digits, and consider training a generative adversarial network, to generate images, conditionally on the digit we want. The GAN method consists of training the generative model using a second model, the discriminative network, to discriminate input data between real and fake. In this case, we can simply reuse our MNIST image classification model as discriminator, with two classes, real or fake, for the prediction output, and also condition it on the label of the digit that is supposed to be generated. To condition the net on the label, the digit label is concatenated with the inputs: def conv_cond_concat(x, y): return T.concatenate([x, y*T.ones((x.shape[0], y.shape[1], x.shape[2], x.shape[3]))], axis=1) def discrim(X, Y, w, w2, w3, wy): yb = Y.dimshuffle(0, 1, 'x', 'x') X = conv_cond_concat(X, yb) h = T.nnet.relu(dnn_conv(X, w, subsample=(2, 2), border_mode=(2, 2)), alpha=0.2 ) h = conv_cond_concat(h, yb) h2 = T.nnet.relu(batchnorm(dnn_conv(h, w2, subsample=(2, 2), border_mode=(2, 2))), alpha=0.2) h2 = T.flatten(h2, 2) h2 = T.concatenate([h2, Y], axis=1) h3 = T.nnet.relu(batchnorm(T.dot(h2, w3))) h3 = T.concatenate([h3, Y], axis=1) y = T.nnet.sigmoid(T.dot(h3, wy)) return y Note the use of two leaky rectified linear units, with a leak of 0.2, as activation for the first two convolutions. To generate an image given noise and label, the generator network consists of a stack of deconvolutions, using an input noise vector z that consists of 100 real numbers ranging from 0 to 1: To create a deconvolution in Theano, a dummy convolutional forward pass is created, which gradient is used as deconvolution: def deconv(X, w, subsample=(1, 1), border_mode=(0, 0), conv_ mode='conv'): img = gpu_contiguous(T.cast(X, 'float32')) kerns = gpu_contiguous(T.cast(w, 'float32')) desc = GpuDnnConvDesc(border_mode=border_mode, subsample=subsample, conv_mode=conv_mode)(gpu_alloc_empty(img.shape[0], kerns.shape[1], img.shape[2]*subsample[0], img.shape[3]*subsample[1]).shape, kerns. shape) out = gpu_alloc_empty(img.shape[0], kerns.shape[1], img. shape[2]*subsample[0], img.shape[3]*subsample[1]) d_img = GpuDnnConvGradI()(kerns, img, out, desc) return d_img def gen(Z, Y, w, w2, w3, wx): yb = Y.dimshuffle(0, 1, 'x', 'x') Z = T.concatenate([Z, Y], axis=1) h = T.nnet.relu(batchnorm(T.dot(Z, w))) h = T.concatenate([h, Y], axis=1) h2 = T.nnet.relu(batchnorm(T.dot(h, w2))) h2 = h2.reshape((h2.shape[0], ngf*2, 7, 7)) h2 = conv_cond_concat(h2, yb) h3 = T.nnet.relu(batchnorm(deconv(h2, w3, subsample=(2, 2), border_mode=(2, 2)))) h3 = conv_cond_concat(h3, yb) x = T.nnet.sigmoid(deconv(h3, wx, subsample=(2, 2), border_ mode=(2, 2))) return x Real data is given by the tuple (X,Y), while generated data is built from noise and label (Z,Y): X = T.tensor4() Z = T.matrix() Y = T.matrix() gX = gen(Z, Y, *gen_params) p_real = discrim(X, Y, *discrim_params) p_gen = discrim(gX, Y, *discrim_params) Generator and discriminator models compete during adversarial learning: The discriminator is trained to label real data as real (1) and label generated data as generated (0), hence minimizing the following cost function: d_cost = T.nnet.binary_crossentropy(p_real, T.ones(p_real.shape)).mean() + T.nnet.binary_crossentropy(p_gen, T.zeros(p_gen.shape)). mean() The generator is trained to deceive the discriminator as much as possible. The training signal for the generator is provided by the discriminator network (p_gen) to the generator: g_cost = T.nnet.binary_crossentropy(p_gen,T.ones(p_gen.shape)). mean() The same as usual follows. Cost with respect to the parameters for each model is computed and training optimizes the weights of each model alternatively, with two times more the discriminator. In the case of GANs, competition between discriminator and generator does not lead to decreases in each loss. From the first epoch: To the 45th epoch: Generated examples look closer to real ones: Generative models, and especially Generative Adversarial Networks are currently the trending areas of Deep Learning. It has also found its way in a few practical applications as well.  For example, a generative model can successfully be trained to generate the next most likely video frames by learning the features of the previous frames. Another popular example where GANs can be used is, search engines that predict the next likely word before it is even entered by the user, by studying the sequence of the previously entered words. If you found this excerpt useful, do check out more comprehensive coverage of popular deep learning topics in our book  Deep Learning with Theano. [box type="download" align="" class="" width=""] Download files [/box]  
Read more
  • 0
  • 0
  • 38189

article-image-4-important-business-intelligence-considerations-for-the-rest-of-2019
Richard Gall
16 Sep 2019
7 min read
Save for later

4 important business intelligence considerations for the rest of 2019

Richard Gall
16 Sep 2019
7 min read
Business intelligence occupies a strange position, often overshadowed by fields like data science and machine learning. But it remains a critical aspect of modern business - indeed, the less attention the world appears to pay to it, the more it is becoming embedded in modern businesses. Where analytics and dashboards once felt like a shiny and exciting interruption in our professional lives, today it is merely the norm. But with business intelligence almost baked into the day to day routines and activities of many individuals, teams, and organizations, what does this actually mean in practice. For as much as we’d like to think that we’re all data-driven now, the reality is that there’s much we can do to use data more effectively. Research confirms that data-driven initiatives often fail - so with that in mind here’s what’s important when it comes to business intelligence in 2019. Popular business intelligence eBooks and videos Oracle Business Intelligence Enterprise Edition 12c - Second Edition Microsoft Power BI Quick Start Guide Implementing Business Intelligence with SQL Server 2019 [Video] Hands-On Business Intelligence with Qlik Sense Hands-On Dashboard Development with QlikView Getting the balance between self-service business intelligence and centralization Self-service business intelligence is one of the biggest trends to emerge in the last two years. In practice, this means that a diverse range of stakeholders (marketers and product managers for example) have access to analytics tools. They’re no longer purely the preserve of data scientists and analysts. Self-service BI makes a lot of sense in the context of today’s data rich and data-driven environment. The best way to empower team members to actually use data is to remove any bottlenecks (like a centralized data team) and allow them to go directly to the data and tools they need to make decisions. In essence, self-service business intelligence solutions are a step towards the democratization of data. However, while the notion of democratizing data sounds like a noble cause, the reality is a little more complex. There are a number of different issues that make self-service BI a challenging thing to get right. One of the biggest pain points, for example, are the skill gaps of teams using these tools. Although self-service BI should make using data easy for team members, even the most user-friendly dashboards need a level of data literacy to be useful. Read next: What are the limits of self-service BI? Many analytics products are being developed with this problem in mind. But it’s still hard to get around - you don’t, after all, want to sacrifice the richness of data for simplicity and accessibility. Another problem is the messiness of data itself - and this ultimately points to one of the paradoxes of self-service BI. You need strong alignment - centralization even - if you’re to ensure true democratization. The answer to all this isn’t to get tied up in decentralization or centralization. Instead, what’s important is striking a balance between the two. Decentralization needs centralization - there needs to be strong governance and clarity over what data exists, how it’s used, how it’s accessed - someone needs to be accountable for that for decentralized, self-service BI to actually work. Read next: How Qlik Sense is driving self-service Business Intelligence Self-service business intelligence: recommended viewing Power BI Masterclass - Beginners to Advanced [Video] Data storytelling that makes an impact Data storytelling is a phrase that’s used too much without real consideration as to what it means or how it can be done. Indeed, all too often it’s used to refer to stylish graphs and visualizations. And yes, stylish graphs and data visualizations are part of data storytelling, but you can’t just expect some nice graphics to communicate in depth data insights to your colleagues and senior management. To do data storytelling well, you need to establish a clear sense of objectives and goals. By that I’m not referring only to your goals, but also those of the people around you. It goes without saying that data and insight needs context, but what that context should be, exactly, is often the hard part - objectives and aims are perhaps the straightforward way of establishing that context and ensuring your insights are able to establish the scope of a problem and propose a way forward. Data storytelling can only really make an impact if you are able to strike a balance between centralization and self-service. Stakeholders that use self-service need confidence that everything they need is both available and accurate - this can only really be ensured by a centralized team of data scientists, architects, and analysts. Data storytelling: recommend viewing Data Storytelling with Qlik Sense [Video] Data Storytelling with Power BI [Video] The impact of cloud It’s impossible to properly appreciate the extent to which cloud is changing the data landscape. Not only is it easier than ever to store and process data, it’s also easy to do different things with it. This means that it’s now possible to do machine learning, or artificial intelligence projects with relative ease (the word relative being important, of course). For business intelligence, this means there needs to be a clear strategy that joins together every piece of the puzzle, from data collection to analysis. This means there needs to be buy-in and input from stakeholders before a solution is purchased - or built - and then the solution needs to be developed with every individual use case properly understood and supported. Indeed, this requires a combination of business acumen, soft skills, and technical expertise. A large amount of this will rest on the shoulders of an organization’s technical leadership team, but it’s also worth pointing out that those in other departments still have a part to play. If stakeholders are unable to present a clear vision of what their needs and goals are it’s highly likely that the advantages of cloud will pass them by when it comes to business intelligence. Cloud and business intelligence: recommended viewing Going beyond Dashboards with IBM Cognos Analytics [Video] Business intelligence ethics Ethics has become a huge issue for organizations over the last couple of years. With the Cambridge Analytica scandal placing the spotlight on how companies use customer data, and GDPR forcing organizations to take a new approach to (European) user data, it’s undoubtedly the case that ethical considerations have added a new dimension to business intelligence. But what does this actually mean in practice? Ethics manifests itself in numerous ways in business intelligence. Perhaps the most obvious is data collection - do you have the right to use someone’s data in a certain way? Sometimes the law will make it clear. But other times it will require individuals to exercise judgment and be sensitive to the issues that could arise. But there are other ways in which individuals and organizations need to think about ethics. Being data-driven is great, especially if you can approach insight in a way that is actionable and proactive. But at the same time it’s vital that business intelligence isn’t just seen as a replacement for human intelligence. Indeed, this is true not just in an ethical sense, but also in terms of sound strategic thinking. Business intelligence without human insight and judgment is really just the opposite of intelligence. Conclusion: business intelligence needs organizational alignment and buy-in There are many issues that have been slowly emerging in the business intelligence world for the last half a decade. This might make things feel confusing, but in actual fact it underlines the very nature of the challenges organizations, leadership teams, and engineers face when it comes to business intelligence. Essentially, doing business intelligence well requires you - and those around you - to tie all these different elements. It's certainly not straightforward, but with focus and a clarity of thought, it's possible to build a really effective BI program that can fulfil organizational needs well into the future.
Read more
  • 0
  • 0
  • 37845

article-image-implementing-decision-trees
Packt
22 Sep 2015
4 min read
Save for later

Implementing Decision Trees

Packt
22 Sep 2015
4 min read
 In this article by the author, Sunila Gollapudi, of this book, Practical Machine Learning, we will outline a business problem that can be addressed by building a decision tree-based model, and see how it can be implemented in Apache Mahout, R, Julia, Apache Spark, and Python. This can happen many, many times. So, building a website or an app will take a bit longer than it used to. (For more resources related to this topic, see here.) Implementing decision trees Here, we will explore implementing decision trees using various frameworks and tools. The R example We will use the rpart and ctree packages in R to build decision tree-based models: Import the packages for data import and decision tree libraries as shown here: Start data manipulation: Create a categorical variable on Sales and append to the existing dataset as shown here: Using random functions, split data into training and testing datasets; Fit the tree model with training data and check how the model is working with testing data, measure the error: Prune the tree; Plotting the pruned tree will look like the following: The Spark example Java-based example using MLib is shown here: import java.util.HashMap; import scala.Tuple2; import org.apache.spark.api.java.JavaPairRDD; import org.apache.spark.api.java.JavaRDD; import org.apache.spark.api.java.JavaSparkContext; import org.apache.spark.api.java.function.Function; import org.apache.spark.api.java.function.PairFunction; import org.apache.spark.mllib.regression.LabeledPoint; import org.apache.spark.mllib.tree.DecisionTree; import org.apache.spark.mllib.tree.model.DecisionTreeModel; import org.apache.spark.mllib.util.MLUtils; import org.apache.spark.SparkConf; SparkConf sparkConf = new SparkConf().setAppName("JavaDecisionTree"); JavaSparkContext sc = new JavaSparkContext(sparkConf); // Load and parse the data file. String datapath = "data/mllib/sales.txt"; JavaRDD<LabeledPoint> data = MLUtils.loadLibSVMFile(sc.sc(), datapath).toJavaRDD(); // Split the data into training and test sets (30% held out for testing) JavaRDD<LabeledPoint>[] splits = data.randomSplit(new double[]{0.7, 0.3}); JavaRDD<LabeledPoint> trainingData = splits[0]; JavaRDD<LabeledPoint> testData = splits[1]; // Set parameters. // Empty categoricalFeaturesInfo indicates all features are continuous. Integer numClasses = 2; Map<Integer, Integer> categoricalFeaturesInfo = new HashMap<Integer, Integer>(); String impurity = "gini"; Integer maxDepth = 5; Integer maxBins = 32; // Train a DecisionTree model for classification. final DecisionTreeModel model = DecisionTree.trainClassifier(trainingData, numClasses, categoricalFeaturesInfo, impurity, maxDepth, maxBins); // Evaluate model on test instances and compute test error JavaPairRDD<Double, Double> predictionAndLabel = testData.mapToPair(new PairFunction<LabeledPoint, Double, Double>() { @Override public Tuple2<Double, Double> call(LabeledPoint p) { return new Tuple2<Double, Double>(model.predict(p.features()), p.label()); } }); Double testErr = 1.0 * predictionAndLabel.filter(new Function<Tuple2<Double, Double>, Boolean>() { @Override public Boolean call(Tuple2<Double, Double> pl) { return !pl._1().equals(pl._2()); } }).count() / testData.count(); System.out.println("Test Error: " + testErr); System.out.println("Learned classification tree model:n" + model.toDebugString()); The Julia example We will use the DecisionTree package in Julia as shown here; julia> Pkg.add("DecisionTree")julia> using DecisionTree We will use the RDatasets package to load the dataset for the example in context; julia> Pkg.add("RDatasets"); using RDatasets julia> sales = data("datasets", "sales"); julia> features = array(sales[:, 1:4]); # use matrix() for Julia v0.2 julia> labels = array(sales[:, 5]); # use vector() for Julia v0.2 julia> stump = build_stump(labels, features); julia> print_tree(stump) Feature 3, Threshold 3.0 L-> price : 50/50 R-> shelvelock : 50/100 Pruning the tree julia> length(tree) 11 julia> pruned = prune_tree(tree, 0.9); julia> length(pruned) 9 Summary In this article, we implemented decision trees using R, Spark, and Julia. Resources for Article: Further resources on this subject: An overview of common machine learning tasks[article] How to do Machine Learning with Python[article] Modeling complex functions with artificial neural networks [article]
Read more
  • 0
  • 0
  • 37536

article-image-getting-know-sql-server-options-disaster-recovery
Sunith Shetty
27 Feb 2018
10 min read
Save for later

Getting to know SQL Server options for disaster recovery

Sunith Shetty
27 Feb 2018
10 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Marek Chmel and Vladimír Mužný titled SQL Server 2017 Administrator's Guide. This book will help you learn to implement and administer successful database solutions with SQL Server 2017.[/box] Today, we will explore the disaster recovery basics to understand the common terms in high availability and disaster recovery. We will then discuss SQL Server offering for HA/DR options. Disaster recovery basics Disaster recovery (DR) is a set of tools, policies, and procedures, which help us during the recovery of your systems after a disastrous event. Disaster recovery is just a subset of a more complex discipline called business continuity planning, where more variables come in place and you expect more sophisticated plans on how to recover the business operations. With careful planning, you can minimize the effects of the disaster, because you have to keep in mind that it's nearly impossible to completely avoid disasters. The main goal of a disaster recovery plan is to minimize the downtime of our service and to minimize the data loss. To measure these objectives, we use special metrics: Recovery Point and Time Objectives. Recovery Time Objective (RTO) is the maximum time that you can use to recover the system. This time includes your efforts to fix the problem without starting the disaster recovery procedures, the recovery itself, proper testing after the disaster recovery, and the communication to the stakeholders. Once a disaster strikes, clocks are started to measure the disaster recovery actions and the Recovery Time Actual (RTA) metric is calculated. If you manage to recover the system within the Recovery Time Objective, which means that RTA < RTO, then you have met the metrics with a proper combination of the plan and your ability to restore the system. Recovery Point Objective (RPO) is the maximum tolerable period for acceptable data loss. This defines how much data can be lost due to disaster. The Recovery Point Objective has an impact on your implementation of backups, because you plan for a recovery strategy that has specific requirements for your backups. If you can avoid to lose one day of work, you can properly plan your backup types and the frequency of the backups that you need to take. The following image is an illustration of the very concepts that we discussed in the preceding paragraph: When we talk about system availability, we usually use a percentage of the availability time. This availability is a calculated uptime in a given year or month (any date metric that you need) and is usually compared to a following table of "9s". Availability also expresses a tolerable downtime in a given time frame so that the system still meets the availability metric. In the following table, we'll see some basic availability options with tolerable downtime a year and a day: Availability % Downtime a year Downtime a day 90% 36.5 days 2.4 hours 98% 7.3 days 28.8 minutes 99% 3.65 days 14.4 minutes 99.9% 8.76 hours 1.44 minutes 99.99% 52.56 minutes 8.64 seconds 99.999% 5.26 minutes less than 1 second This tolerable downtime consists of the unplanned downtime and can be caused by many factors: Natural Disasters Hardware failures Human errors (accidental deletes, code breakdowns, and so on) Security breaches Malware For these, we can have a mitigation plan in place that will help us reduce the downtime to a tolerable range, and we usually deploy a combination of high availability solutions and disaster recovery solutions so that we can quickly restore the operations. On the other hand, there's a reasonable set of events that require a downtime on your service due to the maintenance and regular operations, which does not affect the availability on your system. These can include the following: New releases of the software Operating system patching SQL Server patching Database maintenance and upgrades Our goal is to have the database online as much as possible, but there will be times when the database will be offline and, from the perspective of the management and operation, we're talking about several keywords such as uptime, downtime, time to repair, and time between failures, as you can see in the following image: It's really critical not only to have a plan for disaster recovery, but also to practice the disaster recovery itself. Many companies follow the procedure of proper disaster recovery plan testing with different types of exercise where each and every aspect of the disaster recovery is carefully evaluated by teams who are familiar with the tools and procedures for a real disaster event. This exercise may have different scope and frequency, as listed in the following points: Tabletop exercises usually involve only a small number of people and focus on a specific aspect of the DR plan. This would be a DBA team drill to recover a single SQL Server or a small set of servers with simulated outage. Medium-sized exercises will involve several teams to practice team communication and interaction. Complex exercises usually simulate larger events such as data center loss, where a new virtual data center is built and all new servers and services are provisioned by the involved teams. Such exercises should be run on a periodic basis so that all the teams and team personnel are up to speed with the disaster recovery plans. SQL Server options for high availability and disaster recovery SQL Server has many features that you can put in place to implement a HA/DR solution that will fit your needs. These features include the following: Always On Failover Cluster Always On Availability Groups Database mirroring Log shipping Replication In many cases, you will combine more of the features together, as your high availability and disaster recovery needs will overlap. HA/DR does not have to be limited to just one single feature. In complex scenarios, you'll plan for a primary high availability solution and secondary high availability solution that will work as your disaster recovery solution at the same time. Always On Failover Cluster An Always On Failover Cluster (FCI) is an instance-level protection mechanism, which is based on top of a Windows Failover Cluster Feature (WFCS). SQL Server instance will be installed across multiple WFCS nodes, where it will appear in the network as a single computer. All the resources that belong to one SQL Server instance (disk, network, names) can be owned by one node of the cluster and, during any planned or unplanned event like a failure of any server component, these can be moved to another node in the cluster to preserve operations and minimize downtime, as shown in the following image: Always On Availability Groups Always On Availability Groups were introduced with SQL Server 2012 to bring a database-level protection to the SQL Server. As with the Always On Failover Cluster, Availability Groups utilize the Windows Failover Cluster feature, but in this case, single SQL Server is not installed as a clustered instance but runs independently on several nodes. These nodes can be configured as Always On Availability Group nodes to host a database, which will be synchronized among the hosts. The replica can be either synchronous or asynchronous, so Always On Availability Groups are a good fit either as a solution for one data center or even distant data centers to keep your data safe. With new SQL Server versions, Always On Availability Groups were enhanced and provide many features for database high availability and disaster recovery scenarios. You can refer to the following image for a better understanding: Database mirroring Database mirroring is an older HA/DR feature available in SQL Server, which provides database-level protection. Mirroring allows synchronizing the databases between two servers, where you can include one more server as a witness server as a failover quorum. Unlike the previous two features, database mirroring does not require any special setup such as Failover Cluster and the configuration can be achieved via SSMS using a wizard available via database properties. Once a transaction occurs on the primary node, it's copied to the second node to the mirrored database. With proper configuration, database mirroring can provide failover options for high availability with automatic client redirection. Database mirroring is not preferred solution for HA/DR, since it's marked as a deprecated feature from SQL Server 2012 and is replaced by Basic Availability Groups on current versions. Log shipping Log shipping configuration, as the name suggests, is a mechanism to keep a database in sync by copying the logs to the remote server. Log shipping, unlike mirroring, is not copying each single transaction, but copies the transactions in batches via transaction log backup on the primary node and log restore on the secondary node. Unlike all previously mentioned features, log shipping does not provide an automatic failover option, so it's considered more as a disaster recovery option than a high availability one. Log shipping operates on regular intervals where three jobs have to run: Backup job to backup the transaction log on the primary system Copy job to copy the backups to the secondary system Restore job to restore the transaction log backup on the secondary system Log shipping supports multiple standby databases, which is quite an advantage compared to database mirroring. One more advantage is the standby configuration for log shipping, which allows read-only access to the secondary database. This is mainly used for many reporting scenarios, where the reporting applications use read-only access and such configuration allows performance offload to the secondary system. Replication Replication is a feature for data movement from one server to another that allows many different scenarios and topologies. Replication uses a model of publisher/subscriber, where the Publisher is the server offering the content via a replication article and subscribers are getting the data. The configuration is more complex compared to mirroring and log shipping features, but allows you much more variety in the configuration for security, performance, and topology. Replication has many benefits and a few of them are as follows: Works on the object level (whereas other features work on database or instance level) Allows merge replication, where more servers synchronize data between each other Allows bi-directional synchronization of data Allows other than SQL Server partners (Oracle, for example) There are several different replication types that can be used with SQL Server, and you can choose them based on the needs for HA/DR options and the data availability requirements on the secondary servers. These options include the following: Snapshot replication Transactional replication Peer-to-peer replication Merge replication We introduced the disaster recovery discipline with the whole big picture of business continuity on SQL Server. Disaster recovery is not only about having backups, but more about the ability to bring the service back to operation after severe failures. We have seen several options that can be used to implement part of disaster recovery on SQL Server--log shipping, replication, and mirroring. To know more about how to design and use an optimal database management strategy, do checkout the book SQL Server 2017 Administrator's Guide.  
Read more
  • 0
  • 0
  • 37420

article-image-decrypting-bitcoin-2017-journey
Ashwin Nair
28 Dec 2017
7 min read
Save for later

There and back again: Decrypting Bitcoin`s 2017 journey from $1000 to $20000

Ashwin Nair
28 Dec 2017
7 min read
Lately, Bitcoin has emerged as the most popular topic of discussion amongst colleagues, friends and family. The conversations more or less have a similar theme - filled with inane theories around what Bitcoin is and and the growing fear of missing out on Bitcoin mania. Well to be fair, who would want to miss out on an opportunity to grow their money by more than 1000%. That`s the return posted by Bitcoin since the start of the year. Bitcoin at the time of writing this article is at $15,000 with a marketcap over $250 billion. To put the hype in context, Bitcoin is now valued higher than 90% of companies in S&P 500 list. Supposedly, invented by an anonymous group or an individual under the alias Satoshi Nakamoto in 2009, Bitcoin has been seen as a digital currency that the internet might not need but one that it deserves. Satoshi`s vision was to create a robust electronic payment system that functions smoothly without a need for a trusted third party. This was achievable with the help of Blockchain, a digital ledger which records transactions in an indelible fashion and is distributed across multiple nodes. This ensured that no transaction is altered or deleted, thus completely eliminating the need for a trusted third party. For all the excitement around Bitcoin and the increase in interest level towards owning one, the thought of dissecting the roller coaster journey from sub $1000 level to $15,000 this year seemed pretty exciting. Started year with a bang / Global uncertainties leading to Bitcoin rally - Oct 2016 to Jan 2017 Global uncertainties played a big role in driving Bitcoin`s price and boy 2016 was full of it!  From Brexit to Trump winning the US elections and major economic commotion in terms of Devaluation of China`s Yuan and India`s Demonetization drive all leading to investors seeking shelter in Bitcoin. The first blip - Jan 2017 to March 2017 China as a country has had a major impact in determining Bitcoin`s fate. Early 2017, China contributed to over 80% of Bitcoin transactions indicating the amount of power Chinese traders and investors had in controlling Bitcoin`s price. However, People's Bank of China`s closer inspection of the exchanges revealed several irregularities in the transactions and business practices. Which eventually led to officials halting withdrawals using Bitcoin transactions and taking stringent steps towards cryptocurrency exchanges. Source - Bloomberg On the path tobecoming  mainstream and gaining support from Ethereum - Mar 2017 to June 2017 During this phase, we saw the rise of another cryptocurrency, Ethereum, a close digital cousin of Bitcoin. Similar to Bitcoin, Ethereum is also built on top of Blockchain technology allowing it to build a decentralized public network. However, Ethereum’s capabilities extend beyond being a cryptocurrency and help developers to build and deploy any kind of decentralized applications. Ethereum`s valuation in this period rose from $20 to $375 which was quite beneficial for Bitcoin. As every reference of Ethereum had Bitcoin mentioned as well, whether it was to explain what Ethereum is or how it can take the number 1 cryptocurrency spot in the future. This, coupled with the rise in Blockchain`s popularity, increased Bitcoin`s visibility within USA. The media started observing politicians, celebrities and other prominent personalities speaking on Bitcoin as well. Bitcoin also received a major boost from Chinese exchanges, wherein withdrawals of the cryptocurrency resumed after nearly a four-month freeze. All these factors led to Bitcoin crossing an all time high of $2500, up by more than 150% since the start of the year. The Curious case of Fork - June 2017 to September 2017 The month of July saw the Cryptocurrencies market cap witnessing a sharp decline, with questions being raised on the price volatility and whether Bitcoin`s rally for the year was over? We can of-course now confidently debunk that question. Though, there hasn’t been any proven rationale behind the decline, one of the reasons seems to be profit booking following months of steep rally in valuations witnessed by Bitcoin. Another major factor, which might have driven the price collapse may be an unresolved dispute among leading members of the bitcoin community over how to overcome the growing problem of Bitcoin being slow and expensive. With growing usage of Bitcoin, its performance in terms of transaction time has slowed down. Bitcoin`s network due to its limitation in terms of block size could only execute around 7 transactions per second compared to the VISA Network which could do over 1600 transactions. This also led to transaction fees being increased substantially to $5.00 per transaction and settlement time often taking hours and even days. This eventually put Bitcoin`s flaw in the spotlight when compared with services offered by competitors such as Paypal in terms of cost and transaction time. Source - Coinmarketcap The poor performance of Bitcoin led to investors opting for other cryptocurrencies. The above graph, shows how Bitcoin`s dominance fell substantially compared to other cryptocurrencies such as Ethereum and Litecoin during this time.   With the core-community still unable to come to a consensus on how to improve the performance and update the software, the prospect of a “fork” was raised.  Fork highlights change in underlying software protocol of Bitcoin to make previous rules valid or invalid. There are two types of blockchain forks - soft fork and hard fork. Around August, the community announced to go ahead with a hard fork in the form of Bitcoin Cash. This news was surprisingly taken in a positive manner leading to Bitcoin rebounding strongly and reaching new heights of around $4000 in price. Once bitten, Twice Shy (China) September 2017 to October 2017 Source - Zerohedge The month of September saw another setback for Bitcoin due to measures taken from People`s Bank of China. This time, PBoC banned initial coin offerings (ICO), thus prohibiting the practice of building and selling cryptocurrency to any investors or to finance any startup projects within China. Based on a  report by National Committee of Experts on Internet Financial Security Technology, Chinese Investors were involved in 2.6 Billion Yuan worth of ICOs in January-June, 2017 reflecting China`s exposure towards Bitcoin. My Precious (October 2017 to December 2017) Source - Manmanroe During the last quarter Bitcoin`s surge has shocked even hardcore Bitcoin fanatics.  Everything seems to be going right for Bitcoin at the moment.  While at the start of the year China was the major contributor towards the hike in Bitcoin`s valuation, now the momentum seemed to have shifted to a much sensible and responsible market in terms of Japan who have embraced Bitcoin in quite a positive manner. As you can see from the below graph, Japan now holds more than 50% of transaction compared to USA which is much lesser in size. Besides Japan, we are also seeing positive developments in country such as Russia and India, who are looking to legalize cryptocurrency usage. Moreover, the level of interests towards Bitcoin from institutional investors is at its peak. All these factors have resulted in Bitcoin to cross the 5 digit mark for the first time in Nov, 2017 and touching an all time high figure of close to $20,000 in December, 2017. Post the record high, Bitcoin has been witnessing a crash and rebound phenomenon in the last two weeks of December. From a record high of $20,000 to $11,000 and now at $15,000, Bitcoin is still a volatile digital currency if one is looking for a quick price appreciation. Despite the valuation dilemma and the price volatility, one thing is sure: the world is warming up to the idea of cryptocurrencies and even owning one. There are already several predictions being made on how Bitcoin`s astronomical growth is going to continue in 2018. However, Bitcoin needs to overcome several challenges before it can replace the traditional currency and and be widely accepting in banking practices. Besides, the rise of other cryptocurriencies such as Ethereum, LiteCoin or bitcoin cash who are looking to dethrone Bitcoin from the #1 spot, there are broader issues at hand which the Bitcoin community should prioritize such as how to curb the effect of Bitcoin`s mining activities on the environment and on having smoother reforms as well as building regulatory roadmap from countries before people actually start using instead of just looking it as a tool for making a quick buck.  
Read more
  • 0
  • 33
  • 37275
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-3-ways-to-deploy-a-qt-and-opencv-application
Gebin George
02 Apr 2018
16 min read
Save for later

3 ways to deploy a QT and OpenCV application

Gebin George
02 Apr 2018
16 min read
[box type="note" align="" class="" width=""]This article is an excerpt from the book, Computer Vision with OpenCV 3 and Qt5 written by Amin Ahmadi Tazehkandi.  This book covers how to build, test, and deploy Qt and OpenCV apps, either dynamically or statically.[/box] Today, we will learn three different methods to deploy a QT + OpenCV application. It is extremely important to provide the end users with an application package that contains everything it needs to be able to run on the target platform. And demand very little or no effort at all from the users in terms of taking care of the required dependencies. Achieving this kind of works-out-of-the-box condition for an application relies mostly on the type of the linking (dynamic or static) that is used to create an application, and also the specifications of the target operating system. Deploying using static linking Deploying an application statically means that your application will run on its own and it eliminates having to take care of almost all of the needed dependencies, since they are already inside the executable itself. It is enough to simply make sure you select the Release mode while building your application, as seen in the following screenshot: When your application is built in the Release mode, you can simply pick up the produced executable file and ship it to your users. If you try to deploy your application to Windows users, you might face an error similar to the following when your application is executed: The reason for this error is that on Windows, even when building your Qt application statically, you still need to make sure that Visual C++ Redistributables exist on the target system. This is required for C++ applications that are built by using Microsoft Visual C++, and the version of the required redistributables correspond to the Microsoft Visual Studio installed on your computer. In our case, the official title of the installer for these libraries is Visual C++ Redistributables for Visual Studio 2015, and it can be downloaded from the following link: https:/ / www. microsoft. com/en- us/ download/ details. aspx? id= 48145. It is a common practice to include the redistributables installer inside the installer for our application and perform a silent installation of them if they are not already installed. This process happens with most of the applications you use on your Windows PCs, most of the time, without you even noticing it. We already quite briefly talked about the advantages (fewer files to deploy) and disadvantages (bigger executable size) of static linking. But when it is meant in the context of deployment, there are some more complexities that need to be considered. So, here is another (more complete) list of disadvantages, when using static linking to deploy your applications: The building takes more time and the executable size gets bigger and bigger. You can't mix static and shared (dynamic) Qt libraries, which means you can't use the power of plugins and extending your application without building everything from scratch. Static linking, in a sense, means hiding the libraries used to build an application. Unfortunately, this option is not offered with all libraries, and failing to comply with it can lead to licensing issues with your application. This complexity arises partly because of the fact that Qt Framework uses some third-party libraries that do not offer the same set of licensing options as Qt itself. Talking about licensing issues is not a discussion suitable for this book, so we'll suffice with mentioning that you must be careful when you plan to create commercial applications using static linking of Qt libraries. For a detailed list of licenses used by third-party libraries within Qt, you can always refer to the Licenses Used in Qt web page from the following link: http://doc.qt.io/qt-5/ licenses-used-in-qt.html Static linking, even with all of its disadvantages that we just mentioned, is still an option, and a good one in some cases, provided that you can comply with the licensing options of the Qt Framework. For instance, in Linux operating systems where creating an installer for our application requires some extra work and care, static linking can help extremely reduce the effort needed to deploy applications (merely a copy and paste). So, the final decision of whether to use static linking or not is mostly on you and how you plan to deploy your application. Making this important decision will be much easier by the end of this chapter, when you have an overview of the possible linking and deployment methods. Deploying using dynamic linking When you deploy an application built with Qt and OpenCV using shared libraries (or dynamic linking), you need to make sure that the executable of your application is able to reach the runtime libraries of Qt and OpenCV, in order to load and use them. This reachability or visibility of runtime libraries can have different meanings depending on the operating system. For instance, on Windows, you need to copy the runtime libraries to the same folder where your application executable resides, or put them in a folder that is appended to the PATH environment value. Qt Framework offers command-line tools to simplify the deployment of Qt applications on Windows and macOS. As mentioned before, the first thing you need to do is to make sure your application is built in the Release mode, and not Debug mode. Then, if you are on Windows, first copy the executable (let us assume it is called app.exe) from the build folder into a separate folder (which we will refer to as deploy_path) and execute the following commands using a command-line instance: cd deploy_path QT_PATHbinwindeployqt app.exe The windeployqt tool is a deployment helper tool that simplifies the process of copying the required Qt runtime libraries into the same folder as the application executable. Itsimply takes an executable as a parameter and after determining the modules used to create it, copies all required runtime libraries and any additional required dependencies, such as Qt plugins, translations, and so on. This takes care of all the required Qt runtime libraries, but we still need to take care of OpenCV runtime libraries. If you followed all of the steps in Chapter 1, Introduction to OpenCV and Qt, for building OpenCV libraries dynamically, then you only need to manually copy the opencv_world330.dll and opencv_ffmpeg330.dll files from OpenCV installation folder (inside the x86vc14bin folder) into the same folder where your application executable resides. We didn't really go into the benefits of turning on the BUILD_opencv_world option when we built OpenCV in the early chapters of the book; however, it should be clear now that this simplifies the deployment and usage of the OpenCV libraries, by requiring only a single entry for LIBS in the *.pro file and manually copying only a single file (not counting the ffmpeg library) when deploying OpenCV applications. It should be also noted that this method has the disadvantage of copying all OpenCV codes (in a single library) along your application even when you do not need or use all of its modules in a project. Also note that on Windows, as mentioned in the Deploying using static linking section, you still need to similarly provide the end users of your application with Microsoft Visual C++ Redistributables. On a macOS operating system, it is also possible to easily deploy applications written using Qt Framework. For this reason, you can use the macdeployqt command-line tool provided by Qt. Similar to windeployqt, which accepts a Windows executable and fills the same folder with the required libraries, macdeployqt accepts a macOS application bundle and makes it deployable by copying all of the required Qt runtimes as private frameworks inside the bundle itself. Here is an example: cd deploy_path QT_PATH/bin/macdeployqt my_app_bundle Optionally, you can also provide an additional -dmg parameter, which leads to the creation of a macOS *.dmg (disk image) file. As for the deployment of OpenCV libraries when dynamic linking is used, you can create an installer using Qt Installer Framework (which we will learn about in the next section), a third-party provider, or a script that makes sure the required runtime libraries are copied to their required folders. This is because of the fact that simply copying your runtime libraries (whether it is OpenCV or anything else) to the same folder as the application executable does not help with making them visible to an application on macOS. The same also applies to the Linux operating system, where unfortunately even a tool for deploying Qt runtime libraries does not exist (at least for the moment), so we also need to take care of Qt libraries in addition to OpenCV libraries, either by using a trusted third-party provider (which you can search for online) or by using the cross-platform installer provided by Qt itself, combined with some scripting to make sure everything is in place when our application is executed. Deploy using Qt Installer Framework Qt Installer Framework allows you to create cross-platform installers of your Qt applications for Windows, macOS, and Linux operating systems. It allows for creating standard installer wizards where the user is taken through consecutive dialogs that provide all the necessary information, and finally display the progress for when the application is being installed and so on, similar to most of installations you have probably faced, and especially the installation of Qt Framework itself. Qt Installer Framework is based on Qt Framework itself but is provided as a different package and does not require Qt SDK (Qt Framework, Qt Creator, and so on) to be present on a computer. It is also possible to use Qt Installer Framework in order to create installer packages for any application, not just Qt applications. In this section, we are going to learn how to create a basic installer using Qt Installer Framework, which takes care of installing your application on a target computer and copying all the necessary dependencies. The result will be a single executable installer file that you can put on a web server to be downloaded or provide it in a USB stick or CD, or any other media type. This example project will help you get started with working your way around the many great capabilities of Qt Installer Framework by yourself. You can use the following link to download and install the Qt Installer Framework. Make sure to simply download the latest version when you use this link, or any other source for downloading it. At the moment, the latest version is 3.0.2: https://download.qt.io/official_releases/qt-installer-framework After you have downloaded and installed Qt Installer Framework, you can start creating the required files that Qt Installer Framework needs in order to create an installer. You can do this by simply browsing to the Qt Installer Framework, and from the examples folder copying the tutorial folder, which is also a template in case you want to quickly rename and re-edit all of the files and create your installer quickly. We will go the other way and create them manually; first because we want to understand the structure of the required files and folders for the Qt Installer Framework, and second, because it is still quite easy and simple. Here are the required steps for creating an installer: Assuming that you have already finished developing your Qt and OpenCV application, you can start by creating a new folder that will contain the installer files. Let's assume this folder is called deploy. Create an XML file inside the deploy folder and name it config.xml. This XML file must contain the following: <?xml version="1.0" encoding="UTF-8"?> <Installer> <Name>Your application</Name> <Version>1.0.0</Version> <Title>Your application Installer</Title> <Publisher>Your vendor</Publisher> <StartMenuDir>Super App</StartMenuDir> <TargetDir>@HomeDir@/InstallationDirectory</TargetDir> </Installer> Make sure to replace the required XML fields in the preceding code with information relevant to your application and then save and close this file: Now, create a folder named packages inside the deploy folder. This folder will contain the individual packages that you want the user to be able to install, or make them mandatory or optional so that the user can review and decide what will be installed. In the case of simpler Windows applications that are written using Qt and OpenCV, usually it is enough to have just a single package that includes the required files to run your application, and even do silent installation of Microsoft Visual C++ Redistributables. But for more complex cases, and especially when you want to have more control over individual installable elements of your application, you can also go for two or more packages, or even sub-packages. This is done by using domain-like folder names for each package. Each package folder can have a name like com.vendor.product, where vendor and product are replaced by the developer name or company and the application. A subpackage (or sub-component) of a package can be identified by adding. subproduct to the name of the parent package. For instance, you can have the following folders inside the packages folder: com.vendor.product com.vendor.product.subproduct1 com.vendor.product.subproduct2 com.vendor.product.subproduct1.subsubproduct1 … This can go on for as many products (packages) and sub-products (sub-packages) as we like. For our example case, let's create a single folder that contains our executable, since it describes it all and you can create additional packages by simply adding them to the packages folder. Let's name it something like com.amin.qtcvapp. Now, follow these required steps: Now, create two folders inside the new package folder that we created, the com.amin.qtcvapp folder. Rename them to data and meta. These two folders must exist inside all packages. Copy your application files inside the data folder. This folder will be extracted into the target folder exactly as it is (we will talk about setting the target folder of a package in the later steps). In case you are planning to create more than one package, then make sure to separate their data correctly and in a way that it makes sense. Of course, you won't be faced with any errors if you fail to do so, but the users of your application will probably be confused, for instance by skipping a package that should be installed at all times and ending up with an installed application that does not work. Now, switch to the meta folder and create the following two files inside that folder, and fill them with the codes provided for each one of them. The package.xml file should contain the following. There's no need to mention that you must fill the fields inside the XML with values relevant to your package: <?xml version="1.0" encoding="UTF-8"?> <Package> <DisplayName>The component</DisplayName> <Description>Install this component.</Description> <Version>1.0.0</Version> <ReleaseDate>1984-09-16</ReleaseDate> <Default>script</Default> <Script>installscript.qs</Script> </Package> The script in the previous XML file, which is probably the most important part of the creation of an installer, refers to a Qt Installer Script (*.qs file), which is named installerscript.qs and can be used to further customize the package, its target folder, and so on. So, let us create a file with the same name (installscript.qs) inside the meta folder, and use the following code inside it: function Component() { // initializations go here } Component.prototype.isDefault = function() { // select (true) or unselect (false) the component by default return true; } Component.prototype.createOperations = function() { try { // call the base create operations function component.createOperations(); } catch (e) { console.log(e); } } This is the most basic component script, which customizes our package (well, it only performs the default actions) and it can optionally be extended to change the target folder, create shortcuts in the Start menu or desktop (on Windows), and so on. It is a good idea to keep an eye on the Qt Installer Framework documentation and learn about its scripting to be able to create more powerful installers that can put all of the required dependencies of your app in place, and automatically. You can also browse through all of the examples inside the examples folder of the Qt Installer Framework and learn how to deal with different deployment cases. For instance, you can try to create individual packages for Qt and OpenCV dependencies and allow the users to deselect them, in case they already have the Qt runtime libraries on their computer. The last step is to use the binarycreator tool to create our single and standalone installer. Simply run the following command by using a Command Prompt (or Terminal) instance: binarycreator -p packages -c config.xml myinstaller The binarycreator is located inside the Qt Installer Framework bin folder. It requires two parameters that we have already prepared. -p must be followed by our packages folder and -c must be followed by the configuration file (or config.xml) file. After executing this command, you will get myinstaller (on Windows, you can append *.exe to it), which you can execute to install your application. This single file should contain all of the required files needed to run your application, and the rest is taken care of. You only need to provide a download link to this file, or provide it on a CD to your users. The following are the dialogs you will face in this default and most basic installer, which contains most of the usual dialogs you would expect when installing an application: If you go to the installation folder, you will notice that it contains a few more files than you put inside the data folder of your package. Those files are required by the installer to handle modifications and uninstall your application. For instance, the users of your application can easily uninstall your application by executing the maintenance tool executable, which would produce another simple and user-friendly dialog to handle the uninstall process: We saw how to deploy a QT + OpenCV applications using static linking, dynamic linking, and QT installer. If you found our post useful, do check out this book Computer Vision with OpenCV 3 and Qt5  to accentuate your OpenCV applications by developing them with Qt.  
Read more
  • 0
  • 0
  • 37237

article-image-getting-know-generative-models-types
Sunith Shetty
20 Feb 2018
9 min read
Save for later

Getting to know Generative Models and their types

Sunith Shetty
20 Feb 2018
9 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Rajdeep Dua and Manpreet Singh Ghotra titled Neural Network Programming with Tensorflow. In this book, you will use TensorFlow to build and train neural networks of varying complexities, without any hassle.[/box] In today’s tutorial, we will learn about generative models, and their types. We will also look into how discriminative models differs from generative models. Introduction to Generative models Generative models are the family of machine learning models that are used to describe how data is generated. To train a generative model we first accumulate a vast amount of data in any domain and later train a model to create or generate data like it. In other words, these are the models that can learn to create data that is similar to data that we give them. One such approach is using Generative Adversarial Networks (GANs). There are two kinds of machine learning models: generative models and discriminative models. Let's examine the following list of classifiers: decision trees, neural networks, random forests, generalized boosted models, logistic regression, naive bayes, and Support Vector Machine (SVM). Most of these are classifiers and ensemble models. The odd one out here is Naive Bayes. It's the only generative model in the list. The others are examples of discriminative models. The fundamental difference between generative and discriminative models lies in the underlying probability inference structure. Let's go through some of the key differences between generative and discriminative models. Discriminative versus generative models Discriminative models learn P(Y|X), which is the conditional relationship between the target variable Y and features X. This is how least squares regression works, and it is the kind of inference pattern that gets used. It is an approach to sort out the relationship among variables. Generative models aim for a complete probabilistic description of the dataset. With generative models, the goal is to develop the joint probability distribution P(X, Y), either directly or by computing P(Y | X) and P(X) and then inferring the conditional probabilities required to classify newer data. This method requires more solid probabilistic thought than regression demands, but it provides a complete model of the probabilistic structure of the data. Knowing the joint distribution enables you to generate the data; hence, Naive Bayes is a generative model. Suppose we have a supervised learning task, where xi is the given features of the data points and yi is the corresponding labels. One way to predict y on future x is to learn a function f() from (xi,yi) that takes in x and outputs the most likely y. Such models fall in the category of discriminative models, as you are learning how to discriminate between x's from different classes. Methods like SVMs and neural networks fall into this category. Even if you're able to classify the data very accurately, you have no notion of how the data might have been generated. The second approach is to model how the data might have been generated and learn a function f(x,y) that gives a score to the configuration determined by x and y together. Then you can predict y for a new x by finding the y for which the score f(x,y) is maximum. A canonical example of this is Gaussian mixture models. Another example of this is: you can imagine x to be an image and y to be a kind of object like a dog, namely in the image. The probability written as p(y|x) tells us how much the model believes that there is a dog, given an input image compared to all possibilities it knows about. Algorithms that try to model this probability map directly are called discriminative models. Generative models, on the other hand, try to learn a function called the joint probability p(y, x). We can read this as how much the model believes that x is an image and there is a dog y in it at the same time. These two probabilities are related and that could be written as p(y, x) = p(x) p(y|x), with p(x) being how likely it is that the input x is an image. The p(x) probability is usually called a density function in literature. The main reason to call these models generative ultimately connects to the fact that the model has access to the probability of both input and output at the same time. Using this, we can generate images of animals by sampling animal kinds y and new images x from p(y, x). We can mainly learn the density function p(x) which only depends on the input space. Both models are useful; however, comparatively, generative models have an interesting advantage over discriminative models, namely, they have the potential to understand and explain the underlying structure of the input data even when there are no labels available. This is very desirable when working in the real world. Types of generative models Discriminative models have been at the forefront of the recent success in the field of machine learning. Models make predictions that depend on a given input, although they are not able to generate new samples or data. The idea behind the recent progress of generative modeling is to convert the generation problem to a prediction one and use deep learning algorithms to learn such a problem. Autoencoders One way to convert a generative to a discriminative problem can be by learning the mapping from the input space itself. For example, we want to learn an identity map that, for each image x, would ideally predict the same image, namely, x = f(x), where f is the predictive model. This model may not be of use in its current form, but from this, we can create a generative model. Here, we create a model formed of two main components: an encoder model q(h|x) that maps the input to another space, which is referred to as hidden or the latent space represented by h, and a decoder model q(x|h) that learns the opposite mapping from the hidden input space. These components--encoder and decoder--are connected together to create an end-to-end trainable model. Both the encoder and decoder models are neural networks of different architectures, for example, RNNs and Attention Nets, to get desired outcomes. As the model is learned, we can remove the decoder from the encoder and then use them separately. To generate a new data sample, we can first generate a sample from the latent space and then feed that to the decoder to create a new sample from the output space. GAN As seen with autoencoders, we can think of a general concept to create networks that will work together in a relationship, and training them will help us learn the latent spaces that allow us to generate new data samples. Another type of generative network is GAN, where we have a generator model q(x|h) to map the small dimensional latent space of h (which is usually represented as noise samples from a simple distribution) to the input space of x. This is quite similar to the role of decoders in autoencoders. The deal is now to introduce a discriminative model p(y| x), which tries to associate an input instance x to a yes/no binary answer y, about whether the generator model generated the input or was a genuine sample from the dataset we were training on. Let's use the image example done previously. Assume that the generator model creates a new image, and we also have the real image from our actual dataset. If the generator model was right, the discriminator model would not be able to distinguish between the two images easily. If the generator model was poor, it would be very simple to tell which one was a fake or fraud and which one was real. When both these models are coupled, we can train them end to end by assuring that the generator model is getting better over time to fool the discriminator model, while the discriminator model is trained to work on the harder problem of detecting frauds. Finally, we desire a generator model with outputs that are indistinguishable from the real data that we used for the training. Through the initial parts of the training, the discriminator model can easily detect the samples coming from the actual dataset versus the ones generated synthetically by the generator model, which is just beginning to learn. As the generator gets better at modeling the dataset, we begin to see more and more generated samples that look similar to the dataset. The following example depicts the generated images of a GAN model learning over time: Sequence models If the data is temporal in nature, then we can use specialized algorithms called Sequence Models. These models can learn the probability of the form p(y|x_n, x_1), where i is an index signifying the location in the sequence and x_i is the ith  input sample. As an example, we can consider each word as a series of characters, each sentence as a series of words, and each paragraph as a series of sentences. Output y could be the sentiment of the sentence. Using a similar trick from autoencoders, we can replace y with the next item in the series or sequence, namely y = x_n + 1, allowing the model to learn. To summarize, we learned generative models are a fast advancing area of study and research. As we proceed to advance these models and grow the training and datasets, we can expect to generate data examples that depict completely believable images. This can be used in several applications such as image denoising, painting, structured prediction, and exploration in reinforcement learning. To know more about how to build and optimize neural networks using TensorFlow, do checkout this book Neural Network Programming with Tensorflow.    
Read more
  • 0
  • 0
  • 37175

article-image-microsoft-build-2019-microsoft-showcases-new-updates-to-ms-365-platfrom-with-focus-on-ai-and-developer-productivity
Sugandha Lahoti
07 May 2019
10 min read
Save for later

Microsoft Build 2019: Microsoft showcases new updates to MS 365 platform with focus on AI and developer productivity

Sugandha Lahoti
07 May 2019
10 min read
At the ongoing Microsoft Build 2019 conference, Microsoft has announced a ton of new features and tool releases with a focus on innovation using AI and mixed reality with the intelligent cloud and the intelligent edge. In his opening keynote, Microsoft CEO Satya Nadella outlined the company’s vision and developer opportunity across Microsoft Azure, Microsoft Dynamics 365 and IoT Platform, Microsoft 365, and Microsoft Gaming. “As computing becomes embedded in every aspect of our lives, the choices developers make will define the world we live in,” said Satya Nadella, CEO, Microsoft. “Microsoft is committed to providing developers with trusted tools and platforms spanning every layer of the modern technology stack to build magical experiences that create new opportunity for everyone.” https://youtu.be/rIJRFHDr1QE Increasing developer productivity in Microsoft 365 platform Microsoft Graph data connect Microsoft Graphs are now powered with data connectivity, a service that combines analytics data from the Microsoft Graph with customers’ business data. Microsoft Graph data connect will provide Office 365 data and Microsoft Azure resources to users via a toolset. The migration pipelines are deployed and managed through Azure Data Factory. Microsoft Graph data connect can be used to create new apps shared within enterprises or externally in the Microsoft Azure Marketplace. It is generally available as a feature in Workplace Analytics and also as a standalone SKU for ISVs. More information here. Microsoft Search Microsoft Search works as a unified search experience across all Microsoft apps-  Office, Outlook, SharePoint, OneDrive, Bing and Windows. It applies AI technology from Bing and deep personalized insights surfaced by the Microsoft Graph to personalized searches. Other features included in Microsoft Search are: Search box displacement Zero query typing and key-phrase suggestion feature Query history feature, and personal search query history Administrator access to the history of popular searches for their organizations, but not to search history for individual users Files/people/site/bookmark suggestions Microsoft Search will begin publicly rolling out to all Microsoft 365 and Office 365 commercial subscriptions worldwide at the end of May. Read more on MS Search here. Fluid Framework As the name suggests Microsoft's newly launched Fluid framework allows seamless editing and collaboration between different applications. Essentially, it is a web-based platform and componentized document model that allows users to, for example, edit a document in an application like Word and then share a table from that document in Microsoft Teams (or even a third-party application) with real-time syncing. Microsoft says Fluid can translate text, fetch content, suggest edits, perform compliance checks, and more. The company will launch the software developer kit and the first experiences powered by the Fluid Framework later this year on Microsoft Word, Teams, and Outlook. Read more about Fluid framework here. Microsoft Edge new features Microsoft Build 2019 paved way for a bundle of new features to Microsoft’s flagship web browser, Microsoft Edge. New features include: Internet Explorer mode: This mode integrates Internet Explorer directly into the new Microsoft Edge via a new tab. This allows businesses to run legacy Internet Explorer-based apps in a modern browser. Privacy Tools: Additional privacy controls which allow customers to choose from 3 levels of privacy in Microsoft Edge—Unrestricted, Balanced, and Strict. These options limit third parties to track users across the web.  “Unrestricted” allows all third-party trackers to work on the browser. “Balanced” prevents third-party trackers from sites the user has not visited before. And “Strict” blocks all third-party trackers. Collections: Collections allows users to collect, organize, share and export content more efficiently and with Office integration. Microsoft is also migrating Edge as a whole over to Chromium. This will make Edge easier to develop for by third parties. For more details, visit Microsoft’s developer blog. New toolkit enhancements in Microsoft 365 Platform Windows Terminal Windows Terminal is Microsoft’s new application for Windows command-line users. Top features include: User interface with emoji-rich fonts and graphics-processing-unit-accelerated text rendering Multiple tab support and theming and customization features Powerful command-line user experience for users of PowerShell, Cmd, Windows Subsystem for Linux (WSL) and all forms of command-line application Windows Terminal will arrive in mid-June and will be delivered via the Microsoft Store in Windows 10. Read more here. React Native for Windows Microsoft announced a new open-source project for React Native developers at Microsoft Build 2019. Developers who prefer to use the React/web ecosystem to write user-experience components can now leverage those skills and components on Windows by using “React Native for Windows” implementation. React for Windows is under the MIT License and will allow developers to target any Windows 10 device, including PCs, tablets, Xbox, mixed reality devices and more. The project is being developed on GitHub and is available for developers to test. More mature releases will follow soon. Windows Subsystem for Linux 2 Microsoft rolled out a new architecture for Windows Subsystem for Linux: WSL 2 at the MSBuild 2019. Microsoft will also be shipping a fully open-source Linux kernel with Windows specially tuned for WSL 2. New features include massive file system performance increases (twice as much speed for file-system heavy operations, such as Node Package Manager install). WSL also supports running Linux Docker containers. The next generation of WSL arrives for Insiders in mid-June. More information here. New releases in multiple Developer Tools .NET 5 arrives in 2020 .NET 5 is the next major version of the .NET Platform which will be available in 2020. .NET 5 will have all .NET Core features as well as more additions: One Base Class Library containing APIs for building any type of application More choice on runtime experiences Java interoperability will be available on all platforms. Objective-C and Swift interoperability will be supported on multiple operating systems .NET 5 will provide both Just-in-Time (JIT) and Ahead-of-Time (AOT) compilation models to support multiple compute and device scenarios. .NET 5 also will offer one unified toolchain supported by new SDK project types as well as a flexible deployment model (side-by-side and self-contained EXEs) Detailed information here. ML.NET 1.0 ML.NET is Microsoft’s open-source and cross-platform framework that runs on Windows, Linux, and macOS and makes machine learning accessible for .NET developers. Its new version, ML.NET 1.0, was released at the Microsoft Build Conference 2019 yesterday. Some new features in this release are: Automated Machine Learning Preview: Transforms input data by selecting the best performing ML algorithm with the right settings. AutoML support in ML.NET is in preview and currently supports Regression and Classification ML tasks. ML.NET Model Builder Preview: Model Builder is a simple UI tool for developers which uses AutoML to build ML models. It also generates model training and model consumption code for the best performing model. ML.NET CLI Preview: ML.NET CLI is a dotnet tool which generates ML.NET Models using AutoML and ML.NET. The ML.NET CLI quickly iterates through a dataset for a specific ML Task and produces the best model. Visual Studio IntelliCode, Microsoft’s tool for AI-assisted coding Visual Studio IntelliCode, Microsoft’s AI-assisted coding is now generally available. It is essentially an enhanced IntelliSense, Microsoft’s extremely popular code completion tool. Intellicode is trained by using the code of thousands of open-source projects from GitHub that have at least 100 stars. It is available for C# and XAML for Visual Studio and Java, JavaScript, TypeScript, and Python for Visual Studio Code. IntelliCode also is included by default in Visual Studio 2019, starting in version 16.1 Preview 2. Additional capabilities, such as custom models, remain in public preview. Visual Studio 2019 version 16.1 Preview 2 Visual Studio 2019 version 16.1 Preview 2 release includes IntelliCode and the GitHub extensions by default. It also brings out of preview the Time Travel Debugging feature introduced with version 16.0. Also includes multiple performances and productivity improvements for .NET and C++ developers. Gaming and Mixed Reality Minecraft AR game for mobile devices At the end of Microsoft’s Build 2019 keynote yesterday, Microsoft teased a new Minecraft game in augmented reality, running on a phone. The teaser notes that more information will be coming on May 17th, the 10-year anniversary of Minecraft. https://www.youtube.com/watch?v=UiX0dVXiGa8 HoloLens 2 Development Edition and unreal engine support The HoloLens 2 Development Edition includes a HoloLens 2 device, $500 in Azure credits and three-months free trials of Unity Pro and Unity PiXYZ Plugin for CAD data, starting at $3,500 or as low as $99 per month. The HoloLens 2 Development Edition will be available for preorder soon and will ship later this year. Unreal Engine support for streaming and native platform integration will be available for HoloLens 2 by the end of May. Intelligent Edge and IoT Azure IoT Central new features Microsoft Build 2019 also featured new additions to Azure IoT Central, an IoT software-as-a-service solution. Better rules processing and customs rules with services like Azure Functions or Azure Stream Analytics Multiple dashboards and data visualization options for different types of users Inbound and outbound data connectors, so that operators can integrate with   systems Ability to add custom branding and operator resources to an IoT Central application with new white labeling options New Azure IoT Central features are available for customer trials. IoT Plug and Play IoT Plug and Play is a new, open modeling language to connect IoT devices to the cloud seamlessly without developers having to write a single line of embedded code. IoT Plug and Play also enable device manufacturers to build smarter IoT devices that just work with the cloud. Cloud developers will be able to find IoT Plug and Play enabled devices in Microsoft’s Azure IoT Device Catalog. The first device partners include Compal, Kyocera, and STMicroelectronics, among others. Azure Maps Mobility Service Azure Maps Mobility Service is a new API which provides real-time public transit information, including nearby stops, routes and trip intelligence. This API also will provide transit services to help with city planning, logistics, and transportation. Azure Maps Mobility Service will be in public preview in June. Read more about Azure Maps Mobility Service here. KEDA: Kubernetes-based event-driven autoscaling Microsoft and Red Hat collaborated to create KEDA, which is an open-sourced project that supports the deployment of serverless, event-driven containers on Kubernetes. It can be used in any Kubernetes environment — in any public/private cloud or on-premises such as Azure Kubernetes Service (AKS) and Red Hat OpenShift. KEDA has support for built-in triggers to respond to events happening in other services or components. This allows the container to consume events directly from the source, instead of routing through HTTP. KEDA also presents a new hosting option for Azure Functions that can be deployed as a container in Kubernetes clusters. Securing elections and political campaigns ElectionGuard SDK and Microsoft 365 for Campaigns ElectionGuard, is a free open-source software development kit (SDK) as an extension of Microsoft’s Defending Democracy Program to enable end-to-end verifiability and improved risk-limiting audit capabilities for elections in voting systems. Microsoft365 for Campaigns provides security capabilities of Microsoft 365 Business to political parties and individual candidates. More details here. Microsoft Build is in its 6th year and will continue till 8th May. The conference hosts over 6,000 attendees with early 500 student-age developers and over 2,600 customers and partners in attendance. Watch it live here! Microsoft introduces Remote Development extensions to make remote development easier on VS Code Docker announces a collaboration with Microsoft’s .NET at DockerCon 2019 How Visual Studio Code can help bridge the gap between full-stack development and DevOps [Sponsered by Microsoft]
Read more
  • 0
  • 0
  • 37033

article-image-getting-started-with-data-visualization-in-tableau
Amarabha Banerjee
13 Feb 2018
5 min read
Save for later

Getting started with Data Visualization in Tableau

Amarabha Banerjee
13 Feb 2018
5 min read
[box type="note" align="" class="" width=""]This article is an book extract from Mastering Tableau, written by David Baldwin. Tableau has emerged as one of the most popular Business Intelligence solutions in recent times, thanks to its powerful and interactive data visualization capabilities. This book will empower you to become a master in Tableau by exploiting the many new features introduced in Tableau 10.0.[/box] In today’s post, we shall explore data visualization basics with Tableau and explore a real world example using these techniques. Tableau Software has a focused vision resulting in a small product line. The main product (and hence the center of the Tableau universe) is Tableau Desktop. Assuming you are a Tableau author, that's where almost all your time will be spent when working with Tableau. But of course you must be able to connect to data and output the results. Thus, as shown in the following figure, the Tableau universe encompasses data sources, Tableau Desktop, and output channels, which include the Tableau Server family and Tableau Reader: Worksheet and dashboard creation At the heart of Tableau are worksheets and dashboards. Worksheets contain individual visualizations and dashboards contain one or more worksheets. Additionally, worksheets and dashboards may be combined into stories to communicate specific insights to the end user via a presentation environment. Lastly, all worksheets, dashboards, and stories are organized in workbooks that can be accessed via Tableau Desktop, Server, or Reader. In this section, we will look at worksheet and dashboard creation with the intent of not only communicating the basics, but also providing some insight that may prove helpful to even more seasoned Tableau authors. Worksheet creation At the most fundamental level, a visualization in Tableau is created by placing one or more fields on one or more shelves. To state this as a pseudo-equation: Field(s) + shelf(s) = Viz As an example, note that the visualization created in the following screenshot is generated by placing the Sales field on the Text shelf. Although the results are quite simple – a single number – this does qualify as a view. In other words, a Field (Sales) placed on a shelf (Text) has generated a Viz: Exercise – fundamentals of visualizations Let's explore the basics of creating a visualization via an exercise: Navigate to h t t p s ://p u b l i c . t a b l e a u . c o m /p r o f i l e /d a v i d 1. . b a l d w i n #!/ to locate and download the workbook associated with this chapter. In the workbook, find the tab labeled Fundamentals of Visualizations: Locate Region within the Dimensions portion of the Data pane: Drag Region to the Color shelf; that is, Region + Color shelf = what is shown in the following screenshot: Click on the Color shelf and then on Edit Colors… to adjust colors as desired: Next, move Region to the Size, Label/Text, Detail, Columns, and Rows shelves. After placing Region on each shelf, click on the shelf to access additional options. Lastly, choose other fields to drop on various shelves to continue exploring Tableau's behavior. As you continue exploring Tableau's behavior by dragging and dropping different fields onto different shelves, you will notice that Tableau responds with default behaviors. These defaults, however, can be overridden, which we will explore in the following section. Dashboard Creation Although, as stated previously, a dashboard contains one or more worksheets, dashboards are much more than static presentations. They are an essential part of Tableau's interactivity. In this section, we will populate a dashboard with worksheets and then deploy actions for interactivity. Exercise – building a dashboard In the workbook associated with this chapter, navigate to the tab entitled Building a Dashboard. Within the Dashboard pane located on the left-hand portion of the screen, double-click on each of the following worksheets (in the order in which they are listed) to add them to the dashboard: US Sales Customer Segment Scatter Plot Customers In the lower right-hand corner of the dashboard, click in the blank area below Profit Ratio to select the vertical container: After clicking in the blank area, you should see a blue border around the filter and the legends. This indicates that the vertical container is selected. As shown in the following screenshot, select the vertical container handle and drag it to the left-hand side of the Customers worksheet. Note the gray shading, which communicates where the container will be placed: The gray shading (provided by Tableau when dragging elements such as worksheets and containers onto a dashboard) helpfully communicates where the element will be placed. Take your time and observe carefully when placing an element on a dashboard or the results may be Unexpected. 5. Format the dashboard as desired. The following tips may prove helpful: Adjust the sizes of the elements on the screen by hovering over the edges between each element and then clicking and dragging as Desired. 2. Note that the Sales and Profit legends in the following screenshot are floating elements. Make an element float by right-clicking on the element handle and selecting Floating. (See the previous screenshot and note that the handle is located immediately above Region, in the upper-right-hand corner). 3. Create Horizontal and Vertical containers by dragging those objects from the bottom portion of the Dashboard pane. 4. Drag the edges of containers to adjust the size of each worksheet. 5. Display the dashboard title via Dashboard | Show Title…: If you enjoyed our post, be sure to check out Mastering Tableau which consists of many useful data visualization and data analysis techniques.  
Read more
  • 0
  • 0
  • 36691
article-image-f8-pytorch-announcements-pytorch-1-1-releases-with-new-ai-toolsopen-sourcing-botorch-and-ax-and-more
Bhagyashree R
03 May 2019
4 min read
Save for later

F8 PyTorch announcements: PyTorch 1.1 releases with new AI tools, open sourcing BoTorch and Ax, and more

Bhagyashree R
03 May 2019
4 min read
Despite Facebook’s frequent appearance in the news for all the wrong reasons, we cannot deny that its open source contributions to AI have been its one redeeming quality. At its F8 annual developer conference showcasing its exceptional AI prowess, Facebook shared how the production-ready PyTorch 1.0 is being adopted by the community and also the release of PyTorch 1.1. Facebook introduced PyTorch in 2017, and since then it has been well-received by developers. It partnered with the AI community for further development in PyTorch and released the stable version last year in December. Along with optimizing and fixing other parts of PyTorch, the team introduced Just-in-time compilation for production support that allows seamless transitions between eager mode and graph mode. PyTorch 1.0 in leading businesses, communities, and universities Facebook is leveraging end-to-end workflows of PyTorch 1.0 for building and deploying translation and NLP at large scale. These NLP systems are delivering a staggering 6 billion translations for applications such as Messenger. PyTorch has also enabled Facebook to quickly iterate their ML systems. It has helped them accelerate their research-to-production cycle. Other leading organizations and businesses are also now using PyTorch for speeding up the development of AI features. Airbnb’s Smart Reply feature is backed by PyTorch libraries and APIs for conversational AI. ATOM (Accelerating Therapeutics for Opportunities in Medicine) has come up with a variational autoencoder that represents diverse chemical structures and designs new drug candidates. Microsoft has built large-scale distributed language models that are now in production in offerings such as Cognitive Services. PyTorch 1.1 releases with new model understanding and visualization tools Along with showcasing how the production-ready version is being accepted by the community, the PyTorch team further announced the release of PyTorch 1.1. This release focuses on improved performance, brings new model understanding and visualization tools for improved usability, and more. Following are some of the key feature PyTorch 1.1 comes with: Support for TensorBoard: TensorBoard, a suite of visualization tools, is now natively supported in PyTorch. You can use it through the  “from torch.utils.tensorboard import SummaryWriter” command. Improved JIT compiler: Along with some bug fixes, the team has expanded capabilities in TorchScript such as support for dictionaries, user classes, and attributes. Introducing new APIs: New APIs are introduced to support Boolean tensors and custom recurrent neural networks. Distributed training: This release comes with improved performance for common models such as CNNs. Multi-device modules support and the ability to split models across GPUs while still using Distributed Data Parallel is added. Ax, BoTorch, and more: Open source tools for Machine Learning engineers Facebook announced that it is open sourcing two new tools, Ax and BoTorch that are aimed at solving large scale exploration problems both in research and production environment. Built on top of PyTorch, BoTorch leverages its features such as auto-differentiation, massive parallelism, and deep learning to help in researches related Bayesian optimization. Ax is a general purpose ML platform for managing adaptive experiments. Both Ax and BoTorch use probabilistic models that efficiently use data and meaningfully quantify the costs and benefits of exploring new regions of problem space. Facebook has also open sourced PyTorch-BigGraph (PBG), a tool that makes it easier and faster to produce graph embeddings for extremely large graphs with billions of entities and trillions of edges. PBG comes with support for sharding and negative sampling and also offers sample use cases based on Wikidata embedding. As a result of its collaboration with Google, AI Platform Notebooks, a new histed JupyterLab service from Google Cloud Platform, now comes preinstalled with PyTorch. It also comes integrated with other GCP services such as BigQuery, Cloud Dataproc, Cloud Dataflow, and AI Factory. The broader PyTorch community has also come up with some impressive open source tools. BigGAN-Torch is basically a full reimplementation of PyTorch that uses gradient accumulation to provide the benefits of big batches by only using a few GPUs. GeomLoss is an API written in Python that defines PyTorch layers for geometric loss functions between sampled measures, images, and volumes. It provides efficient GPU implementations for Kernel norms, Hausdorff divergences, and unbiased Sinkhorn divergences. PyTorch Geometric is a geometric deep learning extension library for PyTorch consisting of various methods for deep learning on graphs and other irregular structures. Read the official announcement on Facebook’s AI  blog. Facebook open-sources F14 algorithm for faster and memory-efficient hash tables “Is it actually possible to have a free and fair election ever again?,” Pulitzer finalist, Carole Cadwalladr on Facebook’s role in Brexit F8 Developer Conference Highlights: Redesigned FB5 app, Messenger update, new Oculus Quest and Rift S, Instagram shops, and more
Read more
  • 0
  • 0
  • 36491

article-image-getting-to-know-different-big-data-characteristics
Gebin George
05 Jan 2018
4 min read
Save for later

Getting to know different Big data Characteristics

Gebin George
05 Jan 2018
4 min read
[box type="note" align="" class="" width=""]This article is an excerpt from a book written by Osvaldo Martin titled Mastering Predictive Analytics with R, Second Edition. This book will help you leverage the flexibility and modularity of R to experiment with a range of different techniques and data types.[/box] Our article will quickly walk you through all the fundamental characteristics of Big Data. For you to determine if your data source qualifies as big data or as needing special handling, you can start by examining your data source in the following areas: The volume (amount) of data. The variety of data. The number of different sources and spans of the data. Let's examine each of these areas. Volume If you are talking about the number of rows or records, then most likely your data source is not a big data source since big data is typically measured in gigabytes, terabytes, and petabytes. However, space doesn't always mean big, as these size measurements can vary greatly in terms of both volume and functionality. Additionally, data sources of several million records may qualify as big data, given their structure (or lack of structure). Varieties Data used in predictive models may be structured or unstructured (or both) and include transactions from databases, survey results, website logs, application messages, and so on (by using a data source consisting of a higher variety of data, you are usually able to cover a broader context for the analytics you derive from it). Variety, much like volume, is considered a normal qualifier for big data. Sources and spans If the data source for your predictive analytics project is the result of integrating several sources, you most likely hit on both criteria of volume and variety and your data qualifies as big data. If your project uses data that is affected by governmental mandates, consumer requests is a historical analysis, you are almost certainty using big data. Government regulations usually require that certain types of data need to be stored for several years. Products can be consumer driven over the lifetime of the product and with today's trends, historical analysis data is usually available for more than five years. Again, all examples of big data sources. Structure You will often find that data sources typically fall into one of the following three categories: 1. Sources with little or no structure in the data (such as simple text files). 2. Sources containing both structured and unstructured data (like data that is sourced from document management systems or various websites, and so on). 3. Sources containing highly structured data (like transactional data stored in a relational database example). How your data source is categorized will determine how you prepare and work with your data in each phase of your predictive analytics project. Although data sources with structure can obviously still fall into the category of big data, it's data containing both structured and unstructured data (and of course totally unstructured data) that fit as big data and will require special handling and or pre-processing. Statistical noise Finally, we should take a note here that other factors (other than those discussed already in the chapter) can qualify your project data source as being unwieldy, overly complex, or a big data source. These include (but are not limited to): Statistical noise (a term for recognized amounts of unexplained variations within the data) Data suffering from mismatched understandings (the differences in interpretations of the data by communities, cultures, practices, and so on) Once you have determined that the data source that you will be using in your predictive analytics project seems to qualify as big (again as we are using the term here) then you can proceed with the process of deciding how to manage and manipulate that data source, based upon the known challenges this type of data demands, so as to be most effective. In the next section, we will review some of these common problems, before we go on to offer useable solutions. We have learned fundamental characteristics which define Big Data, to further use them for Analytics. If you enjoyed our post, check out the book Mastering Predictive Analytics with R, Second Edition to learn complex machine learning models using R.    
Read more
  • 0
  • 0
  • 36368

article-image-how-to-create-a-neural-network-in-tensorflow
Aaron Lazar
06 Feb 2018
8 min read
Save for later

How to Create a Neural Network in TensorFlow

Aaron Lazar
06 Feb 2018
8 min read
[box type="note" align="" class="" width=""]This article has been extracted from the book Principles of Data Science authored by Sinan Ozdemir. With a unique approach that bridges the gap between mathematics and computer science, the books takes you through the entire data science pipeline. Beginning with cleaning and preparing data, and effective data mining strategies and techniques to help you get to grips with machine learning.[/box] In this article, we’re going to learn how to create a neural network whose goal will be to classify images. Tensorflow is an open-source machine learning module that is used primarily for its simplified deep learning and neural network abilities. I would like to take some time to introduce the module and solve a few quick problems using tensorflow. Let’s begin with some imports: from sklearn import datasets, metrics import tensorflow as tf import numpy as np from sklearn.cross_validation import train_test_split %matplotlib inline Loading our iris dataset: # Our data set of iris flowers iris = datasets.load_iris() # Load datasets and split them for training and testing X_train, X_test, y_train, y_test = train_test_split(iris.data, iris. target) Creating the Neural Network: # Specify that all features have real-value datafeature_columns = [tf.contrib.layers.real_valued_column("", dimension=4)] optimizer = tf.train.GradientDescentOptimizer(learning_rate=.1) # Build 3 layer DNN with 10, 20, 10 units respectively. classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns, hidden_units=[10, 20, 10], optimizer=optimizer, n_classes=3) # Fit model. classifier.fit(x=X_train, y=y_train, steps=2000) Notice that our code really hasn't changed from the last segment. We still have our feature_columns from before, but now we introduce, instead of a linear classifier, a DNNClassifier, which stands for Deep Neural Network Classifier. This is TensorFlow's syntax for implementing a neural network. Let's take a closer look: tf.contrib.learn.DNNClassifier(feature_columns=feature_columns, hidden_units=[10, 20, 10], optimizer=optimizer, n_classes=3) We see that we are inputting the same feature_columns, n_classes, and optimizer, but see how we have a new parameter called hidden_units? This list represents the number of nodes to have in each layer between the input and the output layer. All in all, this neural network will have five layers: The first layer will have four nodes, one for each of the iris feature variables. This layer is the input layer. A hidden layer of 10 nodes. A hidden layer of 20 nodes. A hidden layer of 10 nodes. The final layer will have three nodes, one for each possible outcome of the network. This is called our output layer. Now that we've trained our model, let's evaluate it on our test set: # Evaluate accuracy. accuracy_score = classifier.evaluate(x=X_test, y=y_test)["accuracy"] print('Accuracy: {0:f}'.format(accuracy_score)) Accuracy: 0.921053 Hmm, our neural network didn't do so well on this dataset, but perhaps it is because the network is a bit too complicated for such a simple dataset. Let's introduce a new dataset that has a bit more to it… The MNIST dataset consists of over 50,000 handwritten digits (0-9) and the goal is to recognize the handwritten digits and output which letter they are writing. Tensorflow has a built-in mechanism for downloading and loading these images. from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets("MNIST_data/", one_hot=False) Extracting MNIST_data/train-images-idx3-ubyte.gz Extracting MNIST_data/train-labels-idx1-ubyte.gz Extracting MNIST_data/t10k-images-idx3-ubyte.gz Extracting MNIST_data/t10k-labels-idx1-ubyte.gz Notice that one of our inputs for downloading mnist is called one_hot. This parameter either brings in the dataset's target variable (which is the digit itself) as a single number or has a dummy variable. For example, if the first digit were a 7, the target would either be: 7: If one_hot was false 0 0 0 0 0 0 0 1 0 0: If one_hot was true (notice that starting from 0, the seventh index is a 1) We will encode our target the former way, as this is what our tensorflow neural network and our sklearn logistic regression will expect. The dataset is split up already into a training and test set, so let's create new variables to hold them: x_mnist = mnist.train.images y_mnist = mnist.train.labels.astype(int) For the y_mnist variable, I specifically cast every target as an integer (by default they come in as floats) because otherwise tensorflow would throw an error at us. Out of curiosity, let's take a look at a single image: import matplotlib.pyplot as plt plt.imshow(x_mnist[10].reshape(28, 28)) And hopefully our target variable matches at the 10th index as well: y_mnist[10] 0 Excellent! Let's now take a peek at how big our dataset is: x_mnist.shape (55000, 784) y_mnist.shape (55000,) Our training size then is 55000 images and target variables. Let's fit a deep neural network to our images and see if it will be able to pick up on the patterns in our inputs: # Specify that all features have real-value data feature_columns = [tf.contrib.layers.real_valued_column("", dimension=784)] optimizer = tf.train.GradientDescentOptimizer(learning_rate=.1) # Build 3 layer DNN with 10, 20, 10 units respectively. classifier = tf.contrib.learn.DNNClassifier(feature_columns=feature_columns,     hidden_units=[10, 20, 10],   optimizer=optimizer, n_classes=10) # Fit model. classifier.fit(x=x_mnist,       y=y_mnist,       steps=1000) # Warning this is veryyyyyyyy slow This code is very similar to our previous segment using DNNClassifier; however, look how in our first line of code, I have changed the number of columns to be 784 while in the classifier itself, I changed the number of output classes to be 10. These are manual inputs that tensorflow must be given to work. The preceding code runs very slowly. It is little by little adjusting itself in order to get the best possible performance from our training set. Of course, we know that the ultimate test here is testing our network on an unknown test set, which is also given to us from tensorflow: x_mnist_test = mnist.test.images y_mnist_test = mnist.test.labels.astype(int) x_mnist_test.shape (10000, 784) y_mnist_test.shape (10000,) So we have 10,000 images to test on; let's see how our network was able to adapt to the dataset: # Evaluate accuracy. accuracy_score = classifier.evaluate(x=x_mnist_test, y=y_mnist_test)["accuracy"] print('Accuracy: {0:f}'.format(accuracy_score)) Accuracy: 0.920600 Not bad, 92% accuracy on our dataset. Let's take a second and compare this performance to a standard sklearn logistic regression now: logreg = LogisticRegression() logreg.fit(x_mnist, y_mnist) # Warning this is slow y_predicted = logreg.predict(x_mnist_test) from sklearn.metrics import accuracy_score # predict on our test set, to avoid overfitting! accuracy = accuracy_score(y_predicted, y_mnist_test) # get our accuracy score Accuracy 0.91969 Success! Our neural network performed better than the standard logistic regression. This is likely because the network is attempting to find relationships between the pixels themselves and using these relationships to map them to what digit we are writing down. In logistic regression, the model assumes that every single input is independent of one another, and therefore has a tough time finding relationships between them. There are ways of making our neural network learn differently: We could make our network wider, that is, increase the number of nodes in the hidden layers instead of having several layers of a smaller number of nodes: # A wider network feature_columns = [tf.contrib.layers.real_valued_column("", dimension=784)] optimizer = tf.train.GradientDescentOptimizer(learning_rate=.1) # Build 3 layer DNN with 10, 20, 10 units respectively. classifier = tf.contrib.learn.DNNClassifier(feature_ columns=feature_columns,      hidden_units=[1500],       optimizer=optimizer,    n_classes=10) # Fit model. classifier.fit(x=x_mnist,       y=y_mnist,       steps=100) # Warning this is veryyyyyyyy slow # Evaluate accuracy. accuracy_score = classifier.evaluate(x=x_mnist_test,    y=y_mnist_test)["accuracy"] print('Accuracy: {0:f}'.format(accuracy_score)) Accuracy: 0.898400 We could increase our learning rate, forcing the network to attempt to converge into an answer faster. As mentioned before, we run the risk of the model skipping the answer entirely if we go down this route. It is usually better to stick with a smaller learning rate. We can change the method of optimization. Gradient descent is very popular; however, there are other algorithms for doing so. One example is called the Adam Optimizer. The difference is in the way they traverse the error function, and therefore the way that they approach the optimization point. Different problems in different domains call for different optimizers. There is no replacement for a good old fashioned feature selection phase instead of attempting to let the network figure everything out for us. We can take the time to find relevant and meaningful features that actually will allow our network to find an answer quicker! There you go! You’ve now learned how to build a neural net in Tensorflow! If you liked this tutorial and would like to learn more, head over and grab the copy Principles of Data Science. If you want to take things a bit further and learn how to classify Irises using multi-layer perceptrons, head over here.    
Read more
  • 0
  • 0
  • 36307
article-image-distributed-tensorflow-multiple-gpu-server
Kunal Chaudhari
07 May 2018
12 min read
Save for later

Distributed TensorFlow: Working with multiple GPUs and servers

Kunal Chaudhari
07 May 2018
12 min read
Some neural networks models are so large they cannot fit in memory of a single device (GPU). Such models need to be split over many devices, carrying out the training in parallel on the devices. This means anyone can now scale out distributed training to 100s of GPUs using TensorFlow. But that’s not the only advantage of distributed TensorFlow, you can also massively reduce your experimentation time by running many experiments in parallel on many GPUs and servers. Today, we will discuss about distributed TensorFlow and present a number of recipes to work with TensorFlow, GPUs, and multiple servers. Working with TensorFlow and GPUs We will learn how to use TensorFlow with GPUs: the operation performed is a simple matrix multiplication either on CPU or on GPU. Getting ready The first step is to install a version of TensorFlow that supports GPUs. The official TensorFlow Installation Instruction is your starting point . Remember that you need to have an environment supporting GPUs either via CUDA or CuDNN. How to do it... We proceed with the recipe as follows: Start by importing a few modules import sys import numpy as np import tensorflow as tf from datetime import datetime Get from command line the type of processing unit that you desire to use (either "gpu" or "cpu") device_name = sys.argv[1] # Choose device from cmd line. Options: gpu or cpu shape = (int(sys.argv[2]), int(sys.argv[2])) if device_name == "gpu": device_name = "/gpu:0" else: device_name = "/cpu:0" Execute the matrix multiplication either on GPU or on CPU. The key instruction is with tf.device(device_name). It creates a new context manager, telling TensorFlow to perform those actions on either the GPU or the CPU with tf.device(device_name): random_matrix = tf.random_uniform(shape=shape, minval=0, maxval=1) dot_operation = tf.matmul(random_matrix, tf.transpose(random_matrix)) sum_operation = tf.reduce_sum(dot_operation) startTime = datetime.now() with tf.Session(config=tf.ConfigProto(log_device_placement=True)) as session: result = session.run(sum_operation) print(result)  Print some debug timing just to verify what is the difference between CPU and GPU print("Shape:", shape, "Device:", device_name) print("Time taken:", datetime.now() - startTime) How it works... This recipe explains how to assign TensorFlow computations either to CPUs or to GPUs. The code is pretty simple and it will be used as a basis for the next recipe. Playing with Distributed TensorFlow: multiple GPUs and one CPU We will show an example of data parallelism where data is split across multiple GPUs Getting ready This recipe is inspired by a good blog posting written by Neil Tenenholtz and available online: https://clindatsci.com/blog/2017/5/31/distributed-tensorflow How to do it... We proceed with the recipe as follows: Consider this piece of code which runs a matrix multiplication on a single GPU. # single GPU (baseline) import tensorflow as tf # place the initial data on the cpu with tf.device('/cpu:0'): input_data = tf.Variable([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.], [10., 11., 12.]]) b = tf.Variable([[1.], [1.], [2.]]) # compute the result on the 0th gpu with tf.device('/gpu:0'): output = tf.matmul(input_data, b) # create a session and run with tf.Session() as sess: sess.run(tf.global_variables_initializer()) print sess.run(output) Partition the code with in graph replication as in the following snippet between 2 different GPUs. Note that the CPU is acting as the master node distributing the graph and collecting the final results. # in-graph replication import tensorflow as tf num_gpus = 2 # place the initial data on the cpu with tf.device('/cpu:0'): input_data = tf.Variable([[1., 2., 3.], [4., 5., 6.], [7., 8., 9.], [10., 11., 12.]]) b = tf.Variable([[1.], [1.], [2.]]) # split the data into chunks for each gpu inputs = tf.split(input_data, num_gpus) outputs = [] # loop over available gpus and pass input data for i in range(num_gpus): with tf.device('/gpu:'+str(i)): outputs.append(tf.matmul(inputs[i], b)) # merge the results of the devices with tf.device('/cpu:0'): output = tf.concat(outputs, axis=0) # create a session and run with tf.Session() as sess: sess.run(tf.global_variables_initializer()) print sess.run(output) How it works... This is a very simple recipe where the graph is split in two parts by the CPU acting as master and distributed to two GPUs acting as distributed workers. The result of the computation is collected back to the CPU. Playing with Distributed TensorFlow: multiple servers We will learn how to distribute a TensorFlow computation across multiple servers. The key assumption is that the code is same for both the workers and the parameter servers. Therefore the role of each computation node is passed by a command line argument. Getting ready Again, this recipe is inspired by a good blog posting written by Neil Tenenholtz and available online: https://clindatsci.com/blog/2017/5/31/distributed-tensorflow How to do it... We proceed with the recipe as follows: Consider this piece of code where we specify the cluster architecture with one master running on 192.168.1.1:1111 and two workers running on 192.168.1.2:1111 and 192.168.1.3:1111 respectively. import sys import tensorflow as tf # specify the cluster's architecture cluster = tf.train.ClusterSpec({'ps': ['192.168.1.1:1111'], 'worker': ['192.168.1.2:1111', '192.168.1.3:1111'] }) Note that the code is replicated on multiple machines and therefore it is important to know what is the role of the current execution node. This information we get from the command line. A machine can be either a worker or a parameter server (ps). # parse command-line to specify machine job_type = sys.argv[1] # job type: "worker" or "ps" task_idx = sys.argv[2] # index job in the worker or ps list # as defined in the ClusterSpec Run the training server where given a cluster, we bless each computational with a role (either worker or ps), and an id. # create TensorFlow Server. This is how the machines communicate. server = tf.train.Server(cluster, job_name=job_type, task_index=task_idx) The computation is different according to the role of the specific computation node: If the role is a parameter server, then the condition is to join the server. Note that in this case there is no code to execute because the workers will continuously push updates and the only thing that the Parameter Server has to do is waiting. Otherwise the worker code is executed on a specific device within the cluster. This part of code is similar to the one executed on a single machine where we first build the model and then we train it locally. Note that all the distribution of the work and the collection of the updated results is done transparently by Tensoflow. Note that TensorFlow provides a convenient tf.train.replica_device_setter that automatically assigns operations to devices. # parameter server is updated by remote clients. # will not proceed beyond this if statement. if job_type == 'ps': server.join() else: # workers only with tf.device(tf.train.replica_device_setter( worker_device='/job:worker/task:'+task_idx, cluster=cluster)): # build your model here as if you only were using a single machine with tf.Session(server.target): # train your model here How it works... We have seen how to create a cluster with multiple computation nodes. A node can be either playing the role of a Parameter server or playing the role of a worker. In both cases the code executed is the same but the execution of the code is different according to parameters collected from the command line. The parameter server only needs to wait until the workers send updates. Note that tf.train.replica_device_setter(..) takes the role of automatically assigning operations to available devices, while tf.train.ClusterSpec(..) is used for cluster setup. There is more... An example of distributed training for MNIST is available online. In addition, to that you can decide to have more than one parameter server for efficiency reasons. Using parameters the server can provide better network utilization, and it allows to scale models to more parallel machines. It is possible to allocate more than one parameter server. The interested reader can have a look in here. Training a Distributed TensorFlow MNIST classifier In this we trained a full MNIST classifier in a distributed way. This recipe is inspired by the blog post in http://ischlag.github.io/2016/06/12/async-distributed- tensorflow/ and the code running on TensorFlow 1.2 is available here https://github. com/ischlag/distributed-tensorflow-example Getting ready This recipe is based on the previous one. So it might be convenient to read them in order. How to do it... We proceed with the recipe as follows: Import a few standard modules and define the TensorFlow cluster where the computation is run. Then start a server for a specific task import tensorflow as tf import sys import time # cluster specification parameter_servers = ["pc-01:2222"] workers = [ "pc-02:2222", "pc-03:2222", "pc-04:2222"] cluster = tf.train.ClusterSpec({"ps":parameter_servers, "worker":workers}) # input flags tf.app.flags.DEFINE_string("job_name", "", "Either 'ps' or 'worker'") tf.app.flags.DEFINE_integer("task_index", 0, "Index of task within the job")FLAGS = tf.app.flags.FLAGS # start a server for a specific task server = tf.train.Server( cluster, job_name=FLAGS.job_name, task_index=FLAGS.task_index) Read MNIST data and define the hyperparameters used for training # config batch_size = 100 learning_rate = 0.0005 training_epochs = 20 logs_path = "/tmp/mnist/1" # load mnist data set from tensorflow.examples.tutorials.mnist import input_data mnist = input_data.read_data_sets('MNIST_data', one_hot=True) Check if your role is Parameter Server or Worker. If worker then define a simple dense neural network, define an optimizer, and the metric used for evaluating the classifier (for example accuracy). if FLAGS.job_name == "ps": server.join() elif FLAGS.job_name == "worker": # Between-graph replication with tf.device(tf.train.replica_device_setter( worker_device="/job:worker/task:%d" % FLAGS.task_index, cluster=cluster)): # count the number of updates global_step = tf.get_variable( 'global_step', [], initializer = tf.constant_initializer(0), trainable = False) # input images with tf.name_scope('input'): # None -> batch size can be any size, 784 -> flattened mnist image x = tf.placeholder(tf.float32, shape=[None, 784], name="x-input") # target 10 output classes y_ = tf.placeholder(tf.float32, shape=[None, 10], name="y-input") # model parameters will change during training so we use tf.Variable tf.set_random_seed(1) with tf.name_scope("weights"): W1 = tf.Variable(tf.random_normal([784, 100])) W2 = tf.Variable(tf.random_normal([100, 10])) # bias with tf.name_scope("biases"): b1 = tf.Variable(tf.zeros([100])) b2 = tf.Variable(tf.zeros([10])) # implement model with tf.name_scope("softmax"): # y is our prediction z2 = tf.add(tf.matmul(x,W1),b1) a2 = tf.nn.sigmoid(z2) z3 = tf.add(tf.matmul(a2,W2),b2) y = tf.nn.softmax(z3) # specify cost function with tf.name_scope('cross_entropy'): # this is our cost cross_entropy = tf.reduce_mean( -tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1])) # specify optimizer with tf.name_scope('train'): # optimizer is an "operation" which we can execute in a session grad_op = tf.train.GradientDescentOptimizer(learning_rate) train_op = grad_op.minimize(cross_entropy, global_step=global_step) with tf.name_scope('Accuracy'): # accuracy correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) # create a summary for our cost and accuracy tf.summary.scalar("cost", cross_entropy) tf.summary.scalar("accuracy", accuracy) # merge all summaries into a single "operation" which we can execute in a session summary_op = tf.summary.merge_all() init_op = tf.global_variables_initializer() print("Variables initialized ...") Start a supervisor which acts as a Chief machine for the distributed setting. The chief is the worker machine which manages all the rest of the cluster. The session is maintained by the chief and the key instruction is sv = tf.train.Supervisor(is_chief=(FLAGS.task_index == 0)). Also, with prepare_or_wait_for_session(server.target) the supervisor will wait for the model to be ready for use. Note that each worker will take care of different batched models and the final model is then available for the chief. sv = tf.train.Supervisor(is_chief=(FLAGS.task_index == 0), begin_time = time.time() frequency = 100 with sv.prepare_or_wait_for_session(server.target) as sess: # create log writer object (this will log on every machine) writer = tf.summary.FileWriter(logs_path, graph=tf.get_default_graph()) # perform training cycles start_time = time.time() for epoch in range(training_epochs): # number of batches in one epoch batch_count = int(mnist.train.num_examples/batch_size) count = 0 for i in range(batch_count): batch_x, batch_y = mnist.train.next_batch(batch_size) # perform the operations we defined earlier on batch _, cost, summary, step = sess.run( [train_op, cross_entropy, summary_op, global_step], feed_dict={x: batch_x, y_: batch_y}) writer.add_summary(summary, step) count += 1 if count % frequency == 0 or i+1 == batch_count: elapsed_time = time.time() - start_time start_time = time.time() print("Step: %d," % (step+1), " Epoch: %2d," % (epoch+1), " Batch: %3d of %3d," % (i+1, batch_count), " Cost: %.4f," % cost, "AvgTime:%3.2fms" % float(elapsed_time*1000/frequency)) count = 0 print("Test-Accuracy: %2.2f" % sess.run(accuracy, feed_dict={x: mnist.test.images, y_: mnist.test.labels})) print("Total Time: %3.2fs" % float(time.time() - begin_time)) print("Final Cost: %.4f" % cost) sv.stop() print("done") How it works... This recipe describes an example of distributed MNIST classifier. In this example, TensorFlow allows us to define a cluster of three machines. One acts as parameter server and two more machines are used as workers working on separate batches of the training data. If you enjoyed this excerpt, check out the book TensorFlow 1.x Deep Learning Cookbook, to become an expert in implementing deep learning techniques in real-world applications. Emoji Scavenger Hunt showcases TensorFlow.js The 5 biggest announcements from TensorFlow Developer Summit 2018 Setting up Logistic Regression model using TensorFlow  
Read more
  • 0
  • 0
  • 36207

article-image-machine-learning-ethics-what-you-need-to-know-and-what-you-can-do
Richard Gall
23 Sep 2019
10 min read
Save for later

Machine learning ethics: what you need to know and what you can do

Richard Gall
23 Sep 2019
10 min read
Ethics is, without a doubt, one of the most important topics to emerge in machine learning and artificial intelligence over the last year. While the reasons for this are complex, it nevertheless underlines that the area has reached technological maturity. After all, if artificial intelligence systems weren’t having a real, demonstrable impact on wider society, why would anyone be worried about its ethical implications? It’s easy to dismiss the debate around machine learning and artificial intelligence as abstract and irrelevant to engineers’ and developers’ immediate practical concerns. However this is wrong. Ethics needs to be seen as an important practical consideration for anyone using and building machine learning systems. If we fail to do so the consequences could be serious. The last 12 months has been packed with stories of artificial intelligence not only showing theoretical bias, but also causing discriminatory outcomes in the real world. Amazon scrapped its AI tool for hiring last October because it showed significant bias against female job applicants. Even more recently, last month it emerged that algorithms built to detect hate speech online have in-built biases against black people. Although these might seem like edge cases, it’s vital that everyone in the industry takes responsibility. This isn’t something we can leave up to regulation or other organizations the people who can really affect change are the developers and engineers on the ground. It’s true that machine learning and artificial intelligence systems will be operating in ways where ethics isn’t really an issue - and that’s fine. But by focusing on machine learning ethics, and thinking carefully about the impact of your work you will ultimately end up building better systems that are more robust and have better outcomes. So with that in mind, let’s look at the practical ways to start thinking about ethics in machine learning and artificial intelligence. Machine learning ethics and bias The first step towards thinking seriously about ethics in machine learning is to think about bias. Once you are aware of how bias can creep into machine learning systems, and how that can have ethical implications, it becomes much easier to identify issues and make changes - or, even better, stop them before they arise. Bias isn’t strictly an ethical issue. It could be a performance issue that’s affecting the effectiveness of your system. But in the conversation around AI and machine learning ethics, it’s the most practical way of starting to think seriously about the issue. Types of machine learning and algorithmic bias Although there are a range of different types of bias, the best place to begin is with two top level concepts. You may have read lists of numerous different biases, but for the purpose of talking about ethics there are two important things to think about. Pre-existing and data set biases Pre-existing biases are embedded in the data on which we choose to train algorithms. While it’s true that just about every data set will be ‘biased’ in some way (data is a representation, after all - there will always be something ‘missing), the point here is that we need to be aware of the extent of the bias and the potential algorithmic consequences. You might have heard terms like ‘sampling bias’, ‘exclusion bias’ and ‘prejudice bias’ - these aren’t radically different. They all result from pre-existing biases about how a data set looks or what it represents. Technical and contextual biases Technical machine learning bias is about how an algorithm is programmed. It refers to the problems that arise when an algorithm is built to operate in a specific way. Essentially, it occurs when the programmed elements of an algorithm fail to properly account for the context in which it is being used. A good example is the plagiarism checker Turnitin - this used an algorithm that was trained to identify strings of texts, which meant it would target non-native English speakers over English speaking ones, who were able to make changes to avoid detection. Although there are, as I’ve said, many different biases in the field of machine learning, by thinking about the data on which your algorithm is trained and the context in which the system is working, you will be in a much better place to think about the ethical implications of your work. Equally, you will also be building better systems that don’t cause unforeseen issues. Read next: How to learn data science: from data mining to machine learning The importance of context in machine learning The most important thing for anyone working in machine learning and artificial intelligence is context. Put another way, you need to have a clear sense of why you are trying to do something and what the possible implications could be. If this is unclear, think about it this way: when you use an algorithm, you’re essentially automating away decision making. That’s a good thing when you want to make lots of decisions at a huge scale. But the one thing you lose when turning decision making into a mathematical formula is context. The decisions an algorithm makes lack context because it is programmed to react in a very specific way. This means contextual awareness is your problem. That’s part of the bargain of using an algorithm. Context in data collection Let’s look at what thinking about context means when it comes to your data set. Step 1: what are you trying to achieve? Essentially, the first thing you’ll want to consider is what you’re trying to achieve. Do you want to train an algorithm to recognise faces? Do you want it to understand language in some way? Step 2: why are you doing this? What’s the point of doing what you’re doing? Sometimes this might be a straightforward answer, but be cautious if the answer is too easy to answer. Making something work more efficiently or faster isn’t really a satisfactory reason. What’s the point of making something more efficient? This is often where you’ll start to see ethical issues emerge more clearly. Sometimes they’re not easily resolved. You might not even be in a position to resolve them yourself (if you’re employed by a company, after all, you’re quite literally contracted to perform a specific task). But even if you do feel like there’s little room to maneuver, it’s important to ensure that these discussions actually take place and that you consider the impact of an algorithm. That will make it easier for you to put safeguarding steps in place. Step 3: Understanding the data set Think about how your data set fits alongside the what and the why. Is there anything missing? How was the data collected? Could it be biased or skewed in some way? Indeed, it might not even matter. But if it does, it’s essential that you pay close attention to the data you’re using. It’s worth recording any potential limitations or issues, so if a problem arises at a later stage in your machine learning project, the causes are documented and visible to others. The context of algorithm implementation The other aspect of thinking about context is to think carefully about how your machine learning or artificial intelligence system is being implemented. Is it working how you thought it would? Is it showing any signs of bias? Many articles about the limitations of artificial intelligence and machine learning ethics cite the case of Microsoft’s Tay. Tay was a chatbot that ‘learned’ from its interactions with users on Twitter. Built with considerable naivety, Twitter users turned Tay racist in a matter of days. Users ‘spoke’ to Tay using racist language, and because Tay learned through interactions with Twitter users, the chatbot quickly became a reflection of the language and attitudes of those around it. This is a good example of how the algorithm’s designers didn’t consider how the real-world implementation of the algorithm would have a negative consequence. Despite, you’d think, the best of intentions, the developers didn’t have the foresight to consider the reality of the world into which they were releasing their algorithmic progeny. Read next: Data science vs. machine learning: understanding the difference and what it means today Algorithmic impact assessments It’s true that ethics isn’t always going to be an urgent issue for engineers. But in certain domains, it’s going to be crucial, particularly in public services and other aspects of government, like justice. Maybe there should be a debate about whether artificial intelligence and machine learning should be used in those contexts at all. But if we can’t have that debate, at the very least we can have tools that help us to think about the ethical implications of the machine learning systems we build. This is where Algorithmic Impact Assessments come in. The idea was developed by the AI Now institute and outlined in a paper published last year, and was recently implemented by the Canadian government. There’s no one way to do an algorithmic impact assessment - the Canadian government uses a questionnaire “designed to help you assess and mitigate the risks associated with deploying an automated decision system.” This essentially provides a framework for those using and building algorithms to understand the scope of their project and to identify any potential issues or problems that could arise. Tools for assessing bias and supporting ethical engineering However, although algorithmic impact assessments can provide you with a solid conceptual grounding for thinking about the ethical implications of artificial intelligence and machine learning systems, there are also a number of tools that can help you better understand the ways in which algorithms could be perpetuating biases or prejudices. One of these is FairML, “an end-to- end toolbox for auditing predictive models by quantifying the relative significance of the model's inputs” - helping engineers to identify the extent to which algorithmic inputs could cause harm or bias - while another is LIME (Local Interpretable Model Agnostic Explanations). LIME is not dissimilar to FairML. it aims to understand why an algorithm makes the decisions it does by ‘perturbing’ inputs and seeing how this affects its outputs. There’s also Deon, which is a lot like a more lightweight, developer-friendly version of an algorithmic assessment impact. It’s a command line tool that allows you to add an ethics checklist to your projects. All these tools underline some of the most important elements in the fight for machine learning ethics. FairML and LIME are both attempting to make interpretability easier, while Deon is making it possible for engineers to bring a holistic and critical approach directly into their day to day work. It aims to promote transparency and improve communication between engineers and others. The future of artificial intelligence and machine learning depends on developers taking responsibility Machine learning and artificial intelligence are hitting maturity. They’re technologies that are now, after decades incubated in computer science departments and military intelligence organizations, transforming and having an impact in a truly impressive range of domains. With this maturity comes more responsibility. Ethical questions arise as machine learning affects change everywhere, spilling out into everything from marketing to justice systems. If we can’t get machine learning ethics right, then we’ll never properly leverage the benefits of artificial intelligence and machine learning. People won’t trust it and legislation will start to severely curb what it can do. It’s only by taking responsibility for its effects and consequences that we can be sure it will not only have a transformative impact on the world, but also one that’s safe and for the benefit of everyone.
Read more
  • 0
  • 0
  • 36118
Modal Close icon
Modal Close icon