Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-convolutional-neural-networks-reinforcement-learning
Packt
06 Apr 2017
9 min read
Save for later

Convolutional Neural Networks with Reinforcement Learning

Packt
06 Apr 2017
9 min read
In this article by Antonio Gulli, Sujit Pal, the authors of the book Deep Learning with Keras, we will learn about reinforcement learning, or more specifically deep reinforcement learning, that is, the application of deep neural networks to reinforcement learning. We will also see how convolutional neural networks leverage spatial information and they are therefore very well suited for classifying images. (For more resources related to this topic, see here.) Deep convolutional neural network A Deep Convolutional Neural Network (DCCN) consists of many neural network layers. Two different types of layers, convolutional and pooling, are typically alternated. The depth of each filter increases from left to right in the network. The last stage is typically made of one or more fully connected layers as shown here: There are three key intuitions beyond ConvNets: Local receptive fields Shared weights Pooling Let's review them together. Local receptive fields If we want to preserve the spatial information, then it is convenient to represent each image with a matrix of pixels. Then, a simple way to encode the local structure is to connect a submatrix of adjacent input neurons into one single hidden neuron belonging to the next layer. That single hidden neuron represents one local receptive field. Note that this operation is named convolution and it gives the name to this type of networks. Of course we can encode more information by having overlapping submatrices. For instance let's suppose that the size of each single submatrix is 5 x 5 and that those submatrices are used with MNIST images of 28 x 28 pixels, then we will be able to generate 23 x 23 local receptive field neurons in the next hidden layer. In fact it is possible to slide the submatrices by only 23 positions before touching the borders of the images. In Keras, the size of each single submatrix is called stride-length and this is an hyper-parameter which can be fine-tuned during the construction of our nets. Let's define the feature map from one layer to another layer. Of course we can have multiple feature maps which learn independently from each hidden layer. For instance we can start with 28 x 28 input neurons for processing MINST images, and then recall k feature maps of size 23 x 23 neurons each (again with stride of 5 x 5) in the next hidden layer. Shared weights and bias Let's suppose that we want to move away from the pixel representation in a row by gaining the ability of detecting the same feature independently from the location where it is placed in the input image. A simple intuition is to use the same set of weights and bias for all the neurons in the hidden layers. In this way each layer will learn a set of position-independent latent features derived from the image. Assuming that the input image has shape (256, 256) on 3 channels with tf (Tensorflow) ordering, this is represented as (256, 256, 3). Note that with th (Theano) mode the channels dimension (the depth) is at index 1, in tf mode is it at index 3. In Keras if we want to add a convolutional layer with dimensionality of the output 32 and extension of each filter 3 x 3 we will write: model = Sequential() model.add(Convolution2D(32, 3, 3, input_shape=(256, 256, 3)) This means that we are applying a 3 x 3 convolution on 256 x 256 image with 3 input channels (or input filters) resulting in 32 output channels (or output filters). An example of convolution is provided in the following diagram: Pooling layers Let's suppose that we want to summarize the output of a feature map. Again, we can use the spatial contiguity of the output produced from a single feature map and aggregate the values of a submatrix into one single output value synthetically describing the meaning associated with that physical region. Max pooling One easy and common choice is the so-called max pooling operator which simply outputs the maximum activation as observed in the region. In Keras, if we want to define a max pooling layer of size 2 x 2 we will write: model.add(MaxPooling2D(pool_size = (2, 2))) An example of max pooling operation is given in the following diagram: Average pooling Another choice is the average pooling which simply aggregates a region into the average values of the activations observed in that region. Keras implements a large number of pooling layers and a complete list is available online. In short, all the pooling operations are nothing more than a summary operation on a given region. Reinforcement learning Our objective is to build a neural network to play the game of catch. Each game starts with a ball being dropped from a random position from the top of the screen. The objective is to move a paddle at the bottom of the screen using the left and right arrow keys to catch the ball by the time it reaches the bottom. As games go, this is quite simple. At any point in time, the state of this game is given by the (x, y) coordinates of the ball and paddle. Most arcade games tend to have many more moving parts, so a general solution is to provide the entire current game screen image as the state. The following diagram shows four consecutive screenshots of our catch game: Astute readers might note that our problem could be modeled as a classification problem, where the input to the network are the game screen images and the output is one of three actions - move left, stay, or move right. However, this would require us to provide the network with training examples, possibly from recordings of games played by experts. An alternative and simpler approach might be to build a network and have it play the game repeatedly, giving it feedback based on whether it succeeds in catching the ball or not. This approach is also more intuitive and is closer to the way humans and animals learn. The most common way to represent such a problem is through a Markov Decision Process (MDP). Our game is the environment within which the agent is trying to learn. The state of the environment at time step t is given by st (and contains the location of the ball and paddle). The agent can perform certain actions at (such as moving the paddle left or right). These actions can sometimes result in a reward rt, which can be positive or negative (such as an increase or decrease in the score). Actions change the environment and can lead to a new state st+1, where the agent can perform another action at+1, and so on. The set of states, actions and rewards, together with the rules for transitioning from one state to the other, make up a Markov decision process. A single game is one episode of this process, and is represented by a finite sequence of states, actions and rewards: Since this is a Markov decision process, the probability of state st+1 depends only on current state st and action at. Maximizing future rewards As an agent, our objective is to maximize the total reward from each game. The total reward can be represented as follows: In order to maximize the total reward, the agent should try to maximize the total reward from any time point t in the game. The total reward at time step t is given by Rt and is represented as: However, it is harder to predict the value of the rewards the further we go into the future. In order to take this into consideration, our agent should try to maximize the total discounted future reward at time t instead. This is done by discounting the reward at each future time step by a factor γ over the previous time step. If γ is 0, then our network does not consider future rewards at all, and if γ is 1, then our network is completely deterministic. A good value for γ is around 0.9. Factoring the equation allows us to express the total discounted future reward at a given time step recursively as the sum of the current reward and the total discounted future reward at the next time step: Q-learning Deep reinforcement learning utilizes a model-free reinforcement learning technique called Q-learning. Q-learning can be used to find an optimal action for any given state in a finite Markov decision process. Q-learning tries to maximize the value of the Q-function which represents the maximum discounted future reward when we perform action a in state s: Once we know the Q-function, the optimal action a at a state s is the one with the highest Q-value. We can then define a policy π(s) that gives us the optimal action at any state: We can define the Q-function for a transition point (st, at, rt, st+1) in terms of the Q-function at the next point (st+1, at+1, rt+1, st+2) similar to how we did with the total discounted future reward. This equation is known as the Bellmann equation. The Q-function can be approximated using the Bellman equation. You can think of the Q-function as a lookup table (called a Q-table) where the states (denoted by s) are rows and actions (denoted by a) are columns, and the elements (denoted by Q(s, a)) are the rewards that you get if you are in the state given by the row and take the action given by the column. The best action to take at any state is the one with the highest reward. We start by randomly initializing the Q-table, then carry out random actions and observe the rewards to update the Q-table iteratively according to the following algorithm: initialize Q-table Q observe initial state s repeat select and carry out action a observe reward r and move to new state s' Q(s, a) = Q(s, a) + α(r + γ maxa' Q(s', a') - Q(s, a)) s = s' until game over You will realize that the algorithm is basically doing stochastic gradient descent on the Bellman equation, backpropagating the reward through the state space (or episode) and averaging over many trials (or epochs). Here α is the learning rate that determines how much of the difference between the previous Q-value and the discounted new maximum Q-value should be incorporated. Summary We have seen the application of deep neural networks, reinforcement learning. We have also seen convolutional neural networks and how they are well suited for classifying images. Resources for Article: Further resources on this subject: Deep learning and regression analysis [article] Training neural networks efficiently using Keras [article] Implementing Artificial Neural Networks with TensorFlow [article]
Read more
  • 0
  • 0
  • 26890

Packt
06 Apr 2017
15 min read
Save for later

Synchronization – An Approach to Delivering Successful Machine Learning Projects

Packt
06 Apr 2017
15 min read
“In the midst of chaos, there is also opportunity”                                                                                                                - Sun Tzu In this article, by Cory Lesmeister, the author of the book Mastering Machine Learning with R - Second Edition, Cory provides insights on ensuring the success and value of your machine learning endeavors. (For more resources related to this topic, see here.) Framing the problem Raise your hand if any of the following has happened or is currently happening to you: You’ve been part of a project team that failed to deliver anything of business value You attend numerous meetings, but they don’t seem productive; maybe they are even complete time wasters Different teams are not sharing information with each other; thus, you are struggling to understand what everyone else is doing, and they have no idea what you are doing or why you are doing it An unknown stakeholder, feeling threatened by your project, comes from out of nowhere and disparages you and/or your work The Executive Committee congratulates your team on their great effort, but decides not to implement it, or even worse, tells you to go back and do it all over again, only this time solve the real problem OK, you can put your hand down now. If you didn’t raise your hand, please send me your contact information because you are about as rare as a unicorn. All organizations, regardless of their size, struggle with integrating different functions, current operations, and other projects. In short, the real-world is filled with chaos. It doesn’t matter how many advanced degrees people have, how experienced they are, how much money is thrown at the problem, what technology is used, how brilliant and powerful the machine learning algorithm is, problems such as those listed above will happen. The bottom line is that implementing machine learning projects in the business world is complicated and prone to failure. However, out of this chaos you have the opportunity to influence your organization by integrating disparate people and teams, fostering a collaborative environment that can adapt to unforeseen changes. But, be warned, this is not easy. If it was easy everyone would be doing it. However, it works and, it works well. By it, I’m talking about the methodology I developed about a dozen years ago, a method I refer to as the “Synchronization Process”. If we ask ourselves, “what are the challenges to implementation”, it seems to me that the following blog post, clearly and succinctly sums it up: https://www.capgemini.com/blog/capping-it-off/2012/04/four-key-challenges-for-business-analytics It enumerates four challenges: Strategic alignment Agility Commitment Information maturity This blog addresses business analytics, but it can be extended to machine learning projects. One could even say machine learning is becoming the analytics tool of choice in many organizations. As such, I will make the case below that the Synchronization Process can effectively deal with the first three challenges. Not only that, the process can provide additional benefits. By overcoming the challenges, you can deliver an effective project, by delivering an effective project you can increase actionable insights and by increasing actionable insights, you will improve decision-making, and that is where the real business value resides. Defining the process “In preparing for battle, I have always found that plans are useless, but planning is indispensable.”                                                                                       - Dwight D. Eisenhower I adopted the term synchronization from the US Army’s operations manual, FM 3-0 where it is described as a battlefield tenet and force multiplier. The manual defines synchronization as, “…arranging activities in time, space and purpose to mass maximum relative combat power at a decisive place and time”. If we overlay this military definition onto the context of a competitive marketplace, we come up with a definition I find more relevant. For our purpose, synchronization is defined as, “arranging business functions and/or tasks in time and purpose to produce the proper amount of focus on a critical event or events”. These definitions put synchronization in the context of an “endstate” based on a plan and a vision. However, it is the process of seeking to achieve that endstate that the true benefits come to fruition. So, we can look at synchronization as not only an endstate, but also as a process. The military’s solution to synchronizing operations before implementing a plan is the wargame. Like the military, businesses and corporations have utilized wargaming to facilitate decision-making and create integration of different business functions. Following the synchronization process techniques explained below, you can take the concept of business wargaming to a new level. I will discuss and provide specific ideas, steps, and deliverables that you can implement immediately. Before we begin that discussion, I want to cover the benefits that the process will deliver. Exploring the benefits of the process When I created this methodology about a dozen years ago, I was part of a market research team struggling to commit our limited resources to numerous projects, all of which were someone’s top priority, in a highly uncertain environment. Or, as I like to refer to it, just another day at the office. I knew from my military experience that I had the tools and techniques to successfully tackle these challenges. It worked then and has been working for me ever since. I have found that it delivers the following benefits to an organization: Integration of business partners and stakeholders Timely and accurate measurement of performance and effectiveness Anticipation of and planning for possible events Adaptation to unforeseen threats Exploitation of unforeseen opportunities Improvement in teamwork Fostering a collaborative environment Improving focus and prioritization In market research, and I believe it applies to all analytical endeavors, including machine learning, we talked about focusing on three specific questions about what to measure: What are we measuring? When do we measure it? How will we measure it? We found that successfully answering those questions facilitated improved decision-making by informing leadership what STOP doing, what to START doing and what to KEEP doing. I have found myself in many meetings going nowhere when I would ask a question like, “what are you looking to stop doing?” Ask leadership what they want to stop, start, or continue to do and you will get to the core of the problem. Then, your job will be to configure the business decision as the measurement/analytical problem. The Synchronization Process can bring this all together in a coherent fashion. I’ve been asked often about what triggers in my mind that a project requires going through the Synchronization Process. Here are some of the questions you should consider, and if you answer “yes” to any of them, it may be a good idea to implement the process: Are resources constrained to the point that several projects will suffer poor results or not be done at all? Do you face multiple, conflicting priorities? Could the external environment change and dramatically influence project(s)? Are numerous stakeholders involved or influenced by a project’s result? Is the project complex and facing a high-level of uncertainty? Does the project involve new technology? Does the project face the actual or potential for organizational change? You may be thinking, “Hey, we have a project manager for all this?” OK, how is that working out? Let me be crystal clear here, this is not just project management! This is about improving decision-making! A Gannt Chart or task management software won’t do that. You must be the agent of change. With that, let’s turn our attention to the process itself. Exploring the process Any team can take the methods elaborated on below and incorporate them to their specific situation with their specific business partners.  If executed properly, one can expect the initial investment in time and effort to provide substantial payoff within weeks of initiating the process. There are just four steps to incorporate with each having several tasks for you and your team members to complete. The four steps are as follows: Project kick-off Project analysis Synchronization exercise Project execution Let’s cover each of these in detail. I will provide what I like to refer to as a “Quad Chart” for each process step along with appropriate commentary. Project kick-off I recommend you lead the kick-off meeting to ensure all team members understand and agree to the upcoming process steps. You should place emphasis on the importance of completing the pre-work and understanding of key definitions, particularly around facts and critical assumptions. The operational definitions are as follows: Facts: Data or information that will likely have an impact on the project Critical assumptions: Valid and necessary suppositions in the absence of facts that, if proven false, would adversely impact planning or execution It is an excellent practice to link facts and assumptions. Here is an example of how that would work: It is a FACT that the Information Technology is beta-testing cloud-based solutions. We must ASSUME for planning purposes, that we can operate machine learning solutions on the cloud by the fourth quarter of this year. See, we’ve linked a fact and an assumption together and if this cloud-based solution is not available, let’s say it would negatively impact our ability to scale-up our machine learning solutions. If so, then you may want to have a contingency plan of some sort already thought through and prepared for implementation. Don’t worry if you haven’t thought of all possible assumptions or if you end up with a list of dozens. The synchronization exercise will help in identifying and prioritizing them. In my experience, identifying and tracking 10 critical assumptions at the project level is adequate. The following is the quad chart for this process step: Figure 1: Project kick-off quad chart Notice what is likely a new term, “Synchronization Matrix”. That is merely the tool used by the team to capture notes during the Synchronization Exercise. What you are doing is capturing time and events on the X-axis, and functions and terms on the Y-axis. Of course, this is highly customizable based on the specific circumstances and we will discuss more about it in process step number 3, that is Synchronization exercise, but here is an abbreviated example: Figure 2: Synchronization matrix example You can see in the matrix that I’ve included a row to capture critical assumptions. I can’t understate how important it is to articulate, capture, and track them. In fact, this is probably my favorite quote on the subject: … flawed assumptions are the most common cause of flawed execution. Harvard Business Review, The High Performance Organization, July-August 2005 OK, I think I’ve made my point, so let’s look at the next process step. Project analysis At this step, the participants prepare by analyzing the situation, collecting data, and making judgements as necessary. The goal is for each participant of the Synchronization Exercise to come to that meeting fully prepared. A good technique is to provide project participants with a worksheet template for them to use to complete the pre-work. A team can complete this step either individually, collectively or both. Here is the quad chart for the process step: Figure 3: Project analysis quad chart Let me expand on a couple of points. The idea of a team member creating information requirements is quite important. These are often tied back to your critical assumptions. Take the example above of the assumption around fielding a cloud-based capability. Can you think of some information requirements that might have as a potential end-user? Furthermore, can you prioritize them? OK, having done that, can you think of a plan to acquire that information and confirm or deny the underlying critical assumption? Notice also how that ties together with decision points you or others may have to make and how they may trigger contingency plans. This may sound rather basic and simplistic, but unless people are asked to think like this, articulate their requirements, share the information don’t expect anything to change anytime soon. It will be business as usual and let me ask again, “how is that working out for you?”. There is opportunity in all that chaos, so embrace it, and in the next step you will see the magic happen. Synchronization exercise The focus and discipline of the participants determine the success of this process step. This is a wargame-type exercise where team members portray their plan over time. Now, everyone gets to see how their plan relates to or even inhibits someone else’s plan and vice versa. I’ve done this step several different ways, including building the matrix on software, but the method that has consistently produced the best results is to build the matrix on large paper and put it along a conference room wall. Then, have the participants, one at a time, use post-it notes to portray their key events.  For example, the marketing manager gets up to the wall and posts “Marketing Campaign One” in the first time phase, “Marketing Campaign Two” in the final time phase, along with “Propensity Models” in the information requirements block. Iterating by participant and by time/event leads to coordination and cooperation like nothing you’ve ever seen. Another method to facilitate the success of the meeting is to have a disinterested and objective third party “referee” the meeting. This will help to ensure that any issues are captured or resolved and the process products updated accordingly. After the exercise, team members can incorporate the findings to their individual plans. This is an example quad chart on the process step: Figure 4: Synchronization exercise quad chart I really like the idea of execution and performance metrics. Here is how to think about them: Execution metrics—are we doing things right? Performance metrics—are we doing the right things? As you see, execution is about plan implementation, while performance metrics are about determining if the plan is making a difference (yes, I know that can be quite a dangerous thing to measure). Finally, we come to the fourth step where everything comes together during the execution of the project plan. Project execution This is a continual step in the process where a team can utilize the synchronization products to maintain situational understanding of the itself, key stakeholders, and the competitive environment. It can determine and how plans are progressing and quickly react to opportunities and threats as necessary. I recommend you update and communicate changes to the documentation on a regular basis. When I was in pharmaceutical forecasting, it was imperative that I end the business week by updating the matrices on SharePoint, which were available to all pertinent team members. The following is the quad chart for this process step: Figure 5: Project execution quad chart Keeping up with the documentation is a quick and simple process for the most part, and by doing so you will keep people aligned and cooperating. Be aware that like everything else that is new in the world, initial exuberance and enthusiasm will start to wane after several weeks. That is fine as long as you keep the documentation alive and maintain systematic communication. You will soon find that behavior is changing without anyone even taking heed, which is probably the best way to actually change behavior. A couple of words of warning. Don’t expect everyone to embrace the process wholeheartedly, which is to say that office politics may create a few obstacles. Often, an individual or even an entire business function will withhold information as “information is power”, and by sharing information they may feel they are losing power. Another issue may rise where some people feel it is needlessly complex or unnecessary. A solution to these problems is to scale back the number of core team members and utilize stakeholder analysis and a communication plan to bring they naysayers slowly into the fold. Change is never easy, but necessary nonetheless. Summary In this article, I’ve covered, at a high-level, a successful and proven process to deliver machine learning projects that will drive business value. I developed it from my numerous years of planning and evaluating military operations, including a one-year stint as a strategic advisor to the Iraqi Oil Police, adapting it to the needs of any organization. Utilizing the Synchronization Process will help any team avoid the common pitfalls of projects and improve efficiency and decision-making. It will help you become an agent of change and create influence in an organization without positional power. Resources for Article: Further resources on this subject: Machine Learning with R [article] Machine Learning Using Spark MLlib [article] Welcome to Machine Learning Using the .NET Framework [article]
Read more
  • 0
  • 0
  • 2181

article-image-integrating-messages-app
Packt
06 Apr 2017
17 min read
Save for later

Integrating with Messages App

Packt
06 Apr 2017
17 min read
In this article by Hossam Ghareeb, the author of the book, iOS Programming Cookbook, we will cover the recipe Integrating iMessage app with iMessage app. (For more resources related to this topic, see here.) Integrating iMessage app with iMessage app Using iMessage apps will let users use your apps seamlessly from iMessage without having to leave the iMessage. Your app can share content in the conversation, make payment, or do any specific job that seems important or is appropriate to do within a Messages app. Getting ready Similar to the Stickers app we created earlier, you need Xcode 8.0 or later version to create an iMessage app extension and you can test it easily in the iOS simulator. The app that we are going to build is a Google drive picker app. It will be used from an iMessage extension to send a file to your friends just from Google Drive. Before starting, ensure that you follow the instructions in Google Drive API for iOS from https://developers.google.com/drive/ios/quickstart to get a client key to be used in our app. Installing the SDK in Xcode will be done via CocoaPods. To get more information about CocoaPods and how to use it to manage dependencies, visit https://cocoapods.org/ . How to do it… We Open Xcode and create a new iMessage app, as shown, and name itFiles Picker:   Now, let's install Google Drive SDK in iOS using CocoaPods. Open terminal and navigate to the directory that contains your Xcode project by running this command: cd path_to_directory Run the following command to create a Pod file to write your dependencies: Pod init It will create a Pod file for you. Open it via TextEdit and edit it to be like this: use_frameworks! target 'PDFPicker' do end target 'MessagesExtension' do pod 'GoogleAPIClient/Drive', '~> 1.0.2' pod 'GTMOAuth2', '~> 1.1.0' end Then, close the Xcode app completely and run the pod install command to install the SDK for you. A new workspace will be created. Open it instead of the Xcode project itself. Prepare the client key from the Google drive app you created as we mentioned in the Getting ready section, because we are going to use it in the Xcode project. Open MessagesViewController.swift and add the following import statements: import GoogleAPIClient import GTMOAuth2 Add the following private variables just below the class declaration and embed your client key in the kClientID constant, as shown: private let kKeychainItemName = "Drive API" private let kClientID = "Client_Key_Goes_HERE" private let scopes = [kGTLAuthScopeDrive] private let service = GTLServiceDrive() Add the following code in your class to request authentication to Google drive if it's not authenticated and load file info: override func viewDidLoad() { super.viewDidLoad() // Do any additional setup after loading the view. if let auth = GTMOAuth2ViewControllerTouch.authForGoogleFromKeychain(forName: kKeychainItemName, clientID: kClientID, clientSecret: nil) { service.authorizer = auth } } // When the view appears, ensure that the Drive API service is authorized // and perform API calls override func viewDidAppear(_ animated: Bool) { if let authorizer = service.authorizer, canAuth = authorizer.canAuthorize where canAuth { fetchFiles() } else { present(createAuthController(), animated: true, completion: nil) } } // Construct a query to get names and IDs of 10 files using the Google Drive API func fetchFiles() { print("Getting files...") if let query = GTLQueryDrive.queryForFilesList(){ query.fields = "nextPageToken, files(id, name, webViewLink, webContentLink, fileExtension)" service.executeQuery(query, delegate: self, didFinish: #selector(MessagesViewController.displayResultWithTicket(ticket:finishedWit hObject:error:))) } } // Parse results and display func displayResultWithTicket(ticket : GTLServiceTicket, finishedWithObject response : GTLDriveFileList, if let error = error { showAlert(title: "Error", message: error.localizedDescription) return } var filesString = "" let files = response.files as! [GTLDriveFile] if !files.isEmpty{ filesString += "Files:n" for file in files{ filesString += "(file.name) ((file.identifier) ((file.webViewLink) ((file.webContentLink))n" } } else { filesString = "No files found." } print(filesString) } // Creates the auth controller for authorizing access to Drive API private func createAuthController() -> GTMOAuth2ViewControllerTouch { let scopeString = scopes.joined(separator: " ") return GTMOAuth2ViewControllerTouch( scope: scopeString, clientID: kClientID, clientSecret: nil, keychainItemName: kKeychainItemName, delegate: self, finishedSelector: #selector(MessagesViewController.viewController(vc:finishedWithAuth:error:) ) ) } // Handle completion of the authorization process, and update the Drive API // with the new credentials. func viewController(vc : UIViewController, finishedWithAuth authResult : GTMOAuth2Authentication, error : NSError?) { if let error = error { service.authorizer = nil showAlert(title: "Authentication Error", message: error.localizedDescription) return } service.authorizer = authResult dismiss(animated: true, completion: nil) fetchFiles() } // Helper for showing an alert func showAlert(title : String, message: String) { let alert = UIAlertController( title: title, message: message, preferredStyle: UIAlertControllerStyle.alert ) let ok = UIAlertAction( title: "OK", style: UIAlertActionStyle.default, handler: nil ) alert.addAction(ok) self.present(alert, animated: true, completion: nil) } The code now requests authentication, loads files, and then prints them in the debug area. Now, try to build and run, you will see the following: Click on the arrow button in the bottom right corner to maximize the screen and try to log in with any Google account you have. Once the authentication is done, you will see the files' information printed in the debug area. Now, let's add a table view that will display the files' information and once a user selects a file, we will download this file to send it as an attachment to the conversation. Now, open theMainInterface.storyboard, drag a table view from Object Library, and add the following constraints: Set the delegate and data source of the table view from interface builder by dragging while holding down the Ctrl key to theMessagesViewController. Then, add an outlet to the table view, as follows, to be used to refresh the table with the files:  Drag a UITabeView cell from Object Library and drop it in the table view. For Attribute Inspector, set the cell style to Basic and the identifier to cell. Now, return to MessagesViewController.swift. Add the following property to hold the current display files: private var currentFiles = [GTLDriveFile]() Edit the displayResultWithTicket function to be like this: // Parse results and display func displayResultWithTicket(ticket : GTLServiceTicket, finishedWithObject response : GTLDriveFileList, error : NSError?) { if let error = error { showAlert(title: "Error", message: error.localizedDescription) return } var filesString = "" let files = response.files as! [GTLDriveFile] self.currentFiles = files if !files.isEmpty{ filesString += "Files:n" for file in files{ filesString += "(file.name) ((file.identifier) ((file.webViewLink) ((file.webContentLink))n" } } else { filesString = "No files found." } print(filesString) self.filesTableView.reloadData() } Now, add the following method for the table view delegate and data source: // MARK: - Table View methods - func tableView(_ tableView: UITableView, numberOfRowsInSection section: Int) -> Int { return self.currentFiles.count } func tableView(_ tableView: UITableView, cellForRowAt indexPath: IndexPath) -> UITableViewCell { let cell = tableView.dequeueReusableCell(withIdentifier: "cell") let file = self.currentFiles[indexPath.row] cell?.textLabel?.text = file.name return cell! } func tableView(_ tableView: UITableView, didSelectRowAt indexPath: IndexPath) { let file = self.currentFiles[indexPath.row] // Download File here to send as attachment. if let downloadURLString = file.webContentLink{ let url = NSURL(string: downloadURLString) if let name = file.name{ let downloadedPath = (documentsPath() as NSString).appendingPathComponent("(name)") let fetcher = service.fetcherService.fetcher(with: url as! URL) let destinationURL = NSURL(fileURLWithPath: downloadedPath) as URL fetcher.destinationFileURL = destinationURL fetcher.beginFetch(completionHandler: { (data, error) in if error == nil{ self.activeConversation?.insertAttachment(destinationURL, withAlternateFilename: name, completionHandler: nil) } }) } } } private func documentsPath() -> String{ let paths = NSSearchPathForDirectoriesInDomains(.documentDirectory, .userDomainMask, true) return paths.first ?? "" } Now, build and run the app, and you will see the magic: select any file and the app will download and save it to the local disk and send it as an attachment to the conversation, as illustrated: How it works… We started by installing the Google Drive SDK to the Xcode project. This SDK has all the APIs that we need to manage drive files and user authentication. When you visit the Google developers' website, you will see two options to install the SDK: manually or using CocoaPods. I totally recommend using CocoaPods to manage your dependencies as it is simple and efficient. Once the SDK has been installed via CocoaPods, we added some variables to be used for the Google Drive API and the most important one is the client key. You can access this value from the project you have created in the Google Developers Console. In the viewDidLoad function, first, we check if we have an authentication saved in KeyChain, and then, we use it. We can do that by calling GTMOAuth2ViewControllerTouch.authForGoogleFromKeychain, which takes the Keychain name and client key as parameters to search for authentication. It's useful as it helps you remember the last authentication and there is no need to ask for user authentication again if a user has already been authenticated before. In viewDidAppear, we check if a user is already authenticated; so, in that case, we start fetching files from the drive and, if not, we display the authentication controller, which asks a user to enter his Google account credentials. To display the authentication controller, we present the authentication view controller created in the createAuthController() function. In this function, the Google Drive API provides us with the GTMOAuth2ViewControllerTouch class, which encapsulates all logic for Google account authentication for your app. You need to pass the client key for your project, keychain name to save the authentication details there, and the finished  viewController(vc : UIViewController, finishedWithAuth authResult : GTMOAuth2Authentication, error : NSError?) selector that will be called after the authentication is complete. In that function, we check for errors and if something wrong happens, we display an alert message to the user. If no error occurs, we start fetching files using the fetchFiles() function. In the fetchFiles() function, we first create a query by calling GTLQueryDrive.queryForFilesList(). The GTLQueryDrive class has all the information you need about your query, such as which fields to read, for example, name, fileExtension, and a lot of other fields that you can fetch from the Google drive. You can specify the page size if you are going to call with pagination, for example, 10 by 10 files. Once you are happy with your query, execute it by calling service.executeQuery, which takes the query and the finished selector to be called when finished. In our example, it will call the displayResultWithTicket function, which prepares the files to be displayed in the table view. Then, we call self.filesTableView.reloadData() to refresh the table view to display the list of files. In the delegate function of table view didSelectRowAt indexPath:, we first read the webContentLink property from the GTLDriveFile instance, which is a download link for the selected file. To fetch a file from the Google drive, the API provides us with GTMSessionFetcher that can fetch a file and write it directly to a device's disk locally when you pass a local path to it. To create GTMSessionFetcher, use the service.fetcherService factory class, which gives you instance to a fetcher via the file URL. Then, we create a local path to the downloaded file by appending the filename to the documents path of your app and then, pass it to fetcher via the following command: fetcher.destinationFileURL = destinationURL Once you set up everything, call fetcher.beginFetch and pass a completion handler to be executed after finishing the fetching. Once the fetching is completed successfully, you can get a reference to the current conversation so that you can insert the file to it as an attachment. To do this, just call the following function: self.activeConversation?.insertAttachment(destinationURL, withAlternateFilename: name, completionHandler: nil) There's more… Yes, there's more that you can do in the preceding example to make it fancier and more appealing to users. Check the following options to make it better: You can show a loading indicator or progress bar while a file is downloading. Checks if the file is already downloaded, and if so, there is no need to download it again. Adding pagination to request only 10 files at a time. Options to filter documents by type, such as PDF, images, or even by date. Search for a file in your drive. Showing Progress indicator As we said, one of the features that we can add in the preceding example is the ability to show a progress bar indicating the downloading progress of a file. Before starting how to show a progress bar, let's install a library that is very helpful in managing/showing HUD indicators, which is MBProgressHUD. This library is available in GitHub at https://github.com/jdg/MBProgressHUD. As we agreed before, all packages are managed via CocoaPods, so now, let's install the library via CocoaPods, as shown: Open the Podfile and update it to be as follows: use_frameworks! target 'PDFPicker' do end target 'MessagesExtension' do pod 'GoogleAPIClient/Drive', '~> 1.0.2' pod 'GTMOAuth2', '~> 1.1.0' pod 'MBProgressHUD', '~> 1.0.0' end Run the following command to install the dependencies: pod install Now, at the top of the MessagesViewController.swift file, add the following import statement to import the library: Now, let's edit the didSelectRowAtIndexPath function to be like this: func tableView(_ tableView: UITableView, didSelectRowAt indexPath: IndexPath) { let file = self.currentFiles[indexPath.row] // Download File here to send as attachment. if let downloadURLString = file.webContentLink{ let url = NSURL(string: downloadURLString) if let name = file.name{ let downloadedPath = (documentsPath() as NSString).appendingPathComponent("(name)") let fetcher = service.fetcherService.fetcher(with: url as! URL) let destinationURL = NSURL(fileURLWithPath: downloadedPath) as URL fetcher.destinationFileURL = destinationURL var progress = Progress() let hud = MBProgressHUD.showAdded(to: self.view, animated: true) hud.mode = .annularDeterminate; hud.progressObject = progress fetcher.beginFetch(completionHandler: { (data, error) in if error == nil{ hud.hide(animated: true) self.activeConversation?.insertAttachment(destinationURL, withAlternateFilename: name, completionHandler: nil) } }) fetcher.downloadProgressBlock = { (bytes, written, expected) in let p = Double(written) * 100.0 / Double(expected) print(p) progress.totalUnitCount = expected progress.completedUnitCount = written } } } } First, we create an instance of MBProgressHUD and set its type to annularDeterminate, which means to display a circular progress bar. HUD will update its progress by taking a reference to the NSProgress object. Progress has two important variables to determine the progress value, which are totalUnitCount and completedUnitCount. These two values will be set inside the progress completion block, downloadProgressBlock, in the fetcher instance. HUD will be hidden in the completion block that will be called once the download is complete. Now build and run; after authentication, when you click on a file, you will see something like this: As you can see, the progressive view is updated with the percentage of download to give the user an overview of what is going on. Request files with pagination Loading all files at once is easy from the development side, but it's incorrect from the user experience side. It will take too much time at the beginning when you get the list of all the files and it would be great if we could request only 10 files at a time with pagination. In this section, we will see how to add the pagination concept to our example and request only 10 files at a time. When a user scrolls to the end of the list, we will display a loading indicator, call the next page, and append the results to our current results. Implementation of pagination is pretty easy and requires only a few changes in our code. Let's see how to do it: We will start by adding the progress cell design in MainInterface.storyboard. Open the design of MessagesViewController and drag a new cell along with our default cell. Drag a UIActivityIndicatorView from ObjectLibrary and place it as a subview to the new cell. Add center constraints to center it horizontally and vertically as shown: Now, select the new cell and go to attribute inspector to add an identifier to the cell and disable the selection as illustrated: Now, from the design side, we are ready. Open MessagesViewController.swift to add some tweaks to it. Add the following two variables to the list of our current variables: private var doneFetchingFiles = false private var nextPageToken: String! The doneFetchingFiles flag will be used to hide the progress cell when we try to load the next page from Google Drive and returns an empty list. In that case, we know that we are done with the fetching files and there is no need to display the progress cell any more. The nextPageToken contains the token to be passed to the GTLQueryDrive query to ask it to load the next page. Now, go to the fetchFiles() function and update it to be as shown: func fetchFiles() { print("Getting files...") if let query = GTLQueryDrive.queryForFilesList(){ query.fields = "nextPageToken, files(id, name, webViewLink, webContentLink, fileExtension)" query.mimeType = "application/pdf" query.pageSize = 10 query.pageToken = nextPageToken service.executeQuery(query, delegate: self, didFinish: #selector(MessagesViewController.displayResultWithTicket(ticket:finishedWit hObject:error:))) } } The only difference you can note between the preceding code and the one before that is setting the pageSize and pageToken. For pageSize, we set how many files we require for each call and for pageToken, we pass the token to get the next page. We receive this token as a response from the previous page call. This means that, at the first call, we don't have a token and it will be passed as nil. Now, open the displayResultWithTicket function and update it like this: // Parse results and display func displayResultWithTicket(ticket : GTLServiceTicket, finishedWithObject response : GTLDriveFileList, error : NSError?) { if let error = error { showAlert(title: "Error", message: error.localizedDescription) return } var filesString = "" nextPageToken = response.nextPageToken let files = response.files as! [GTLDriveFile] doneFetchingFiles = files.isEmpty self.currentFiles += files if !files.isEmpty{ filesString += "Files:n" for file in files{ filesString += "(file.name) ((file.identifier) ((file.webViewLink) ((file.webContentLink))n" } } else { filesString = "No files found." } print(filesString) self.filesTableView.reloadData() } As you can see, we first get the token that is to be used to load the next page. We get it by calling response.nextPageToken and setting it to our new  nextPageToken property so that we can use it while loading the next page. The doneFetchingFiles will be true only if the current page we are loading has no files, which means that we are done. Then, we append the new files we get to the current files we have. We don't know when to fire the calling of the next page. We will do this once the user scrolls down to the refresh cell that we have. To do so, we will implement one of the UITableViewDelegate methods, which is willDisplayCell, as illustrated: func tableView(_ tableView: UITableView, willDisplay cell: UITableViewCell, forRowAt indexPath: IndexPath) { if !doneFetchingFiles && indexPath.row == self.currentFiles.count { // Refresh cell fetchFiles() return } } For any cell that is going to be displayed, this function will be triggered with indexPath of the cell. First, we check if we are not done with the fetching files and the row is equal to the last row, then, we fire fetchFiles() again to load the next page. As we added a new refresh cell at the bottom, we should update our UITableViewDataSource functions, such as numbersOfRowsInSection and cellForRow. Check our updated functions, shown as follows: func tableView(_ tableView: UITableView, numberOfRowsInSection section: Int) -> Int { return doneFetchingFiles ? self.currentFiles.count : self.currentFiles.count + 1 } func tableView(_ tableView: UITableView, cellForRowAt indexPath: IndexPath) -> UITableViewCell { if !doneFetchingFiles && indexPath.row == self.currentFiles.count{ return tableView.dequeueReusableCell(withIdentifier: "progressCell")! } let cell = tableView.dequeueReusableCell(withIdentifier: "cell") let file = self.currentFiles[indexPath.row] cell?.textLabel?.text = file.name return cell! } As you can see, the number of rows will be equal to the current files' count plus one for the refresh cell. If we are done with the fetching files, we will return only the number of files. Now, everything seems perfect. When you build and run, you will see only 10 files listed, as shown: And when you scroll down you would see the progress cell that and 10 more files will be called. Summary In this article, we learned how to integrate iMessage app with iMessage app. Resources for Article: Further resources on this subject: iOS Security Overview [article] Optimizing JavaScript for iOS Hybrid Apps [article] Testing our application on an iOS device [article]
Read more
  • 0
  • 0
  • 32818

article-image-supervised-learning-classification-and-regression
Packt
06 Apr 2017
30 min read
Save for later

Supervised Learning: Classification and Regression

Packt
06 Apr 2017
30 min read
In this article by Alexey Grigorev, author of the book Mastering Java for Data Science, we will look at how to do pre-processing of data in Java and how to do Exploratory Data Analysis inside and outside Java. Now, when we covered the foundation, we are ready to start creating Machine Learning models. First, we start with Supervised Learning. In the supervised settings we have some information attached to each observation – called labels – and we want to learn from it, and predict it for observations without labels. There are two types of labels: the first are discrete and finite, such as True/False or Buy/Sell, and second are continuous, such as salary or temperature. These types correspond to two types of Supervised Learning: Classification and Regression. We will talk about them in this article. This article covers: Classification problems Regression Problems Evaluation metrics for each type Overview of available implementations in Java (For more resources related to this topic, see here.) Classification In Machine Learning, the classification problem deals with discrete targets with finite set of possible values. The Binary Classification is the most common type of classification problems: the target variable can have only two possible values, such as True/False, Relevant/Not Relevant, Duplicate/Not Duplicate, Cat/Dog and so on. Sometimes the target variable can have more than two outcomes, for example: colors, category of an item, model of a car, and so on, and we call it Multi-Class Classification. Typically each observation can only have one label, but in some settings an observation can be assigned several values. Typically, Multi-Class Classification can be converted to a set of Binary Classification problems, which is why we will mostly concentrate on Binary Classification. Binary Classification Models There are many models for solving the binary classification problem and it is not possible to cover all of them. We will briefly cover the ones that are most often used in practice. They include: Logistic Regression Support Vector Machines Decision Trees Neural Networks We assume that you are already familiar with these methods. Deep familiarity is not required, but for more information you can check the following book: An Introduction to Statistical Learning by G. James, D. Witten, T. Hastie and R. Tibshirani Python Machine Learning by S. Raschka When it comes to libraries, we will cover the following: Smile, JSAT, LIBSVM and LIBLINEAR and Encog. Smile Smile (Statistical Machine Intelligence and Learning Engine) is a library with a large set of classification and other machine learning algorithms. You can see the full list here: https://github.com/haifengl/smile. The library is available on Maven Central and the latest version at the moment of writing is 1.1.0. To include it to your project, add the following dependency: <dependency> <groupId>com.github.haifengl</groupId> <artifactId>smile-core</artifactId> <version>1.1.0</version> </dependency> It is being actively developed; new features and bug fixes are added quite often, but not released as frequently. We recommend to use the latest available version of Smile, and to get it, you will need to build it from the sources. To do it: Install sbt – a tool for building scala projects. You can follow this instruction: http://www.scala-sbt.org/release/docs/Manual-Installation.html Use git to clone the project from https://github.com/haifengl/smile To build and publish the library to local maven repository, run the following command: sbt core/publishM2 The Smile library consists of several sub-modules, such as smile-core, smile-nlp, and smile-plot and so on. So, after building it, add the following dependency to your pom: <dependency> <groupId>com.github.haifengl</groupId> <artifactId>smile-core</artifactId> <version>1.2.0</version> </dependency> The models from Smile expect the data to be in a form of two-dimensional arrays of doubles, and the label information is one dimensional array of integers. For binary models, the values should be 0 or 1. Some models in Smile can handle Multi-Class Classification problems, so it is possible to have more labels. The models are built using the Builder pattern: you create a special class, set some parameters and at the end it returns the object it builds. In Smile this builder class is typically called Trainer, and all models should have a trainer for them. For example, consider training a Random Forest model: double[] X = ...// training data int[] y = ...// 0 and 1 labels RandomForest model = new RandomForest.Trainer() .setNumTrees(100) .setNodeSize(4) .setSamplingRates(0.7) .setSplitRule(SplitRule.ENTROPY) .setNumRandomFeatures(3) .train(X, y); The RandomForest.Trainer class takes in a set of parameters and the training data, and at the end produces the trained Random Forest model. The implementation of Random Forest from Smile has the following parameters: numTrees: number of trees to train in the model. nodeSize: the minimum number of items in the leaf nodes. samplingRate: the ratio of training data used to grow each tree. splitRule: the impurity measure used for selecting the best split. numRandomFeatures: the number of features the model randomly chooses for selecting the best split. Similarly, a logistic regression is trained as follows: LogisticRegression lr = new LogisticRegression.Trainer() .setRegularizationFactor(lambda) .train(X, y); Once we have a model, we can use it for predicting the label of previously unseen items. For that we use the predict method: double[] row = // data int prediction = model.predict(row); This code outputs the most probable class for the given item. However, often we are more interested not in the label itself, but in the probability of having the label. If a model implements the SoftClassifier interface, then it is possible to get these probabilities like this: double[] probs = new double[2]; model.predict(row, probs); After running this code, the probs array will contain the probabilities. JSAT JSAT (Java Statistical Analysis Tool) is another Java library which contains a lot of implementations of common Machine Learning algorithms. You can check the full list of implemented models at https://github.com/EdwardRaff/JSAT/wiki/Algorithms. To include JSAT to a Java project, add this to pom: <dependency> <groupId>com.edwardraff</groupId> <artifactId>JSAT</artifactId> <version>0.0.5</version> </dependency> Unlike Smile, which just takes arrays of doubles, JSAT requires a special wrapper class for data instances. If we have an array, it is converted to the JSAT representation like this: double[][] X = ... // data int[] y = ... // labels // change to more classes for more classes for multi-classification CategoricalData binary = new CategoricalData(2); List<DataPointPair<Integer>> data = new ArrayList<>(X.length); for (int i = 0; i < X.length; i++) { int target = y[i]; DataPoint row = new DataPoint(new DenseVector(X[i])); data.add(new DataPointPair<Integer>(row, target)); } ClassificationDataSet dataset = new ClassificationDataSet(data, binary); Once we have prepared the dataset, we can train a model. Let us consider the Random Forest classifier again: RandomForest model = new RandomForest(); model.setFeatureSamples(4); model.setMaxForestSize(150); model.trainC(dataset); First, we set some parameters for the model, and then, at we end, we call the trainC method (which means “train a classifier”). In the JSAT implementation, Random Forest has fewer options for tuning: only the number of features to select and the number of trees to grow. There are several implementations of Logistic Regression. The usual Logistic Regression model does not have any parameters, and it is trained like this: LogisticRegression model = new LogisticRegression(); model.trainC(dataset); If we want to have a regularized model, then we need to use the LogisticRegressionDCD class (DCD stands for “Dual Coordinate Descent” - this is the optimization method used to train the logistic regression). We train it like this: LogisticRegressionDCD model = new LogisticRegressionDCD(); model.setMaxIterations(maxIterations); model.setC(C); model.trainC(fold.toJsatDataset()); In this code, C is the regularization parameter, and the smaller values of C correspond to stronger regularization effect. Finally, for outputting the probabilities, we can do the following: double[] row = // data DenseVector vector = new DenseVector(row); DataPoint point = new DataPoint(vector); CategoricalResults out = model.classify(point); double probability = out.getProb(1); The class CategoricalResults contains a lot of information, including probabilities for each class and the most likely label. LIBSVM and LIBLINEAR Next we consider two similar libraries: LIBSVM and LIBLINEAR. LIBSVM (https://www.csie.ntu.edu.tw/~cjlin/libsvm/) is a library with implementation of Support Vector Machine models, which include support vector classifiers. LIBLINEAR (https://www.csie.ntu.edu.tw/~cjlin/liblinear/) is a library for fast linear classification algorithms such as Liner SVM and Logistic Regression. Both these libraries come from the same research group and have very similar interfaces. LIBSVM is implemented in C++ and has an officially supported java version. It is available on Maven Central: <dependency> <groupId>tw.edu.ntu.csie</groupId> <artifactId>libsvm</artifactId> <version>3.17</version> </dependency> Note that the Java version of LIBSVM is updated not as often as the C++ version. Nevertheless, the version above is stable and should not contain bugs, but it might be slower than its C++ version. To use SVM models from LIBSVM, you first need to specify the parameters. For this, you create a svm_parameter class. Inside, you can specify many parameters, including: the kernel type (RBF, POLY or LINEAR), the regularization parameter C, probability, which you can set to 1 to be able to get probabilities, the svm_type should be set to C_SVC: this tells that the model should be a classifier. Here is an example of how you can configure an SVM classifier with the Linear kernel which can output probabilities: svm_parameter param = new svm_parameter(); param.svm_type = svm_parameter.C_SVC; param.kernel_type = svm_parameter.LINEAR; param.probability = 1; param.C = C; // default parameters param.cache_size = 100; param.eps = 1e-3; param.p = 0.1; param.shrinking = 1; The polynomial kernel is specified by the following formula: It has three additional parameters: gamma, coeff0 and degree; and also C – the regularization parameter. You can specify it like this: svm_parameter param = new svm_parameter(); param.svm_type = svm_parameter.C_SVC; param.kernel_type = svm_parameter.POLY; param.C = C; param.degree = degree; param.gamma = 1; param.coef0 = 1; param.probability = 1; // plus defaults from the above Finally, the Gaussian kernel (or RBF) has the following formula:' So there is one parameter gamma, which controls the width of the Gaussians. We can specify the model with the RBF kernel like this: svm_parameter param = new svm_parameter(); param.svm_type = svm_parameter.C_SVC; param.kernel_type = svm_parameter.RBF; param.C = C; param.gamma = gamma; param.probability = 1; // plus defaults from the above Once we created the configuration object, we need to convert the data in the right format. The LIBSVM command line application reads files in the SVMLight format, so the library also expects sparse data representation. For a single array, the conversion is following: double[] dataRow = // single row vector svm_node[] svmRow = new svm_node[dataRow.length]; for (int j = 0; j < dataRow.length; j++) { svm_node node = new svm_node(); node.index = j; node.value = dataRow[j]; svmRow[j] = node; } For a matrix, we do this for every row: double[][] X = ... // data int n = X.length; svm_node[][] nodes = new svm_node[n][]; for (int i = 0; i < n; i++) { nodes[i] = wrapAsSvmNode(X[i]); } Where wrapAsSvmNode is a function, that wraps a vector into an array of svm_node objects. Now we can put the data and the labels together into svm_problem object: double[] y = ... // labels svm_problem prob = new svm_problem(); prob.l = n; prob.x = nodes; prob.y = y; Now using the data and the parameters we can train the SVM model: svm_model model = svm.svm_train(prob, param);  Once the model is trained, we can use it for classifying unseen data. Getting probabilities is done this way: double[][] X = // test data int n = X.length; double[] results = new double[n]; double[] probs = new double[2]; for (int i = 0; i < n; i++) { svm_node[] row = wrapAsSvmNode(X[i]); svm.svm_predict_probability(model, row, probs); results[i] = probs[1]; } Since we used the param.probability = 1, we can use svm.svm_predict_probability method to predict probabilities. Like Smile, the method takes an array of doubles and writes the results there. Then we can get the probabilities there. Finally, while training, LIBSVM outputs a lot of things on the console. If we are not interested in this output, we can disable it with the following code snippet: svm.svm_set_print_string_function(s -> {}); Just add this in the beginning of your code and you will not see this anymore. The next library is LIBLINEAR, which provides very fast and high performing linear classifiers such as SVM with Linear Kernel and Logistic Regression. It can easily scale to tens and hundreds of millions of data points. Unlike LIBSVM, there is no official Java version of LIBLINEAR, but there is unofficial Java port available at http://liblinear.bwaldvogel.de/. To use it, include the following: <dependency> <groupId>de.bwaldvogel</groupId> <artifactId>liblinear</artifactId> <version>1.95</version> </dependency> The interface is very similar to LIBSVM. First, you define the parameters: SolverType solverType = SolverType.L1R_LR; double C = 0.001; double eps = 0.0001; Parameter param = new Parameter(solverType, C, eps); We have to specify three parameters here: solverType: defines the model which will be used; C: the amount regularization, the smaller C the stronger the regularization; epsilon: tolerance for stopping the training process. A reasonable default is 0.0001; For classification there are the following solvers: Logistic Regression: L1R_LR or L2R_LR SVM: L1R_L2LOSS_SVC or L2R_L2LOSS_SVC According to the official FAQ (which can be found here: https://www.csie.ntu.edu.tw/~cjlin/liblinear/FAQ.html) you should: Prefer SVM to Logistic Regression as it trains faster and usually gives higher accuracy. Try L2 regularization first unless you need a sparse solution – in this case use L1 The default solver is L2-regularized support linear classifier for the dual problem. If it is slow, try solving the primal problem instead. Then you define the dataset. Like previously, let's first see how to wrap a row: double[] row = // data int m = row.length; Feature[] result = new Feature[m]; for (int i = 0; i < m; i++) { result[i] = new FeatureNode(i + 1, row[i]); } Please note that we add 1 to the index – the 0 is the bias term, so the actual features should start from 1. We can put this into a function wrapRow and then wrap the entire dataset as following: double[][] X = // data int n = X.length; Feature[][] matrix = new Feature[n][]; for (int i = 0; i < n; i++) { matrix[i] = wrapRow(X[i]); } Now we can create the Problem class with the data and labels: double[] y = // labels Problem problem = new Problem(); problem.x = wrapMatrix(X); problem.y = y; problem.n = X[0].length + 1; problem.l = X.length; Note that here we also need to provide the dimensionality of the data, and it's the number of features plus one. We need to add one because it includes the bias term. Now we are ready to train the model: Model model = LibLinear.train(fold, param); When the model is trained, we can use it to classify unseen data. In the following example we will output probabilities: double[] dataRow = // data Feature[] row = wrapRow(dataRow); Linear.predictProbability(model, row, probs); double result = probs[1]; The code above works fine for the Logistic Regression model, but it will not work for SVM: SVM cannot output probabilities, so the code above will throw an error for solvers like L1R_L2LOSS_SVC. What you can do instead is get the raw output: double[] values = new double[1]; Feature[] row = wrapRow(dataRow); Linear.predictValues(model, row, values); double result = values[0]; In this case the results will not contain probability, but some real value. When this value is greater than zero, the model predicts that the class is positive. If we would like to map this value to the [0, 1] range, we can use the sigmoid function for that: public static double[] sigmoid(double[] scores) { double[] result = new double[scores.length]; for (int i = 0; i < result.length; i++) { result[i] = 1 / (1 + Math.exp(-scores[i])); } return result; } Finally, like LIBSVM, LIBLINEAR also outputs a lot of things to standard output. If you do not wish to see it, you can mute it with the following code: PrintStream devNull = new PrintStream(new NullOutputStream()); Linear.setDebugOutput(devNull); Here, we use NullOutputStream from Apache IO which does nothing, so the screen stays clean. When to use LIBSVM and when to use LIBLINEAR? For large datasets often it is not possible to use any kernel methods. In this case you should prefer LIBLINEAR. Additionally, LIBLINEAR is especially good for Text Processing purposes such as Document Classification. Encog Finally, we consider a library for training Neural Networks: Encog. It is available on Maven Central and can be added with the following snippet: <dependency> <groupId>org.encog</groupId> <artifactId>encog-core</artifactId> <version>3.3.0</version> </dependency> With this library you first need to specify the network architecture: BasicNetwork network = new BasicNetwork(); network.addLayer(new BasicLayer(new ActivationSigmoid(), true, noInputNeurons)); network.addLayer(new BasicLayer(new ActivationSigmoid(), true, 30)); network.addLayer(new BasicLayer(new ActivationSigmoid(), true, 1)); network.getStructure().finalizeStructure(); network.reset(); Here we create a network with one input layer, one inner layer with 30 neurons and one output layer with 1 neuron. In each layer we use sigmoid as the activation function and add the bias input (the true parameter). The last line randomly initializes the weights in the network. For both input and output Encog expects two-dimensional double arrays. In case of binary classification we typically have a one dimensional array, so we need to convert it: double[][] X = // data double[] y = // labels double[][] y2d = new double[y.length][]; for (int i = 0; i < y.length; i++) { y2d[i] = new double[] { y[i] }; } Once the data is converted, we wrap it into special wrapper class: MLDataSet dataset = new BasicMLDataSet(X, y2d); Then this dataset can be used for training: MLTrain trainer = new ResilientPropagation(network, dataset); double lambda = 0.01; trainer.addStrategy(new RegularizationStrategy(lambda)); int noEpochs = 101; for (int i = 0; i < noEpochs; i++) { trainer.iteration(); } There are a lot of other Machine Learning libraries available in Java. For example Weka, H2O, JavaML and others. It is not possible to cover all of them, but you can also try them and see if you like them more than the ones we have covered. Evaluation We have covered many Machine Learning libraries, and many of them implement the same algorithms like Random Forest or Logistic Regression. Also, each individual model can have many different parameters: a Logistic Regression has the regularization coefficient, an SVM is configured by setting the kernel and its parameters. How do we select the best single model out of so many possible variants? For that we first define some evaluation metric and then select the model which achieves the best possible performance with respect to this metric. For binary classification we can select one of the following metrics: Accuracy and Error Precision, Recall and F1 AUC (AU ROC) Result Evaluation K-Fold Cross Validation Training, Validation and Testing. Accuracy Accuracy tells us for how many examples the model predicted the correct label. Calculating it is trivial: int n = actual.length; double[] proba = // predictions; double[] prediction = Arrays.stream(proba).map(p -> p > threshold ? 1.0 : 0.0).toArray(); int correct = 0; for (int i = 0; i < n; i++) { if (actual[i] == prediction[i]) { correct++; } } double accuracy = 1.0 * correct / n; Accuracy is the simplest evaluation metric and everybody understands it. Precision, Recall and F1 In some cases accuracy is not the best measure of model performance. For example, suppose we have an unbalanced dataset: there are only 1% of examples that are positive. Then a model which always predict negative is right in 99% cases, and hence will have accuracy of 0.99. But this model is not useful. There are alternatives to accuracy that can overcome this problem. Precision and Recall are among these metrics: they both look at the fraction of positive items that the model correctly recognized. They can be calculated using the Confusion Matrix: a table which summarizes the performance of a binary classifier: Precision is the fraction of correctly predicted positive items among all items the model predicted positive. In terms of the confusion matrix, Precision is TP / (TP + FP). Recall is the fraction of correctly predicted positive items among items that are actually positive. With values from the confusion matrix, Recall is TP / (TP + FN). It is often hard to decide whether one should optimize Precision or Recall. But there is another metric which combines both Precision and Recall into one number, and it is called F1 score. For calculating Precision and Recall, we first need to calculate the values for the cells of the confusion matrix: int tp = 0, tn = 0, fp = 0, fn = 0; for (int i = 0; i < actual.length; i++) { if (actual[i] == 1.0 && proba[i] > threshold) { tp++; } else if (actual[i] == 0.0 && proba[i] <= threshold) { tn++; } else if (actual[i] == 0.0 && proba[i] > threshold) { fp++; } else if (actual[i] == 1.0 && proba[i] <= threshold) { fn++; } } Then we can use the values to calculate Precision and Recall: double precision = 1.0 * tp / (tp + fp); double recall = 1.0 * tp / (tp + fn); Finally, F1 can be calculated using the following formula: double f1 = 2 * precision * recall / (precision + recall); ROC and AU ROC (AUC) The metrics above are good for binary classifiers which produce hard output: they only tell if the class should be assigned a positive label or negative. If instead our model outputs some score such that the higher the values of the score the more likely the item is to be positive, then the binary classifier is called a ranking classifier. Most of the models can output probabilities of belonging to a certain class, and we can use it to rank examples such that the positive are likely to come first. The ROC Curve visually tells us how good a ranking classifier separates positive examples from negative ones. The way a ROC curve is build is the following: We sort the observations by their score and then starting from the origin we go up if the observation is positive and right if it is negative. This way, in the ideal case, we first always go up, and then always go right – and this will result in the best possible ROC curve. In this case we can say that the separation between positive and negative examples is perfect. If the separation is not perfect, but still OK, the curve will go up for positive examples, but sometimes will turn right when a misclassification occurred. Finally, a bad classifier will not be able to tell positive and negative examples apart and the curve would alternate between going up and right. The diagonal line on the plot represents the baseline – the performance that a random classifier would achieve. The further away the curve from the baseline, the better. Unfortunately, there is no available easy-to-use implementation of ROC curves in Java. So the algorithm for drawing a ROC curve is the following: Let POS be number of positive labels, and NEG be the number of negative labels Order data by the score, decreasing Start from (0, 0) For each example in the sorted order, if the example is positive, move 1 / POS up in the graph, otherwise, move 1 / NEG right in the graph. This is a simplified algorithm and assumes that the scores are distinct. If the scores aren't distinct, and there are different actual labels for the same score, some adjustment needs to be made. It is implemented in the class RocCurve which you will find in the source code. You can use it as following: RocCurve.plot(actual, prediction); Calling it will create a plot similar to this one: The area under the curve says how good the separation is. If the separation is very good, then the area will be close to one. But if the classifier cannot distinguish between positive and negative examples, the curve will go around the random baseline curve, and the area will be close to 0.5. Area Under the Curve is often abbreviated as AUC, or, sometimes, AU ROC – to emphasize that the Curve is a ROC Curve. AUC has a very nice interpretation: the value of AUC corresponds to probability that a randomly selected positive example is scored higher than a randomly selected negative example. Naturally, if this probability is high, our classifier does a good job separating positive and negative examples. This makes AUC a to-go evaluation metric for many cases, especially when the dataset is unbalanced – in the sense that there are a lot more examples of one class than another. Luckily, there are implementations of AUC in Java. For example, it is implemented in Smile. You can use it like this: double[] predicted = ... // int[] truth = ... // double auc = AUC.measure(truth, predicted); Result Validation When learning from data there is always the danger of overfitting. Overfitting occurs when the model starts learning the noise in the data instead of detecting useful patterns. It is always important to check if a model overfits – otherwise it will not be useful when applied to unseen data. The typical and most practical way of checking whether a model overfits or not is to emulate “unseen data” – that is, take a part of the available labeled data and do not use it for training. This technique is called “hold out”: we hold out a part of the data and use it only for evaluation. Often we shuffle the original data set before splitting. In many cases we make a simplifying assumption that the order of data is not important – that is, one observation has no influence on another. In this case shuffling the data prior to splitting will remove effects that the order of items might have. On the other hand, if the data is a Time Series data, then shuffling it is not a good idea, because there is some dependence between observations. So, let us implement the hold out split. We assume that the data that we have is already represented by X – a two-dimensional array of doubles with features and y – a one-dimensional array of labels. First, let us create a helper class for holding the data: public class Dataset { private final double[][] X; private final double[] y; // constructor and getters are omitted } Splitting our dataset should produce two datasets, so let us create a class for that as well: public class Split { private final Dataset train; private final Dataset test; // constructor and getters are omitted } Now suppose we want to split the data into two parts: train and test. We also want to specify the size of the train set, we will do it using a testRatio parameter: the percentage of items that should go to the test set. So the first thing we do is generating an array with indexes and then splitting it according to testRatio: int[] indexes = IntStream.range(0, dataset.length()).toArray(); int trainSize = (int) (indexes.length * (1 - testRatio)); int[] trainIndex = Arrays.copyOfRange(indexes, 0, trainSize); int[] testIndex = Arrays.copyOfRange(indexes, trainSize, indexes.length); We can also shuffle the indexes if we want: Random rnd = new Random(seed); for (int i = indexes.length - 1; i > 0; i--) { int index = rnd.nextInt(i + 1); int tmp = indexes[index]; indexes[index] = indexes[i]; indexes[i] = tmp; } Then we can select instances for the training set as follows: int trainSize = trainIndex.length; double[][] trainX = new double[trainSize][]; double[] trainY = new double[trainSize]; for (int i = 0; i < trainSize; i++) { int idx = trainIndex[i]; trainX[i] = X[idx]; trainY[i] = y[idx]; } And then finally wrap it into our Dataset class: Dataset train = new Dataset(trainX, trainY); If we repeat the same for the test set, we can put both train and test sets into a Split object: Split split = new Split(train, test); And now we can use train fold for training and test fold for testing the models. If we put all the code above into a function of the Dataset class, for example, trainTestSplit, we can use it as follows: Split split = dataset.trainTestSplit(0.2); Dataset train = split.getTrain(); // train the model using train.getX() and train.getY() Dataset test = split.getTest(); // test the model using test.getX(); test.getY(); K-Fold Cross Validation Holding out only one part of the data may not always be the best option. What we can do instead is splitting it into K parts and then testing the models only on 1/Kth of the data. This is called K-Fold Cross-Validation: it not only gives the performance estimation, but also the possible spread of the error. Typically we are interested in models which give good and consistent performance. K-Fold Cross-Validation helps us to select such models. Then we prepare the data for K-Fold Cross-Validation is the following: First, split the data into K parts Then for each of these parts Take one part as the validation set Take the remaining K-1 parts as the training set If we translate this into Java, the first step will look like this: int[] indexes = IntStream.range(0, dataset.length()).toArray(); int[][] foldIndexes = new int[k][]; int step = indexes.length / k; int beginIndex = 0; for (int i = 0; i < k - 1; i++) { foldIndexes[i] = Arrays.copyOfRange(indexes, beginIndex, beginIndex + step); beginIndex = beginIndex + step; } foldIndexes[k - 1] = Arrays.copyOfRange(indexes, beginIndex, indexes.length); This creates an array of indexes for each fold. You can also shuffle the indexes array as previously. Now we can create splits from each fold: List<Split> result = new ArrayList<>(); for (int i = 0; i < k; i++) { int[] testIdx = folds[i]; int[] trainIdx = combineTrainFolds(folds, indexes.length, i); result.add(Split.fromIndexes(dataset, trainIdx, testIdx)); } In the code above we have two additional methods: combineTrainFolds K-1 arrays with indexes and combines them into one Split.fromIndexes creates a split gives train and test indexes. We have already covered the second function when we created a simple hold-out test set. And the first function, combineTrainFolds, is implemented like this: private static int[] combineTrainFolds(int[][] folds, int totalSize, int excludeIndex) { int size = totalSize - folds[excludeIndex].length; int result[] = new int[size]; int start = 0; for (int i = 0; i < folds.length; i++) { if (i == excludeIndex) { continue; } int[] fold = folds[i]; System.arraycopy(fold, 0, result, start, fold.length); start = start + fold.length; } return result; } Again, we can put the code above into a function of the Dataset class and call it like follows: List<Split> folds = train.kfold(3); Now when we have a list of Split objects, we can create a special function for performing Cross-Validation: public static DescriptiveStatistics crossValidate(List<Split> folds, Function<Dataset, Model> trainer) { double[] aucs = folds.parallelStream().mapToDouble(fold -> { Dataset foldTrain = fold.getTrain(); Dataset foldValidation = fold.getTest(); Model model = trainer.apply(foldTrain); return auc(model, foldValidation); }).toArray(); return new DescriptiveStatistics(aucs); } What this function does takes a list of folds and a callback which inside creates a model. Then, after the model is trained, we calculate AUC for it. Additionally, we take advantage of Java's ability to parallelize loops and train models on each fold at the same time. Finally, we put the AUCs calculated on each fold into a DescriptiveStatistics object, which can later on be used to return the mean and the standard deviation of the AUCs. As you probably remember, the DescriptiveStatistics class comes from the Apache Commons Math library. Let us consider an example. Suppose we want to use Logistic Regression from LIBLINEAR and select the best value for the regularization parameter C. We can use the function above this way: double[] Cs = { 0.01, 0.05, 0.1, 0.5, 1.0, 5.0, 10.0 }; for (double C : Cs) { DescriptiveStatistics summary = crossValidate(folds, fold -> { Parameter param = new Parameter(SolverType.L1R_LR, C, 0.0001); return LibLinear.train(fold, param); }); double mean = summary.getMean(); double std = summary.getStandardDeviation(); System.out.printf("L1 logreg C=%7.3f, auc=%.4f ± %.4f%n", C, mean, std); } Here, LibLinear.train is a helper method which takes a Dataset object and a Parameter object and then trains a LIBLINEAR model. This will print AUC for all provided values of C, so you can see which one is the best, and pick the one with highest mean AUC. Training, Validation and Testing When doing the Cross-Validation there’s still a danger of overfitting. Since we try a lot of different experiments on the same validation set, we might accidentally pick the model which just happened to do well on the validation set – but it may later on fail to generalize to unseen data. The solution to this problem is to hold out a test set at the very beginning and do not touch it at all until we select what we think is the best model. And we use it only for evaluating the final model on it. So how do we select the best model? What we can do is to do Cross-Validation on the remaining train data. It can be either hold out or K-Fold Cross-Validation. In general you should prefer doing K-Fold Cross-Validation because it also gives you the spread of performance, and you may use it in for model selection as well. The typical workflow should be the following: (0) Select some metric for validation, e.g. accuracy or AUC. (1) Split all the data into train and test sets (2) Split the training data further and hold out a validation dataset or split it into K folds (3) Use the validation data for model selection and parameter optimization (4) Select the best model according to the validation set and evaluate it against the hold out test set It is important to avoid looking at the test set too often. It should be used only occasionally for final evaluation to make sure the selected model does not overfit. If the validation scheme is set up properly, the validation score should correspond to the final test score. If this happens, we can be sure that the model does not overfit and is able to generalize to unseen data. Using the classes and the code we created previously, it translates to the following Java code: Dataset data = new Dataset(X, y); Dataset train = split.getTrain(); List<Split> folds = train.kfold(3); // now use crossValidate(folds, ...) to select the best model Dataset test = split.getTest(); // do final evaluation of the best model on test With this information we are ready to do a project on Binary Classification. Summary In this article we spoke about supervised machine learning and about two common supervised problems: Classification and Regression. We also covered the libraries which are commonly used algorithms , implemented and how to evaluate the performance of these algorithms. There is another family of Machine Learning algorithms that do not require the label information: these methods are called Unsupervised Learning. Resources for Article: Further resources on this subject: Supervised Machine Learning [article] Clustering and Other Unsupervised Learning Methods [article] Data Clustering [article]
Read more
  • 0
  • 0
  • 5294

article-image-hands-service-fabric
Packt
06 Apr 2017
12 min read
Save for later

Hands on with Service Fabric

Packt
06 Apr 2017
12 min read
In this article by Rahul Rai and Namit Tanasseri, authors of the book Microservices with Azure, explains that Service Fabric as a platform supports multiple programming models. Each of which is best suited for specific scenarios. Each programming model offers different levels of integration with the underlying management framework. Better integration leads to more automation and lesser overheads. Picking the right programming model for your application or services is the key to efficiently utilize the capabilities of Service Fabric as a hosting platform. Let's take a deeper look into these programming models. (For more resources related to this topic, see here.) To start with, let's look at the least integrated hosting option: Guest Executables. Native windows applications or application code using Node.js or Java can be hosted on Service Fabric as a guest executable. These executables can be packaged and pushed to a Service Fabric cluster like any other services. As the cluster manager has minimal knowledge about the executable, features like custom health monitoring, load reporting, state store and endpoint registration cannot be leveraged by the hosted application. However, from a deployment standpoint, a guest executable is treated like any other service. This means that for a guest executable, Service Fabric cluster manager takes care of high availability, application lifecycle management, rolling updates, automatic failover, high density deployment and load balancing. As an orchestration service, Service Fabric is responsible for deploying and activating an application or application services within a cluster. It is also capable of deploying services within a container image. This programming model is addressed as Guest Containers. The concept of containers is best explained as an implementation of operating system level virtualization. They are encapsulated deployable components running on isolated process boundaries sharing the same kernel. Deployed applications and their runtime dependencies are bundles within the container with an isolated view of all operating system constructs. This makes containers highly portable and secure. Guest container programming model is usually chosen when this level of isolation is required for the application. As containers don't have to boot an operating system, they have fast boot up time and are comparatively small in size. A prime benefit of using Service Fabric as a platform is the fact that it supports heterogeneous operating environments. Service Fabric supports two types of containers to be deployed as guest containers: Docker containers on Linux and Windows server containers. Container images for Docker containers are stored in Docker Hub and Docker APIs are used to create and manage the containers deployed on Linux kernel. Service Fabric supports two different types of containers in Windows Server 2016 with different levels of isolation. They are: Windows Server containers and Windows Hyper-V containers Windows Server containers are similar to Docker containers in terms of the isolation they provide. Windows Hyper-V containers offer higher degree of isolation and security by not sharing the operating system kernel across instances. These are ideally used when a higher level of security isolation is required such as systems requiring hostile multitenant hosts. The following figure illustrates the different isolation levels achieved by using these containers. Container isolation levels Service Fabric application model treats containers as an application host which can in turn host service replicas. There are three ways of utilizing containers within a Service Fabric application mode. Existing applications like Node.js, JavaScript application of other executables can be hosted within a container and deployed on Service Fabric as a Guest Container. A Guest Container is treated similar to a Guest Executable by Service Fabric runtime. The second scenario supports deploying stateless services inside a container hosted on Service Fabric. Stateless services using Reliable Services and Reliable actors can be deployed within a container. The third option is to deploy stateful services in containers hosted on Service Fabric. This model also supports Reliable Services and Reliable Actors. Service Fabric offers several features to manage containerized Microservices. These include container deployment and activation, resource governance, repository authentication, port mapping, container discovery and communication and ability to set environment variables. While containers offer a good level of isolation it is still heavy in terms of deployment footprint. Service Fabric offers a simpler, powerful programming model to develop your services which they call Reliable Services. Reliable services let you develop stateful and stateless services which can be directly deployed on Service Fabric clusters. For stateful services, the state can be stored close to the compute by using Reliable Collections. High availability of the state store and replication of the state is taken care by the Service Fabric cluster management services. This contributes substantially to the performance of the system by improving the latency of data access. Reliable services come with a built-in pluggable communication model which supports HTTP with Web API, WebSockets and custom TCP protocols out of the box. A Reliable service is addressed as stateless if it does not maintain any state within it or if the scope of the state stored is limited to a service call and is entirely disposable. This means that a stateless service does not require to persist, synchronize or replicate state. A good example for this service is a weather service like MSN weather service. A weather service can be queried to retrieve weather conditions associated with a specific geographical location. The response is totally based on the parameters supplied to the service. This service does not store any state. Although stateless services are simpler to implement, most of the services in real life are not stateless. They either store state in an external state store or an internal one. Web front end hosting APIs or web applications are good use cases to be hosted as stateless services. A stateful service persists states. The outcome of a service call made to a stateful service is usually influenced by the state persisted by the service. A service exposed by a bank to return the balance on an account is a good example for a stateful service. The state may be stored in an external data store such as Azure SQL Database, Azure Blobs or Azure Table store. Most services prefer to store the state externally considering the challenges around reliability, availability, scalability and consistency of the data store. With Service Fabric, state can be stored close to the compute by using reliable collections. To makes things more lightweight, Service Fabric also offers a programming model based on Virtual actor pattern. This programming model is called Reliable Actors. The Reliable Actors programming model is built on top of Reliable Services. This guarantees the scalability and reliability of the services. An Actor can be defined as an isolated, independent unit of compute and state with single-threaded execution. Actors can be created, managed and disposed independent of each other. Large number of actors can coexist and execute at a time. Service Fabric Reliable Actors are a good fit for systems which are highly distributed and dynamic by nature. Every actor is defined as an instance of an actor type; the same way an object is an instance of a class. Each actor is uniquely identified by an actor ID. The lifetime of Service Fabric Actors is not tied to their in-memory state. As a result, Actors are automatically created the first time a request for them is made. Reliable Actor's garbage collector takes care of disposing unused Actors in memory. Now that we understand the programming models, let's take a look at how the services deployed on Service Fabric are discovered and how the communication between services takes place. Service Fabric discovery and communication An application built on top of Microservices is usually composed of multiple services, each of which runs multiple replicas. Each service is specialized in a specific task. To achieve an end to end business use case, multiple services will need to be stitched together. This requires services to communicate to each other. A simple example would be web front end service communicating with the middle tier services which in turn connects to the back end services to handle a single user request. Some of these middle tier services can also be invoked by external applications. Services deployed on Service Fabric are distributed across multiple nodes in a cluster of virtual machines. The services can move across dynamically. This distribution of services can wither be triggered by a manual action of be result of Service Fabric cluster manager re-balancing services to achieve optimal resource utilization. This makes communication a challenge as services are not tied to a particular machine. Let's understand how Service Fabric solved this challenge for its consumers. Service protocols Service Fabric, as a hosting platform for Microservices does not interfere in the implementation of the service. On top of this, it also lets services decide on the communication channels they want to open. These channels are addressed as service endpoints. During service initiation, Service Fabric provides the opportunity for the services to set up the endpoints for incoming request on any protocol or communication stack. The endpoints are defined according to common industry standards, that is IP:Port. It is possible that multiple service instances share a single host process. In which case, they either have to use different ports or a port sharing mechanism. This will ensure that every service instance is uniquely addressable. Service endpoints Service discovery Service Fabric can rebalance services deployed on a cluster as a part of orchestration activities. This can be caused by resource balancing activities, failovers, upgrades, scale outs or scale ins. This will result in change in service endpoint addresses as the services move across different virtual machines. Service distribution The Service Fabric Naming Service is responsible for abstracting this complexity from the consuming service or application. Naming service takes care of service discovery and resolution. All service instances in Services Fabric are identified by a unique URL like fabric:/MyMicroServiceApp/AppService1. This name stays constant across the lifetime of the service although the endpoint addresses which physically host the service may change. Internally, Service Fabric manages a map between the service names and the physical location where the service is hosted. This is similar to the DNS service which is used to resolve Website URLs to IP addresses. The following figure illustrates the name resolution process for a service hosted on Service Fabric: Name resolution Connections from applications external to Service Fabric Service communications to or between services hosted in Service Fabric can be categorized as internal or external. Internal communication among services hosted on Service Fabric is easily achieved using the Naming Service. External communication, originated from an application or a user outside the boundaries of Service Fabric will need some extra work. To understand how this works, let's dive deeper in to the logical network layout of a typical Service Fabric cluster. Service Fabric cluster is always placed behind an Azure Load Balancer. The Load Balancer acts like a gateway to all traffic which needs to pass to the Service Fabric cluster. The Load Balancer is aware of every post open on every node of a cluster. When a request hits the Load Balancer, it identifies the port the request is looking for and randomly routes the request to one of the nodes which has the requested port open. The Load Balancer is not aware of the services running on the nodes or the ports associated with the services. The following figure illustrates request routing in action. Request routing Configuring ports and protocols The protocol and the ports to be opened by a Service Fabric cluster can be easily configured through the portal. Let's take an example to understand the configuration in detail. If we need a web application to be hosted on a Service Fabric cluster which should have port 80 opened on HTTP to accept incoming traffic, the following steps should be performed. Configuring service manifest Once a service listening to port 80 is authored, we need to configure port 80 in the service manifest to open a listener in the service. This can be done by editing the Service Manifest.xml. <Resources> <Endpoints> <Endpoint Name="WebEndpoint" Protocol="http" Port="80" /> </Endpoints> </Resources> Configuring custom end point On the Service Fabric cluster, configure port 80 as a custom endpoint. This can be easily done through the Azure Management portal. Configuring custom port Configure Azure Load Balancer Once the cluster is configured and created, the Azure Load Balancer can be instructed to forward the traffic to port 80. If the Service Fabric cluster is created through the portal, this step is automatically taken care for every port which is configured on the cluster configuration. Configuring Azure Load Balancer Configure health check Azure Load Balancer probes the ports on the nodes for their availability to ensure reliability of the service. The probes can be configured on the Azure portal. This is an optional step as a default probe configuration is applied for each endpoint when a cluster is created. Configuring probe Built-in Communication API Service Fabric offers many built-in communication options to support inter service communications. Service Remoting is one of them. This option allows strong typed remote procedure calls between Reliable Services and Reliable Actors. This option is very easy to set up and operate with as Service Remoting handles resolution of service addresses, connection, retry and error handling. Service Fabric also supports HTTP for language-agnostic communication. Service Fabric SDK exposes ICommunicationClient and ServicePartitionClient classes for service resolution, HTTP connections, and retry loops. WCF is also supported by Service Fabric as a communication channel to enable legacy workload to be hosted on it. The SDK exposed WcfCommunicationListener for the server side and WcfCommunicationClient and ServicePartitionClient classes for the client to ease programming hurdles. Resources for Article: Further resources on this subject: Installing Neutron [article] Designing and Building a vRealize Automation 6.2 Infrastructure [article] Insight into Hyper-V Storage [article]
Read more
  • 0
  • 0
  • 18655

article-image-systems-and-logics
Packt
06 Apr 2017
19 min read
Save for later

Systems and Logics

Packt
06 Apr 2017
19 min read
In this article by Priya Kuber, Rishi Gaurav Bhatnagar, and Vijay Varada, authors of the book Arduino for Kids explains structure and various components of a code: How does a code work What is code What is a System How to download, save and access a file in the Arduino IDE (For more resources related to this topic, see here.) What is a System? Imagine system as a box which in which a process is completed. Every system is solving a larger problem, and can be broken down into smaller problems that can be solved and assembled. Sort of like a Lego set! Each small process has 'logic' as the backbone of the solution. Logic, can be expressed as an algorithm and implemented in code. You can design a system to arrive at solutions to a problem. Another advantage to breaking down a system into small processes is that in case your solution fails to work, you can easily spot the source of your problem, by checking if your individual processes work. What is Code? Code is a simple set of written instructions, given to a specific program in a computer, to perform a desired task. Code is written in a computer language. As we all know by now, a computer is an intelligent, electronic device capable of solving logical problems with a given set of instructions. Some examples of computer languages are Python, Ruby, C, C++ and so on. Find out some more examples of languages from the internet and write it down in your notebook. What is an Algorithm? A logical set by step process, guided by the boundaries (or constraints) defined by a problem, followed to find a solution is called an algorithm. In a better and more pictorial form, it can be represented as follows: (Logic + Control = Algorithm) (A picture depicting this equation) What does that even mean? Look at the following example to understand the process. Let's understand what an algorithm means with the help of an example. It's your friend's birthday and you have been invited for the party (Isn't this exciting already?). You decide to gift her something. Since it's a gift, let's wrap it. What would you do to wrap the gift? How would you do it? Look at the size of the gift Fetch the gift wrapping paper Fetch the scissors Fetch the tape Then you would proceed to place the gift inside the wrapping paper. You will start start folding the corners in a way that it efficiently covers the Gift. In the meanwhile, to make sure that your wrapping is tight, you would use a scotch tape. You keep working on the wrapper till the whole gift is covered (and mind you, neatly! you don't want mommy scolding you, right?). What did you just do? You used a logical step by step process to solve a simple task given to you. Again coming back to the sentence: (Logic + Control = Algorithm) 'Logic' here, is the set of instructions given to a computer to solve the problem. 'Control' are the words making sure that the computer understands all your boundaries. Logic Logic is the study of reasoning and when we add this to the control structures, they become algorithms. Have you ever watered the plants using a water pipe or washed a car with it? How do you think it works? The pipe guides the water from the water tap to the car. It makes sure optimum amount of water reaches the end of the pipe. A pipe is a control structure for water in this case. We will understand more about control structures in the next topic. How does a control structure work? A very good example to understand how a control structure works, is taken from wikiversity. (https://en.wikiversity.org/wiki/Control_structures) A precondition is the state of a variable before entering a control structure. In the gift wrapping example, the size of the gift determines the amount of gift wrapping paper you will use. Hence, it is a condition that you need to follow to successfully finish the task. In programming terms, such condition is called precondition. Similarly, a post condition is the state of the variable after exiting the control structure. And a variable, in code, is an alphabetic character, or a set of alphabetic characters, representing or storing a number, or a value. Some examples of variables are x, y, z, a, b, c, kitten, dog, robot Let us analyze flow control by using traffic flow as a model. A vehicle is arriving at an intersection. Thus, the precondition is the vehicle is in motion. Suppose the traffic light at the intersection is red. The control structure must determine the proper course of action to assign to the vehicle. Precondition: The vehicle is in motion. Control Structure: Is the traffic light green? If so, then the vehicle may stay in motion. Is the traffic light red? If so, then the vehicle must stop. End of Control Structure: Post condition: The vehicle comes to a stop. Thus, upon exiting the control structure, the vehicle is stopped. If you wonder where you learnt to wrap the gift, you would know that you learnt it by observing other people doing a similar task through your eyes. Since our microcontroller does not have eyes, we need to teach it to have a logical thinking using Code. The series of logical steps that lead to a solution is called algorithm as we saw in the previous task. Hence, all the instructions we give to a micro controller are in the form of an algorithm. A good algorithm solves the problem in a fast and efficient way. Blocks of small algorithms form larger algorithms. But algorithm is just code! What will happen when you try to add sensors to your code? A combination of electronics and code can be called a system. Picture: (block diagram of sensors + Arduino + code written in a dialogue box ) Logic is universal. Just like there can be multiple ways to fold the wrapping paper, there can be multiple ways to solve a problem too! A micro controller takes the instructions only in certain languages. The instructions then go to a compiler that translates the code that we have written to the machine. What language does your Arduino Understand? For Arduino, we will use the language 'processing'. Quoting from processing.org, Processing is a flexible software sketchbook and a language for learning how to code within the context of the visual arts. Processing is an open source programming language and integrated development environment (IDE). Processing was originally built for designers and it was extensively used in electronics arts and visual design communities with the sole purpose of teaching the fundamentals of computer sciences in a visual context. This also served as the foundations of electronic sketchbooks. From the previous example of gift wrapping, you noticed that before you need to bring in the paper and other stationery needed, you had to see the size of the problem at hand (the gift). What is a Library? In computer language, the stationery needed to complete your task, is called "Library". A library is a collection of reusable code that a programmer can 'call' instead of writing everything again. Now imagine if you had to cut a tree, make paper, then color the paper into the beautiful wrapping paper that you used, when I asked you to wrap the gift. How tiresome would it be? (If you are inventing a new type of paper, sure, go ahead chop some wood!) So before writing a program, you make sure that you have 'called' all the right libraries. Can you search the internet and make a note of a few arduino libraries in your inventor's diary? Please remember, that libraries are also made up of code! As your next activity, we will together learn more about how a library is created. Activity: Understanding the Morse Code During the times before the two-way mobile communication, people used a one-way communication called the Morse code. The following image is the experimental setup of a Morse code. Do not worry; we will not get into how you will perform it physically, but by this example, you will understand how your Arduino will work. We will show you the bigger picture first and then dissect it systematically so that you understand what a code contains. The Morse code is made up of two components "Short" and "Long" signals. The signals could be in the form of a light pulse or sound. The following image shows how the Morse code looks like. A dot is a short signal and a dash is a long signal. Interesting, right? Try encrypting your message for your friend with this dots and dashes. For example, "Hello" would be: The image below shows how the Arduino code for Morse code will looks like. The piece of code in dots and dashes is the message SOS that I am sure you all know, is an urgent appeal for help. SOS in Morse goes: dot dot dot; dash dash dash; dot dot dot. Since this is a library, which is being created using dots and dashes, it is important that we define how the dot becomes dot, and dash becomes dash first. The following sections will take smaller sections or pieces of main code and explain you how they work. We will also introduce some interesting concepts using the same. What is a function? Functions have instructions in a single line of code, telling the values in the bracket how to act. Let us see which one is the function in our code. Can you try to guess from the following screenshot? No? Let me help you! digitalWrite() in the above code is a Function, that as you understand, 'writes' on the correct pin of the Arduino. delay is a Function that tells the controller how frequently it should send the message. The higher the delay number, the slower will be the message (Imagine it as a way to slow down your friend who speaks too fast, helping you to understand him better!) Look up the internet to find out what is the maximum number that you stuff into delay. What is a constant? A constant is an identifier with pre-defined, non-changeable values. What is an identifier you ask? An identifier is a name that labels the identity of a unique object or value. As you can see from the above piece of code, HIGH and LOW are Constants. Q: What is the opposite of Constant? Ans: Variable The above food for thought brings us to the next section. What is a variable? A variable is a symbolic name for information. In plain English, a 'teacher' can have any name; hence, the 'teacher' could be a variable. A variable is used to store a piece of information temporarily. The value of a variable changes, if any action is taken on it for example; Add, subtract, multiply etc. (Imagine how your teacher praises you when you complete your assignment on time and scolds you when you do not!) What is a Datatype? Datatypes are sets of data that have a pre-defined value. Now look at the first block of the example program in the following image: int as shown in the above screenshot, is a Datatype The following table shows some of the examples of a Datatype. Datatype Use Example int describes an integer number is used to represent whole numbers 1, 2, 13, 99 etc float used to represent that the numbers are decimal 0.66, 1.73 etc char represents any character. Strings are written in single quotes 'A', 65 etc str represent string "This is a good day!" With the above definition, can we recognize what pinMode is? Every time you have a doubt in a command or you want to learn more about it, you can always look it up at Arduino.cc website. You could do the same for digitalWrite() as well! From the pinMode page of Arduino.cc we can define it as a command that configures the specified pin to behave either as an input or an output. Let us now see something more interesting. What is a control structure? We have already seen the working of a control structure. In this section, we will be more specific to our code. Now I draw your attention towards this specific block from the main example above: Do you see void setup() followed by a code in the brackets? Similarly void loop() ? These make the basics of the structure of an Arduino program sketch. A structure, holds the program together, and helps the compiler to make sense of the commands entered. A compiler is a program that turns code understood by humans into the code that is understood by machines. There are other loop and control structures as you can see in the following screenshot: These control structures are explained next. How do you use Control Structures? Imagine you are teaching your friend to build 6 cm high lego wall. You ask her to place one layer of lego bricks, and then you further ask her to place another layer of lego bricks on top of the bottom layer. You ask her to repeat the process until the wall is 6 cm high. This process of repeating instructions until a desired result is achieved is called Loop. A micro-controller is only as smart as you program it to be. Hence, we will move on to the different types of loops. While loop: Like the name suggests, it repeats a statement (or group of statements) while the given condition is true. The condition is tested before executing the loop body. For loop: Execute a sequence of statements multiple times and abbreviates the code that manages the loop variable. Do while loop: Like a while statement, except that it tests the condition at the end of the loop body Nested loop: You can use one or more loop inside any another while, for or do..while loop. Now you were able to successfully tell your friend when to stop, but how to control the micro controller? Do not worry, the magic is on its way! You introduce Control statements. Break statements: Breaks the flow of the loop or switch statement and transfers execution to the statement that is immediately following the loop or switch. Continue statements: This statement causes the loop to skip the remainder of its body and immediately retest its condition before reiterating. Goto statements: This transfers control to a statement which is labeled . It is no advised to use goto statement in your programs. Quiz time: What is an infinite loop? Look up the internet and note it in your inventor-notebook. The Arduino IDE The full form of IDE is Integrated Development Environment. IDE uses a Compiler to translate code in a simple language that the computer understands. Compiler is the program that reads all your code and translates your instructions to your microcontroller. In case of the Arduino IDE, it also verifies if your code is making sense to it or not. Arduino IDE is like your friend who helps you finish your homework, reviews it before you give it for submission, if there are any errors; it helps you identify them and resolve them. Why should you love the Arduino IDE? I am sure by now things look too technical. You have been introduced to SO many new terms to learn and understand. The important thing here is not to forget to have fun while learning. Understanding how the IDE works is very useful when you are trying to modify or write your own code. If you make a mistake, it would tell you which line is giving you trouble. Isn't it cool? The Arduino IDE also comes with loads of cool examples that you can plug-and-play. It also has a long list of libraries for you to access. Now let us learn how to get the library on to your computer. Ask an adult to help you with this section if you are unable to succeed. Make a note of the following answers in your inventor's notebook before downloading the IDE. Get your answers from google or ask an adult. What is an operating system?What is the name of the operating system running on your computer?What is the version of your current operating system?Is your operating system 32 bit or 64 bit?What is the name of the Arduino board that you have? Now that we did our homework, let us start playing! How to download the IDE? Let us now, go further and understand how to download something that's going to be our playground. I am sure you'd be eager to see the place you'll be working in for building new and interesting stuff! For those of you wanting to learn and do everything my themselves, open any browser and search for "Arduino IDE" followed by the name of your operating system with "32 bits" or "64 bits" as learnt in the previous section. Click to download the latest version and install! Else, the step-by-step instructions are here: Open your browser (Firefox, Chrome, Safari)(Insert image of the logos of firefox, chrome and safari) Go to www.arduino.ccas shown in the following screenshot Click on the 'Download' section of the homepage, which is the third option from your left as shown in the following screenshot. From the options, locate the name of your operating system, click on the right version (32 bits or 64 bits) Then click on 'Just Download' after the new page appears. After clicking on the desired link and saving the files, you should be able to 'double click' on the Arduino icon and install the software. If you have managed to install successfully, you should see the following screens. If not, go back to step 1 and follow the procedure again. The next screenshot shows you how the program will look like when it is loading. This is how the IDE looks when no code is written into it. Your first program Now that you have your IDE ready and open, it is time to start exploring. As promised before, the Arduino IDE comes with many examples, libraries, and helping tools to get curious minds such as you to get started soon. Let us now look at how you can access your first program via the Arduino IDE. A large number of examples can be accessed in the File > Examples option as shown in the following screenshot. Just like we all have nicknames in school, a program, written in in processing is called a 'sketch'. Whenever you write any program for Arduino, it is important that you save your work. Programs written in processing are saved with the extension .ino The name .ino is derived from the last 3 letters of the word ArduINO. What are the other extensions are you aware of? (Hint: .doc, .ppt etc) Make a note in your inventor's notebook. Now ask yourself why do so many extensions exist. An extension gives the computer, the address of the software which will open the file, so that when the contents are displayed, it makes sense. As we learnt above, that the program written in the Arduino IDE is called a 'Sketch'. Your first sketch is named 'blink'. What does it do? Well, it makes your Arduino blink! For now, we can concentrate on the code. Click on File | Examples | Basics | Blink. Refer to next image for this. When you load an example sketch, this is how it would look like. In the image below you will be able to identify the structure of code, recall the meaning of functions and integers from the previous section. We learnt that the Arduino IDE is a compiler too! After opening your first example, we can now learn how to hide information from the compiler. If you want to insert any extra information in plain English, you can do so by using symbols as following. /* your text here*/ OR // your text here Comments can also be individually inserted above lines of code, explaining the functions. It is good practice to write comments, as it would be useful when you are visiting back your old code to modify at a later date. Try editing the contents of the comment section by spelling out your name. The following screenshot will show you how your edited code will look like. Verifying your first sketch Now that you have your first complete sketch in the IDE, how do you confirm that your micro-controller with understand? You do this, by clicking on the easy to locate Verify button with a small . You will now see that the IDE informs you "Done compiling" as shown in the screenshot below. It means that you have successfully written and verified your first code. Congratulations, you did it! Saving your first sketch As we learnt above, it is very important to save your work. We now learn the steps to make sure that your work does not get lost. Now that you have your first code inside your IDE, click on File > SaveAs. The following screenshot will show you how to save the sketch. Give an appropriate name to your project file, and save it just like you would save your files from Paintbrush, or any other software that you use. The file will be saved in a .ino format. Accessing your first sketch Open the folder where you saved the sketch. Double click on the .ino file. The program will open in a new window of IDE. The following screen shot has been taken from a Mac Os, the file will look different in a Linux or a Windows system. Summary Now we know about systems and how logic is used to solve problems. We can write and modify simple code. We also know the basics of Arduino IDE and studied how to verify, save and access your program. Resources for Article:  Further resources on this subject: Getting Started with Arduino [article] Connecting Arduino to the Web [article] Functions with Arduino [article]
Read more
  • 0
  • 0
  • 24358
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-introduction-network-security
Packt
06 Apr 2017
18 min read
Save for later

Introduction to Network Security

Packt
06 Apr 2017
18 min read
In this article by Warun Levesque, Michael McLafferty, and Arthur Salmon, the authors of the book Applied Network Security, we will be covering the following topics, which will give an introduction to network security: Murphy's law The definition of a hacker and the types The hacking process Recent events and statistics of network attacks Security for individual versus company Mitigation against threats Building an assessment This world is changing rapidly with advancing network technologies. Unfortunately, sometimes the convenience of technology can outpace its security and safety. Technologies like the Internet of things is ushering in a new era of network communication. We also want to change the mindset of being in the field of network security. Most current cyber security professionals practice defensive and passive security. They mostly focus on mitigation and forensic tactics to analyze the aftermath of an attack. We want to change this mindset to one of Offensive security. This article will give insight on how a hacker thinks and what methods they use. Having knowledge of a hacker's tactics, will give the reader a great advantage in protecting any network from attack. (For more resources related to this topic, see here.) Murphy's law Network security is much like Murphy's law in the sense that, if something can go wrong it will go wrong. To be successful at understanding and applying network security, a person must master the three Ps. The three Ps are, persistence, patience, and passion. A cyber security professional must be persistent in their pursuit of a solution to a problem. Giving up is not an option. The answer will be there; it just may take more time than expected to find it. Having patience is also an important trait to master. When dealing with network anomalies, it is very easy to get frustrated. Taking a deep breath and keeping a cool head goes a long way in finding the correct solution to your network security problems. Finally, developing a passion for cyber security is critical to being a successful network security professional. Having that passion will drive you to learn more and evolve yourself on a daily basis to be better. Once you learn, then you will improve and perhaps go on to inspire others to reach similar aspirations in cyber security. The definition of a hacker and the types A hacker is a person who uses computers to gain unauthorized access to data. There are many different types of hackers. There are white hat, grey hat, and black hat hackers. Some hackers are defined for their intention. For example, a hacker that attacks for political reasons may be known as a hacktivist. A white hat hackers have no criminal intent, but instead focuses on finding and fixing network vulnerabilities. Often companies will hire a white hat hacker to test the security of their network for vulnerabilities. A grey hat hacker is someone who may have criminal intent but not often for personal gain. Often a grey hat will seek to expose a network vulnerability without the permission from the owner of the network. A black hat hacker is purely criminal. They sole objective is personal gain. Black hat hackers take advantage of network vulnerabilities anyway they can for maximum benefit. A cyber-criminal is another type of black hat hacker, who is motivated to attack for illegal financial gain. A more basic type of hacker is known as a script kiddie. A script kiddie is a person who knows how to use basic hacking tools but doesn't understand how they work. They often lack the knowledge to launch any kind of real attack, but can still cause problems on a poorly protected network. Hackers tools There are a range of many different hacking tools. A tool like Nmap for example, is a great tool for both reconnaissance and scanning for network vulnerabilities. Some tools are grouped together to make toolkits and frameworks, such as the Social Engineering Toolkit and Metasploit framework. The Metasploit framework is one of the most versatile and supported hacking tool frameworks available. Metasploit is built around a collection of highly effective modules, such as MSFvenom and provides access to an extensive database of exploits and vulnerabilities. There are also physical hacking tools. Devices like the Rubber Ducky and Wi-Fi Pineapple are good examples. The Rubber Ducky is a usb payload injector, that automatically injects a malicious virus into the device it's plugged into. The Wi-Fi Pineapple can act as a rogue router and be used to launch man in the middle attacks. The Wi-Fi Pineapple also has a range of modules that allow it to execute multiple attack vectors. These types of tools are known as penetration testing equipment. The hacking process There are five main phases to the hacking process: Reconnaissance: The reconnaissance phase is often the most time consuming. This phase can last days, weeks, or even months sometimes depending on the target. The objective during the reconnaissance phase is to learn as much as possible about the potential target. Scanning: In this phase the hacker will scan for vulnerabilities in the network to exploit. These scans will look for weaknesses such as, open ports, open services, outdated applications (including operating systems), and the type of equipment being used on the network. Access: In this phase the hacker will use the knowledge gained in the previous phases to gain access to sensitive data or use the network to attack other targets. The objective of this phase is to have the attacker gain some level of control over other devices on the network. Maintaining access: During this phase a hacker will look at various options, such as creating a backdoor to maintain access to devices they have compromised. By creating a backdoor, a hacker can maintain a persistent attack on a network, without fear of losing access to the devices they have gained control over. Although when a backdoor is created, it increases the chance of a hacker being discovered. Backdoors are noisy and often leave a large footprint for IDS to follow. Covering your tracks: This phase is about hiding the intrusion of the network by the hacker as to not alert any IDS that may be monitoring the network. The objective of this phase is to erase any trace that an attack occurred on the network. Recent events and statistics of network attacks The news has been full of cyber-attacks in recent years. The number and scale of attacks are increasing at an alarming rate. It is important for anyone in network security to study these attacks. Staying current with this kind of information will help in defending your network from similar attacks. Since 2015, the medical and insurance industry have been heavily targeted for cyber-attacks. On May 5th, 2015 Premera Blue Cross was attacked. This attack is said to have compromised at least 11 million customer accounts containing personal data. The attack exposed customer's names, birth dates, social security numbers, phone numbers, bank account information, mailing, and e-mail addresses. Another attack that was on a larger scale, was the attack on Anthem. It is estimated that 80 million personal data records were stolen from customers, employees, and even the Chief Executive Officer of Anthem. Another more infamous cyber-attack recently was the Sony hack. This hack was a little different than the Anthem and Blue Cross attacks, because it was carried out by hacktivist instead of cyber-criminals. Even though both types of hacking are criminal, the fundamental reasoning and objectives of the attacks are quite different. The objective in the Sony attack was to disrupt and embarrass the executives at Sony as well as prevent a film from being released. No financial data was targeted. Instead the hackers went after personal e-mails of top executives. The hackers then released the e-mails to the public, causing humiliation to Sony and its executives. Many apologies were issued by Sony in the following weeks of the attack. Large commercial retailers have also been a favorite target for hackers. An attack occurred against Home Depot in September 2014. That attack was on a large scale. It is estimated that over 56 million credit cards were compromised during the Home Depot attack. A similar attack but on a smaller scale was carried out against Staples in October 2014. During this attack, over 1.4 million credit card numbers were stolen. The statistics on cyber security attacks are eye opening. It is estimated by some experts that cybercrime has a worldwide cost of 110 billion dollars a year. In a given year, over 15 million Americans will have their identified stolen through cyber-attacks, it is also estimated that 1.5 million people fall victim to cybercrime every day. These statistics are rapidly increasing and will continue to do so until more people take an active interest in network security. Our defense The baseline for preventing a potential security issues typically begins with hardening the security infrastructure, including firewalls, DMZ, and physical security platform. Entrusting only valid sources or individuals with personal data and or access to that data. That also includes being compliant with all regulations that apply to a given situation or business. Being aware of the types of breaches as well as your potential vulnerabilities. Also understanding is an individual or an organization a higher risk target for attacks. The question has to be asked, does one's organization promote security? This is done both at the personal and the business level to deter cyber-attacks? After a decade of responding to incidents and helping customers recover from and increase their resilience against breaches. Organization may already have a security training and awareness (STA) program, or other training and program could have existed. As the security and threat landscape evolves organizations and individuals need to continually evaluate practices that are required and appropriate for the data they collect, transmit, retain, and destroy. Encryption of data at rest/in storage and in transit is a fundamental security requirement and the respective failure is frequently being cited as the cause for regulatory action and lawsuits. Enforce effective password management policies. Least privilege user access (LUA) is a core security strategy component, and all accounts should run with as few privileges and access levels as possible. Conduct regular security design and code reviews including penetration tests and vulnerability scans to identify and mitigate vulnerabilities. Require e-mail authentication on all inbound and outbound mail servers to help detect malicious e-mail including spear phishing and spoofed e-mail. Continuously monitor in real-time the security of your organization's infrastructure including collecting and analyzing all network traffic, and analyzing centralized logs (including firewall, IDS/IPS, VPN, and AV) using log management tools, as well as reviewing network statistics. Identify anomalous activity, investigate, and revise your view of anomalous activity accordingly. User training would be the biggest challenge, but is arguably the most important defense. Security for individual versus company One of the fundamental questions individuals need to ask themselves, is there a difference between them the individual and an organization? Individual security is less likely due to attack service area. However, there are tools and sites on the internet that can be utilized to detect and mitigate data breaches for both.https://haveibeenpwned.com/ or http://map.norsecorp.com/ are good sites to start with. The issue is that individuals believe they are not a target because there is little to gain from attacking individuals, but in truth everyone has the ability to become a target. Wi-Fi vulnerabilities Protecting wireless networks can be very challenging at times. There are many vulnerabilities that a hacker can exploit to compromise a wireless network. One of the basic Wi-Fi vulnerabilities is broadcasting the Service Set Identifier (SSID) of your wireless network. Broadcasting the SSID makes the wireless network easier to find and target. Another vulnerability in Wi-Fi networks is using Media Access Control (MAC) addressesfor network authentication. A hacker can easily spoof or mimic a trusted MAC address to gain access to the network. Using weak encryption such as Wired Equivalent Privacy (WEP) will make your network an easy target for attack. There are many hacking tools available to crack any WEP key in under five minutes. A major physical vulnerability in wireless networks are the Access Points (AP). Sometimes APs will be placed in poor locations that can be easily accessed by a hacker. A hacker may install what is called a rogue AP. This rogue AP will monitor the network for data that a hacker can use to escalate their attack. Often this tactic is used to harvest the credentials of high ranking management personnel, to gain access to encrypted databases that contain the personal/financial data of employees and customers or both. Peer-to-peer technology can also be a vulnerability for wireless networks. A hacker may gain access to a wireless network by using a legitimate user as an accepted entry point. Not using and enforcing security policies is also a major vulnerability found in wireless networks. Using security tools like Active Directory (deployed properly) will make it harder for a hacker to gain access to a network. Hackers will often go after low hanging fruit (easy targets), so having at least some deterrence will go a long way in protecting your wireless network. Using Intrusion Detection Systems (IDS) in combination with Active Directory will immensely increase the defense of any wireless network. Although the most effective factor is, having a well-trained and informed cyber security professional watching over the network. The more a cyber security professional (threat hunter) understands the tactics of a hacker, the more effective that Threat hunter will become in discovering and neutralizing a network attack. Although there are many challenges in protecting a wireless network, with the proper planning and deployment those challenges can be overcome. Knowns and unknowns The toughest thing about an unknown risk to security is that they are unknown. Unless they are found they can stay hidden. A common practice to determine an unknown risk would be to identify all the known risks and attempt to mitigate them as best as possible. There are many sites available that can assist in this venture. The most helpful would be reports from CVE sites that identify vulnerabilities. False positives   Positive Negative True TP: correctly identified TN: correctly rejected False FP: incorrectly identified FN: incorrectly rejected As it related to detection for an analyzed event there are four situations that exist in this context, corresponding to the relation between the result of the detection for an analyzed event. In this case, each of the corresponding situations mentioned in the preceding table are outlined as follows: True positive (TP): It is when the analyzed event is correctly classified as intrusion or as harmful/malicious. For example, a network security administrator enters their credentials into the Active Directory server and is granted administrator access. True negative (TN): It is when the analyzed event is correctly classified and correctly rejected. For example, an attacker uses a port like 4444 to communicate with a victim's device. An intrusion detection system detects network traffic on the authorized port and alerts the cyber security team to this potential malicious activity. The cyber security team quickly closes the port and isolates the infected device from the network. False positive (FP): It is when the analyzed event is innocuous or otherwise clean as it relates to perspective of security, however, the system classifies it as malicious or harmful. For example, a user types their password into a website's login text field. Instead of being granted access, the user is flagged for an SQL injection attempt by input sanitation. This is often caused when input sanitation is misconfigured. False negative (FN): It is when the analyzed event is malicious but it is classified as normal/innocuous. For example, an attacker inputs an SQL injection string into a text field found on a website to gain unauthorized access to database information. The website accepts the SQL injection as normal user behavior and grants access to the attacker. As it relates to detection, having systems correctly identify the given situations in paramount. Mitigation against threats There are many threats that a network faces. New network threats are emerging all the time. As a network security, professional, it would be wise to have a good understanding of effective mitigation techniques. For example, a hacker using a packet sniffer can be mitigated by only allowing the network admin to run a network analyzer (packet sniffer) on the network. A packet sniffer can usually detect another packet sniffer on the network right away. Although, there are ways a knowledgeable hacker can disguise the packet sniffer as another piece of software. A hacker will not usually go to such lengths unless it is a highly-secured target. It is alarming that; most businesses do not properly monitor their network or even at all. It is important for any business to have a business continuity/disaster recovery plan. This plan is intended to allow a business to continue to operate and recover from a serious network attack. The most common deployment of the continuity/disaster recovery plan is after a DDoS attack. A DDoS attack could potentially cost a business or organization millions of dollars is lost revenue and productivity. One of the most effective and hardest to mitigate attacks is social engineering. All the most devastating network attacks have begun with some type of social engineering attack. One good example is the hack against Snapchat on February 26th, 2016. "Last Friday, Snapchat's payroll department was targeted by an isolated e-mail phishing scam in which a scammer impersonated our Chief Executive Officer and asked for employee payroll information," Snapchat explained in a blog post. Unfortunately, the phishing e-mail wasn't recognized for what it was — a scam — and payroll information about some current and former employees was disclosed externally. Socially engineered phishing e-mails, same as the one that affected Snapchat are common attack vectors for hackers. The one difference between phishing e-mails from a few years ago, and the ones in 2016 is, the level of social engineering hackers are putting into the e-mails. The Snapchat HR phishing e-mail, indicated a high level of reconnaissance on the Chief Executive Officer of Snapchat. This reconnaissance most likely took months. This level of detail and targeting of an individual (Chief Executive Officer) is more accurately know as a spear-phishing e-mail. Spear-phishing campaigns go after one individual (fish) compared to phishing campaigns that are more general and may be sent to millions of users (fish). It is like casting a big open net into the water and seeing what comes back. The only real way to mitigate against social engineering attacks is training and building awareness among users. By properly training the users that access the network, it will create a higher level of awareness against socially engineered attacks. Building an assessment Creating a network assessment is an important aspect of network security. A network assessment will allow for a better understanding of where vulnerabilities may be found within the network. It is important to know precisely what you are doing during a network assessment. If the assessment is done wrong, you could cause great harm to the network you are trying to protect. Before you start the network assessment, you should determine the objectives of the assessment itself. Are you trying to identify if the network has any open ports that shouldn't be? Is your objective to quantify how much traffic flows through the network at any given time or a specific time? Once you decide on the objectives of the network assessment, you will then be able to choose the type of tools you will use. Network assessment tools are often known as penetration testing tools. A person who employs these tools is known as a penetration tester or pen tester. These tools are designed to find and exploit network vulnerabilities, so that they can be fixed before a real attack occurs. That is why it is important to know what you are doing when using penetration testing tools during an assessment. Sometimes network assessments require a team. It is important to have an accurate idea of the scale of the network before you pick your team. In a large enterprise network, it can be easy to become overwhelmed by tasks to complete without enough support. Once the scale of the network assessment is complete, the next step would be to ensure you have written permission and scope from management. All parties involved in the network assessment must be clear on what can and cannot be done to the network during the assessment. After the assessment is completed, the last step is creating a report to educate concerned parties of the findings. Providing detailed information and solutions to vulnerabilities will help keep the network up to date on defense. The report will also be able to determine if there are any viruses lying dormant, waiting for the opportune time to attack the network. Network assessments should be conducted routinely and frequently to help ensure strong networksecurity. Summary In this article we covered the fundamentals of network security. It began by explaining the importance of having network security and what should be done to secure the network. It also covered the different ways physical security can be applied. The importance of having security policies in place and wireless security was discussed. This article also spoke about wireless security policies and why they are important. Resources for Article: Further resources on this subject: API and Intent-Driven Networking [article] Deploying First Server [article] Point-to-Point Networks [article]
Read more
  • 0
  • 0
  • 23871

article-image-pros-and-cons-serverless-model
Packt
06 Apr 2017
22 min read
Save for later

Pros and Cons of the Serverless Model

Packt
06 Apr 2017
22 min read
In this article by Diego Zanon, author of the book Building Serverless Web Applications, we give you an introduction to the Serverless model and the pros and cons that you should consider before building a serverless application. (For more resources related to this topic, see here.) Introduction to the Serverless model Serverless can be a model, a kind of architecture, a pattern or anything else you prefer to call it. For me, serverless is an adjective, a word that qualifies a way of thinking. It's a way to abstract how the code that you write will be executed. Thinking serverless is to not think in servers. You code, you test, you deploy and that's (almost) enough. Serverless is a buzzword. You still need servers to run your applications, but you should not worry about them that much. Maintaining a server is none of your business. The focus is on the development, writing code, and not in the operation. DevOps is still necessary, although with a smaller role. You need to automate deployment and have at least a minimal monitoring of how your application is operating and costing, but you won't start or stop machines to match the usage and you'll neither replace failed instances nor apply security patches to the operating system. Thinking serverless A serverless solution is entirely request-driven. Every time that a user requests some information, a trigger will notify your cloud vendor to pick your code and execute it to retrieve the answer. In contrast, a traditional solution also works to answer requests, but the code is always up and running, consuming machine resources that were reserved specifically for you, even when no one is using your website. In a serverless architecture, it's not necessary to load the entire codebase into a running machine to process a single request. For a faster loading step, only the code that is necessary to answer the request is selected to run. This small piece of the solution is referenced to as a function. So we only run functions on demand. Although we call it simply as a function, it's usually a zipped package that contains a piece of code that runs as an entry point and all the modules that this code depends to execute. What makes the Serverless model so interesting is that you are only billed for the time that was needed to execute your function, which is usually measured in fractions of seconds, not hours of use. If no one is using your service, you pay nothing. Also, if you have a sudden peak of users accessing your app, the cloud service will load different instances to handle all simultaneous requests. If one of those cloud machines fails, another one will be made available automatically, without your interference. Serverless and PaaS Serverless is often confused with Platform as a Service (PaaS). PaaS is a kind of cloud computing model that allows developers to launch applications without worrying about the infrastructure. According to this definition, they have the same objective, which is true. Serverless is like a rebranding of PaaS or you can call it as the next generation of PaaS. The main difference between PaaS and Serverless is that in PaaS you don't manage machines, but you know that they exist. You are billed by provisioning them, even if there is no user actively browsing your website. In PaaS, your code is always running and waiting for new requests. In Serverless, there is a service that is listening for requests and will trigger your code to run only when necessary. This is reflected in your bill. You will pay only for the fractions of seconds that your code was executed and the number of requests that were made to this listener. IaaS and on-premises Besides PaaS, Serverless is frequently compared to Infrastructure as a Service (IaaS) and the on-premises solutions to expose its advantages. IaaS is another strategy to deploy cloud solutions where you hire virtual machines and is allowed to connect to them to configure everything that you need in the guest operating system. It gives you greater flexibility, but comes along with more responsibilities. You need to apply security patches, handle occasional failures and set up new servers to handle usage peaks. Also, you pay the same per hour if you are using 5% or 100% of the machine's CPU. On-premises are the traditional kind of solution where you buy the physical computers and run them inside your company. You get total flexibility and control with this approach. Hosting your own solution can be cheaper, but it happens only when your traffic usage is extremely stable. Over or under provisioning computers is so frequent that it's hard to have real gains using this approach, also when you add the risks and costs to hire a team to manage those machines. Cloud providers may look expensive, but several detailed use cases prove that the return on investment (ROI) is larger running on the cloud than on-premises. When using the cloud, you benefit from the economy of scale of many gigantic data centers. Running by your own exposes your business to a wide range of risks and costs that you'll never be able to anticipate. The main goals of serverless To define a service as serverless, it must have at least the following features: Scale as you need: There is no under or over provisioning Highly available: Fault-tolerant and always online Cost-efficient: Never pay for idle servers Scalability Regarding IaaS, you can achieve infinite scalability with any cloud service. You just need to hire new machines as your usage grows. You can also automate the process of starting and stopping servers as your demand changes. But this is not a fast way to scale. When you start a new machine, you usually need something like 5 minutes before it can be usable to process new requests. Also, as starting and stopping machines is costly, you only do this after you are certain that you need. So, your automated process will wait for some minutes to confirm that your demand has changed before taking any action. IaaS is able to handle well-behaved usage changes, but can't handle unexpected high peaks that happen after announcements or marketing campaigns. With serverless, your scalability is measured in seconds and not minutes. Besides being scalable, it's very fast to scale. Also, it scales per request, without needing to provision capacity. When you consider a high usage frequency in a scale of minutes, IaaS suffers to satisfy the needed capacity while serverless meets even higher usages in less time. In the following figure, the left graph shows how scalability occurs with IaaS. The right graph shows how well the demand can be satisfied using a serverless solution: With an on-premises approach, this is an even bigger problem. As the usage grows, new machines must be bought and prepared. However, increasing the infrastructure requires purchase orders to be created and approved, delay waiting the new servers to arrive and time for the team to configure and test them. It can take weeks to grow, or even months if the company is very big and requests many steps and procedures to be filled in. Availability A highly available solution is one that is fault-tolerant to hardware failures. If one machine goes out, you can keep running with a satisfactory performance. However, if you lose an entire data center due to a power outage, you need machines in another data center to keep the service online. It generally means duplicating your entire infrastructure by placing each half in a different data center. Highly available solutions are usually very expensive in IaaS and on-premises. If you have multiple machines to handle your workload, placing them in different physical places and running a load balance service can be enough. If one data center goes out, you keep the traffic in the remaining machines and scale to compensate. However, there are cases where you pay extra without using those machines. For example, if you have a huge relational database that scaled vertically, you will end up paying for another expensive machine just as a slave to keep the availability. Even for NoSQL databases, if you set a MongoDB replica set in a consistent model, you pay for instances that will act only as secondaries, without serving to alleviate read requests. Instead of running idle machines, you can set them in a cold start state, meaning that the machine is prepared, but is off to reduce costs. However, if you run a website that sells products or services, you can lose customers even in small downtimes. Cold start for web servers can take a few minutes to recover, but needs several minutes for databases. Considering these scenarios, you get high availability for free in serverless. The cost is already considered in what you pay to use. Another aspect of availability is how to handle Distributed Denial of Service (DDoS) attacks. When you receive a huge load of requests in a very short time, how do you handle it? There are some tools and techniques that help mitigate the problem, for example, blacklisting IPs that go over a specific request rate, but before those tools start to work, you need to scale the solution, and it needs to scale really fast to prevent the availability to be compromised. In this, again, serverless has the best scaling speed. Cost efficiency It's impossible to match the traffic usage with what you have provisioned. Considering IaaS or on-premises, as a rule of thumb, CPU and RAM usage must always be lower than 90% to consider the machine healthy, and it is desirable to have a CPU using less than 20% of the capacity on normal traffic. In this case, you are paying for 80% of the waste, where the capacity is in an idle state. Paying for computer resources that you don't use is not efficient. Many cloud providers advertise that you just pay for what you use, but they usually offer significant discounts when you provision for 24 hours of usage in long term (one year or more). This means that you pay for machines that will keep running even in very low traffic hours. Also, even if you want to shut down machines in hours with very low traffic, you need to keep at least a minimum infrastructure 24/7 to keep you web server and databases always online. Regarding high availability, you need extra machines to add redundancy. Again, it's a waste of resources. Another efficiency problem is related with the databases, especially relational ones. Scaling vertically is a very troublesome task, so relational databases are always provisioned considering max peaks. It means that you pay for an expensive machine when most of the time you don't need one. In serverless, you shouldn't worry about provisioning or idle times. You should pay exactly for the CPU and RAM time that is used, measured in fractions of seconds and not hours. If it's a serverless database, you need to store data permanently, so this represents a cost even if no one is using your system. However, storage is usually very cheap. The higher cost, which is related with the database engine that runs queries and manipulates data, will only be billed by the amount of time used without considering idle times. Running a serverless system continuously for one hour has a much higher cost than one hour in a traditional system. However, the difference is that it will never keep one machine with 100% for one hour straight. The cost efficiency of serverless is perceived clearer in websites with varying traffic and not in the ones with flat traffic. Pros We can list the following strengths for the Serverless model: Fast scalability High availability Efficient usage of resources Reduced operational costs Focus on business, not on infrastructure System security is outsourced Continuous delivery Microservices friendly Cost model is startup friendly Let's skip the first three benefits, since they were already covered in the previous section and take a look into the others. Reduced operational costs As the infrastructure is fully managed by the cloud vendor, it reduces the operational costs since you don't need to worry about hardware failures, neither applying security patches to the operating system, nor fixing network issues. It effectively means that you need to spend less sysadmin hours to keep your application running. Also, it helps reducing risks. If you make an investment to deploy a new service and that ends up as a failure, you don't need to worry about selling machines or how to dispose of the data center that you have built. Focus on business Lean software development advocates that you must spend time in what aggregates value to the final product. In a serverless project, the focus is on business. Infrastructure is a second-class citizen. Configuring a large infrastructure is a costly and time consuming task. If you want to validate an idea through a Minimum Viable Product (MVP), without losing time to market, consider using serverless to save time. There are tools that automate the deployment, such as the Serverless Framework, which helps the developer to launch a prototype with minimum effort. If the idea fails, infrastructure costs are minimized since there are no payments in advance. System security The cloud vendor is responsible to manage the security of the operating system, runtime, physical access, networking, and all related technologies that enable the platform to operate. The developer still needs to handle authentication, authorization, and code vulnerabilities, but the rest is outsourced to the cloud provider. It's a positive feature if you consider that a large team of specialists is focused to implement the best security practices and patches new bug fixes as soon as possible to serve hundreds of their customers. That's economy of scale. Continuous delivery Serverless is based on breaking a big project into dozens of packages, each one represented by a top-level function that handles requests. Deploying a new version of a function means uploading a ZIP package to replacing the previous one and updating the endpoint configuration that specifies how this function can be triggered. Executing this task manually, for dozens of functions, is an exhaustive task. Automation is a must-have feature when working in a serverless project. For this task, we can use the Serverless Framework, which helps developers to manage and organize solutions, making a deployment task as simple as executing a one-line command. With automation, continuous delivery is a consequence that brings many benefits like the ability to deploy short development cycles and easier rollbacks anytime. Another related benefit when the deployment is automated is the creation of different environments. You can create a new test environment, which is an exact duplicate of the development environment, using simple commands. The ability to replicate the environment is very important to favor acceptance tests and deployment to production. Microservices friendly A microservices architecture is encouraged in a serverless project. As your functions are single units of deployment, you can have different teams working concurrently on different use cases. You can also use different programming languages in the same project and take advantage of emerging technologies or team skills. Cost model Suppose that you have built a serverless store. The average user will make some requests to see a few products and a few more requests to decide if they will buy something or not. In serverless, a single unit of code has a predictable time to execute for a given input. After collecting some data, you can predict how much a single user costs in average, and this unit cost will almost be constant as your application grows in usage. Knowing how much a single user cost and keeping this number fixed is very important for a startup. It helps to decide how much you need to charge for a service or earn through ads or sales to make a profit. In a traditional infrastructure, you need to make payments in advance, and scaling your application means increasing your capacity in steps. So, calculating the unit cost of a user is more difficult and it's a variable number. The following chart makes this difference easier to understand: Cons Serverless is great, but no technology is a silver bullet. You should be aware of the following issues: Higher latency Constraints Hidden inefficiencies Vendor dependency Debugging Atomic deploys Uncertainties We will discuss these drawbacks in detail now. Higher latency Serverless is request-driven and your code is not running all the time. When a request is made, it triggers a service that finds your function, unzips the package, loads into a container, and makes it available to be executed. The problem is that those steps takes time—up to a few hundreds of milliseconds. This issue is called cold start delay and is a trade-off that exists between serverless cost-effective model and a lower latency of traditional hosting. There are some solutions to fix this performance problem. For example, you can configure your function to reserve more RAM memory. It gives a faster start and overall performance. The programming language is also important. Java has a much higher cold start time then JavaScript (Node.js). Another solution is to benefit from the fact that the cloud provider may cache the loaded code, which means that the first execution will have a delay but further requests will benefit by a smaller latency. You can optimize a serverless function by aggregating a large number of functionalities into a single function. The consequence is that this package will be executed with a higher frequency and will frequently skip the cold start issue. The problem is that a big package will take more time to load and provoke a higher first start time. At a last resort, you could schedule another service to ping your functions periodically, like one time per couple of minutes, to prevent them to be put to sleep. It will add costs, but remove the cold start problem. There is also a concept of serverless databases that references services where the database is fully managed by the vendor and it charges only the storage and the time to execute the database engine. Those solutions are wonderful, but they add even more delay for your requests when compared with traditional databases. Proceed with caution when selecting those. Constraints If you go serverless, you need to know what the vendor constraints are. For example, on Amazon Web Services (AWS), you can't run a Lambda function for more than 5 minutes. It makes sense because if you're doing this, you are using it wrong. Serverless was designed to be cost efficient in short bursts. For constant and predictable processing, it will be expensive. Another constraint on AWS Lambda is the number of concurrent executions across all functions within a given region. Amazon limits this to 100. Suppose that your functions need 100 milliseconds in average to execute. In this scenario, you can handle up to 1000 users per second. The reasoning behind this restriction is to avoid excessive costs due to programming errors that may create potential runways or recursive iterations. AWS Lambda has a default limit of 100 concurrent executions. However, you can file a case into AWS Support Center to raise this limit. If you say that your application is ready for production and that you understand the risks, they will happily increase this value. When monitoring your Lambda functions using Amazon CloudWatch, there is an option called Throttles. Each invocation that exceeds the safety limit of concurrent calls is counted as one throttle. Hidden inefficiencies Some people are calling serverless as a NoOps solution. That's not true. DevOps is still necessary. You don't need to worry much about servers because they are second-class citizens and the focus is on your business. However, adding metrics and monitoring your applications will always be a good practice. It's so easy to scale that a specific function may be deployed with a poor performance that takes much more time than necessary and remains unnoticed forever because no one is monitoring the operation. Also, over or under provisioning is also possible (in a smaller sense) since you need to configure your function setting the amount of RAM memory that it will reserve and the threshold to timeout the execution. It's a very different scale of provisioning, but you need to keep it in mind to avoid mistakes. Vendor dependency When you build a serverless solution, you trust your business in a third-party vendor. You should be aware that companies fail and you can suffer downtimes, security breaches, and performance issues. Also, the vendor may change the billing model, increase costs, introduce bugs into their services, have poor documentations, modify an API forcing you to upgrade, and terminate services. A whole bunch of bad things may happen. What you need to weigh is whether it's worth trusting in another company or making a big investment to build all by yourself. You can mitigate these problems by doing a market search before selecting a vendor. However, you still need to count on luck. For example, Parse was a vendor that offered managed services with really nice features. It was bought by Facebook in 2013, which gave more reliability due to the fact that it was backed by a big company. Unfortunately, Facebook decided to shutdown all servers in 2016 giving one year of notice for customers to migrate to other vendors. Vendor lock-in is another big issue. When you use cloud services, it's very likely that one specific service has a completely different implementation than another vendor, making those two different APIs. You need to rewrite code in case you decide to migrate. It's already a common problem. If you use a managed service to send e-mails, you need to rewrite part of your code before migrating to another vendor. What raises a red flag here is that a serverless solution is entirely based into one vendor and migrating the entire codebase can be much more troublesome. To mitigate this problem, some tools like the Serverless Framework are moving to include multivendor support. Currently, only AWS is supported but they expect to include Microsoft, Google, and IBM clouds in the future, without requiring code rewrites to migrate. Multivendor support represents safety for your business and gives power to competitiveness. Debugging Unit testing a serverless solution is fairly simple because any code that your functions relies on can be separated into modules and unit tested. Integration tests are a little bit more complicated because you need to be online to test with external services. You can build stubs, but you lose some fidelity in your tests. When it comes to debugging to test a feature or fix an error, it's a whole different problem. You can't hook into an external service and make slow processing steps to see how your code behaves. Also, those serverless APIs are not open source, so you can't run them in-house for testing. All you have is the ability to log steps, which is a slow debugging approach, or you can extract the code and adapt it to host into your own servers and make local calls. Atomic deploys Deploying a new version of a serverless function is easy. You update the code and the next time that a trigger requests this function, your newly deployed code will be selected to run. This means that, for a brief moment, two instances of the same function can be executed concurrently with different implementations. Usually, that's not a problem, but when you deal with persistent storage and databases, you should be aware that a new piece of code can insert data into a format that an old version can't understand. Also, if you want to deploy a function that relies on a new implementation of another function, you need to be careful in the order that you deploy those functions. Ordering is often not secured by the tools that automate the deployment process. The problem here is that current serverless implementations consider that deployment is an atomic process for each function. You can't batch deploy a group of functions atomically. You can mitigate this issue by disabling the event source while you deploy a specific group, but that means introducing a downtime into the deployment process, or you can use a monolith approach instead of a microservices architecture for serverless applications. Uncertainties Serverless is still a pretty new concept. Early adopters are braving this field testing what works and which kind patterns and technologies can be used. Emerging tools are defining the development process. Vendors are releasing and improving new services. There are high expectations for the future, but the future hasn't come yet. Some uncertainties still worry developers when it comes to build large applications. Being a pioneer can be rewarding, but risky. Technical debt is a concept that compares software development with finances. The easiest solution in the short run is not always the best overall solution. When you take a bad decision in the beginning, you pay later with extra hours to fix it. Software is not perfect. Every single architecture has pros and cons that append technical debt in the long run. The question is: how much technical debt does serverless aggregate to the software development process? Is it more, less, or equivalent to the kind of architecture that you are using today? Summary In this article, you learned about the Serverless model and how it is different from other traditional approaches. You already know what the main benefits and advantages are that it may offer for your next application. Also, you are aware that no technology is a silver bullet. You know what kind of problems you may have with serverless and how mitigate some of them. Resources for article Refer to Building Serverless Architectures, available at https://www.packtpub.com/application-development/building-serverless-architectures for further resources on this subject. Resources for Article: Further resources on this subject: Deployment and DevOps [article] Customizing Xtext Components [article] Heads up to MvvmCross [article]
Read more
  • 0
  • 0
  • 2049

article-image-using-raspberry-pi-camera-module
Packt
06 Apr 2017
6 min read
Save for later

Using the Raspberry Pi Camera Module

Packt
06 Apr 2017
6 min read
This article by Shervin Emami, co-author of the book, Mastering OpenCV 3 - Second Edition, explains how to use the Raspberry Pi Camera Module for your Cartoonifier and Skin changer applications. While using a USB webcam on Raspberry Pi has the convenience of supporting identical behavior & code on desktop as on embedded device, you might consider using one of the official Raspberry Pi Camera Modules (referred to as the RPi Cams). They have some advantages and disadvantages over USB webcams: (For more resources related to this topic, see here.) RPi Cams uses the special MIPI CSI camera format, designed for smartphone cameras to use less power, smaller physical size, faster bandwidth, higher resolutions, higher framerates, and reduced latency, compared to USB. Most USB2.0 webcams can only deliver 640x480 or 1280x720 30 FPS video, since USB2.0 is too slow for anything higher (except for some expensive USB webcams that perform onboard video compression) and USB3.0 is still too expensive. Whereas smartphone cameras (including the RPi Cams) can often deliver 1920x1080 30 FPS or even Ultra HD / 4K resolutions. The RPi Cam v1 can in fact deliver up to 2592x1944 15 FPS or 1920x1080 30 FPS video even on a $5 Raspberry Pi Zero, thanks to the use of MIPI CSI for the camera and a compatible video processing ISP & GPU hardware inside the Raspberry Pi. The RPi Cam also supports 640x480 in 90 FPS mode (such as for slow-motion capture), and this is quite useful for real-time computer vision so you can see very small movements in each frame, rather than large movements that are harder to analyze. However the RPi Cam is a plain circuit board that is highly sensitive to electrical interference, static electricity or physical damage (simply touching the small orange flat cable with your finger can cause video interference or even permanently damage your camera!). The big flat white cable is far less sensitive but it is still very sensitive to electrical noise or physical damage. The RPi Cam comes with a very short 15cm cable. It's possible to buy third-party cables on eBay with lengths between 5cm to 1m, but cables 50cm or longer are less reliable, whereas USB webcams can use 2m to 5m cables and can be plugged into USB hubs or active extension cables for longer distances. There are currently several different RPi Cam models, notably the NoIR version that doesn't have an internal Infrared filter, therefore a NoIR camera can easily see in the dark (if you have an invisible Infrared light source), or see Infrared lasers or signals far clearer than regular cameras that include an Infrared filter inside them. There are also 2 different versions of RPi Cam (shown above): RPi Cam v1.3 and RPi Cam v2.1, where the v2.1 uses a wider angle lens with a Sony 8 Mega-Pixel sensor instead of a 5 Mega-Pixel OmniVision sensor, and has better support for motion in low lighting conditions, and adds support for 3240x2464 15 FPS video and potentially upto 120 FPS video at 720p. However, USB webcams come in thousands of different shapes & versions, making it easy to find specialized webcams such as waterproof or industrial-grade webcams, rather than requiring you to create your own custom housing for an RPi Cam. IP Cameras are also another option for a camera interface that can allow 1080p or higher resolution videos with Raspberry Pi, and IP cameras support not just very long cables, but potentially even work anywhere in the world using Internet. But IP cameras aren't quite as easy to interface with OpenCV as USB webcams or the RPi Cam. In the past, RPi Cams and the official drivers weren't directly compatible with OpenCV, you often used custom drivers and modified your code in order to grab frames from RPi Cams, but it's now possible to access an RPi Cam in OpenCV the exact same way as a USB Webcam! Thanks to recent improvements in the v4l2 drivers, once you load the v4l2 driver the RPi Cam will appear as file /dev/video0 or /dev/video1 like a regular USB webcam. So traditional OpenCV webcam code such as cv::VideoCapture(0) will be able to use it just like a webcam. Installing the Raspberry Pi Camera Module driver First let's temporarily load the v4l2 driver for the RPi Cam to make sure our camera is plugged in correctly: sudo modprobe bcm2835-v4l2 If the command failed (that is, it printed an error message to the console, or it froze, or the command returned a number beside 0), then perhaps your camera is not plugged in correctly. Shutdown then unplug power from your RPi and try attaching the flat white cable again, looking at photos on the Web to make sure it's plugged in the correct way around. If it is the correct way around, it's possible the cable wasn't fully inserted before you closed the locking tab on the RPi. Also check if you forgot to click Enable Camera when configuring your Raspberry Pi earlier, using the sudo raspi-config command. If the command worked (that is, the command returned 0 and no error was printed to the console), then we can make sure the v4l2 driver for the RPi Cam is always loaded on bootup, by adding it to the bottom of the /etc/modules file: sudo nano /etc/modules # Load the Raspberry Pi Camera Module v4l2 driver on bootup: bcm2835-v4l2 After you save the file and reboot your RPi, you should be able to run ls /dev/video* to see a list of cameras available on your RPi. If the RPi Cam is the only camera plugged into your board, you should see it as the default camera (/dev/video0), or if you also have a USB webcam plugged in then it will be either /dev/video0 or /dev/video1. Let's test the RPi Cam using the starter_video sample program: cd ~/opencv-3.*/samples/cpp DISPLAY=:0 ./starter_video 0 If it's showing the wrong camera, try DISPLAY=:0 ./starter_video 1. Now that we know the RPi Cam is working in OpenCV, let's try Cartoonifier! cd ~/Cartoonifier DISPLAY=:0 ./Cartoonifier 0 (or DISPLAY=:0 ./Cartoonifier 1 for the other camera). Resources for Article: Further resources on this subject: Video Surveillance, Background Modeling [article] Performance by Design [article] Getting started with Android Development [article]
Read more
  • 0
  • 0
  • 5865

article-image-building-components-using-angular
Packt
06 Apr 2017
11 min read
Save for later

Building Components Using Angular

Packt
06 Apr 2017
11 min read
In this article by Shravan Kumar Kasagoni, the author of the book Angular UI Development, we will learn how to use new features of Angular framework to build web components. After going through this article you will understand the following: What are web components How to setup the project for Angular application development Data binding in Angular (For more resources related to this topic, see here.) Web components In today's web world if we need to use any of the UI components provided by libraries like jQuery UI, YUI library and so on. We write lot of imperative JavaScript code, we can't use them simply in declarative fashion like HTML markup. There are fundamental problems with the approach. There is no way to define custom HTML elements to use them in declarative fashion. The JavaScript, CSS code inside UI components can accidentally modify other parts of our web pages, our code can also accidentally modify UI components, which is unintended. There is no standard way to encapsulate code inside these UI components. Web Components provides solution to all these problems. Web Components are set of specifications for building reusable UI components. Web Components specifications is comprised of four parts: Templates: Allows us to declare fragments of HTML that can be cloned and inserted in the document by script Shadow DOM: Solves DOM tree encapsulation problem Custom elements: Allows us to define custom HTML tags for UI components HTML imports: Allows to us add UI components to web page using import statement More information on web components can be found at: https://www.w3.org/TR/components-intro/. Component are the fundament building blocks of any Angular application. Components in Angular are built on top of the web components specification. Web components specification is still under development and might change in future, not all browsers supports it. But Angular provides very high abstraction so that we don't need to deal with multiple technologies in web components. Even if specification changes Angular can take care of internally, it provides much simpler API to write web components. Getting started with Angular We know Angular is completely re-written from scratch, so everything is new in Angular. In this article we will discuss few important features like data binding, new templating syntax and built-in directives. We are going use more practical approach to learn these new features. In the next section we are going to look at the partially implemented Angular application. We will incrementally use Angular new features to implement this application. Follow the instruction specified in next section to setup sample application. Project Setup Here is a sample application with required Angular configuration and some sample code. Application Structure Create directory structure, files as mentioned below and copy the code into files from next section. Source Code package.json We are going to use npm as our package manager to download libraries and packages required for our application development. Copy the following code to package.json file. { "name": "display-data", "version": "1.0.0", "description": "", "main": "index.js", "scripts": { "tsc": "tsc", "tsc:w": "tsc -w", "lite": "lite-server", "start": "concurrent "npm run tsc:w" "npm run lite" " }, "author": "Shravan", "license": "ISC", "dependencies": { "angular2": "^2.0.0-beta.1", "es6-promise": "^3.0.2", "es6-shim": "^0.33.13", "reflect-metadata": "^0.1.2", "rxjs": "^5.0.0-beta.0", "systemjs": "^0.19.14", "zone.js": "^0.5.10" }, "devDependencies": { "concurrently": "^1.0.0", "lite-server": "^1.3.2", "typescript": "^1.7.5" } } The package.json file holds metadata for npm, in the preceding code snippet there are two important sections: dependencies: It holds all the packages required for an application to run devDependencies: It holds all the packages required only for development Once we add the preceding package.json file to our project we should run the following command at the root of our application. $ npm install The preceding command will create node_modules directory in the root of project and downloads all the packages mentioned in dependencies, devDependencies sections into node_modules directory. There is one more important section, that is scripts. We will discuss about scripts section, when we are ready to run our application. tsconfig.json Copy the below code to tsconfig.json file. { "compilerOptions": { "target": "es5", "module": "system", "moduleResolution": "node", "sourceMap": true, "emitDecoratorMetadata": true, "experimentalDecorators": true, "removeComments": false, "noImplicitAny": false }, "exclude": [ "node_modules" ] } We are going to use TypeScript for developing our Angular applications. The tsconfig.json file is the configuration file for TypeScript compiler. Options specified in this file are used while transpiling our code into JavaScript. This is totally optional, if we don't use it TypeScript compiler use are all default flags during compilation. But this is the best way to pass the flags to TypeScript compiler. Following is the expiation for each flag specified in tsconfig.json: target: Specifies ECMAScript target version: 'ES3' (default), 'ES5', or 'ES6' module: Specifies module code generation: 'commonjs', 'amd', 'system', 'umd' or 'es6' moduleResolution: Specifies module resolution strategy: 'node' (Node.js) or 'classic' (TypeScript pre-1.6) sourceMap: If true generates corresponding '.map' file for .js file emitDecoratorMetadata: If true enables the output JavaScript to create the metadata for the decorators experimentalDecorators: If true enables experimental support for ES7 decorators removeComments: If true, removes comments from output JavaScript files noImplicitAny: If true raise error if we use 'any' type on expressions and declarations exclude: If specified, the compiler will not compile the TypeScript files in the containing directory and subdirectories index.html Copy the following code to index.html file. <!DOCTYPE html> <html lang="en"> <head> <meta charset="UTF-8"> <title>Top 10 Fastest Cars in the World</title> <link rel="stylesheet" href="app/site.css"> <script src="node_modules/angular2/bundles/angular2-polyfills.js"></script> <script src="node_modules/systemjs/dist/system.src.js"></script> <script src="node_modules/rxjs/bundles/Rx.js"></script> <script src="node_modules/angular2/bundles/angular2.dev.js"></script> <script> System.config({ transpiler: 'typescript', typescriptOptions: {emitDecoratorMetadata: true}, map: {typescript: 'node_modules/typescript/lib/typescript.js'}, packages: { 'app' : { defaultExtension: 'ts' } } }); System.import('app/boot').then(null, console.error.bind(console)); </script> </head> <body> <cars-list>Loading...</cars-list> </body> </html> This is startup page of our application, it contains required angular scripts, SystemJS configuration for module loading. Body tag contains <cars-list> tag which renders the root component of our application. However, I want to point out one specific statement: The System.import('app/boot') statement will import boot module from app package. Physically it loading boot.js file under app folder. car.ts Copy the following code to car.ts file. export interface Car { make: string; model: string; speed: number; } We are defining a car model using TypeScript interface, we are going to use this car model object in our components. app.component.ts Copy the following code to app.component.ts file. import {Component} from 'angular2/core'; @Component({ selector: 'cars-list', template: '' }) export class AppComponent { public heading = "Top 10 Fastest Cars in the World"; } Important points about AppComponent class: The AppComponent class is our application root component, it has one public property named 'heading' The AppComponent class is decorated with @Component() function with selector, template properties in its configuration object The @Component() function is imported using ES2015 module import syntax from 'angular2/core' module in Angular library We are also exporting the AppComponent class as module using export keyword Other modules in application can also import the AppComponent class using module name (app.component – file name without extension) using ES2015 module import syntax boot.ts Copy the following code to boot.ts file. import {bootstrap} from 'angular2/platform/browser' import {AppComponent} from './app.component'; bootstrap(AppComponent); In this file we are importing bootstrap() function from 'angular2/platform/browser' module and the AppComponent class from 'app.component' module. Next we are invoking bootstrap() function with the AppComponent class as parameter, this will instantiate an Angular application with the AppComponent as root component. site.css Copy the following code to site.css file. * { font-family: 'Segoe UI Light', 'Helvetica Neue', 'Segoe UI', 'Segoe'; color: rgb(51, 51, 51); } This file contains some basic styles for our application. Working with data in Angular In any typical web application, we need to display data on a HTML page and read the data from input controls on a HTML page. In Angular everything is a component, HTML page is represented as template and it is always associated with a component class. Application data lives on component's class properties. Either push values to template or pull values from template, to do this we need to bind the properties of component class to the controls on the template. This mechanism is known as data binding. Data binding in angular allows us to use simple syntax to push or pull data. When we bind the properties of component class to the controls on the template, if the data on the properties changes, Angular will automatically update the template to display the latest data and vice versa. We can also control the direction of data flow (from component to template, from template to component). Displaying Data using Interpolation If we go back to our AppComponent class in sample application, we have heading property. We need to display this heading property on the template. Here is the revised AppComponent class: app/app.component.ts import {Component} from 'angular2/core'; @Component({ selector: 'cars-list', template: '<h1>{{heading}}</h1>' }) export class AppComponent { public heading = "Top 10 Fastest Cars in the World"; } In @Component() function we updated template property with expression {{heading}} surrounded by h1 tag. The double curly braces are the interpolation syntax in Angular. Any property on the class we need to display on the template, use the property name surrounded by double curly braces. Angular will automatically render the value of property on the browser screen. Let's run our application, go to command line and navigate to the root of the application structure, then run the following command. $ npm start The preceding start command is part of scripts section in package.json file. It is invoking two other commands npm run tsc:w, npm run lite. npm run tsc:w: This command is performing the following actions: It is invoking TypeScript compiler in watch mode TypeScript compiler will compile all our TypeScript files to JavaScript using configuration mentioned in tsconfig.json TypeScript compiler will not exit after the compilation is over, it will wait for changes in TypeScript files Whenever we modify any TypeScript file, on the fly compiler will compile them to JavaScript npm run lite: This command will start a lite weight Node.js web server and launches our application in browser Now we can continue to make the changes in our application. Changes are detected and browser will refresh automatically with updates. Output in the browser: Let's further extend this simple application, we are going to bind the heading property to a textbox, here is revised template: template: ` <h1>{{heading}}</h1> <input type="text" value="{{heading}}"/> ` If we notice the template it is a multiline string and it is surrounded by ` (backquote/ backtick) symbols instead of single or double quotes. The backtick (``) symbols are new multi-line string syntax in ECMAScript 2015. We don't need start our application again, as mentioned earlier it will automatically refresh the browser with updated output until we stop 'npm start' command is at command line. Output in the browser: Now textbox also displaying the same value in heading property. Let's change the value in textbox by typing something, then hit the tab button. We don't see any changes happening on the browser. But as mentioned earlier in data binding whenever we change the value of any control on the template, which is bind to a property of component class it should update the property value. Then any other controls bind to same property should also display the updated value. In browser h1 tag should also display the same text whatever we type in textbox, but it won't happen. Summary We started this article by covering introduction to web components. Next we discussed a sample application which is the foundation for this article. Then we discussed how to write components using new features in Angular to like data binding and new templating syntaxes using lot of examples. By the end of this article, you should have good understanding of Angular new concepts and should be able to write basic components. Resources for Article: Further resources on this subject: Get Familiar with Angular [article] Gearing Up for Bootstrap 4 [article] Angular's component architecture [article]
Read more
  • 0
  • 0
  • 34866
article-image-creating-fabric-policies
Packt
06 Apr 2017
9 min read
Save for later

Creating Fabric Policies

Packt
06 Apr 2017
9 min read
In this article by Stuart Fordham, the author of the book Cisco ACI Cookbook, helps us to understand ACI and the APIC and also explains us how to create fabric policies. (For more resources related to this topic, see here.) Understanding ACI and the APIC ACI is for the data center. It is a fabric (which is just a fancy name for the layout of the components) that can span data centers using OTV or similar overlay technologies, but it is not for the WAN. We can implement a similar level of programmability on our WAN links through APIC-EM (Application Policy Infrastructure Controller Enterprise Module), which uses ISR or ASR series routers along with the APIC-EM virtual machine to control and program them. APIC and APIC-EM are very similar; just the object of their focus is different. APIC-EM is outside the scope of this book, as we will be looking at data center technologies. The APIC is our frontend. Through this, we can create and manage our policies, manage the fabric, create tenants, and troubleshoot. Most importantly, the APIC is not associated with the data path. If we lose the APIC for any reason, the fabric will continue to forward the traffic. To give you the technical elevator pitch, ACI uses a number of APIs (application programming interfaces) such as REST (Representational State Transfer) using languages such as JSON (JavaScript Object Notation) and XML (eXtensible Markup Language), as well as the CLI and the GUI to manage the fabric and other protocols such as OpFlex to supply the policies to the network devices. The first set (those that manage the fabric) are referred to as northbound protocols. Northbound protocols allow lower-level network components talk to higher-level ones. OpFlex is a southbound protocol. Southbound protocols (such as OpFlex and OpenFlow, which is another protocol you will hear in relation to SDN) allow the controllers to push policies down to the nodes (the switches). Figure 1 This is a very brief introduction to the how. Now let's look at the why. What does ACI give us that the traditional network does not? In a multi-tenant environment, we have defined goals. The primary purpose is that one tenant remain separate from another. We can achieve this in a number of ways. We could have each of the tenants in their own DMZ (demilitarized zone), with firewall policies to permit or restrict traffic as required. We could use VLANs to provide a logical separation between tenants. This approach has two drawbacks. It places a greater onus on the firewall to direct traffic, which is fine for northbound traffic (traffic leaving the data center) but is not suitable when the majority of the traffic is east-west bound (traffic between applications within the data center; see figure 2). Figure 2 We could use switches to provide layer-3 routing and use access lists to control and restrict traffic; these are well designed for that purpose. Also, in using VLANs, we are restricted to a maximum of 4,096 potential tenants (due to the 12-bit VLAN ID). An alternative would be to use VRFs (virtual routing and forwarding). VRFs are self-contained routing tables, isolated from each other unless we instruct the router or switch to share the routes by exporting and importing route targets (RTs). This approach is much better for traffic isolation, but when we need to use shared services, such as an Internet pipe, VRFs can become much harder to keep secure. One way around this would be to use route leaking. Instead of having a separate VRF for the Internet, this is kept in the global routing table and then leaked to both tenants. This maintains the security of the tenants, and as we are using VRFs instead of VLANs, we have a service that we can offer to more than 4,096 potential customers. However, we also have a much bigger administrative overhead. For each new tenant, we need more manual configuration, which increases our chances of human error. ACI allows us to mitigate all of these issues. By default, ACI tenants are completely separated from each other. To get them talking to each other, we need to create contracts, which specify what network resources they can and cannot see. There are no manual steps required to keep them separate from each other, and we can offer Internet access rapidly during the creation of the tenant. We also aren't bound by the 4,096 VLAN limit. Communication is through VXLAN, which raises the ceiling of potential segments (per fabric) to 16 million (by using a 24-bit segment ID). Figure 3 VXLAN is an overlay mechanism that encapsulates layer-2 frames within layer-4 UDP packets, also known as MAC-in-UDP (figure 3). Through this, we can achieve layer-2 communication across a layer-3 network. Apart from the fact that through VXLAN, tenants can be placed anywhere in the data center and that the number of endpoints far outnumbers the traditional VLAN approach, the biggest benefit of VXLAN is that we are no longer bound by the Spanning Tree Protocol. With STP, the redundant paths in the network are blocked (until needed). VXLAN, by contrast, uses layer-3 routing, which enables it to use equal-cost multipathing (ECMP) and link aggregation technologies to make use of all the available links, with recovery (in the event of a link failure) in the region of 125 microseconds. With VXLAN, we have endpoints, referred to as VXLAN Tunnel Endpoints (VTEPs), and these can be physical or virtual switchports. Head-End Replication (HER) is used to forward broadcast, unknown destination address, and multicast traffic, which is referred to (quite amusingly) as BUM traffic. This 16M limit with VXLAN is more theoretical, however. Truthfully speaking, we have a limit of around 1M entries in terms of MAC addresses, IPv4 addresses, and IPv6 addresses due to the size of the TCAM (ternary content-addressable memory). The TCAM is high-speed memory, used to speed up the reading of routing tables and performing matches against access control lists. The amount of available TCAM became a worry back in 2014 when the BGP routing table first exceeded 512 thousand routes, which was the maximum number supported by many of the Internet routers. The likelihood of having 1M entries within the fabric is also pretty rare, but even at 1M entries, ACI remains scalable in that the spine switches let the leaf switches know about only the routes and endpoints they need to know about. If you are lucky enough to be scaling at this kind of magnitude, however, it would be time to invest in more hardware and split the load onto separate fabrics. Still, a data center with thousands of physical hosts is very achievable. Creating fabric policies In this recipe, we will create an NTP policy and assign it to our pod. NTP is a good place to start, as having a common and synced time source is critical for third-party authentication, such as LDAP and logging. In this recipe, we will use the Quick Start menu to create an NTP policy, in which we will define our NTP servers. We will then create a POD policy and attach our NTP policy to it. Lastly, we'll create a POD profile, which calls the policy and applies it to our pod (our fabric). We can assign pods to different profiles, and we can share policies between policy groups. So we may have one NTP policy but different SNMP policies for different pods: The ACI fabric is very flexible in this respect. How to do it... From the Fabric menu, select Fabric Policies. From the Quick Start menu, select Create an NTP Policy: A new window will pop up, and here we'll give our new policy a name and (optional) description and enable it. We can also define any authentication keys, if the servers use them. Clicking on Next takes us to the next page, where we specify our NTP servers. Click on the plus sign on the right-hand side, and enter the IP address or Fully Qualified Domain Name (FQDN) of the NTP server(s): We can also select a management EPG, which is useful if the NTP servers are outside of our network. Then, click on OK. Click on Finish.  We can now see our custom policy under Pod Policies: At the moment, though, we are not using it: Clicking on Show Usage at the bottom of the screen shows that no nodes or policies are using the policy. To use the policy, we must assign it to a pod, as we can see from the Quick Start menu: Clicking on the arrow in the circle will show us a handy video on how to do this. We need to go into the policy groups under Pod Policies and create a new policy: To create the policy, click on the Actions menu, and select Create Pod Policy Group. Name the new policy PoD-Policy. From here, we can attach our NTP-POLICY to the PoD-Policy. To attach the policy, click on the drop-down next to Date Time Policy, and select NTP-POLICY from the list of options: We can see our new policy. We have not finished yet, as we still need to create a POD profile and assign the policy to the profile. The process is similar as before: we go to Profiles (under the Pod Policies menu), select Actions, and then Create Pod Profile: We give it a name and associate our policy to it. How it works... Once we create a policy, we must associate it with a POD policy. The POD policy must then be associated with a POD profile. We can see the results here: Our APIC is set to use the new profile, which will be pushed down to the spine and leaf nodes. We can also check the NTP status from the APIC CLI, using the command show ntp . apic1# show ntp nodeid remote refid st t when poll reach delay offset jitter -------- - --------------- -------- ----- -- ------ ------ ------- ------- -------- -------- 1 216.239.35.4 .INIT. 16 u - 16 0 0.000 0.000 0.000 apic1# Summary In this article, we have summarized the ACI and APIC and helps us to develop and create the fabric policies. Resources for Article: Further resources on this subject: The Fabric library – the deployment and development task manager [article] Web Services and Forms [article] The Alfresco Platform [article]
Read more
  • 0
  • 0
  • 32143

article-image-learning-cassandra
Packt
06 Apr 2017
17 min read
Save for later

Learning Cassandra

Packt
06 Apr 2017
17 min read
In this article by Sandeep Yarabarla, the author of the book Learning Apache Cassandra - Second Edition, we will built products using relational databases like MySQL and PostgreSQL, and perhaps experimented with NoSQL databases including a document store like MongoDB or a key-value store like Redis. While each of these tools has its strengths, you will now consider whether a distributed database like Cassandra might be the best choice for the task at hand. In this article, we'll begin with the need for NoSQL databases to satisfy the conundrum of ever growing data. We will see why NoSQL databases are becoming the de facto choice for big data and real-time web applications. We will also talk about the major reasons to choose Cassandra from among the many database options available to you. Having established that Cassandra is a great choice, we'll go through the nuts and bolts of getting a local Cassandra installation up and running (For more resources related to this topic, see here.) What is Big Data Big Data is a relatively new term which has been gathering steam over the past few years. Big Data is a term used for data sets that are relatively large to be stored in a traditional database system or processed by traditional data processing pipelines. This data could be structured, semi-structured or unstructured data. The data sets that belong to this category usually scale to terabytes or petabytes of data. Big Data usually involves one or more of the following: Velocity: Data moves at an unprecedented speed and must be dealt with it in a timely manner. Volume: Organizations collect data from a variety of sources, including business transactions, social media and information from sensor or machine-to-machine data. This could involve terabytes to petabytes of data. In the past, storing it would've been a problem – but new technologies have eased the burden. Variety: Data comes in all sorts of formats ranging from structured data to be stored in traditional databases to unstructured data (blobs) such as images, audio files and text files. These are known as the 3 Vs of Big Data. In addition to these, we tend to associate another term with Big Data. Complexity: Today's data comes from multiple sources, which makes it difficult to link, match, cleanse and transform data across systems. However, it's necessary to connect and correlate relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control. It must be able to traverse multiple data centers, cloud and geographical zones. Challenges of modern applications Before we delve into the shortcomings of relational systems to handle Big Data, let's take a look at some of the challenges faced by modern web facing and Big Data applications. Later, this will give an insight into how NoSQL data stores or Cassandra in particular helps solve these issues. One of the most important challenges faced by a web facing application is it should be able to handle a large number of concurrent users. Think of a search engine like google which handles millions of concurrent users at any given point of time or a large online retailer. The response from these applications should be swift even as the number of users keeps on growing. Modern applications need to be able to handle large amounts of data which can scale to several petabytes of data and beyond. Consider a large social network with a few hundred million users: Think of the amount of data generated in tracking and managing those users Think of how this data can be used for analytics Business critical applications should continue running without much impact even when there is a system failure or multiple system failures (server failure, network failure, and so on.). The applications should be able to handle failures gracefully without any data loss or interruptions. These applications should be able to scale across multiple data centers and geographical regions to support customers from different regions around the world with minimum delay. Modern applications should be implementing fully distributed architectures and should be capable of scaling out horizontally to support any data size or any number of concurrent users. Why not relational Relational database systems (RDBMS) have been the primary data store for enterprise applications for 20 years. Lately, NoSQL databases have been picking a lot of steam and businesses are slowly seeing a shift towards non-relational databases. There are a few reasons why relational databases don't seem like a good fit for modern web applications. Relational databases are not designed for clustered solutions. There are some solutions that shard data across servers but these are fragile, complex and generally don't work well. Sharding solutions implemented by RDBMS MySQL's product MySQL cluster provides clustering support which adds many capabilities of non-relational systems. It is actually a NoSQL solution that integrates with MySQL relational database. It partitions the data onto multiple nodes and the data can be accessed via different APIs. Oracle provides a clustering solution, Oracle RAC which involves multiple nodes running an oracle process accessing the same database files. This creates a single point of failure as well as resource limitations in accessing the database itself. They are not a good fit current hardware and architectures. Relational databases are usually scaled up by using larger machines with more powerful hardware and maybe clustering and replication among a small number of nodes. Their core architecture is not a good fit for commodity hardware and thus doesn't work with scale out architectures. Scale out vs Scale up architecture Scaling out means adding more nodes to a system such as adding more servers to a distributed database or filesystem. This is also known as horizontal scaling. Scaling up means adding more resources to a single node within the system such as adding more CPU, memory or disks to a server. This is also known as vertical scaling. How to handle Big Data Now that we are convinced the relational model is not a good fit for Big Data, let's try to figure out ways to handle Big Data. These are the solutions that paved the way for various NoSQL databases. Clustering: The data should be spread across different nodes in a cluster. The data should be replicated across multiple nodes in order to sustain node failures. This helps spread the data across the cluster and different nodes contain different subsets of data. This improves performance and provides fault tolerance. A node is an instance of database software running on a server. Multiple instances of the same database could be running on the same server. Flexible schema: Schemas should be flexible unlike the relational model and should evolve with data. Relax consistency: We should embrace the concept of eventual consistency which means data will eventually be propagated to all the nodes in the cluster (in case of replication). Eventual consistency allows data replication across nodes with minimum overhead. This allows for fast writes with the need for distributed locking. De-normalization of data: De-normalize data to optimize queries. This has to be done at the cost of writing and maintaining multiple copies of the same data. What is Cassandra and why Cassandra Cassandra is a fully distributed, masterless database, offering superior scalability and fault tolerance to traditional single master databases. Compared with other popular distributed databases like Riak, HBase, and Voldemort, Cassandra offers a uniquely robust and expressive interface for modeling and querying data. What follows is an overview of several desirable database capabilities, with accompanying discussions of what Cassandra has to offer in each category. Horizontal scalability Horizontal scalability refers to the ability to expand the storage and processing capacity of a database by adding more servers to a database cluster. A traditional single-master database's storage capacity is limited by the capacity of the server that hosts the master instance. If the data set outgrows this capacity, and a more powerful server isn't available, the data set must be sharded among multiple independent database instances that know nothing of each other. Your application bears responsibility for knowing to which instance a given piece of data belongs. Cassandra, on the other hand, is deployed as a cluster of instances that are all aware of each other. From the client application's standpoint, the cluster is a single entity; the application need not know, nor care, which machine a piece of data belongs to. Instead, data can be read or written to any instance in the cluster, referred to as a node; this node will forward the request to the instance where the data actually belongs. The result is that Cassandra deployments have an almost limitless capacity to store and process data. When additional capacity is required, more machines can simply be added to the cluster. When new machines join the cluster, Cassandra takes careof rebalancing the existing data so that each node in the expanded cluster has a roughly equal share. Also, the performance of a Cassandra cluster is directly proportional to the number of nodes within the cluster. As you keep on adding instances, the read and write throughput will keep increasing proportionally. Cassandra is one of the several popular distributed databases inspired by the Dynamo architecture, originally published in a paper by Amazon. Other widely used implementations of Dynamo include Riak and Voldemort. You can read the original paper at http://s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf High availability The simplest database deployments are run as a single instance on a single server. This sort of configuration is highly vulnerable to interruption: if the server is affected by a hardware failure or network connection outage, the application's ability to read and write data is completely lost until the server is restored. If the failure is catastrophic, the data on that server might be lost completely. Master-slave architectures improve this picture a bit. The master instance receives all write operations, and then these operations are replicated to follower instances. The application can read data from the master or any of the follower instances, so a single host becoming unavailable will not prevent the application from continuing to read data. A failure of the master, however, will still prevent the application from performing any write operations, so while this configuration provides high read availability, it doesn't completely provide high availability. Cassandra, on the other hand, has no single point of failure for reading or writing data. Each piece of data is replicated to multiple nodes, but none of these nodes holds the authoritative master copy. All the nodes in a Cassandra cluster are peers without a master node. If a machine becomes unavailable, Cassandra will continue writing data to the other nodes that share data with that machine, and will queue the operations and update the failed node when it rejoins the cluster. This means in a typical configuration, multiple nodes must fail simultaneously for there to be any application-visible interruption in Cassandra's availability. How many copies? When you create a keyspace - Cassandra's version of a database, you specify how many copies of each piece of data should be stored; this is called the replication factor. A replication factor of 3 is a common choice for most use cases. Write optimization Traditional relational and document databases are optimized for read performance. Writing data to a relational database will typically involve making in-place updates to complicated data structures on disk, in order to maintain a data structure that can be read efficiently and flexibly. Updating these data structures is a very expensive operation from a standpoint of disk I/O, which is often the limiting factor for database performance. Since writes are more expensive than reads, you'll typically avoid any unnecessary updates to a relational database, even at the expense of extra read operations. Cassandra, on the other hand, is highly optimized for write throughput, and in fact never modifies data on disk; it only appends to existing files or creates new ones. This is much easier on disk I/O and means that Cassandra can provide astonishingly high write throughput. Since both writing data to Cassandra, and storing data in Cassandra, are inexpensive, denormalization carries little cost and is a good way to ensure that data can be efficiently read in various access scenarios. Because Cassandra is optimized for write volume; you shouldn't shy away from writing data to the database. In fact, it's most efficient to write without reading whenever possible, even if doing so might result in redundant updates. Just because Cassandra is optimized for writes doesn't make it bad at reads; in fact, a well-designed Cassandra database can handle very heavy read loads with no problem. Structured records The first three database features we looked at are commonly found in distributed data stores. However, databases like Riak and Voldemort are purely key-value stores; these databases have no knowledge of the internal structure of a record that's stored at a particular key. This means useful functions like updating only part of a record, reading only certain fields from a record, or retrieving records that contain a particular value in a given field are not possible. Relational databases like PostgreSQL, document stores like MongoDB, and, to a limited extent, newer key-value stores like Redis do have a concept of the internal structure of their records, and most application developers are accustomed to taking advantage of the possibilities this allows. None of these databases, however, offer the advantages of a masterless distributed architecture. In Cassandra, records are structured much in the same way as they are in a relational database—using tables, rows, and columns. Thus, applications using Cassandra can enjoy all the benefits of masterless distributed storage while also getting all the advanced data modeling and access features associated with structured records. Secondary indexes A secondary index, commonly referred to as an index in the context of a relational database, is a structure allowing efficient lookup of records by some attribute other than their primary key. This is a widely useful capability: for instance, when developing a blog application, you would want to be able to easily retrieve all of the posts written by a particular author. Cassandra supports secondary indexes; while Cassandra's version is not as versatile as indexes in a typical relational database, it's a powerful feature in the right circumstances. Materialized views Data modelling principles in Cassandra compel us to denormalize data as much as possible. Prior to Cassandra 3.0, the only way to query on a non-primary key column was to create a secondary index and query on it. However, secondary indexes have a performance tradeoff if it contains high cardinality data. Often, high cardinality secondary indexes have to scan data on all the nodes and aggregate them to return the query results. This defeats the purpose of having a distributed system. To avoid secondary indexes and client-side denormalization, Cassandra introduced the feature of Materialized views which does server-side denormalization. You can create views for a base table and Cassandra ensures eventual consistency between the base and view. This lets us do very fast lookups on each view following the normal Cassandra read path. Materialized views maintain a correspondence of one CQL row each in the base and the view, so we need to ensure that each CQL row which is required for the views will be reflected in the base table's primary keys. Although, materialized view allows for fast lookups on non-primary key indexes; this comes at a performance hit to writes. Efficient result ordering It's quite common to want to retrieve a record set ordered by a particular field; for instance, a photo sharing service will want to retrieve the most recent photographs in descending order of creation. Since sorting data on the fly is a fundamentally expensive operation, databases must keep information about record ordering persisted on disk in order to efficiently return results in order. In a relational database, this is one of the jobs of a secondary index. In Cassandra, secondary indexes can't be used for result ordering, but tables can be structured such that rows are always kept sorted by a given column or columns, called clustering columns. Sorting by arbitrary columns at read time is not possible, but the capacity to efficiently order records in any way, and to retrieve ranges of records based on this ordering, is an unusually powerful capability for a distributed database. Immediate consistency When we write a piece of data to a database, it is our hope that that data is immediately available to any other process that may wish to read it. From another point of view, when we read some data from a database, we would like to be guaranteed that the data we retrieve is the most recently updated version. This guarantee is calledimmediate consistency, and it's a property of most common single-master databases like MySQL and PostgreSQL. Distributed systems like Cassandra typically do not provide an immediate consistency guarantee. Instead, developers must be willing to accept eventual consistency, which means when data is updated, the system will reflect that update at some point in the future. Developers are willing to give up immediate consistency precisely because it is a direct tradeoff with high availability. In the case of Cassandra, that tradeoff is made explicit through tunable consistency. Each time you design a write or read path for data, you have the option of immediate consistency with less resilient availability, or eventual consistency with extremely resilient availability. Discretely writable collections While it's useful for records to be internally structured into discrete fields, a given property of a record isn't always a single value like a string or an integer. One simple way to handle fields that contain collections of values is to serialize them using a format like JSON, and then save the serialized collection into a text field. However, in order to update collections stored in this way, the serialized data must be read from the database, decoded, modified, and then written back to the database in its entirety. If two clients try to perform this kind of modification to the same record concurrently, one of the updates will be overwritten by the other. For this reason, many databases offer built-in collection structures that can be discretely updated: values can be added to, and removed from collections, without reading and rewriting the entire collection. Neither the client nor Cassandra itself needs to read the current state of the collection in order to update it, meaning collection updates are also blazingly efficient. Relational joins In real-world applications, different pieces of data relate to each other in a variety of ways. Relational databases allow us to perform queries that make these relationships explicit, for instance, to retrieve a set of events whose location is in the state of New York (this is assuming events and locations are different record types). Cassandra, however, is not a relational database, and does not support anything like joins. Instead, applications using Cassandra typically denormalize data and make clever use of clustering in order to perform the sorts of data access that would use a join in a relational database. For datasets that aren't already denormalized, applications can also perform client-side joins, which mimic the behavior of a relational database by performing multiple queries and joining the results at the application level. Client-side joins are less efficient than reading data that has been denormalized in advance, but offer more flexibility. Summary In this article, you explored the reasons to choose Cassandra from among the many databases available, and having determined that Cassandra is a great choice, you installed it on your development machine. You had your first taste of the Cassandra Query Language when you issued your first few commands through the CQL shell in order to create a keyspace, table, and insert and read data. You're now poised to begin working with Cassandra in earnest. Resources for Article: Further resources on this subject: Setting up a Kubernetes Cluster [article] Dealing with Legacy Code [article] Evolving the data model [article]
Read more
  • 0
  • 0
  • 2233

article-image-layout-management-python-gui
Packt
05 Apr 2017
13 min read
Save for later

Layout Management for Python GUI

Packt
05 Apr 2017
13 min read
In this article written by Burkhard A. Meierwe, the author of the book Python GUI Programming Cookbook - Second Edition, we will lay out our GUI using Python 3.6 and above: Arranging several labels within a label frame widget Using padding to add space around widgets How widgets dynamically expand the GUI Aligning the GUI widgets by embedding frames within frames (For more resources related to this topic, see here.)  In this article, we will explore how to arrange widgets within widgets to create our Python GUI. Learning the fundamentals of GUI layout design will enable us to create great looking GUIs. There are certain techniques that will help us to achieve this layout design. The grid layout manager is one of the most important layout tools built into tkinter that we will be using. We can very easily create menu bars, tabbed controls (aka Notebooks) and many more widgets using tkinter . Arranging several labels within a label frame widget The LabelFrame widget allows us to design our GUI in an organized fashion. We are still using the grid layout manager as our main layout design tool, yet by using LabelFrame widgets we get much more control over our GUI design. Getting ready We are starting to add more and more widgets to our GUI, and we will make the GUI fully functional in the coming recipes. Here we are starting to use the LabelFrame widget. Add the following code just above the main event loop towards the bottom of the Python module. Running the code will result in the GUI looking like this: Uncomment line 111 and notice the different alignment of the LabelFrame. We can easily align the labels vertically by changing our code, as shown next. Note that the only change we had to make was in the column and row numbering. Now the GUI LabelFrame looks as such: How it works... In line 109 we create our first ttk LabelFrame widget and assign the resulting instance to the variable buttons_frame. The parent container is win, our main window. In lines 114 -116, we create labels and place them in the LabelFrame. buttons_frame is the parent of the labels. We are using the important grid layout tool to arrange the labels within the LabelFrame. The column and row properties of this layout manager give us the power to control our GUI layout. The parent of our labels is the buttons_frame instance variable of the LabelFrame, not the win instance variable of the main window. We can see the beginning of a layout hierarchy here. We can see how easy it is to change our layout via the column and row properties. Note how we change the column to 0, and how we layer our labels vertically by numbering the row values sequentially. The name ttk stands for themed tk. The tk-themed widget set was introduced in Tk 8.5. There's more... In a recipe later in this article we will embed LabelFrame widgets within LabelFrame widgets, nesting them to control our GUI layout. Using padding to add space around widgets Our GUI is being created nicely. Next, we will improve the visual aspects of our widgets by adding a little space around them, so they can breathe. Getting ready While tkinter might have had a reputation for creating ugly GUIs, this has dramatically changed since version 8.5. You just have to know how to use the tools and techniques that are available. That's what we will do next. tkinter version 8.6 ships with Python 3.6. How to do it... The procedural way of adding spacing around widgets is shown first, and then we will use a loop to achieve the same thing in a much better way. Our LabelFrame looks a bit tight as it blends into the main window towards the bottom. Let's fix this now. Modify line 110 by adding padx and pady: And now our LabelFrame got some breathing space: How it works... In tkinter, adding space horizontally and vertically is done by using the built-in properties named padx and pady. These can be used to add space around many widgets, improving horizontal and vertical alignments, respectively. We hard-coded 20 pixels of space to the left and right of the LabelFrame, and we added 40 pixels to the top and bottom of the frame. Now our LabelFrame stands out more than it did before. The screenshot above only shows the relevant change. We can use a loop to add space around the labels contained within the LabelFrame: Now the labels within the LabelFrame widget have some space around them too: The grid_configure() function enables us to modify the UI elements before the main loop displays them. So, instead of hard-coding values when we first create a widget, we can work on our layout and then arrange spacing towards the end of our file, just before the GUI is being created. This is a neat technique to know. The winfo_children() function returns a list of all the children belonging to the buttons_frame variable. This enables us to loop through them and assign the padding to each label. One thing to notice is that the spacing to the right of the labels is not really visible. This is because the title of the LabelFrame is longer than the names of the labels. We can experiment with this by making the names of the labels longer. Now our GUI looks like this. Note how there is now some space added to the right of the long label next to the dots. The last dot does not touch the LabelFrame which it otherwise would without the added space. We can also remove the name of the LabelFrame to see the effect padx has on positioning our labels. By setting the text property to an empty string, we remove the name that was previously displayed for the LabelFrame. How widgets dynamically expand the GUI You probably noticed in previous screenshots and by running the code that widgets have a capability to extend themselves to the space they need to visually display their text. Java introduced the concept of dynamic GUI layout management. In comparison, visual development IDEs like VS.NET lay out the GUI in a visual manner, and are basically hard-coding the x and y coordinates of UI elements. Using tkinter, this dynamic capability creates both an advantage and a little bit of a challenge, because sometimes our GUI dynamically expands when we would prefer it rather not to be so dynamic! Well, we are dynamic Python programmers so we can figure out how to make the best use of this fantastic behavior! Getting ready At the beginning of the previous recipe we added a LabelFrame widget. This moved some of our controls to the center of column 0. We might not wish this modification to our GUI layout. Next we will explore some ways to solve this. How to do it... Let us first become aware of the subtle details that are going on in our GUI layout in order to understand it better. We are using the grid layout manager widget and it lays out our widgets in a zero-based grid. This is very similar to an Excel spreadsheet or a database table. Grid Layout Manager Example with 2 Rows and 3 Columns: Row 0; Col 0 Row 0; Col 1 Row 0; Col 2 Row 1; Col 0 Row 1; Col 1 Row 1; Col 2 Using the grid layout manager, what is happening is that the width of any given column is determined by the longest name or widget in that column. This affects all rows. By adding our LabelFrame widget and giving it a title that is longer than some hard-coded size widget like the top-left label and the text entry below it, we dynamically move those widgets to the center of column 0, adding space to the left and right sides of those widgets. Incidentally, because we used the sticky property for the Checkbutton and ScrolledText widgets, those remain attached to the left side of the frame. Let's look in more detail at the screenshot from the first recipe of this article. We added the following code to create the LabelFrame and then placed labels into this frame. Since the text property of the LabelFrame, which is displayed as the title of the LabelFrame, is longer than both our Enter a name: label and the text box entry below it, those two widgets are dynamically centered within the new width of column 0. The Checkbutton and Radiobutton widgets in column 0 did not get centered because we used the sticky=tk.W property when we created those widgets. For the ScrolledText widget we used sticky=tk.WE, which binds the widget to both the west (aka left) and east (aka right) side of the frame. Let's remove the sticky property from the ScrolledText widget and observe the effect this change has. Now our GUI has new space around the ScrolledText widget both on the left and right sides. Because we used the columnspan=3 property, our ScrolledText widget still spans all three columns. If we remove columnspan=3 we get the following GUI which is not what we want. Now our ScrolledText only occupies column 0 and, because of its size, it stretches the layout. One way to get our layout back to where we were before adding the LabelFrame is to adjust the grid column position. Change the column value from 0 to 1. Now our GUI looks like this: How it works... Because we are still using individual widgets, our layout can get messed up. By moving the column value of the LabelFrame from 0 to 1, we were able to get the controls back to where they used to be and where we prefer them to be. At least the left-most label, text, Checkbutton, ScrolledText, and Radiobutton widgets are now located where we intended them to be. The second label and text Entry located in column 1 aligned themselves to the center of the length of the Labels in a Frame widget so we basically moved our alignment challenge one column to the right. It is not so visible because the size of the Choose a number: label is almost the same as the size of the Labels in a Frame title, and so the column width was already close to the new width generated by the LabelFrame. There's more... In the next recipe we will embed frames within frames to avoid the accidental misalignment of widgets we just experienced in this recipe. Aligning the GUI widgets by embedding frames within frames We have much better control of our GUI layout if we embed frames within frames. This is what we will do in this recipe. Getting ready The dynamic behavior of Python and its GUI modules can create a little bit of a challenge to really get our GUI looking the way we want. Here we will embed frames within frames to get more control of our layout. This will establish a stronger hierarchy among the different UI elements, making the visual appearance easier to achieve. We will continue to use the GUI we created in the previous recipe. How to do it... Here, we will create a top-level frame that will contain other frames and widgets. This will help us to get our GUI layout just the way we want. In order to do so, we will have to embed our current controls within a central ttk.LabelFrame. This ttk.LabelFrame is a child of the main parent window and all controls will be children of this ttk.LabelFrame. Up to this point in our recipes we have assigned all widgets to our main GUI frame directly. Now we will only assign our LabelFrame to our main window and after that we will make this LabelFrame the parent container for all the widgets. This creates the following hierarchy in our GUI layout: In this diagram, win is the variable that holds a reference to our main GUI tkinter window frame; mighty is the variable that holds a reference to our LabelFrame and is a child of the main window frame (win); and Label and all other widgets are now placed into the LabelFrame container (mighty). Add the following code towards the top of our Python module: Next we will modify all the following controls to use mighty as the parent, replacing win. Here is an example of how to do this: Note how all the widgets are now contained in the Mighty Python LabelFrame which surrounds all of them with a barely visible thin line. Next, we can reset the Labels in a Frame widget to the left without messing up our GUI layout: Oops - maybe not. While our frame within another frame aligned nicely to the left, it again pushed our top widgets into the center (a default). In order to align them to the left, we have to force our GUI layout by using the sticky property. By assigning it 'W' (West) we can control the widget to be left-aligned. How it works... Note how we aligned the label, but not the text box below it. We have to use the sticky property for all the controls we want to left-align. We can do that in a loop, using the winfo_children() and grid_configure(sticky='W') properties, as we did before in the second recipe of this article. The winfo_children() function returns a list of all the children belonging to the parent. This enables us to loop through all of the widgets and change their properties. Using tkinter to force left, right, top, bottom the naming is very similar to Java: west, east, north and south, abbreviated to: 'W' and so on. We can also use the following syntax: tk.W instead of 'W'. This requires having imported the tkinter module aliased as tk. In a previous recipe we combined both 'W' and 'E' to make our ScrolledText widget attach itself both to the left and right sides of its container using 'WE'. We can add more combinations: 'NSE' will stretch our widget to the top, bottom and right side. If we have only one widget in our form, for example a button, we can make it fill in the entire frame by using all options: 'NSWE'. We can also use tuple syntax: sticky=(tk.N, tk.S, tk.W, tk.E). Let's align the entry in column 0 to the left. Now both the label and the  Entry are aligned towards the West (left). In order to separate the influence that the length of our Labels in a Frame LabelFrame has on the rest of our GUI layout we must not place this LabelFrame into the same LabelFrame as the other widgets but assign it directly to the main GUI form (win). Summary We have learned the layout management for Python GUI using the following reciepes: Arranging several labels within a label frame widget Using padding to add space around widgets How widgets dynamically expand the GUI Aligning the GUI widgets by embedding frames within frames Resources for Article:  Further resources on this subject: Python Scripting Essentials [article] An Introduction to Python Lists and Dictionaries [article] Test all the things with Python [article]
Read more
  • 0
  • 0
  • 15558
article-image-getting-started-docker-storage
Packt
05 Apr 2017
12 min read
Save for later

Getting Started with Docker Storage

Packt
05 Apr 2017
12 min read
In this article by Scott Gallagher, author of the book Mastering Docker – Second Edition, we will cover the places you store your containers, such as Docker Hub and Docker Hub Enterprises. We will also cover Docker Registry that you can use to run your own local storage for the Docker containers. We will review the differences between them all and when and how to use each of them. It will also cover how to set up automated builds using web hooks as well as the pieces that are all required to set them up. Lastly, we will run through an example of how to set up your own Docker Registry. Let's take a quick look at the topics we will be covering in this article: Docker Hub Docker Hub Enterprise Docker Registry Automated builds (For more resources related to this topic, see here.) Docker Hub In this section, we will focus on that Docker Hub, which is a free public option, but also has a private option that you can use to secure your images. We will focus on the web aspect of Docker Hub and the management you can do there. The login page is like the one shown in the following screenshot: Dashboard After logging into the Docker Hub, you will be taken to the following landing page. This page is known as the Dashboard of Docker Hub. From here, you can get to all the other sub pages of Docker Hub. In the upcoming sections, we will go through everything you see on the dashboard, starting with the dark blue bar you have on the top. Exploring the repositories page The following is the screenshot of the Explore link you see next to Dashboard at the top of the screen: As you can see in the screenshot, this is a link to show you all the official repositories that Docker has to offer. Official repositories are those that come directly from Docker or from the company responsible for the product. They are regularly updated and patched as needed. Organizations Organizations are those that you have either created or have been added to. Organizations allow you to layer on control, for say, a project that multiple people are collaborating on. The organization gets its own setting such as whether to store repositories as public or private by default, changing plans that will allow for different amounts of private repositories, and separate repositories all together from the ones you or others have. You can also access or switch between accounts or organizations from the Dashboard just below the Docker log, where you will typically see your username when you log in. This is a drop-down list, where you can switch between all the organizations you belong to. The Create menu The Create menu is the new item along the top bar of the Dashboard. From this drop-down menu, you can perform three actions: Create repository Create automated build Create organization A pictorial representation is shown in the following screenshot: The Settings Page Probably, the first section everyone jumps to once they have created an account on the Docker Hub—the Settings page. I know, that's what I did at least. The Account Settings page can be found under the drop-down menu that is accessed in the upper-right corner of the dashboard on selecting Settings. The page allows you to set up your public profile; change your password; see what organization you belong to, the subscriptions for e-mail updates you belong to, what specific notifications you would like to receive, what authorized services have access to your information, linked accounts (such as your GitHub or Bitbucket accounts); as well as your enterprise licenses, billing, and global settings. The only global setting as of now is the choice between having your repositories default to public or private upon creation. The default is to create them as public repositories. The Stars page Below the dark blue bar at the top of the Dashboard page are two more areas that are yet to be covered. The first, the Stars page, allows you to see what repositories you yourself have starred. This is very useful if you come across some repositories that you prefer to use and want to access them to see whether they have been updated recently or whether any other changes have occurred on these repositories. The second is a new setting in the new version of Docker Hub called Contributed. In this section, there will be a list of repositories you have contributed to outside of the ones within your Repositories list. Docker Hub Enterprise Docker Hub Enterprise, as it is currently known, will eventually be called Docker Subscription. We will focus on Docker Subscription, as it's the new and shiny piece. We will view the differences between Docker Hub and Docker Subscription (as we will call it moving forward) and view the options to deploy Docker Subscription. Let's first start off by comparing Docker Hub to Docker Subscription and see why each is unique and what purpose each serves: Docker Hub Shareable image, but it can be private No hassle of self-hosting Free (except for a certain number of private images) Docker Subscription Integrated into your authentication services (that is, AD/LDAP) Deployed on your own infrastructure (or cloud) Commercial support Docker Subscription for server Docker Subscription for server allows you to deploy both Docker Trusted Registry as well as Docker Engine on the infrastructure that you manage. Docker Trusted Registry is the location where you store the Docker images that you have created. You can set these up to be internal only or share them out publicly as well. Docker Subscription gives you all the benefits of running your own dedicated Docker hosted registry with the added benefits of getting support in case you need it. Docker Subscription for cloud As we saw in the previous section, we can also deploy Docker Subscription to a cloud provider if we wish. This allows us to leverage our existing cloud environments without having to roll our own server infrastructure up to host our Docker images. The setup is the same as we reviewed in the previous section; but this time, we will be targeting our existing cloud environment instead. Docker Registry In this section, we will be looking at Docker Registry. Docker Registry is an open source application that you can run anywhere you please and store your Docker image in. We will look at the comparison between Docker Registry and Docker Hub and how to choose among the two. By the end of the section, you will learn how to run your own Docker Registry and see whether it's a true fit for you. An overview of Docker Registry Docker Registry, as stated earlier, is an open source application that you can utilize to store your Docker images on a platform of your choice. This allows you to keep them 100% private if you wish or share them as needed. The registry can be found at https://docs.docker.com/registry/. This will run you through the setup and the steps to follow while pushing images to Docker Registry compared to Docker Hub. Docker Registry makes a lot of sense if you want to roll your own registry without having to pay for all the private features of Docker Hub. Next, let's take a look at some comparisons between Docker Hub and Docker Registry, so you can make an educated decision as to which platform to choose to store your images. Docker Registry will allow you to do the following: Host and manage your own registry from which you can serve all the repositories as private, public, or a mix between the two Scale the registry as needed based on how many images you host or how many pull requests you are serving out All are command-line-based for those that live on the command line Docker Hub will allow you to: Get a GUI-based interface that you can use to manage your images A location already set up on the cloud that is ready to handle public and/or private images Peace of mind of not having to manage a server that is hosting all your images Automated builds In this section, we will look at automated builds. Automated builds are those that you can link to your GitHub or Bitbucket account(s) and, as you update the code in your code repository, you can have the image automatically built on Docker Hub. We will look at all the pieces required to do so and, by the end, you'll be automating all your builds. Setting up your code The first step to create automated builds is to set up your GitHub or Bitbucket code. These are the two options you have while selecting where to store your code. For our example, I will be using GitHub; but the setup will be the same for GitHub and Bitbucket. First, we set up our GitHub code that contains just a simple README file that we will edit for our purpose. This file could be anything as far as a script or even multiple files that you want to manipulate for your automated builds. One key thing is that we can't just leave the README file alone. One key piece is that a Dockerfile is required to do the builds when you want it to for them to be automated. Next, we need to set up the link between our code and Docker Hub. Setting up Docker Hub On Docker Hub, we are going to use the Create drop-down menu and select Create Automated Build. After selecting it, we will be taken to a screen that will show you the accounts you have linked to either GitHub or Bitbucket. You then need to search and select the repository from either of the locations you want to create the automated build from. This will essentially create a web hook that when a commit is done on a selected code repository, then a new build will be created on Docker Hub. After you select the repository you would like to use, you will be taken to a screen similar to the following one: For the most part, the defaults will be used by most. You can select a different branch if you want to use one, say a testing branch if you use one before the code may go to the master branch. The one thing that will not be filled out, but is required, is the description field. You must enter something here or you will not be able to continue past this page. Upon clicking Create, you will be taken to a screen similar to the next screenshot: On this screen, you can see a lot of information on the automated build you have set up. Information such as tags, the Dockerfile in the code repository, build details, build settings, collaborators on the code, web hooks, and settings that include making the repository public or private and deleting the automated build repository as well. Putting all the pieces together So, let's take a run at doing a Docker automated build and see what happens when we have all the pieces in place and exactly what we have to do to kick off this automated build and be able to create our own magic: Update the code or any file inside your GitHub or Bitbucket repository. Upon committing the update, the automated build will be kicked off and logged in Docker Hub for that automated repository. Creating your own registry To create a registry of your own, use the following command: $ docker-machine create --driver vmwarefusion registry Creating SSH key... Creating VM... Starting registry... Waiting for VM to come online... To see how to connect Docker to this machine, run the following command: $ docker-machine env registry export DOCKER_TLS_VERIFY="1" export DOCKER_HOST="tcp://172.16.9.142:2376" export DOCKER_CERT_PATH="/Users/scottpgallagher/.docker/machine/machines/ registry" export DOCKER_MACHINE_NAME="registry" # Run this command to configure your shell: # eval "$(docker-machine env registry)" $ eval "$(docker-machine env registry)" $ docker pull registry $ docker run -p 5000:5000 -v <HOST_DIR>:/tmp/registry-dev registry:2 This will specify to use version 2 of the registry. For AWS (as shown in example from https://hub.docker.com/_/registry/): $ docker run -e SETTINGS_FLAVOR=s3 -e AWS_BUCKET=acme-docker -e STORAGE_PATH=/registry -e AWS_KEY=AKIAHSHB43HS3J92MXZ -e AWS_SECRET=xdDowwlK7TJajV1Y7EoOZrmuPEJlHYcNP2k4j49T -e SEARCH_BACKEND=sqlalchemy -p 5000:5000 registry:2 Again, this will use version 2 of the self-hosted registry. Then, you need to modify your Docker startups to point to the newly set up registry. Add the following line to the Docker startup in the /etc/init.d/docker file: -H tcp://127.0.0.1:2375 -H unix:///var/run/docker.sock --insecureregistry <REGISTRY_HOSTNAME>:5000 Most of these settings might already be there and you might only need to add --insecure-registry <REGISTRY_HOSTNAME>:5000: To access this file, you will need to use docker-machine: $ docker-machine ssh <docker-host_name> Now, you can pull a registry from the public Docker Hub as follows: $ docker pull debian Tag it, so when we do a push, it will go to the registry we set up: $ docker tag debian <REGISTRY_URL>:5000/debian Then, we can push it to our registry: $ docker push <REGISTRY_URL>:5000/debian We can also pull it for any future clients (or after any updates we have pushed for it): $ docker pull <REGISTRY_URL>:5000/debian Summary In this article, we dove deep into Docker Hub and also reviewed the new shiny Docker Subscription as well as the self-hosted Docker Registry. We have gone through the extensive review of each of them. You learned of the differences between them all and how to utilize each one. In this article, we also looked deep into setting up automated builds. We took a look at how to set up your own Docker Hub Registry. We have encompassed a lot in this chapter and I hope you have learned a lot and will like to put it all into good use. Resources for Article: Further resources on this subject: Docker in Production [article] Docker Hosts [article] Hands On with Docker Swarm [article]
Read more
  • 0
  • 0
  • 5507

article-image-conditional-statements-functions-and-lists
Packt
05 Apr 2017
14 min read
Save for later

Conditional Statements, Functions, and Lists

Packt
05 Apr 2017
14 min read
In this article, by Sai Yamanoor and Srihari Yamanoor, author of the book Python Programming with Raspberry Pi Zero, you will learn about conditional statements and how to make use of logical operators to check conditions using conditional statements. Next, youwill learn to write simple functions in Python and discuss interfacing inputs to the Raspberry Pi's GPIO header using a tactile switch (momentary push button). We will also discuss motor control (this is a run-up to the final project) using the Raspberry Pi Zero and control the motors using the switch inputs. Let's get to it! In this article, we will discuss the following topics: Conditional statements in Python Using conditional inputs to take actions based on GPIO pin states Breaking out of loops using conditional statement Functions in Python GPIO callback functions Motor control in Python (For more resources related to this topic, see here.) Conditional statements In Python, conditional statements are used to determine if a specific condition is met by testing whether a condition is True or False. Conditional statements are used to determine how a program is executed. For example,conditional statements could be used to determine whether it is time to turn on the lights. The syntax is as follows: if condition_is_true: do_something() The condition is usually tested using a logical operator, and the set of tasks under the indented block is executed. Let's consider the example,check_address_if_statement.py where the user input to a program needs to be verified using a yesor no question: check_address = input("Is your address correct(yes/no)? ") if check_address == "yes": print("Thanks. Your address has been saved") if check_address == "no": del(address) print("Your address has been deleted. Try again")check_address = input("Is your address correct(yes/no)? ") In this example, the program expects a yes or no input. If the user provides the input yes, the condition if check_address == "yes"is true, the message Your address has been savedis printed on the screen. Likewise, if the user input is no, the program executes the indented code block under the logical test condition if check_address == "no" and deletes the variable address. An if-else statement In the precedingexample, we used an ifstatement to test each condition. In Python, there is an alternative option named the if-else statement. The if-else statement enables testing an alternative condition if the main condition is not true: check_address = input("Is your address correct(yes/no)? ") if check_address == "yes": print("Thanks. Your address has been saved") else: del(address) print("Your address has been deleted. Try again") In this example, if the user input is yes, the indented code block under if is executed. Otherwise, the code block under else is executed. An if-elif-else statement In the precedingexample, the program executes any piece of code under the else block for any user input other than yesthat is if the user pressed the return key without providing any input or provided random characters instead of no, the if-elif-else statement works as follows: check_address = input("Is your address correct(yes/no)? ") if check_address == "yes": print("Thanks. Your address has been saved") elifcheck_address == "no": del(address) print("Your address has been deleted. Try again") else: print("Invalid input. Try again") If the user input is yes, the indented code block under the ifstatement is executed. If the user input is no, the indented code block under elif (else-if) is executed. If the user input is something else, the program prints the message: Invalid input. Try again. It is important to note that the code block indentation determines the block of code that needs to be executed when a specific condition is met. We recommend modifying the indentation of the conditional statement block and find out what happens to the program execution. This will help understand the importance of indentation in python. In the three examples that we discussed so far, it could be noted that an if-statement does not need to be complemented by an else statement. The else and elif statements need to have a preceding if statement or the program execution would result in an error. Breaking out of loops Conditional statements can be used to break out of a loop execution (for loop and while loop). When a specific condition is met, an if statement can be used to break out of a loop: i = 0 while True: print("The value of i is ", i) i += 1 if i > 100: break In the precedingexample, the while loop is executed in an infinite loop. The value of i is incremented and printed onthe screen. The program breaks out of the while loop when the value of i is greater than 100 and the value of i is printed from 1 to 100. The applications of conditional statements: executing tasks using GPIO Let's discuss an example where a simple push button is pressed. A button press is detected by reading the GPIO pin state. We are going to make use of conditional statements to execute a task based on the GPIO pin state. Let us connect a button to the Raspberry Pi's GPIO. All you need to get started are a button, pull-up resistor, and a few jumper wires. The figure given latershows an illustration on connecting the push button to the Raspberry Pi Zero. One of the push button's terminals is connected to the ground pin of the Raspberry Pi Zero's GPIO pin. The schematic of the button's interface is shown here: Raspberry Pi GPIO schematic The other terminal of the push button is pulled up to 3.3V using a 10K resistor.The junction of the push button terminal and the 10K resistor is connected to the GPIO pin 2. Interfacing the push button to the Raspberry Pi Zero's GPIO—an image generated using Fritzing Let's review the code required to review the button state. We make use of loops and conditional statements to read the button inputs using the Raspberry Pi Zero. We will be making use of the gpiozero. For now, let’s briefly discuss the concept of classes for this example. A class in Python is a blueprint that contains all the attributes that define an object. For example, the Button class of the gpiozero library contains all attributes required to interface a button to the Raspberry Pi Zero’s GPIO interface.These attributes include button states and functions required to check the button states and so on. In order to interface a button and read its states, we need to make use of this blueprint. The process of creating a copy of this blueprint is called instantiation. Let's get started with importing the gpiozero library and instantiate the Button class of the gpiozero. The button is interfaced to GPIO pin 2. We need to pass the pin number as an argument during instantiation: from gpiozero import Button #button is interfaced to GPIO 2 button = Button(2) The gpiozero library's documentation is available athttp://gpiozero.readthedocs.io/en/v1.2.0/api_input.html.According to the documentation, there is a variable named is_pressed in the Button class that could be tested using a conditional statement to determine if the button is pressed: if button.is_pressed: print("Button pressed") Whenever the button is pressed, the message Button pressed is printed on the screen. Let's stick this code snippet inside an infinite loop: from gpiozero import Button #button is interfaced to GPIO 2 button = Button(2) while True: if button.is_pressed: print("Button pressed") In an infinite while loop, the program constantly checks for a button press and prints the message as long as the button is being pressed. Once the button is released, it goes back to checking whether the button is pressed. Breaking out a loop by counting button presses Let's review another example where we would like to count the number of button presses and break out of the infinite loop when the button has received a predetermined number of presses: i = 0 while True: if button.is_pressed: button.wait_for_release() i += 1 print("Button pressed") if i >= 10: break In this example, the program checks for the state of the is_pressed variable. On receiving a button press, the program can be paused until the button is released using the method wait_for_release.When the button is released, the variable used to store the number of presses is incremented by 1. The program breaks out of the infinite loop, when the button has received 10 presses. A red momentary push button interfaced to Raspberry Pi Zero GPIO pin 2 Functions in Python We briefly discussed functions in Python. Functions execute a predefined set of task. print is one example of a function in Python. It enables printing something to the screen. Let's discuss writing our own functions in Python. A function can be declared in Python using the def keyword. A function could be defined as follows: defmy_func(): print("This is a simple function") In this function my_func, the print statement is written under an indented code block. Any block of code that is indented under the function definition is executed when the function is called during the code execution. The function could be executed as my_func(). Passing arguments to a function: A function is always defined with parentheses. The parentheses are used to pass any requisite arguments to a function. Arguments are parameters required to execute a function. In the earlierexample, there are no arguments passed to the function. Let's review an example where we pass an argument to a function: defadd_function(a, b): c = a + b print("The sum of a and b is ", c) In this example, a and b are arguments to the function. The function adds a and b and prints the sum onthe screen. When the function add_function is called by passing the arguments 3 and 2 as add_function(3,2) where a=3 and b=2, respectively. Hence, the arguments a and b are required to execute function, or calling the function without the arguments would result in an error. Errors related to missing arguments could be avoided by setting default values to the arguments: defadd_function(a=0, b=0): c = a + b print("The sum of a and b is ", c) The preceding function expects two arguments. If we pass only one argument to thisfunction, the other defaults to zero. For example,add_function(a=3), b defaults to 0, or add_function(b=2), a defaults to 0. When an argument is not furnished while calling a function, it defaults to zero (declared in the function). Similarly,the print function prints any variable passed as an argument. If the print function is called without any arguments, a blank line is printed. Returning values from a function Functions can perform a set of defined operations and finally return a value at the end. Let's consider the following example: def square(a): return a**2 In this example, the function returns a square of the argument. In Python, the return keyword is used to return a value requested upon completion of execution. The scope of variables in a function There are two types of variables in a Python program:local and global variables. Local variables are local to a function,that is, it is a variable declared within a function is accessible within that function.Theexample is as follows: defadd_function(): a = 3 b = 2 c = a + b print("The sum of a and b is ", c) In this example, the variables a and b are local to the function add_function. Let's consider an example of a global variable: a = 3 b = 2 defadd_function(): c = a + b print("The sum of a and b is ", c) add_function() In this case, the variables a and b are declared in the main body of the Python script. They are accessible across the entire program. Now, let's consider this example: a = 3 defmy_function(): a = 5 print("The value of a is ", a) my_function() print("The value of a is ", a) In this case, when my_function is called, the value of a is5 and the value of a is 3 in the print statement of the main body of the script. In Python, it is not possible to explicitly modify the value of global variables inside functions. In order to modify the value of a global variable, we need to make use of the global keyword: a = 3 defmy_function(): global a a = 5 print("The value of a is ", a) my_function() print("The value of a is ", a) In general, it is not recommended to modify variables inside functions as it is not a very safe practice of modifying variables. The best practice would be passing variables as arguments and returning the modified value. Consider the following example: a = 3 defmy_function(a): a = 5 print("The value of a is ", a) return a a = my_function(a) print("The value of a is ", a) In the preceding program, the value of a is 3. It is passed as an argument to my_function. The function returns 5, which is saved to a. We were able to safely modify the value of a. GPIO callback functions Let's review some uses of functions with the GPIO example. Functions can be used in order tohandle specific events related to the GPIO pins of the Raspberry Pi. For example,the gpiozero library provides the capability of calling a function either when a button is pressed or released: from gpiozero import Button defbutton_pressed(): print("button pressed") defbutton_released(): print("button released") #button is interfaced to GPIO 2 button = Button(2) button.when_pressed = button_pressed button.when_released = button_released while True: pass In this example, we make use of the attributeswhen_pressed and when_releasedof the library's GPIO class. When the button is pressed, the function button_pressed is executed. Likewise, when the button is released, the function button_released is executed. We make use of the while loop to avoid exiting the program and keep listening for button events. The pass keyword is used to avoid an error and nothing happens when a pass keyword is executed. This capability of being able to execute different functions for different events is useful in applications like Home Automation. For example, it could be used to turn on lights when it is dark and vice versa. DC motor control in Python In this section, we will discuss motor control using the Raspberry Pi Zero. Why discuss motor control? In order to control a motor, we need an H Bridge motor driver (Discussing H bridge is beyond our scope. There are several resources for H bridge motor drivers: http://www.mcmanis.com/chuck/robotics/tutorial/h-bridge/). There are several motor driver kits designed for the Raspberry Pi. In this section, we will make use of the following kit: https://www.pololu.com/product/2753. The Pololu product page also provides instructions on how to connect the motor. Let's get to writing some Python code tooperate the motor: from gpiozero import Motor from gpiozero import OutputDevice import time motor_1_direction = OutputDevice(13) motor_2_direction = OutputDevice(12) motor = Motor(5, 6) motor_1_direction.on() motor_2_direction.on() motor.forward() time.sleep(10) motor.stop() motor_1_direction.off() motor_2_direction.off() Raspberry Pi based motor control In order to control the motor, let's declare the pins, the motor's speed pins and direction pins. As per the motor driver's documentation, the motors are controlled by GPIO pins 12,13 and 5,6, respectively. from gpiozero import Motor from gpiozero import OutputDevice import time motor_1_direction = OutputDevice(13) motor_2_direction = OutputDevice(12) motor = Motor(5, 6) Controlling the motor is as simple as turning on the motor using the on() method and moving the motor in the forward direction using the forward() method: motor.forward() Similarly, reversing the motor direction could be done by calling the method reverse(). Stopping the motor could be done by: motor.stop() Some mini-project challenges for the reader: In this article, we discussed interfacing inputs for the Raspberry Pi and controlling motors. Think about a project where we could drive a mobile robot that reads inputs from whisker switches and operate a mobile robot. Is it possible to build a wall following robot in combination with the limit switches and motors? We discussed controlling a DC motor in this article. How do we control a stepper motor using a Raspberry Pi? How can we interface a motion sensor to control the lights at home using a Raspberry Pi Zero. Summary In this article, we discussed conditional statements and the applications of conditional statements in Python. We also discussed functions in Python, passing arguments to a function, returning values from a function and scope of variables in a Python program. We discussed callback functions and motor control in Python. Resources for Article: Further resources on this subject: Sending Notifications using Raspberry Pi Zero [article] Raspberry Pi Gaming Operating Systems [article] Raspberry Pi LED Blueprints [article]
Read more
  • 0
  • 0
  • 67979
Modal Close icon
Modal Close icon