Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-how-to-denoise-images-neural-networks
Graham Annett
26 Sep 2016
8 min read
Save for later

How to Denoise Images with Neural Networks

Graham Annett
26 Sep 2016
8 min read
The premise of denoising images is very useful and can be applied to images, sounds, texts, and more. While deep learning is possibly not the best approach, it is an interesting one, and shows how versatile deep learning can be. Get The Data The data we will be using is a dataset of faces from github user hromi. It's a fun dataset to play around with because it has both smiling and non-smiling images of faces and it’s good for a lot of different scenarios, such as training to find a smile or training to fill missing parts of images. The data is neatly packaged in a zip and is easily accessed with the following: import os import numpy as np import zipfile from urllib import request import matplotlib.pyplot as plt import matplotlib.image as mpimg import random %matplotlib inline url = 'https://github.com/hromi/SMILEsmileD/archive/master.zip' request.urlretrieve(url, 'data.zip') zipfile.ZipFile('data.zip').extractall() This will download all of the images to a folder with a variety of peripheral information we will not be using, but would be incredibly fun to incorporate into a model in other ways. Preview images First, let’s load all of the data and preview some images: x_pos = [] base_path = 'SMILEsmileD-master/SMILEs/' positive_smiles = base_path + 'positives/positives7/' negative_smiles = base_path + 'SMILEsmileD-master/SMILEs/negatives/negatives7/' for img in os.listdir(positive_smiles): x_pos.append(mpimg.imread(positive_smiles + img)) # change into np.array and scale to 255. which is max x_pos = np.array(x_pos)/255. # reshape which is explained later x_pos = x_pos.reshape(len(x_pos),1,64,64) # plot 3 random images plt.figure(figsize=(8, 6)) n = 3 for i in range(n): ax = plt.subplot(2, 3, i+1) # using i+1 since 0 is deprecated in future matplotlib plt.imshow(random.choice(x_pos), cmap=plt.cm.gray) ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) Below is what you should get: Visualize Noise From here let's add a random amount of noise and visualize it. plt.figure(figsize=(8, 10)) plt.subplot(3,2,1).set_title('normal') plt.subplot(3,2,2).set_title('noisy') plt.tight_layout() n = 6 for i in range(1,n+1,2): # 2 columns with good on left side, noisy on right side ax = plt.subplot(3, 2, i) rand_img = random.choice(x_pos)[0] random_factor = 0.05 * np.random.normal(loc=0., scale=1., size=rand_img.shape) # plot normal images plt.imshow(rand_img, cmap=plt.cm.gray) ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) # plot noisy images ax = plt.subplot(3,2,i+1) plt.imshow(rand_img + random_factor, cmap=plt.cm.gray) ax.get_yaxis().set_visible(False) ax.get_xaxis().set_visible(False) Below is comparison of normal image on the left and a noisy image on the right:   As you can see, the images are still visually similar to the normal images but this technique can be very useful if an image is blurry or very grainy due to the high ISO in traditional cameras. Prepare the Dataset From here it's always good practice to split the dataset if we intend to evaluate our model later, so we will split the data into a train and a test set. We will also shuffle the images, since I am unaware of any requirement for order to the data. # shuffle the images in case there was some underlying order np.random.shuffle(x_pos) # split into test and train set, but we will use keras built in validation_size x_pos_train = x_pos[int(x_pos.shape[0]* .20):] x_pos_test = x_pos[:int(x_pos.shape[0]* .20)] x_pos_noisy = x_pos_train + 0.05 * np.random.normal(loc=0., scale=1., size=x_pos_train.shape) Training Model The model we are using is based off of the new Keras functional API with a Sequential comparison as well. Quick intro to Keras Functional API While previously there was the graph and sequential model, almost all models used the Sequential form. This is the standard type of modeling in deep learning and consists of a linear ordering of layer to layer (that is, no merges or splits). Using the Sequential model is the same as before and is incredibly modular and understandable since the model is composed by adding layer upon layer. For example, our keras model in Sequential form will look like the following: from keras.models import Sequential from keras.layers import Dense, Activation, Convolution2D, MaxPooling2D, UpSampling2D seqmodel = Sequential() seqmodel.add(Convolution2D(32, 3, 3, border_mode='same', input_shape=(1, 64,64))) seqmodel.add(Activation('relu')) seqmodel.add(MaxPooling2D((2, 2), border_mode='same') seqmodel.add(Convolution2D(32, 3, 3, border_mode='same')) seqmodel.add(Activation('relu')) seqmodel.add(UpSampling2D((2, 2)) seqmodel.add(Convolution2D(1, 3, 3, border_mode='same')) seqmodel.add(Activation('sigmoid')) seqmodel.compile(optimizer='adadelta', loss='binary_crossentropy') Versus the Functional Model format: from keras.layers import Input, Dense, Convolution2D, MaxPooling2D, UpSampling2D from keras.models import Model input_img = Input(shape=(1, 64, 64)) x = Convolution2D(32, 3, 3, border_mode='same')(input_img) x = Activation('relu')(x) x = MaxPooling2D((2, 2), border_mode='same')(x) x = Convolution2D(32, 3, 3, border_mode='same')(x) x = Activation('relu')(x) x = UpSampling2D((2, 2))(x) x = Convolution2D(1, 3, 3, activation='sigmoid', border_mode='same')(x) funcmodel = Model(input_img, x) funcmodel.compile(optimizer='adadelta', loss='binary_crossentropy') While these models look very similar, the functional form is more versatile at the cost of being more confusing. Let's fit these and compare the results to show that they are equivalent: seqmodel.fit(x_pos_noisy, x_pos_train, nb_epoch=10, batch_size=32, shuffle=True, validation_split=.20) funcmodel.fit(x_pos_noisy, x_pos_train, nb_epoch=10, batch_size=32, shuffle=True, validation_split=.20) Following the training time and loss functions should net near-identical results. For the sake of argument, we will plot outputs from both models and show how they result in near identical results. # create noisy test set and create predictions from sequential and function x_noisy_test = x_pos_test + 0.05 * np.random.normal(loc=0., scale=1., size=x_pos_test.shape) f1 = funcmodel.predict(x_noisy_test) s1 = seqmodel.predict(x_noisy_test) plt.figure(figsize=(12, 12)) plt.subplot(3,4,1).set_title('normal') plt.subplot(3,4,2).set_title('noisy') plt.subplot(3,4,3).set_title('denoised-functional') plt.subplot(3,4,4).set_title('denoised-sequential') n = 3 for i in range(1,12,4): img_index = random.randint(0,len(x_noisy_test)) # plot original image ax = plt.subplot(3, 4, i) plt.imshow(x_pos_test[img_index][0], cmap=plt.cm.gray) ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) # plot noisy images ax = plt.subplot(3,4,i+1) plt.imshow(x_noisy_test[img_index][0], cmap=plt.cm.gray) ax.get_yaxis().set_visible(False) ax.get_xaxis().set_visible(False) # plot denoised functional ax = plt.subplot(3,4,i+2) plt.imshow(f1[img_index][0], cmap=plt.cm.gray) ax.get_yaxis().set_visible(False) ax.get_xaxis().set_visible(False) # plot denoised sequential ax = plt.subplot(3,4,i+3) plt.imshow(s1[img_index][0], cmap=plt.cm.gray) ax.get_yaxis().set_visible(False) ax.get_xaxis().set_visible(False) plt.tight_layout() The result will be something like this.   Since we only trained the net with 10 epochs and it was very shallow, we also could add more layers, use more epochs, and see if it nets in better results: seqmodel = Sequential() seqmodel.add(Convolution2D(32, 3, 3, border_mode='same', input_shape=(1, 64,64))) seqmodel.add(Activation('relu')) seqmodel.add(MaxPooling2D((2, 2), border_mode='same')) seqmodel.add(Convolution2D(32, 3, 3, border_mode='same')) seqmodel.add(Activation('relu')) seqmodel.add(MaxPooling2D((2, 2), border_mode='same')) seqmodel.add(Convolution2D(32, 3, 3, border_mode='same')) seqmodel.add(Activation('relu')) seqmodel.add(UpSampling2D((2, 2))) seqmodel.add(Convolution2D(32, 3, 3, border_mode='same')) seqmodel.add(Activation('relu')) seqmodel.add(UpSampling2D((2, 2))) seqmodel.add(Convolution2D(1, 3, 3, border_mode='same')) seqmodel.add(Activation('sigmoid')) seqmodel.compile(optimizer='adadelta', loss='binary_crossentropy') seqmodel.fit(x_pos_noisy, x_pos_train, nb_epoch=50, batch_size=32, shuffle=True, validation_split=.20, verbose=0) s2 = seqmodel.predict(x_noisy_test)plt.figure(figsize=(10, 10)) plt.subplot(3,3,1).set_title('normal') plt.subplot(3,3,2).set_title('noisy') plt.subplot(3,3,3).set_title('denoised') for i in range(1,9,3): img_index = random.randint(0,len(x_noisy_test)) # plot original image ax = plt.subplot(3, 3, i) plt.imshow(x_pos_test[img_index][0], cmap=plt.cm.gray) ax.get_xaxis().set_visible(False) ax.get_yaxis().set_visible(False) # plot noisy images ax = plt.subplot(3,3,i+1) plt.imshow(x_noisy_test[img_index][0], cmap=plt.cm.gray) ax.get_yaxis().set_visible(False) ax.get_xaxis().set_visible(False) # plot denoised functional ax = plt.subplot(3,3,i+2) plt.imshow(s2[img_index][0], cmap=plt.cm.gray) ax.get_yaxis().set_visible(False) ax.get_xaxis().set_visible(False) plt.tight_layout() While this is a small example, it's easily extendable to other scenarios. The ability to denoise an image is by no means new and unique to neural networks, but is an interesting experiment about one of the many uses that show potential for deep learning. About the author Graham Annett is an NLP Engineer at Kip (Kipthis.com).  He has been interested in deep learning for a bit over a year and has worked with and contributed to Keras.  He can be found on GitHub or via here .
Read more
  • 0
  • 3
  • 30063

article-image-approaching-penetration-test-using-metasploit
Packt
26 Sep 2016
17 min read
Save for later

Approaching a Penetration Test Using Metasploit

Packt
26 Sep 2016
17 min read
"In God I trust, all others I pen-test" - Binoj Koshy, cyber security expert In this article by Nipun Jaswal, authors of Mastering Metasploit, Second Edition, we will discuss penetration testing, which is an intentional attack on a computer-based system with the intension of finding vulnerabilities, figuring out security weaknesses, certifying that a system is secure, and gaining access to the system by exploiting these vulnerabilities. A penetration test will advise an organization if it is vulnerable to an attack, whether the implemented security is enough to oppose any attack, which security controls can be bypassed, and so on. Hence, a penetration test focuses on improving the security of an organization. (For more resources related to this topic, see here.) Achieving success in a penetration test largely depends on using the right set of tools and techniques. A penetration tester must choose the right set of tools and methodologies in order to complete a test. While talking about the best tools for penetration testing, the first one that comes to mind is Metasploit. It is considered one of the most effective auditing tools to carry out penetration testing today. Metasploit offers a wide variety of exploits, an extensive exploit development environment, information gathering and web testing capabilities, and much more. This article has been written so that it will not only cover the frontend perspectives of Metasploit, but it will also focus on the development and customization of the framework as well. This article assumes that the reader has basic knowledge of the Metasploit framework. However, some of the sections of this article will help you recall the basics as well. While covering Metasploit from the very basics to the elite level, we will stick to a step-by-step approach, as shown in the following diagram: This article will help you recall the basics of penetration testing and Metasploit, which will help you warm up to the pace of this article. In this article, you will learn about the following topics: The phases of a penetration test The basics of the Metasploit framework The workings of exploits Testing a target network with Metasploit The benefits of using databases An important point to take a note of here is that we might not become an expert penetration tester in a single day. It takes practice, familiarization with the work environment, the ability to perform in critical situations, and most importantly, an understanding of how we have to cycle through the various stages of a penetration test. When we think about conducting a penetration test on an organization, we need to make sure that everything is set perfectly and is according to a penetration test standard. Therefore, if you feel you are new to penetration testing standards or uncomfortable with the term Penetration testing Execution Standard (PTES), please refer to http://www.pentest-standard.org/index.php/PTES_Technical_Guidelines to become more familiar with penetration testing and vulnerability assessments. According to PTES, the following diagram explains the various phases of a penetration test: Refer to the http://www.pentest-standard.org website to set up the hardware and systematic phases to be followed in a work environment; these setups are required to perform a professional penetration test. Organizing a penetration test Before we start firing sophisticated and complex attack vectors with Metasploit, we must get ourselves comfortable with the work environment. Gathering knowledge about the work environment is a critical factor that comes into play before conducting a penetration test. Let us understand the various phases of a penetration test before jumping into Metasploit exercises and see how to organize a penetration test on a professional scale. Preinteractions The very first phase of a penetration test, preinteractions, involves a discussion of the critical factors regarding the conduct of a penetration test on a client's organization, company, institute, or network; this is done with the client. This serves as the connecting line between the penetration tester and the client. Preinteractions help a client get enough knowledge on what is about to be done over his or her network/domain or server. Therefore, the tester will serve here as an educator to the client. The penetration tester also discusses the scope of the test, all the domains that will be tested, and any special requirements that will be needed while conducting the test on the client's behalf. This includes special privileges, access to critical systems, and so on. The expected positives of the test should also be part of the discussion with the client in this phase. As a process, preinteractions discuss some of the following key points: Scope: This section discusses the scope of the project and estimates the size of the project. Scope also defines what to include for testing and what to exclude from the test. The tester also discusses ranges and domains under the scope and the type of test (black box or white box) to be performed. For white box testing, what all access options are required by the tester? Questionnaires for administrators, the time duration for the test, whether to include stress testing or not, and payment for setting up the terms and conditions are included in the scope. A general scope document provides answers to the following questions: What are the target organization's biggest security concerns? What specific hosts, network address ranges, or applications should be tested? What specific hosts, network address ranges, or applications should explicitly NOT be tested? Are there any third parties that own systems or networks that are in the scope, and which systems do they own (written permission must have been obtained in advance by the target organization)? Will the test be performed against a live production environment or a test environment? Will the penetration test include the following testing techniques: ping sweep of network ranges, port scan of target hosts, vulnerability scan of targets, penetration of targets, application-level manipulation, client-side Java/ActiveX reverse engineering, physical penetration attempts, social engineering? Will the penetration test include internal network testing? If so, how will access be obtained? Are client/end-user systems included in the scope? If so, how many clients will be leveraged? Is social engineering allowed? If so, how may it be used? Are Denial of Service attacks allowed? Are dangerous checks/exploits allowed? Goals: This section discusses various primary and secondary goals that a penetration test is set to achieve. The common questions related to the goals are as follows: What is the business requirement for this penetration test? This is required by a regulatory audit or standard Proactive internal decision to determine all weaknesses What are the objectives? Map out vulnerabilities Demonstrate that the vulnerabilities exist Test the incident response Actual exploitation of a vulnerability in a network, system, or application All of the above Testing terms and definitions: This section discusses basic terminologies with the client and helps him or her understand the terms well Rules of engagement: This section defines the time of testing, timeline, permissions to attack, and regular meetings to update the status of the ongoing test. The common questions related to rules of engagement are as follows: At what time do you want these tests to be performed? During business hours After business hours Weekend hours During a system maintenance window Will this testing be done on a production environment? If production environments should not be affected, does a similar environment (development and/or test systems) exist that can be used to conduct the penetration test? Who is the technical point of contact? For more information on preinteractions, refer to http://www.pentest-standard.org/index.php/File:Pre-engagement.png. Intelligence gathering / reconnaissance phase In the intelligence-gathering phase, you need to gather as much information as possible about the target network. The target network could be a website, an organization, or might be a full-fledged fortune company. The most important aspect is to gather information about the target from social media networks and use Google Hacking (a way to extract sensitive information from Google using specialized queries) to find sensitive information related to the target. Footprinting the organization using active and passive attacks can also be an approach. The intelligence phase is one of the most crucial phases in penetration testing. Properly gained knowledge about the target will help the tester to stimulate appropriate and exact attacks, rather than trying all possible attack mechanisms; it will also help him or her save a large amount of time as well. This phase will consume 40 to 60 percent of the total time of the testing, as gaining access to the target depends largely upon how well the system is foot printed. It is the duty of a penetration tester to gain adequate knowledge about the target by conducting a variety of scans, looking for open ports, identifying all the services running on those ports and to decide which services are vulnerable and how to make use of them to enter the desired system. The procedures followed during this phase are required to identify the security policies that are currently set in place at the target, and what we can do to breach them. Let us discuss this using an example. Consider a black box test against a web server where the client wants to perform a network stress test. Here, we will be testing a server to check what level of bandwidth and resource stress the server can bear or in simple terms, how the server is responding to the Denial of Service (DoS) attack. A DoS attack or a stress test is the name given to the procedure of sending indefinite requests or data to a server in order to check whether the server is able to handle and respond to all the requests successfully or crashes causing a DoS. A DoS can also occur if the target service is vulnerable to specially crafted requests or packets. In order to achieve this, we start our network stress-testing tool and launch an attack towards a target website. However, after a few seconds of launching the attack, we see that the server is not responding to our browser and the website does not open. Additionally, a page shows up saying that the website is currently offline. So what does this mean? Did we successfully take out the web server we wanted? Nope! In reality, it is a sign of protection mechanism set by the server administrator that sensed our malicious intent of taking the server down, and hence resulting in a ban of our IP address. Therefore, we must collect correct information and identify various security services at the target before launching an attack. The better approach is to test the web server from a different IP range. Maybe keeping two to three different virtual private servers for testing is a good approach. In addition, I advise you to test all the attack vectors under a virtual environment before launching these attack vectors onto the real targets. A proper validation of the attack vectors is mandatory because if we do not validate the attack vectors prior to the attack, it may crash the service at the target, which is not favorable at all. Network stress tests should generally be performed towards the end of the engagement or in a maintenance window. Additionally, it is always helpful to ask the client for white listing IP addresses used for testing. Now let us look at the second example. Consider a black box test against a windows 2012 server. While scanning the target server, we find that port 80 and port 8080 are open. On port 80, we find the latest version of Internet Information Services (IIS) running while on port 8080, we discover that the vulnerable version of the Rejetto HFS Server is running, which is prone to the Remote Code Execution flaw. However, when we try to exploit this vulnerable version of HFS, the exploit fails. This might be a common scenario where inbound malicious traffic is blocked by the firewall. In this case, we can simply change our approach to connecting back from the server, which will establish a connection from the target back to our system, rather than us connecting to the server directly. This may prove to be more successful as firewalls are commonly being configured to inspect ingress traffic rather than egress traffic. Coming back to the procedures involved in the intelligence-gathering phase when viewed as a process are as follows: Target selection: This involves selecting the targets to attack, identifying the goals of the attack, and the time of the attack. Covert gathering: This involves on-location gathering, the equipment in use, and dumpster diving. In addition, it covers off-site gathering that involves data warehouse identification; this phase is generally considered during a white box penetration test. Foot printing: This involves active or passive scans to identify various technologies used at the target, which includes port scanning, banner grabbing, and so on. Identifying protection mechanisms: This involves identifying firewalls, filtering systems, network- and host-based protections, and so on. For more information on gathering intelligence, refer to http://www.pentest-standard.org/index.php/Intelligence_Gathering Predicting the test grounds A regular occurrence during penetration testers' lives is when they start testing an environment, they know what to do next. If they come across a Windows box, they switch their approach towards the exploits that work perfectly for Windows and leave the rest of the options. An example of this might be an exploit for the NETAPI vulnerability, which is the most favorable choice for exploiting a Windows XP box. Suppose a penetration tester needs to visit an organization, and before going there, they learn that 90 percent of the machines in the organization are running on Windows XP, and some of them use Windows 2000 Server. The tester quickly decides that they will be using the NETAPI exploit for XP-based systems and the DCOM exploit for Windows 2000 server from Metasploit to complete the testing phase successfully. However, we will also see how we can use these exploits practically in the latter section of this article. Consider another example of a white box test on a web server where the server is hosting ASP and ASPX pages. In this case, we switch our approach to use Windows-based exploits and IIS testing tools, therefore ignoring the exploits and tools for Linux. Hence, predicting the environment under a test helps to build the strategy of the test that we need to follow at the client's site. For more information on the NETAPI vulnerability, visit http://technet.microsoft.com/en-us/security/bulletin/ms08-067. For more information on the DCOM vulnerability, visit http://www.rapid7.com/db/modules/exploit/Windows /dcerpc/ms03_026_dcom. Modeling threats In order to conduct a comprehensive penetration test, threat modeling is required. This phase focuses on modeling out correct threats, their effect, and their categorization based on the impact they can cause. Based on the analysis made during the intelligence-gathering phase, we can model the best possible attack vectors. Threat modeling applies to business asset analysis, process analysis, threat analysis, and threat capability analysis. This phase answers the following set of questions: How can we attack a particular network? To which crucial sections do we need to gain access? What approach is best suited for the attack? What are the highest-rated threats? Modeling threats will help a penetration tester to perform the following set of operations: Gather relevant documentation about high-level threats Identify an organization's assets on a categorical basis Identify and categorize threats Mapping threats to the assets of an organization Modeling threats will help to define the highest priority assets with threats that can influence these assets. Now, let us discuss a third example. Consider a black box test against a company's website. Here, information about the company's clients is the primary asset. It is also possible that in a different database on the same backend, transaction records are also stored. In this case, an attacker can use the threat of a SQL injection to step over to the transaction records database. Hence, transaction records are the secondary asset. Mapping a SQL injection attack to primary and secondary assets is achievable during this phase. Vulnerability scanners such as Nexpose and the Pro version of Metasploit can help model threats clearly and quickly using the automated approach. This can prove to be handy while conducting large tests. For more information on the processes involved during the threat modeling phase, refer to http://www.pentest-standard.org/index.php/Threat_Modeling. Vulnerability analysis Vulnerability analysis is the process of discovering flaws in a system or an application. These flaws can vary from a server to web application, an insecure application design for vulnerable database services, and a VOIP-based server to SCADA-based services. This phase generally contains three different mechanisms, which are testing, validation, and research. Testing consists of active and passive tests. Validation consists of dropping the false positives and confirming the existence of vulnerabilities through manual validations. Research refers to verifying a vulnerability that is found and triggering it to confirm its existence. For more information on the processes involved during the threat-modeling phase, refer to http://www.pentest-standard.org/index.php/Vulnerability_Analysis. Exploitation and post-exploitation The exploitation phase involves taking advantage of the previously discovered vulnerabilities. This phase is considered as the actual attack phase. In this phase, a penetration tester fires up exploits at the target vulnerabilities of a system in order to gain access. This phase is covered heavily throughout the article. The post-exploitation phase is the latter phase of exploitation. This phase covers various tasks that we can perform on an exploited system, such as elevating privileges, uploading/downloading files, pivoting, and so on. For more information on the processes involved during the exploitation phase, refer to http://www.pentest-standard.org/index.php/Exploitation. For more information on post exploitation, refer to http://www.pentest-standard.org/index.php/Post_Exploitation. Reporting Creating a formal report of the entire penetration test is the last phase to conduct while carrying out a penetration test. Identifying key vulnerabilities, creating charts and graphs, recommendations, and proposed fixes are a vital part of the penetration test report. An entire section dedicated to reporting is covered in the latter half of this article. For more information on the processes involved during the threat modeling phase, refer to http://www.pentest-standard.org/index.php/Reporting. Mounting the environment Before going to a war, the soldiers must make sure that their artillery is working perfectly. This is exactly what we are going to follow. Testing an environment successfully depends on how well your test labs are configured. Moreover, a successful test answers the following set of questions: How well is your test lab configured? Are all the required tools for testing available? How good is your hardware to support such tools? Before we begin to test anything, we must make sure that all the required set of tools are available and that everything works perfectly. Summary Throughout this article, we have introduced the phases involved in penetration testing. We have also seen how we can set up Metasploit and conduct a black box test on the network. We recalled the basic functionalities of Metasploit as well. We saw how we could perform a penetration test on two different Linux boxes and Windows Server 2012. We also looked at the benefits of using databases in Metasploit. After completing this article, we are equipped with the following: Knowledge of the phases of a penetration test The benefits of using databases in Metasploit The basics of the Metasploit framework Knowledge of the workings of exploits and auxiliary modules Knowledge of the approach to penetration testing with Metasploit The primary goal of this article was to inform you about penetration test phases and Metasploit. We will dive into the coding part of Metasploit and write our custom functionalities to the Metasploit framework. Resources for Article: Further resources on this subject: Introducing Penetration Testing [article] Open Source Intelligence [article] Ruby and Metasploit Modules [article]
Read more
  • 0
  • 0
  • 37125

article-image-how-add-unit-tests-sails-framework-application
Luis Lobo
26 Sep 2016
8 min read
Save for later

How to add Unit Tests to a Sails Framework Application

Luis Lobo
26 Sep 2016
8 min read
There are different ways to implement Unit Tests for a Node.js application. Most of them use Mocha, for their test framework, Chai as the assertion library, and some of them include Istanbul for Code Coverage. We will be using those tools, not entering in deep detail on how to use them but rather on how to successfully configure and implement them for a Sails project. 1) Creating a new application from scratch (if you don't have one already) First of all, let’s create a Sails application from scratch. The Sails version in use for this article is 0.12.3. If you already have a Sails application, then you can continue to step 2. Issuing the following command creates the new application: $ sails new sails-test-article Once we create it, we will have the following file structure: ./sails-test-article ├── api │ ├── controllers │ ├── models │ ├── policies │ ├── responses │ └── services ├── assets │ ├── images │ ├── js │ │ └── dependencies │ ├── styles │ └── templates ├── config │ ├── env │ └── locales ├── tasks │ ├── config │ └── register └── views 2) Create a basic test structure We want a folder structure that contains all our tests. For now we will only add unit tests. In this project we want to test only services and controllers. Add necessary modules npm install --save-dev mocha chai istanbul supertest Folder structure Let's create the test folder structure that supports our tests: mkdir -p test/fixtures test/helpers test/unit/controllers test/unit/services After the creation of the folders, we will have this structure: ./sails-test-article ├── api [...] ├── test │ ├── fixtures │ ├── helpers │ └── unit │ ├── controllers │ └── services └── views We now create a mocha.opts file inside the test folder. It contains mocha options, such as a timeout per test run, that will be passed by default to mocha every time it runs. One option per line, as described in mocha opts. --require chai --reporter spec --recursive --ui bdd --globals sails --timeout 5s --slow 2000 Up to this point, we have all our tools set up. We can do a very basic test run: mocha test It prints out this: 0 passing (2ms) Normally, Node.js applications define a test script in the packages.json file. Edit it so that it now looks like this: "scripts": { "debug": "node debug app.js", "start": "node app.js", "test": "mocha test" } We are ready for the next step. 3) Bootstrap file The boostrap.js file is the one that defines the environment that all tests use. Inside it, we define before and after events. In them, we are starting and stopping (or 'lifting' and 'lowering' in Sails language) our Sails application. Since Sails makes globally available models, controller, and services at runtime, we need to start them here. var sails = require('sails'); var _ = require('lodash'); global.chai = require('chai'); global.should = chai.should(); before(function (done) { // Increase the Mocha timeout so that Sails has enough time to lift. this.timeout(5000); sails.lift({ log: { level: 'silent' }, hooks: { grunt: false }, models: { connection: 'unitTestConnection', migrate: 'drop' }, connections: { unitTestConnection: { adapter: 'sails-disk' } } }, function (err, server) { if (err) returndone(err); // here you can load fixtures, etc. done(err, sails); }); }); after(function (done) { // here you can clear fixtures, etc. if (sails && _.isFunction(sails.lower)) { sails.lower(done); } }); This file will be required on each of our tests. That way, each test can individually be run if needed, or run as a whole. 4) Services tests We now are adding two models and one service to show how to test services: Create a Comment model in /api/models/Comment.js: /** * Comment.js */ module.exports = { attributes: { comment: {type: 'string'}, timestamp: {type: 'datetime'} } }; /** * Comment.js */ module.exports = { attributes: { comment: {type: 'string'}, timestamp: {type: 'datetime'} } }; /** * Comment.js */ module.exports = { attributes: { comment: {type: 'string'}, timestamp: {type: 'datetime'} } }; /** * Comment.js */ module.exports = { attributes: { comment: {type: 'string'}, timestamp: {type: 'datetime'} } }; Create a Post model in /api/models/Post.js: /** * Post.js */ module.exports = { attributes: { title: {type: 'string'}, body: {type: 'string'}, timestamp: {type: 'datetime'}, comments: {model: 'Comment'} } }; Create a Post service in /api/services/PostService.js: /** * PostService * * @description :: Service that handles posts */ module.exports = { getPostsWithComments: function () { return Post .find() .populate('comments'); } }; To test the Post service, we need to create a test for it in /test/unit/services/PostService.spec.js. In the case of services, we want to test business logic. So basically, you call your service methods and evaluate the results using an assertion library. In this case, we are using Chai's should. /* global PostService */ // Here is were we init our 'sails' environment and application require('../../bootstrap'); // Here we have our tests describe('The PostService', function () { before(function (done) { Post.create({}) .then(Post.create({}) .then(Post.create({}) .then(function () { done(); }) ) ); }); it('should return all posts with their comments', function (done) { PostService .getPostsWithComments() .then(function (posts) { posts.should.be.an('array'); posts.should.have.length(3); done(); }) .catch(done); }); }); We can now test our service by running: npm test The result should be similar to this one: > sails-test-article@0.0.0 test /home/lobo/dev/luislobo/sails-test-article > mocha test The PostService ✓ should return all posts with their comments 1 passing (979ms) 5) Controllers tests In the case of controllers, we want to validate that our requests are working, that they are returning the correct error codes and the correct data. In this case, we make use of the SuperTest module, which provides HTTP assertions. We add now a Post controller with this content in /api/controllers/PostController.js: /** * PostController */ module.exports = { getPostsWithComments: function (req, res) { PostService.getPostsWithComments() .then(function (posts) { res.ok(posts); }) .catch(res.negotiate); } }; And now we create a Post controller test in: /test/unit/controllers/PostController.spec.js: // Here is were we init our 'sails' environment and application var supertest = require('supertest'); require('../../bootstrap'); describe('The PostController', function () { var createdPostId = 0; it('should create a post', function (done) { var agent = supertest.agent(sails.hooks.http.app); agent .post('/post') .set('Accept', 'application/json') .send({"title": "a post", "body": "some body"}) .expect('Content-Type', /json/) .expect(201) .end(function (err, result) { if (err) { done(err); } else { result.body.should.be.an('object'); result.body.should.have.property('id'); result.body.should.have.property('title', 'a post'); result.body.should.have.property('body', 'some body'); createdPostId = result.body.id; done(); } }); }); it('should get posts with comments', function (done) { var agent = supertest.agent(sails.hooks.http.app); agent .get('/post/getPostsWithComments') .set('Accept', 'application/json') .expect('Content-Type', /json/) .expect(200) .end(function (err, result) { if (err) { done(err); } else { result.body.should.be.an('array'); result.body.should.have.length(1); done(); } }); }); it('should delete post created', function (done) { var agent = supertest.agent(sails.hooks.http.app); agent .delete('/post/' + createdPostId) .set('Accept', 'application/json') .expect('Content-Type', /json/) .expect(200) .end(function (err, result) { if (err) { returndone(err); } else { returndone(null, result.text); } }); }); }); After running the tests again: npm test We can see that now we have 4 tests: > sails-test-article@0.0.0 test /home/lobo/dev/luislobo/sails-test-article > mocha test The PostController ✓ should create a post ✓ should get posts with comments ✓ should delete post created The PostService ✓ should return all posts with their comments 4 passing (1s) 6) Code Coverage Finally, we want to know if our code is being covered by our unit tests, with the help of Istanbul. To generate a report, we just need to run: istanbul cover _mocha test Once we run it, we will have a result similar to this one: The PostController ✓ should create a post ✓ should get posts with comments ✓ should delete post created The PostService ✓ should return all posts with their comments 4 passing (1s) ============================================================================= Writing coverage object [/home/lobo/dev/luislobo/sails-test-article/coverage/coverage.json] Writing coverage reports at [/home/lobo/dev/luislobo/sails-test-article/coverage] ============================================================================= =============================== Coverage summary =============================== Statements : 26.95% ( 45/167 ) Branches : 3.28% ( 4/122 ) Functions : 35.29% ( 6/17 ) Lines : 26.95% ( 45/167 ) ================================================================================ In this case, we can see that the percentages are not very nice. We don't have to worry much about these since most of the “not covered” code is in /api/policies and /api/responses. You can check that result in a file that was created after istanbul ran, in ./coverage/lcov-report/index.html. If you remove those folders and run it again, you will see the difference. rm -rf api/policies api/responses istanbul cover _mocha test ⬡ 4.4.2 [±master ●●●] Now the result is much better: 100% coverage! The PostController ✓ should create a post ✓ should get posts with comments ✓ should delete post created The PostService ✓ should return all posts with their comments 4 passing (1s) ============================================================================= Writing coverage object [/home/lobo/dev/luislobo/sails-test-article/coverage/coverage.json] Writing coverage reports at [/home/lobo/dev/luislobo/sails-test-article/coverage] ============================================================================= =============================== Coverage summary =============================== Statements : 100% ( 24/24 ) Branches : 100% ( 0/0 ) Functions : 100% ( 4/4 ) Lines : 100% ( 24/24 ) ================================================================================ Now if you check the report again, you will see a different picture: Coverage report You can get the source code for each of the steps here. I hope you enjoyed the post! Reference Sails documentation on Testing your code Follows recommendations from Sails author, Mike McNeil, Adds some extra stuff based on my own experience developing applications using Sails Framework. About the author Luis Lobo Borobia is the CTO at FictionCity.NET, mentor and advisor, independent software engineer, consultant, and conference speaker. He has a background as a software analyst and designer—creating, designing, and implementing software products and solutions, frameworks, and platforms for several kinds of industries. In the last few years, he has focused on research and development for the Internet of Things using the latest bleeding-edge software and hardware technologies available.
Read more
  • 0
  • 1
  • 14264

article-image-using-model-serializers-eliminate-duplicate-code
Packt
23 Sep 2016
12 min read
Save for later

Using model serializers to eliminate duplicate code

Packt
23 Sep 2016
12 min read
In this article by Gastón C. Hillar, author of, Building RESTful Python Web Services, we will cover the use of model serializers to eliminate duplicate code and use of default parsing and rendering options. (For more resources related to this topic, see here.) Using model serializers to eliminate duplicate code The GameSerializer class declares many attributes with the same names that we used in the Game model and repeats information such as the types and the max_length values. The GameSerializer class is a subclass of the rest_framework.serializers.Serializer, it declares attributes that we manually mapped to the appropriate types, and overrides the create and update methods. Now, we will create a new version of the GameSerializer class that will inherit from the rest_framework.serializers.ModelSerializer class. The ModelSerializer class automatically populates both a set of default fields and a set of default validators. In addition, the class provides default implementations for the create and update methods. In case you have any experience with Django Web Framework, you will notice that the Serializer and ModelSerializer classes are similar to the Form and ModelForm classes. Now, go to the gamesapi/games folder folder and open the serializers.py file. Replace the code in this file with the following code that declares the new version of the GameSerializer class. The code file for the sample is included in the restful_python_chapter_02_01 folder. from rest_framework import serializers from games.models import Game class GameSerializer(serializers.ModelSerializer): class Meta: model = Game fields = ('id', 'name', 'release_date', 'game_category', 'played') The new GameSerializer class declares a Meta inner class that declares two attributes: model and fields. The model attribute specifies the model related to the serializer, that is, the Game class. The fields attribute specifies a tuple of string whose values indicate the field names that we want to include in the serialization from the related model. There is no need to override either the create or update methods because the generic behavior will be enough in this case. The ModelSerializer superclass provides implementations for both methods. We have reduced boilerplate code that we didn’t require in the GameSerializer class. We just needed to specify the desired set of fields in a tuple. Now, the types related to the game fields is included only in the Game class. Press Ctrl + C to quit Django’s development server and execute the following command to start it again. python manage.py runserver Using the default parsing and rendering options and move beyond JSON The APIView class specifies default settings for each view that we can override by specifying appropriate values in the gamesapi/settings.py file or by overriding the class attributes in subclasses. As previously explained the usage of the APIView class under the hoods makes the decorator apply these default settings. Thus, whenever we use the decorator, the default parser classes and the default renderer classes will be associated with the function views. By default, the value for the DEFAULT_PARSER_CLASSES is the following tuple of classes: ( 'rest_framework.parsers.JSONParser', 'rest_framework.parsers.FormParser', 'rest_framework.parsers.MultiPartParser' ) When we use the decorator, the API will be able to handle any of the following content types through the appropriate parsers when accessing the request.data attribute. application/json application/x-www-form-urlencoded multipart/form-data When we access the request.data attribute in the functions, Django REST Framework examines the value for the Content-Type header in the incoming request and determines the appropriate parser to parse the request content. If we use the previously explained default values, Django REST Framework will be able to parse the previously listed content types. However, it is extremely important that the request specifies the appropriate value in the Content-Type header. We have to remove the usage of the rest_framework.parsers.JSONParser class in the functions to make it possible to be able to work with all the configured parsers and stop working with a parser that only works with JSON. The game_list function executes the following two lines when request.method is equal to 'POST': game_data = JSONParser().parse(request) game_serializer = GameSerializer(data=game_data) We will remove the first line that uses the JSONParser and we will pass request.data as the data argument for the GameSerializer. The following line will replace the previous lines: game_serializer = GameSerializer(data=request.data) The game_detail function executes the following two lines when request.method is equal to 'PUT': game_data = JSONParser().parse(request) game_serializer = GameSerializer(game, data=game_data) We will make the same edits done for the code in the game_list function. We will remove the first line that uses the JSONParser and we will pass request.data as the data argument for the GameSerializer. The following line will replace the previous lines: game_serializer = GameSerializer(game, data=request.data) By default, the value for the DEFAULT_RENDERER_CLASSES is the following tuple of classes: ( 'rest_framework.renderers.JSONRenderer', 'rest_framework.renderers.BrowsableAPIRenderer', ) When we use the decorator, the API will be able to render any of the following content types in the response through the appropriate renderers when working with the rest_framework.response.Response object. application/json text/html By default, the value for the DEFAULT_CONTENT_NEGOTIATION_CLASS is the rest_framework.negotiation.DefaultContentNegotiation class. When we use the decorator, the API will use this content negotiation class to select the appropriate renderer for the response based on the incoming request. This way, when a request specifies that it will accept text/html, the content negotiation class selects the rest_framework.renderers.BrowsableAPIRenderer to render the response and generate text/html instead of application/json. We have to replace the usages of both the JSONResponse and HttpResponse classes in the functions with the rest_framework.response.Response class. The Response class uses the previously explained content negotiation features, renders the received data into the appropriate content type and returns it to the client. Now, go to the gamesapi/games folder folder and open the views.py file. Replace the code in this file with the following code that removes the JSONResponse class, uses the @api_view decorator for the functions and the rest_framework.response.Response class. The modified lines are highlighted. The code file for the sample is included in the restful_python_chapter_02_02 folder. from rest_framework.parsers import JSONParser from rest_framework import status from rest_framework.decorators import api_view from rest_framework.response import Response from games.models import Game from games.serializers import GameSerializer @api_view(['GET', 'POST']) def game_list(request): if request.method == 'GET': games = Game.objects.all() games_serializer = GameSerializer(games, many=True) return Response(games_serializer.data) elif request.method == 'POST': game_serializer = GameSerializer(data=request.data) if game_serializer.is_valid(): game_serializer.save() return Response(game_serializer.data, status=status.HTTP_201_CREATED) return Response(game_serializer.errors, status=status.HTTP_400_BAD_REQUEST) @api_view(['GET', 'PUT', 'POST']) def game_detail(request, pk): try: game = Game.objects.get(pk=pk) except Game.DoesNotExist: return Response(status=status.HTTP_404_NOT_FOUND) if request.method == 'GET': game_serializer = GameSerializer(game) return Response(game_serializer.data) elif request.method == 'PUT': game_serializer = GameSerializer(game, data=request.data) if game_serializer.is_valid(): game_serializer.save() return Response(game_serializer.data) return Response(game_serializer.errors, status=status.HTTP_400_BAD_REQUEST) elif request.method == 'DELETE': game.delete() return Response(status=status.HTTP_204_NO_CONTENT) After you save the previous changes, run the following command: http OPTIONS :8000/games/ The following is the equivalent curl command: curl -iX OPTIONS :8000/games/ The previous command will compose and send the following HTTP request: OPTIONS http://localhost:8000/games/. The request will match and run the views.game_list function, that is, the game_list function declared within the games/views.py file. We added the @api_view decorator to this function, and therefore, it is capable of determining the supported HTTP verbs, parsing and rendering capabilities. The following lines show the output: HTTP/1.0 200 OK Allow: GET, POST, OPTIONS, PUT Content-Type: application/json Date: Thu, 09 Jun 2016 21:35:58 GMT Server: WSGIServer/0.2 CPython/3.5.1 Vary: Accept, Cookie X-Frame-Options: SAMEORIGIN { "description": "", "name": "Game Detail", "parses": [ "application/json", "application/x-www-form-urlencoded", "multipart/form-data" ], "renders": [ "application/json", "text/html" ] } The response header includes an Allow key with a comma-separated list of HTTP verbs supported by the resource collection as its value: GET, POST, OPTIONS. As our request didn’t specify the allowed content type, the function rendered the response with the default application/json content type. The response body specifies the Content-type that the resource collection parses and the Content-type that it renders. Run the following command to compose and send and HTTP request with the OPTIONS verb for a game resource. Don’t forget to replace 3 with a primary key value of an existing game in your configuration: http OPTIONS :8000/games/3/ The following is the equivalent curl command: curl -iX OPTIONS :8000/games/3/ The previous command will compose and send the following HTTP request: OPTIONS http://localhost:8000/games/3/. The request will match and run the views.game_detail function, that is, the game_detail function declared within the games/views.py file. We also added the @api_view decorator to this function, and therefore, it is capable of determining the supported HTTP verbs, parsing and rendering capabilities. The following lines show the output: HTTP/1.0 200 OK Allow: GET, POST, OPTIONS Content-Type: application/json Date: Thu, 09 Jun 2016 20:24:31 GMT Server: WSGIServer/0.2 CPython/3.5.1 Vary: Accept, Cookie X-Frame-Options: SAMEORIGIN { "description": "", "name": "Game List", "parses": [ "application/json", "application/x-www-form-urlencoded", "multipart/form-data" ], "renders": [ "application/json", "text/html" ] } The response header includes an Allow key with comma-separated list of HTTP verbs supported by the resource as its value: GET, POST, OPTIONS, PUT. The response body specifies the content-type that the resource parses and the content-type that it renders, with the same contents received in the previous OPTIONS request applied to a resource collection, that is, to a games collection. When we composed and sent POST and PUT commands, we had to use the use the -H "Content-Type: application/json" option to indicate curl to send the data specified after the -d option as application/json instead of the default application/x-www-form-urlencoded. Now, in addition to application/json, our API is capable of parsing application/x-www-form-urlencoded and multipart/form-data data specified in the POST and PUT requests. Thus, we can compose and send a POST command that sends the data as application/x-www-form-urlencoded with the changes made to our API. We will compose and send an HTTP request to create a new game. In this case, we will use the -f option for HTTPie that serializes data items from the command line as form fields and sets the Content-Type header key to the application/x-www-form-urlencoded value. http -f POST :8000/games/ name='Toy Story 4' game_category='3D RPG' played=false release_date='2016-05-18T03:02:00.776594Z' The following is the equivalent curl command. Notice that we don’t use the -H option and curl will send the data in the default application/x-www-form-urlencoded: curl -iX POST -d '{"name":"Toy Story 4", "game_category":"3D RPG", "played": "false", "release_date": "2016-05-18T03:02:00.776594Z"}' :8000/games/ The previous commands will compose and send the following HTTP request: POST http://localhost:8000/games/ with the Content-Type header key set to the application/x-www-form-urlencoded value and the following data: name=Toy+Story+4&game_category=3D+RPG&played=false&release_date=2016-05-18T03%3A02%3A00.776594Z The request specifies /games/, and therefore, it will match '^games/$' and run the views.game_list function, that is, the updated game_detail function declared within the games/views.py file. As the HTTP verb for the request is POST, the request.method property is equal to 'POST', and therefore, the function will execute the code that creates a GameSerializer instance and passes request.data as the data argument for its creation. The rest_framework.parsers.FormParser class will parse the data received in the request, the code creates a new Game and, if the data is valid, it saves the new Game. If the new Game was successfully persisted in the database, the function returns an HTTP 201 Created status code and the recently persisted Game serialized to JSON in the response body. The following lines show an example response for the HTTP request, with the new Game object in the JSON response: HTTP/1.0 201 Created Allow: OPTIONS, POST, GET Content-Type: application/json Date: Fri, 10 Jun 2016 20:38:40 GMT Server: WSGIServer/0.2 CPython/3.5.1 Vary: Accept, Cookie X-Frame-Options: SAMEORIGIN { "game_category": "3D RPG", "id": 20, "name": "Toy Story 4", "played": false, "release_date": "2016-05-18T03:02:00.776594Z" } After the changes we made in the code, we can run the following command to see what happens when we compose and send an HTTP request with an HTTP verb that is not supported: http PUT :8000/games/ The following is the equivalent curl command: curl -iX PUT :8000/games/ The previous command will compose and send the following HTTP request: PUT http://localhost:8000/games/. The request will match and try to run the views.game_list function, that is, the game_list function declared within the games/views.py file. The @api_view decorator we added to this function doesn’t include 'PUT' in the string list with the allowed HTTP verbs, and therefore, the default behavior returns a 405 Method Not Allowed status code. The following lines show the output with the response from the previous request. A JSON content provides a detail key with a string value that indicates the PUT method is not allowed. HTTP/1.0 405 Method Not Allowed Allow: GET, OPTIONS, POST Content-Type: application/json Date: Sat, 11 Jun 2016 00:49:30 GMT Server: WSGIServer/0.2 CPython/3.5.1 Vary: Accept, Cookie X-Frame-Options: SAMEORIGIN { "detail": "Method "PUT" not allowed." } Summary This article covers the use of model serializers and how it is effective in removing duplicate code. Resources for Article: Further resources on this subject: Making History with Event Sourcing [article] Implementing a WCF Service in the Real World [article] WCF – Windows Communication Foundation [article]
Read more
  • 0
  • 0
  • 9087

article-image-buildbox-2-game-development-peek-boo
Packt
23 Sep 2016
20 min read
Save for later

Buildbox 2 Game Development: peek-a-boo

Packt
23 Sep 2016
20 min read
In this article by Ty Audronis author of the book Buildbox 2 Game Development, teaches the reader the Buildbox 2 game development environment by example.The following excerpts from the book should help you gain an understanding of the teaching style and the feel of the book.The largest example we give is by making a game called Ramblin' Rover (a motocross-style game that uses some of the most basic to the most advanced features of Buildbox).Let's take a quick look. (For more resources related to this topic, see here.) Making the Rover Jump As we've mentioned before, we're making a hybrid game. That is, it's a combination of a motocross game, a platformer, and a side-scrolling shooter game. Our initial rover will not be able to shoot at anything (we'll save this feature for the next upgraded rover that anyone can buy with in-game currency). But this rover will need to jump in order to make the game more fun. As we know, NASA has never made a rover for Mars that jumps. But if they did do this, how would they do it? The surface of Mars is a combination of dust and rocks, so the surface conditions vary greatly in both traction and softness. One viable way is to make the rover move in the same way a spacecraft manoeuvres (using little gas jets). And since the gravity on Mars is lower than that on Earth, this seems legit enough to include it in our game. While in our Mars Training Ground world, open the character properties for Training Rover. Drag the animated PNG sequence located in our Projects/RamblinRover/Characters/Rover001-Jump folder (a small four-frame animation) into the JumpAnimation field. Now we have an animation of a jump-jet firing when we jump. We just need to make our rover actually jump. Your Properties window should look like the following screenshot: The preceding screenshot shows the relevant sections of the character's properties window We're now going to revisit the Character Gameplay Settings section. Scroll the Properties window all the way down to this section. Here's where we actually configure a few settings in order to make the rover jump. The previous screenshot shows the section as we're going to set it up. You can configure your settings similarly. The first setting we are considering is Jump Force. You may notice that the vertical force is set to 55. Since our gravity is -20 in this world, we need enough force to not only counteract the gravity, but also to give us a decent height (about half the screen). A good rule is to just make our Jump Force 2x our Gravity. Next is Jump Counter. We've set it to 1. By default, it's set to 0. This actually means infinity. When JumpCounter is set to 0, there is no limit to how many times a player can use the jump boost… They could effectively ride the top of the screen using the jump boost,such as a flappy bird control. So, we set it to 1 in order to limit the jumps to one at a time. There is also a strange oddity with the Buildbox that we can exploit with this. The jump counter resets only after the rover hits the ground. But, there's a funny thing… The rover itself never actually touches the ground (unless it crashes), only the wheels do. There is one other way to reset the jump counter: by doing a flip. What this means is that once players use their jump up, the only way to reset it is to do a flip-trick off a ramp. Add a level of difficulty and excitement to the game using a quirk of the development software! We could trick the software into believing that the character is simply close enough to the ground to reset the counter by increasing Ground Threshold to the distance that the body is from the ground when the wheels have landed. But why do this? It's kind of cool that a player has to do a trick to reset the jump jets. Finally, let's untick the Jump From Ground checkbox. Since we're using jets for our boost, it makes sense that the driver could activate them while in the air. Plus, as we've already said, the body never meets the ground. Again, we could raise the ground threshold, but let's not (for the reasonsstated previously). Awesome! Go ahead and give it a try by previewing the level. Try jumping on the small ramp that we created, which is used to get on top of our cave. Now, instead of barely clearing it, the rover will easily clear it, and the player can then reset the counter by doing a flip off the big ramp on top. Making a Game Over screen This exercise will show you how to make some connections and new nodes using Game Mind Map. The first thing we're going to want is an event listener to sense when a character dies. It sounds complex, and if we were coding a game, this would take several lines of code to accomplish. In Buildbox, it's a simple drag-and-drop method. If you double-click on the Game Field UI node, you'll be presented with the overlay for the UI and controls during gameplay. Since this is a basic template, you are actually presented with a blank screen. This template is for you to play around with on a computer, so no controls are on the screen. Instead, it is assumed that you would use keyboard controls to play the demo game. This is why the screen looks blank: There are some significant differences between the UI editor and the World editor. You can notice that the Character tab from the asset library is missing, and there is a timeline editor on the bottom. We'll get into how to use this timeline later. For now, let's keep things simple and add our Game Over sensor. If you expand the Logic tab in the asset library, you'll find the Event Observer object. You can drag this object anywhere onto the stage. It doesn't even have to be in the visible window (the dark area in the center of the stage). So long as it's somewhere on the stage, the game can use this logic asset. If you do put it on the visible area of the stage, don't worry; it's an invisible asset, and won't show in your game. While the Event observer is selected on the stage, you'll notice that its properties pop up in the properties window (on the right side of the screen). By default, the Game Over type of event is selected. But if you select this drop-down menu, you'll notice a ton of different event types that this logic asset can handle. Let's leave all of the properties at their default values (except the name; change this to Game Over) and go back to Game Mind Map (the top-left button): Do you notice anything different? The Game Field UI node now has a Game Over output. Now, we just need a place to send this output. Right-click on the blank space of the grid area. Now you can either create a new world or new UI. Select Add New UI and you'll see a new green node that is titled New UI1. This new UI will be your Game Over screen when a character dies. Before we can use this new node, it needs to be connected to the Game Over output of Game Field UI. This process is exceedingly simple. Just hold down your left mouse button on the Game Over output's dark dot, and drag it to the New UI1's Load dark dot (on the left side of the New UI1 node). Congratulations, you've just created your first connected node. We're not done yet, though. We need to make this Game Over screen link back to restart the game. First, by selecting the New UI1 node, change its name using the parameters window (on the right of the screen) to Game Over UI. Make sure you hit your Enter key; this will commit the changed name. Now double-click on the Game Over UI node so we can add some elements to the screen. You can't have a Game Over screen without the words Game Over, so let's add some text. So, we've pretty much completed the game field (except for some minor items that we'll address quite soon). But believe it or not, we're only halfway there! In this article, we're going to finally create our other two rovers, and we'll test and tweak our scenes with them. We'll set up all of our menus, information screens, and even a coin shop where we can use in-game currency to buy the other two rovers, or even use some real-world currency to short-cut and buy more in-game currency. And speaking of monetization, we'll set up two different types of advertising from multiple providers to help us make some extra cash. Or, in the coin-shop, players can pay a modest fee to remove all advertising! Ready? Well, here we go! We got a fever, and the only cure is more rovers! So now that we've created other worlds, we definitely need to set up some rovers that are capable of traversing them. Let's begin with the optimal rover for Gliese. This one is called the K.R.A.B.B. (no, it doesn't actually stand for anything…but the rover looks like a crab, and acronyms look more military-like). Go ahead and drag all of the images in the Rover002-Body folder as characters. Don't worry about the error message. This just tells you that only one character can be on the stage at a time. The software still loads this new character into the library, and that's all we really want at this time anyway. Of course, drag the images in the Rover002-Jump folder to the Jump Animation field, and the LaserShot.png file to the Bullet Animation field. Set up your K.R.A.B.B. with the following settings: For Collision Shape, match this: In the Asset Library, drag the K.R.A.B.B. above the Mars Training Rover. This will make it the default rover. Now, you can test your Gliese level (by soloing each scene) with this rover to make sure it's challenging, yet attainable. You'll notice some problems with the gun destroying ground objects, but we'll solve that soon enough. Now, let's do the same with Rover 003. This one uses a single image for the Default Animation, but an image sequence for the jump. We'll get to the bullet for this one in a moment, but set it up as follows: Collision Shape should look as follows: You'll notice that a lot of the settings are different on this character, and you may wonder what the advantage of this is (since it doesn't lean as much as the K.R.A.B.B.). Well, it's a tank, so the damage it can take will be higher (which we'll set up shortly), and it can do multiple jumps before recharging (five, to be exact). This way, this rover can fly using flappy-bird style controls for short distances. It's going to take a lot more skill to pilot this rover, but once mastered, it'll be unstoppable. Let's move onto the bullet for this rover. Click on the Edit button (the little pencil icon) inside Bullet Animation (once you've dragged the missile.png file into the field), and let's add a flame trail. Set up a particle emitter on the missile, and position it as shown in the following screenshots: The image on the left shows the placement of the missile and the particle emitter. On the right, you can see the flame set up. You may wonder why it is pointed in the opposite direction. This will actually make the flames look more realistic (as if they're drifting behind the missile). Preparing graphic assets for use in Buildbox Okay, so as I said before, the only graphic assets that Buildbox can use are PNG files. If this was just a simple tutorial on how to make Ramblin' Rover, we could leave it there. But it's not just it. Ramblin' Rover is just an example of how a game is made, but we want to give you all of the tools and baseknowledge you need to create all of your own games from scratch. Even if you don't create your own graphic assets, you need to be able to tell anybody creating them for you how you want them. And more importantly, you need to know why. Graphics are absolutely the most important thing in developing a game. After all, you saw how just some eyes and sneakers made a cute character that people would want to see. Graphics create your world. They create characters that people want to succeed. Most importantly, graphics create the feel of your game, and differentiate it from other games on the market. What exactly is a PNG file? Anybody remember GIF files? No, not animated GIFs that you see on most chat-rooms and on Facebook (although they are related). Back in the 1990s, a still-frame GIF file was the best way to have a graphics file that had a transparent background. GIFs can be used for animation, and can have a number of different purposes. However, GIFs were clunky. How so? Well, they had a type of compression known as lossy. This just means that when compressed, information was lost, and artifacts and noise could pop up and be present. Furthermore, GIFs used indexed colors. This means that anywhere from 2 to 256 colors could be used, and that's why you see something known as banding in GIF imagery. Banding is where something in real life goes from dark to light because of lighting and shadows. In real life, it's a smooth transition known as a gradient. With indexed colors, banding can occur when these various shades are outside of the index. In this case, the colors of these pixels are quantized (or snapped) to the nearest color in the index. The images here show a noisy and banded GIF (left) versus the original picture (right): So, along came PNGs (Portable Network Graphics is what it stands for). Originally, the PNG format was what a program called Macromedia Fireworks used to save projects. Now,the same software is called Adobe Fireworks and is part of the Creative Cloud. Fireworks would cut up a graphics file into a table or image map and make areas of the image clickable via hyperlink for HTML web files. PNGs were still not widely supported by web browsers, so it would export the final web files as GIFs or JPEGs. But somewhere along the line, someone realized that the PNG image itself was extremely bandwidthefficient. So, in the 2000s, PNGs started to see some support on browsers. Up until around 2008, though, Microsoft's Internet Explorer still did not support PNGs with transparency, so some strange CSS hacks needed to be done to utilize them. Today, though, the PNG file is the most widely used network-based image file. It's lossless, has great transparency, and is extremely efficient. Since PNGs are very widely used, and this is probably why Buildbox restricts compatibility to this format. Remember, Buildbox can export for multiple mobile and desktop platforms. Alright, so PNGs are great and very compatible. But there are multiple flavours of PNG files. So, what differentiates them? What bit-ratings mean? When dealing with bit-ratings, you have to understand that when you hear 8-bit image and 24-bit image, it maybe talking about two different types of rating, or even exactly the same type of image. Confused? Good, because when dealing with a graphics professional to create your assets, you're going to have to be a lot more specific, so let's give you a brief education in this. Your typical image is 8 bits per channel (8 bpc), or 24 bits total (because there are three channels: red, green, and blue). This is also what they mean by a 16.7 million-color image. The math is pretty simple. A bit is either 0 or 1. 8 bits may look something as 01100110. This means that there are 256 possible combinations on that channel. Why? Because to calculate the number of possibilities, you take the number of possible values per slot and take it to that power. 0 or 1; that's 2 possibilities, and 8 bit is 8 slots. 2x2x2x2x2x2x2x2 (2 to the 8th power) is 256. To combine colors on a pixel, you'd need to multiply the possibilities such as 256x256x256 (which is 16.7 million). This is how they know that there are 16.7 million possible colors in an 8 bpc or 24-bit image. So saying 8 bit may mean per channel or overall. This is why it's extremely important to add thr "channel" word if that's what you mean. Finally, there is a fourth channel called alpha. The alpha channel is the transparency channel. So when you're talking about a 24-bit PNG with transparency, you're really talking about a 32-bit image. Why is this important to know? This is because some graphics programs (such as Photoshop) have 24-bit PNG as an option with a checkbox for transparency. But some other programs (such as the 3D software we used called Lightwave) have an option for a 24-bit PNG and a 32-bit PNG. This is essentially the same as the Photoshop options, but with different names. By understanding what these bits per channel are and what they do, you can navigate your image-creating software options better. So, what's an 8-bit PNG, and why is it so important to differentiate it from an 8-bit per channel PNG (or 24-bit PNG)? It is because an 8-bit PNG is highly compressed. Much like a GIF, it uses indexed colors. It also uses a great algorithm to "dither" or blend the colors to fill them in to avoid banding. 8-bit PNG files are extremely efficient on resources (that is, they are much smaller files), but they still look good, unless they have transparency. Because they are so highly compressed, the alpha channel is included in the 8-bits. So, if you use 8-bit PNG files for objects that require transparency, they will end up with a white-ghosting effect around them and look terrible on screen, much like a weather report where the weather reporter's green screen is bad. So, the rule is… So, what all this means to you is pretty simple. For objects that require transparency channels, always use 24-bit PNG files with transparency (also called 8 bits per channel, or 32-bit images). For objects that have no transparency (such as block-shaped obstacles and objects), use 8-bit PNG files. By following this rule, you'll keep your game looking great while avoiding bloating your project files. In the end, Buildbox repacks all of the images in your project into atlases (which we'll cover later) that are 32 bit. However, it's always a good practice to stay lean. If you were a Buildbox 1.x user, you may remember that Buildbox had some issues with DPI (dots per inch) between the standard 72 and 144 on retina displays. This issue is a thing of the past with Buildbox 2. Image sequences Think of a film strip. It's just a sequence of still-images known as frames. Your standard United States film runs at 24 frames per second (well, really 23.976, but let's just round up for our purposes). Also, in the US, television runs at 30 frames per second (again, 29.97, but whatever…let's round up). Remember that each image in our sequence is a full image with all of the resources associated with it. We can quite literally cut our necessary resources in half by cutting this to 15 frames per second (fps). If you open the content you downloaded, and navigate to Projects/RamblinRover/Characters/Rover001-Body, you'll see that the images are named Rover001-body_001.png, Rover001-body_002.png and so on. The final number indicates the number that should play in the sequence (first 001, then 002, and so on). The animation is really just the satellite dish rotating, and the scanner light in the window rotating as well. But what you'll really notice is that this animation is loopable. All loopable means is that the animation can loop (play over and over again) without you noticing a bump in the footage (the final frame leads seamlessly back to the first). If you're not creating these animations yourself, you'll need to make sure to specify to your graphics professional to make these animations loopable at 15 fps. They should understand exactly what you mean, and if they don't…you may consider finding a new animator. Recommended software for graphics assets For the purposes of context (now that you understand more about graphics and Buildbox), a bit of reinforcement couldn't hurt. A key piece of graphics software is the Adobe Creative Cloud subscription (http://www.adobe.com/CreativeCloud ). Given its bang for the buck, it just can't be beaten. With it, you'll get Photoshop (which can be used for all graphics assets from your game's icon to obstacles and other objects), Illustrator (which is great for navigational buttons), After Effects (very useful for animated image sequences), Premiere Pro (a video editing application for marketing videos from screen-captured gameplay), and Audition (for editing all your sound). You may also want some 3D software, such as Lightwave, 3D Studio Max, or Maya. This can greatly improve the ability to make characters, enemies, and to create still renders for menus and backgrounds. Most of the assets in Ramblin' Rover were created with the 3D software Lightwave. There are free options for all of these tools. However, there are not nearly as many tutorials and resources available on the web to help you learn and create using these. One key thing to remember when using free software: if it's free…you're the product. In other words, some benefits come with paid software, such as better support, and being part of the industry standard. Free software seems to be in a perpetual state of "beta testing." If using free software, read your End User License Agreement (known as a EULA) very carefully. Some software may require you to credit them in some way for the privilege of using their software for profit. They may even lay claim to part of your profits. Okay, let's get to actually using our graphics in Ramblin' Rover… Summary See? It's not that tough to follow. By using plain-English explanations combined with demonstrating some significant and intricate processes, you'll be taken on a journey meant to stimulate your imagination and educate you on how to use the software. Along with the book comes the complete project files and assets to help you follow along the entire way through the build process. You'll be making your own games in no time! Resources for Article: Further resources on this subject: What Makes a Game a Game? [article] Alice 3: Controlling the Behavior of Animations [article] Building a Gallery Application [article]
Read more
  • 0
  • 0
  • 22740

article-image-language-modeling-with-deep-learning
Mohammad Pezeshki
23 Sep 2016
5 min read
Save for later

Language Modeling with Deep Learning

Mohammad Pezeshki
23 Sep 2016
5 min read
Language modeling is defining a joint probability distribution over a sequence of tokens (words or characters). Considering a sequence of tokens fx1; :::; xT g. A language model defines P (x1; : : : ; xT ), which can be used in many areas of natural language processing. Language modelings define a joint probability distribution over a sequence of tokens (words or characters). Consider a sequence of tokens x1; : : : ; xT. For example, a language model can significantly improve the accuracy of a speech recognition system. As an example, in the case of two words that have the same sound but different meanings, a language model can fix the problem of recognizing the right word. In Figure 1, the speech recognizer (aka acoustic model) has assigned the same high probabilities to the words meet" and meat". It is even possible that the speech recognizer assigns a higher probability to meet" rather than meat". However, by conditioning the language model on the three rst tokens (I-cooked-some"), the next word could be sh", pasta", or meat" with a reasonable probability higher than the probability of meet". To get the final answer, we can simply multiply two tables of probabilities and normalize them. Now the word meat" has a very high relative probability! One family of deep learning models that are capable of modeling sequential data (such as language) is Recurrent Neural Networks (RNNs). RNNs have recently achieved impressive results on different problems such as the language modeling. In this article, we briefly describe RNNs and demonstrate how to code them using the Blocks library on top of Theano. Consider a sequence of T input elements x1; : : : ; xT . RNN models the sequence by applying the same operation in a recursive way. Formally, ht = f(ht   1; xt); (1) yt = g(h t); (2)   Where ht is the internal hidden representation of the RNN and yt is the output at tth time-step. For the very first time-step, we also have an initial state h0. f and g are two functions, which are shared across the time axis. In the simplest case, f and g can be a linear transformation followed by a non-linearity. There are more complicated forms of f and g such as Long Short-Term Memory (LSTM) and Gated Recurrent Unit (GRU). Here we skip the exact formulations of f and g to use LSTM as a black box. Consequently, suppose we have B sequences, each with a length of T, such that each time-step is presented in a vector of size F . So the input can be seen as a 3D tensor with size T xBxF, the hidden representation with size T xBxF 0, and the output with size T xBxF 00. Let's build a character-level language model that can model the joint probability P (x1; : : : ; xT ) using the chain rule: P (x1; : : : ; xT ) = P (x1)P (x2jx1)P (x3jx1; x2):::P (xT jx1::T1) (3) We can model P (xtjx1::t1) using an RNN by predicting xt given xt1::1. In other words, given a sequence fx1; : : : ; xT g, the input sequence is fx1; : : : ; xT1g and the target sequence is fx2; : : : ; xT g. To define input and target, we can write: Now to define the model, we need a linear transformation from the input to the LSTM, and from the LSTM to the output. To train the model, we use the cross entropy between the model output and the true target:   Now assuming that data is provided to us, using data stream, we can start training by initializing the model, and tuning parameters: After the model is trained, we can condition the model on an initial sequence and start generating the next token. We can repeatedly feed the predicted token into the model and get the next token. We can even just start from the initial state and ask the model to hallucinate! Here is a sample generated text from a model trained on a 96 MB text data of wikipedia (figure adapted from here): Here is a visualization of the model's output. The first line is the real data and the next six lines are the candidate with the highest output probability of for each character. The more red a cell is, the higher probability the model assigns to that character. For example, as soon as the model sees ttp://ww, it is confident that the next character is also a w" and the next one is a .". Butat this point, there is no more clue about the next character. So the model assigns almost the same probability to all the characters (figure adapted from here): In this post we learned about language modeling and one of its applications in speech recognition. We also learned how to code a recurrent neural network in order to train such a model. You can find the complete code and experiment on a bunch of datasets such as wikipedia at Github. The code is written by my close friend Eloi Zablocki and me. About the author Mohammad Pezeshk is a master’s student in the LISA lab of Universite de Montreal working under the supervision of Yoshua Bengio and Aaron Courville. He obtained his bachelor's in computer engineering from Amirkabir University of Technology (Tehran Polytechnic) in July 2014 and then started his master’s in September 2014. His research interests lie in the fields of artifitial intelligence, machine learning, probabilistic models and specifically deep learning.
Read more
  • 0
  • 0
  • 17231
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-openstack-networking-nutshell
Packt
22 Sep 2016
13 min read
Save for later

OpenStack Networking in a Nutshell

Packt
22 Sep 2016
13 min read
Information technology (IT) applications are rapidly moving from dedicated infrastructure to cloud based infrastructure. This move to cloud started with server virtualization where a hardware server ran as a virtual machine on a hypervisor. The adoptionof cloud based applicationshas accelerated due to factors such as globalization and outsourcing where diverse teams need to collaborate in real time. Server hardware connects to network switches using Ethernet and IP to establish network connectivity. However, as servers move from physical to virtual, the network boundary also moves from the physical network to the virtual network.Traditionally applications, servers and networking were tightly integrated. But modern enterprises and IT infrastructure demand flexibility in order to support complex applications. The flexibility of cloud infrastructure requires networking to be dynamic and scalable. Software Defined Networking (SDN) and Network Functions Virtualization (NFV) play a critical role in data centers in order to deliver the flexibility and agility demanded by cloud based applications. By providing practical management tools and abstractions that hide underlying physical network’s complexity, SDN allows operators to build complex networking capabilities on demand. OpenStack is an open source cloud platform that helps build public and private cloud at scale. Within OpenStack, the name for OpenStack Networking project is Neutron. The functionality of Neutron can be classified as core and service. In this article by Sriram Subramanian and SreenivasVoruganti, authors of the book Software Defined Networking (SDN) with OpenStack, aims to provide a short introduction toOpenStack Networking. We will cover the following topics in this article: Understand traffic flows between virtual and physical networks Neutron entities that support Layer 2 (L2) networking Layer 3 (L3) or routing between OpenStack Networks Securing OpenStack network traffic Advanced networking services in OpenStack OpenStack and SDN The terms Neutron and OpenStack Networking are used interchangeably throughout this article. (For more resources related to this topic, see here.) Virtual and physical networking Server virtualization led to the adoption of virtualized applications and workloads running inside physical servers. While physical servers are connected to the physical network equipment, modern networking has pushed the boundary of networking into the virtual domain as well. Virtual switches, firewalls and routers play a critical role in the flexibility provided by cloud infrastructure. Figure 1: Networking components for server virtualization The preceding figure describes a typical virtualized server and the various networking components. The virtual machines are connected to a virtual switch inside the compute node (or server). The traffic is secured using virtual routers and firewalls. The compute node is connected to a physical switch which is the entry point into the physical network. Let us now walk through different traffic flow scenarios using the picture above as the background. In Figure 2, traffic from one VM to another on same compute node is forwarded by the virtual switch itself. It does not reach the physical network. You can even apply firewall rules to traffic between the two virtual machine. Figure 2: Traffic flow between two virtual machines on the same server Next, let us have a look at how traffic flows between virtual machines across two compute nodes. In Figure 3, the traffic comes out from compute node and then reaches the physical switch. The physical switch forwards the traffic to the second compute node and the virtual switch within the second compute node steers the traffic to appropriate VM. Figure 3: Traffic flow between two virtual machines on the different servers Finally, here is the depiction of traffic flow when a virtual machine sends or receives traffic from the Internet. The physical switch forwards the traffic to the physical router and firewall which is presumed to be connected to the internet. Figure 4: Traffic flow from a virtual machine to external network As seen from the above diagrams, the physical and the virtual network components work together to provide connectivity to virtual machines and applications. Tenant isolation As a cloud platform, OpenStack supports multiple users grouped into tenants. One of the key requirements of a multi-tenant cloud is to provide isolation of data traffic belonging to one tenant from rest of the tenants that use the same infrastructure. OpenStack supports different ways of achieving isolation and the it is the responsibility of the virtual switch to implement the isolation. Layer 2 (L2) capabilities in OpenStack The connectivity to a physical or virtual switch is also known as Layer 2 (L2) connectivity in networking terminology. Layer 2 connectivity is the most fundamental form of network connectivity needed for virtual machines. As mentioned earlier OpenStack supports core and service functionality. The L2 connectivity for virtual machines falls under the core capability of OpenStack Networking, whereas Router, Firewall etc., fall under the service category. The L2 connectivity in OpenStack is realized using two constructs called Network and Subnet. Operators can use OpenStack CLI or the web interface to create Networks and Subnets. And virtual machines are instantiated, the operators can associate them appropriate Networks. Creatingnetwork using OpenStack CLI A Network defines the Layer 2 (L2) boundary for all the instances that are associated with it. All the virtual machines within a Network are part of the same L2 broadcast domain. The Liberty release has introduced new OpenStack CLI (Command Line Interface) for different services. We will use the new CLI and see how to create a Network. Creating Subnet using OpenStack CLI A Subnet is a range of IP addresses that are assigned to virtual machines on the associated network. OpenStack Neutron configures a DHCP server with this IP address range and it starts one DHCP server instance per Network, by default. We will now show you how to create a Subnet using OpenStack CLI. Note: Unlike Network, for Subnet, we need to use the regular neutron CLI command in the Liberty release. Associating a network and Subnet to a virtual machine To give a complete perspective, we will create a virtual machine using OpenStack web interface and show you how to associate a Network and Subnet to a virtual machine. In your OpenStack web interface, navigate to Project|Compute|Instances. Click on Launch Instances action on the right hand side as highlighted above. In the resulting window enter the name for your instance and how you want to boot your instance. To associate a network and a subnet with the instance, click on Networking tab. If you have more than one tenant network, you will be able to choose the network you want to associate with the instance. If you have exactly one network, the web interface will automatically select it. As mentioned earlier, providing isolation for Tenant network traffic is a key requirement for any cloud. OpenStack Neutron uses Network and Subnet to define the boundaries and isolate data traffic between different tenants. Depending on Neutron configuration, the actual isolation of traffic is accomplished by the virtual switches. VLAN and VXLAN are common networking technologies used to isolate traffic. Layer 3 (L3) capabilities in OpenStack Once L2 connectivity is established, the virtual machines within one Network can send or receive traffic between themselves. However, two virtual machines belonging to two different Networks will not be able communicate with each other automatically. This is done to provide privacy and isolation for Tenant networks. In order to allow traffic from one Network to reach another Network, OpenStack Networking supports an entity called Router. The default implementation of OpenStack uses Namespaces to support L3 routing capabilities. Creating Router using OpenStack CLI Operators can create Routers using OpenStack CLI or web interface. They can then add more than one Subnets as interface to the Router. This allows the Networks associated with the router to exchange traffic with one another. The command to create a Router is as follows: This command creates a Router with the specified name. Associating Subnetwork to a Router Once a Router is created, the next step is to associate one or more sub-networks to the Router. The command to accomplish this is: The Subnet represented by subnet1 is now associated to the Router router1. Securing network traffic in OpenStack The security of network traffic is very critical and OpenStack supports two mechanisms to secure network traffic. Security Groups allow traffic within a tenant’s network to be secured. Linux iptables on the compute nodes are used to implement OpenStack security groups. The traffic that goes outside of a tenant’s network – to another Network or the Internet, is secured using OpenStackFirewall Service functionality. Like Routing, Firewall is a service with Neutron. Firewall service also uses iptables but the scope of iptables is limited to the OpenStack Router used as part of the Firewall Service. Usingsecurity groups to secure traffic within a network In order to secure traffic going from one VM to another within a given Network, we must create a security group. The command to create a security group is: The next step is to create one or more rules within the security group. As an example let us create a rule which allows only UDP, incoming traffic on port 8080 from any Source IP address. The final step is to associate this security group and the rules to a virtual machine instance. We will use the nova boot command for this: Once the virtual machine instance has a security group associated with it, the incoming traffic will be monitored and depending upon the rules inside the security group, data traffic may be blocked or permitted to reach the virtual machine. Note: it is possible to block ingress or egress traffic using security groups. Using firewall service to secure traffic We have seen that security groups provide a fine grain control over what traffic is allowed to and from a virtual machine instance. Another layer of security supported by OpenStack is the Firewall as a Service (FWaaS). The FWaaS enforces security at the Router level whereas security groups enforce security at a virtual machine interface level. The main use case of FWaaS is to protect all virtual machine instances within a Network from threats and attacks from outsidethe Network. This could be virtual machines part of another Network in the same OpenStack cloud or some entity on the Internet trying to make an unauthorized access. Let us now see how FWaaS is used in OpenStack. In FWaaS, a set of firewall rules are grouped into a firewall policy. And then a firewall is created that implements one policy at a time. This firewall is then associated to a Router. Firewall rule can be created using neutron firewall-rule-create command as follows: This rule blocks ICMP protocol so applications like Ping will be blocked by the firewall. The next step is to create a Firewall policy. In real world scenarios the security administrators will define several rules and consolidate them under a single policy. For example all rules that block various types of traffic can be combined into a single Policy. The command to create a firewall policy is: The final step is to create a firewall and associate it with a router. The command to do this is: In the command above we did not specify any Routers and the OpenStack behavior is to associate the firewall (and in turn the policy and rules) to all the Routers available for that tenant. The neutron firewall-create command supports an option to pick a specific Router as well. Advanced networking services Besides routing and firewall, there are few other commonly used networking technologies supported by OpenStack. Let’s take a quick look at these without delving deep into the respective commands. Load Balancing as a Service (LBaaS) Virtual machines instances created in OpenStack are used to run applications. Most applications are required to support redundancy and concurrent access. For example, a web server may be accessed by a large number of users at the same time. One of the common strategies to handle scale and redundancy is to implement load-balancing for incoming requests. In this approach, aLoad Balancer distributes an incoming service request onto a pool of servers, which processes the request thus providing higher throughput. If one of the servers in the pool fails, the Load Balancer removes it from the pool and the subsequent service requests are distributed among the remaining servers. User of the application use the IP address of the Load Balancer to access the application and are unaware of the pool of servers. OpenStack implements Load Balancer using HAProxy software and Linux Namespace. Virtual Private Network as a Service (VPNaaS) As mentioned earlier tenant isolation requires data traffic to be segregated and secured within an OpenStack cloud. However, there are times when external entities need to be part of the same Network without removing the firewall based security. This can be accomplished using a Virtual Private Network or VPN. A Virtual Private Network (VPN) connects two endpoints on different networks over a public Internet connection, such that the endpoints appear to be directly connected to each other.  VPNs also provide confidentiality and integrity of transmitted data. Neutron provides a service plugin that enables OpenStack users to connect two networks using a Virtual Private Network (VPN).  The reference implementation of VPN plugin in Neutron uses Openswan to create an IPSec based VPN. IPSec is a suite of protocols that provides secure connection between two endpoints by encrypting each IP packet transferred between them. OpenStack and SDN context So far in this article we have seen the different networking capabilities provided by OpenStack. Let us know look at two capabilities in OpenStack that enable SDN to be leveraged effectively. Choice of technology OpenStack being an open source platform bundles open source networking solutions as default implementation for these networking capabilities. For example, Routing is supported using Namespace, security using iptables and Load balancing using HAproxy. Historically these networking capabilities were implemented using customized hardware and software, most of them being proprietary solutions. These custom solutions are capable of much higher performance and are well supported by their vendors. And hence they have a place in the OpenStack and SDN ecosystem. From it initial releases OpenStack has been designed for extensibility. Vendors can write their own extensions and then can easily configure OpenStack to use their extension instead of the default solutions. This allows the operators to deploy the networking technology of their choice. OpenStack API for networking One of the most powerful capabilities of OpenStack is the extensive support for APIs. All services of OpenStack interact with one another using well defined RESTful APIs. This allows custom implementations and pluggable components to provide powerful enhancements for practical cloud implementation. For example, when a Network is created using OpenStack web interface, a RESTful request is sent to Horizon service. This in turn invokes a RESTful API to validate the user using Keystone service. Once validated, Horizon sends another RESTful API request to Neutron to actually create the Network. Summary As seen in this article, OpenStack supports a wide variety of networking functionality right out of the box. The importance of isolating tenant traffic and the need to allow customized solution requires OpenStack to support flexible configuration. We also highlighted some key aspects of OpenStack that will play a key role in deploying Software Defined Networking in datacenters, thereby supporting powerful cloud architecture and solution. Resources for Article: Further resources on this subject: Setting Up a Network Backup Server with Bacula [article] Jenkins 2.0: The impetus for DevOps Movement [article] Integrating Accumulo into Various Cloud Platforms [article]
Read more
  • 0
  • 0
  • 19085

article-image-jira-101
Packt
22 Sep 2016
15 min read
Save for later

JIRA 101

Packt
22 Sep 2016
15 min read
In this article by Ravi Sagar, author of the book, Mastering Jira 7 - Second Edition, you will learn the basics about JIRA. We will look into the components available in the product, their application, and how to use them. We will try our hands on a few JQL queries, after which we will create project reports on JIRA for issue tracking, and then derive information from them using various built-in reports that JIRA comes with. We will also take a look at the gadgets that JIRA provides, which are helpful for reporting purposes. Finally, we will take take a look at the migrating options, which JIRA provides, to fully restore a JIRA instance for a specific project. (For more resources related to this topic, see here.) Product introduction Atlassian JIRA is a proprietary issue tracking system. It is used for tracking bugs, issues, and project management. There are many such tools available, but the best thing about JIRA is that it can be configured very easily and it offers a wide range of customizations. Out of the box, JIRA offers a defect/bug tracking functionality, but it can be customized to act like a helpdesk system, simple test management suite, or a project management system with end-to-end traceability. The much awaited JIRA 7 was released in October 2015 and it is now offered in the following three different application variants: JIRA Core JIRA Software JIRA Service Desk Let us discuss each one of them separately. JIRA Core This comprises the base application of JIRA that you may be familiar with, of course with some new features. JIRA Core is a simplified version of the JIRA features that we have used till 6.x versions. JIRA Software This comprises of all the features of JIRA Core + JIRA Agile. From JIRA 7 onwards, JIRA Agile will no longer be offered as an add-on. You will not be able to install JIRA Agile from the marketplace. JIRA Service Desk This comprises of all the features of JIRA Core + JIRA Service Desk. Just like JIRA Software, the JIRA Service Desk will no longer be offered as an add-on and you cannot install it from the marketplace. Applications, uses, and examples The ability to customize JIRA is what makes it popular among various companies who use it. The following are the various applications of JIRA: Defect/bug tracking Change requests Helpdesk/support tickets Project management Test-case management Requirements management Process management Let's take a look at the implementation of test-case management: The issue types: Test campaign: This will be the standard issue type Test case: This will be subtask The workflow for test campaign: New states: Published Under Execution Conditions: A test campaign will only pass when all the test cases are passed Only reporter can move this test campaign to Closed Post function: When the test campaign is closed, send an email to everyone in a particular group Workflow for a test case: New states: Blocked Passed Failed In Review Condition: Only the assigned user can move the test case to Passed state Post function: When the test case is moved to Failed state, change the issue priority to major Custom fields: Name Type Values Field configuration Category Select list   Customer name Select list   Steps to reproduce Text area   Mandatory Expected input Text area   Mandatory Expected output Text area   Mandatory Precondition Text area   Postcondition Text area Campaign type Select list Unit Functional Endurance Benchmark Robustness Security Backward compatibility Certification with baseline Automation status Select list Automatic Manual Partially automatic JIRA core concepts Let's take a look at the architecture of JIRA; it will help you understand the core concepts: Project Categories: When there are too many projects in JIRA, it becomes important to segregate them into various categories. JIRA will let you create several categories that can represent the business units, clients, or teams in your company. Projects: A JIRA project is a collection of issues. Your team can use a JIRA project to coordinate the development of a product, track a project, manage a help desk, and so on, depending on your requirements. Components: Components are subsections of a project. They are used to group issues within a project to smaller parts. Versions: Versions are point-in-time for a project. They help you schedule and organize your releases. Issue Types: JIRA will let you create more than one issue types that are different from each other in terms of what kind of information they store. JIRA comes with default-issue types, such as bug, task, and subtask, but you can create more issue types that can follow their own workflow as well as have a different set of fields. Subtasks: Issue types are of two types, namely standard and subtasks, which are children of a standard task. For instance, you can have test campaigns as standard-issue type and test cases as subtasks. Introduction to JQL JIRA Query Language (JQL) is one of the best features in JIRA that lets you search the issues efficiently and offers lots of handy features. The best part about JQL is that it is very easy to learn, thanks to the autocomplete functionality in the Advanced search, which helps the user with suggestions based on keywords typed. JQL consists of questions, whether single or multiple, that can be combined together to form complex questions. Basic JQL syntax JQL has a field followed by an operator. For instance, to retrieve all the issues of the CSTA project, you can use a simple query like this: project = CSTA Now, within this project, if you want to find the issues assigned to a specific user, use the following query: project = CSTA and assignee = ravisagar There may be several hundred issues assigned to a user and, maybe, we just want to focus on issues whose priority is either Critical or Blocker, you can use the following query: project = CSTA and assignee = ravisagar and priority in (Blocker, "Critical) What if, instead of issues assigned to a specific user, we want to find the issues assigned to all other users except one? It can be achieved in the following way: project = CSTA and assignee != ravisagar and priority in (Blocker, "Critical) So, you can see that JQL consists of one or more queries. Project reports Once you start using JIRA for issue tracking of any type, it becomes imperative to derive useful information out of it. JIRA comes with built-in reports that show real-time statistics for projects, users, and other fields. Let's take a look at each of these reports. Open any project in JIRA that contains a lot of issues and has around 5 to 10 users, which are either assignee or reporters. When you open any project page, the default view is the Summary view that contains a 30 day summary report and Activity Stream that shows whatever is happening in the project-like creation of new issues, update of status, comments, and basically any change in the project. On the left-hand side of the project summary page, there are links for Issues and Reports. Average Age Report This report displays the average number of days for which issues are in an unresolved state on a given date. Created vs. Resolved Issues Report This report displays the number of issues that were created over the period of time versus the number of issues that were resolved in that period: Pie Chart Report This chart shows the breakup of data. For instance, in your project, if you are interested in finding out the issue count for all the issue types, then this report can be used to fetch this information. Recently Created Issues Report This report displays the statistical information on the number of issues created for the selected period and days. The report also displays status of the issues. Resolution Time Report There are cases when you are interested in understanding the speed of your team every month. How soon can your team resolve the issues? This report displays the average resolution time of the issues in a given month. Single Level Group By Report It is a simple report that just lists the issues grouped by a particular field, such as Assignee, Issue Type, Resolution, Status, Priority, and so on. Time Since Issues Report This report is useful in finding out how many issues were created in a specific quarter over the past one year and, not only that, there are various date-based fields supported by this report. Time Tracking Report This comprehensive report displays the estimated effort and remaining effort of all the issues. Not only that, the report will also give you indication on the overall progress of the project. User Workload Report This report can tell us about the occupancy of the resources in all the projects. It really helps in distributing the tasks among users. Version Workload Report If your project has various versions that are related to the actual releases or fixes, then it becomes important to understand the status of all such issues. Gadgets for reporting purposes JIRA comes with a lot of useful gadgets that you can add in the dashboard and use for reporting purposes. Additional gadgets can be added in JIRA by installing add-ons. Let's take a look at some of these gadgets. Activity Stream This gadget will display all the latest updates in your JIRA instance. It's also possible to limit this stream to a particular filter as well. This gadget is quite useful because it displays up-to-date information on the dashboard: Created vs. Resolved Chart The project summary page has a chart to display all the issues that were created and resolved in the past 30 days. There is a similar gadget to display this information. You can also change the duration from 30 days to whatever you like. This gadget can be created for a specific project: Pie Chart Just like the pie chart, which is there in project reports, there is a similar gadget that you can add in the dashboard. For instance, for a particular project, you can generate a Pie Chart based on Priority: Issue Statistics This gadget is quite useful in generating simple statistics for various fields. Here, we are interested in finding out the breakup of the project in terms of Issue Statistics: Two Dimensional Filter Statistics The Issue Statistics gadget can display the breakup of project issues for every Status. What if you want to further segregate this information? For instance, how many issues are open and to which Issue Type they belong to? In such scenarios, Two Dimensional Filter Statistics can be used. You just need to select two fields that will be used to generate this report, one for x axis and another for y axis: These are certain common gadgets that can be used in the dashboard; however, there are many more gadgets. Click on the Add Gadget option on the top-right corner to see all such gadgets in your JIRA instance. Some gadgets come out of the box with JIRA and others are a part of add-ons that you can install. After you select all these gadgets in your dashboard, this is how it looks: This is the new dashboard that we have just created and configured for a specific project, but it's also possible to create more than one dashboard. To add another dashboard, just click on the Create Dashboard option under Tools on the top-right corner. If you have more than one dashboard, then you can switch between them using the links on the top-left corner of the screen, as shown in the following screenshot: The simple CSV import Let's understand how to perform a simple import of the CSV data. The first thing to do is prepare the CSV file that can be imported into JIRA. For this exercise, we will import issues into a particular project; these issues will have data, such as issue Summary, Status, Dates, and a few other fields. Preparing the CSV file We’ll use MS Excel to prepare the CSV file with the following data: If your existing tool has the option to export directly into the CSV file, then you can skip this step, but we recommend reviewing your data before importing it into JIRA. Usually, the CSV import will not work if the format of the CSV file and the data is not correct. It's very easy to generate a CSV file from an Excel file. Perform the following steps: Go to File | Save As | File name: and select Save as type: as CSV (comma delimited). If you don't have Microsoft Excel installed, you can use LibreOffice Calc, which is an open source alternative for Microsoft Office Excel: You can open the CSV file to verify its format too: Our CSV file has the following fields: CSV Field Purpose Project JIRA's project key needs to be specified in this field Summary This field is mandatory and needs to be specified in the CSV file Issue Type This is important to specify the issue type Status This displays the status of the issue; these are workflow states that need to exist in JIRA and the project workflow should have the states that be imported into the CSV file Priority The priorities mentioned here should exist in JIRA before import Resolution The resolutions mentioned here should exist in JIRA before import Assignee This specifies the assignee of the issue Reporter This specifies the reporter of the issue Created This is the issue creation date Resolved This is the issue resolution date Performing the CSV import Once your CSV file is prepared, then you are ready to perform the import in JIRA: Navigate to JIRA Administration | System | External System Import | CSV (under IMPORT & EXPORT). On the File import screen in the CSV Source File field, click on the Browse… button to select the CSV file that you just prepared on your machine. Once you select the CSV file, the Next button will be enabled: On the Setup screen, select Import to Project as DOPT, which is the name of our project. Verify Date format. It should match the format of the date values in the CSV file. Click on the Next button to continue. On the Map fields screen, we need to map the fields in the CSV file to JIRA fields. This step is crucial because in your old system, the field name can be different from JIRA fields; so, in this step, map these fields to the respective JIRA fields. Click on the Next button to continue. On the Map values screen, map the values of Status, in fact, this mapping of field values can be done for any field. In our case, the values in the status field are the same as in JIRA, so click on the Begin Import button. You will finally get a confirmation that the issues are imported successfully: If you encounter any errors during the CSV import, then it's usually due to some problem with the CSV format. Read the error messages carefully and correct these issues. As mentioned earlier, the CSV import needs to be performed on the test environment first. Migrate JIRA configurations using Configuration Manager add-on JIRA has provision to fully restore a JIRA instance from a backup file, restore a specific project, and use the CSV import functionality to important data in it. These utilities are quite important as it really makes the life of JIRA administrators a lot easier. They can perform these activities right from the JIRA user interface. The project-import utility and CSV import is used to migrate one or more projects from one instance of JIRA to another, but the target instance should have the required configuration in place otherwise these utilities will not work. For instance, if there is a project in source instance with custom workflow states along with a few custom fields, then the exact similar configurations of workflow and custom fields should exist already in the target instance. Recreating these configurations and schemes can be a time consuming and error prone process. Additionally, in various organizations, there is a test environment or a staging server for JIRA where all the new configurations are first tested before they are rolled out to the production instance. Currently, there is no such way to selectively migrate the configurations from one instance to another. It has to be done manually on the target instance. Configuration Manager is an add-on that does this job. Using this add-on, the project-specific configuration can be migrated from one instance to another. Summary In this article we looked at the different products offered by JIRA. Then we learned about concepts of JIRA. We then tried our hands on JQL and a few examples of it. We saw the different types of reports provided by JIRA and the various gadgets available for reporting purposes. Finally we saw how to migrate JIRA configurations using the configuration manager add-on. Resources for Article: Further resources on this subject: Working with JIRA [article] JIRA Workflows [article] JIRA – an Overview [article]
Read more
  • 0
  • 0
  • 15564

article-image-string-management-in-swift
Jorge Izquierdo
21 Sep 2016
7 min read
Save for later

String management in Swift

Jorge Izquierdo
21 Sep 2016
7 min read
One of the most common tasks when building a production app is translating the user interface into multiple languages. I won't go into much detail explaining this or how to set it up, because there are lots of good articles and tutorials on the topic. As a summary, the default system is pretty straightforward. You have a file named Localizable.strings with a set of keys and then different values depending on the file's language. To use these strings from within your app, there is a simple macro in Foundation, NSLocalizedString(key, comment: comment), that will take care of looking up that key in your localizable strings and return the value for the user's device language. Magic numbers, magic strings The problem with this handy macro is that as you can add a new string inline, you will presumably end up with dozens of NSLocalizedStrings in the middle of the code of your app, resulting in something like this: mainLabel.text = NSLocalizedString("Hello world", comment: "") Or maybe, you will write a simple String extension for not having to write it every time. That extension would be something like: extension String { var localized: String { return NSLocalizedString(self, comment: "") } } mainLabel.text = "Hello world".localized This is an improvement, but you still have the problem that the strings are all over the place in the app, and it is difficult to maintain a scalable format for the strings as there is not a central repository of strings that follows the same structure. The other problem with this approach is that you have plain strings inside your code, where you could change a character and not notice it until seeing a weird string in the user interface. For that not to happen, you can take advantage of Swift's awesome strongly typed nature and make the compiler catch these errors with your strings, so that nothing unexpected happens at runtime. Writing a Swift strings file So that is what we are going to do. The goal is to be able to have a data structure that will hold all the strings in your app. The idea is to have something like this: enum Strings { case Title enum Menu { case Feed case Profile case Settings } } And then whenever you want to display a string from the app, you just do: Strings.Title.Feed // "Feed" Strings.Title.Feed.localized // "Feed" or the value for "Feed" in Localizable.strings This system is not likely to scale when you have dozens of strings in your app, so you need to add some sort of organization for the keys. The basic approach would be to just set the value of the enum to the key: enum Strings: String { case Title = "app.title" enum Menu: String { case Feed = "app.menu.feed" case Profile = "app.menu.profile" case Settings = "app.menu.settings" } } But you can see that this is very repetitive and verbose. Also, whenever you add a new string, you need to write its key in the file and then add it to the Localizable.strings file. We can do better than this. Autogenerating the keys Let’s look into how you can automate this process so that you will have something similar to the first example, where you didn't write the key, but you want an outcome like the second example, where you get a reasonable key organization that will be scalable as you add more and more strings during development. We will take advantage of protocol extensions to do this. For starters, you will define a Localizable protocol to make the string enums conform to: protocol Localizable { var rawValue: String { get } } enum Strings: String, Localizable { case Title case Description } And now with the help of a protocol extension, you can get a better key organization: extension Localizable { var localizableKey: String { return self.dynamicType.entityName + "." rawValue } static var entityName: String { return String(self) } } With that key, you can fetch the localized string in a similar way as we did with the String extension: extension Localizable { var localized: String { return NSLocalizedString(localizableKey, comment: "") } } What you have done so far allows you to do Strings.Title.localized, which will look in the localizable strings file for the key Strings.Title and return the value for that language. Polishing the solution This works great when you only have one level of strings, but if you want to group a bit more, say Strings.Menu.Home.Title, you need to make some changes. The first one is that each child needs to know who its parent is in order to generate a full key. That is impossible to do in Swift in an elegant way today, so what I propose is to explicitly have a variable that holds the type of the parent. This way you can recurse back the strings tree until the parent is nil, where we assume it is the root node. For this to happen, you need to change your Localizable protocol a bit: public protocol Localizable { static var parent: LocalizeParent { get } var rawValue: String { get } } public typealias LocalizeParent = Localizable.Type? Now that you have the parent idea in place, the key generation needs to recurse up the tree in order to find the full path for the key. rivate let stringSeparator: String = "." private extension Localizable { static func concatComponent(parent parent: String?, child: String) -> String { guard let p = parent else { return child.snakeCaseString } return p + stringSeparator + child.snakeCaseString } static var entityName: String { return String(self) } static var entityPath: String { return concatComponent(parent: parent?.entityName, child: entityName) } var localizableKey: String { return self.dynamicType.concatComponent(parent: self.dynamicType.entityPath, child: rawValue) } } And to finish, you have to make enums conform to the updated protocol: enum Strings: String, Localizable { case Title enum Menu: String, Localizable { case Feed case Profile case Settings static let parent: LocalizeParent = Strings.self } static let parent: LocalizeParent = nil } With all this in place you can do the following in your app: label.text = Strings.Menu.Settings.localized And the label will have the value for the "strings.menu.settings" key in Localizable.strings. Source code The final code for this article is available on Github. You can find there the instructions for using it within your project. But also you can just add the Localize.swift and modify it according to your project's needs. You can also check out a simple example project to see the whole solution together.  Next time The next steps we would need to take in order to have a full solution is a way for the Localizable.strings file to autogenerate. The solution for this at the current state of Swift wouldn't be very elegant, because it would require either inspecting the objects using the ObjC runtime (which would be difficult to do since we are dealing with pure Swift types here) or defining all the children of a given object explicitly, in the same way as open source XCTest does. Each test case defines all of its tests in a static property. About the author Jorge Izquierdo has been developing iOS apps for 5 years. The day Swift was released, he starting hacking around with the language and built the first Swift HTTP server, Taylor. He has worked on several projects and right now works as an iOS development contractor.
Read more
  • 0
  • 0
  • 28088

article-image-how-to-build-and-deploy-node-app-docker
John Oerter
20 Sep 2016
7 min read
Save for later

How to Build and Deploy a Node App with Docker

John Oerter
20 Sep 2016
7 min read
How many times have you deployed your app that was working perfectly in your local environment to production, only to see it break? Whether it was directly related to the bug or feature you were working on, or another random issue entirely, this happens all too often for most developers. Errors like this not only slow you down, but they're also embarrassing. Why does this happen? Usually, it's because your development environment on your local machine is different from the production environment you're deploying to. The tenth factor of the Twelve-Factor App is Dev/prod parity. This means that your development, staging, and production environments should be as similar as possible. The authors of the Twelve-Factor App spell out three "gaps" that can be present. They are: The time gap: A developer may work on code that takes days, weeks, or even months to go into production. The personnel gap: Developers write code, ops engineers deploy it. The tools gap: Developers may be using a stack like Nginx, SQLite, and OS X, while the production deployment uses Apache, MySQL, and Linux. (Source) In this post, we will mostly focus on the tools gap, and how to bridge that gap in a Node application with Docker. The Tools Gap In the Node ecosystem, the tools gap usually manifests itself either in differences in Node and npm versions, or differences in package dependency versions. If a package author publishes a breaking change in one of your dependencies or your dependencies' dependencies, it is entirely possible that your app will break on the next deployment (assuming you reinstall dependencies with npm install on every deployment), while it runs perfectly on your local machine. Although you can work around this issue using tools like npm shrinkwrap, adding Docker to the mix will streamline your deployment life cycle and minimize broken deployments to production. Why Docker? Docker is unique because it can be used the same way in development and production. When you enable the architecture of your app to run inside containers, you can easily scale out and create small containers that can be composed together to make one awesome system. Then, you can mimic this architecture in development so you never have to guess how your app will behave in production. In regards to the time gap and the personnel gap, Docker makes it easier for developers to automate deployments, thereby decreasing time to production and making it easier for full-stack teams to own deployments. Tools and Concepts When developing inside Docker containers, the two most important concepts are docker-compose and volumes. docker-compose helps define mulit-container environments and the ability to run them with one command. Here are some of the more often used docker-compose commands: docker-compose build: Builds images for services defined in docker-compose.yml docker-compose up: Creates and starts services. This is the same as running docker-compose create && docker-compose start docker-compose run: Runs a one-off command inside a container Volumes allow you to mount files from the host machine into the container. When the files on your host machine change, they change inside the container as well. This is important so that we don't have to constantly rebuild containers during development every time we make a change. You can also use a tool like node-mon to automatically restart the node app on changes. Let's walk through some tips and tricks with developing Node apps inside Docker containers. Set up Dockerfile and docker-compose.yml When you start a new project with Docker, you'll first want to define a barebones Dockerfile and docker-compose.yml to get you started. Here's an example Dockerfile: FROM node:6.2.1 RUN useradd --user-group --create-home --shell /bin/false app-user ENV HOME=/home/app-user USER app-user WORKDIR $HOME/app This Dockerfile displays two best practices: Favor exact version tags over floating tags such as latest. Node releases often these days, and you don't want to implicitly upgrade when building your container on another machine. By specifying a version such as 6.2.1, you ensure that anyone who builds the image will always be working from the same node version. Create a new user to run the app inside the container. Without this step, everything would run under root in the container. You certainly wouldn't do that on a physical machine, so don't do in Docker containers either. Here's an example starter docker-compose.yml: web: build: . volumes: - .:/home/app-user/app Pretty simple right? Here we are telling Docker to build the web service based on our Dockerfile and create a volume from our current host directory to /home/app-user/app inside the container. This simple setup lets you build the container with docker-compose build and then run bash inside it with docker-compose run --rm web /bin/bash. Now, it's essentially the same as if you were SSH'd into a remote server or working off a VM, except that any file you create inside the container will be on your host machine and vice versa. With that in mind, you can bootstrap your Node app from inside your container using npm init -y and npm shrinkwrap. Then, you can install any modules you need such as Express. Install node modules on build With that done, we need to update our Dockerfile to install dependencies from npm when the image is built. Here is the updated Dockerfile: FROM node:6.2.1 RUN useradd --user-group --create-home --shell /bin/false app-user ENV HOME=/home/app-user COPY package.json npm-shrinkwrap.json $HOME/app/ RUN chown -R app-user:app-user $HOME/* USER app-user WORKDIR $HOME/app RUN npm install Notice that we had to change the ownership of the copied files to app-user. This is because files copied into a container are automatically owned by root. Add a volume for the node_modules directory We also need to make an update to our docker-compose.yml to make sure that our modules are installed inside the container properly. web: build: . volumes: - .:/home/app-user/app - /home/app-user/app/node_modules Without adding a data volume to /home/app-user/app/node_modules, the node_modules wouldn't exist at runtime in the container because our host directory, which won't contain the node_modules directory, would be mounted and hide the node_modules directory that was created when the container was built. For more information, see this Stack Overflow post. Running your app Once you've got an entry point to your app ready to go, simply add it as a CMD in your Dockerfile: CMD ["node", "index.js"] This will automatically start your app on docker-compose up. Running tests inside your container is easy as well. docker-compose --rm run web npm test You could easily hook this into CI. Production Now going to production with your Docker-powered Node app is a breeze! Just use docker-compose again. You will probably want to define another docker-compose.yml that is especially written for production use. This means removing volumes, binding to different ports, setting NODE_ENV=production, and so on. Once you have a production config file, you can tell docker-compose to use it, like so: docker-compose -f docker-compose.yml -f docker-compose.production.yml up The -f lets you specify a list of files that are merged in the order specified. Here is a complete Dockerfile and docker-compose.yml for reference: # Dockerfile FROM node:6.2.1 RUN useradd --user-group --create-home --shell /bin/false app-user ENV HOME=/home/app-user COPY package.json npm-shrinkwrap.json $HOME/app/ RUN chown -R app-user:app-user $HOME/* USER app-user WORKDIR $HOME/app RUN npm install CMD ["node", "index.js"] # docker-compose.yml web: build: . ports: - '3000:3000' volumes: - .:/home/app-user/app - /home/app-user/app/node_modules About the author John Oerter is a software engineer from Omaha, Nebraska, USA. He has a passion for continuous improvement and learning in all areas of software development, including Docker, JavaScript, and C#. He blogs here.
Read more
  • 0
  • 0
  • 18098
article-image-how-construct-your-own-network-models-chainer
Masayuki Takagi
19 Sep 2016
6 min read
Save for later

How to construct your own network models on Chainer

Masayuki Takagi
19 Sep 2016
6 min read
In this post, I will introduce you to how to construct your own network model on Chainer. First, I will begin by introducing Chainer's basic concepts as its building blocks. Then, you will see how to construct network models on Chainer. Finally, I define a multi-class classifier bound to the network model. Concepts Let’s start with the basic concepts Chainer has as its building blocks. Procedural abstractions Chains Links Functions Data abstraction Variables Chainer has three procedural abstractions: chains, links and functions from the higher level. Chains are at the highest level and they represent entire network models. They consist of links and/or other chains. Links are like layers in a network model, which have learnable parameters to be optimized through training. Functions are the most fundamental in Chainer's procedural abstractions. They take inputs and return outputs. Links use functions to apply them its inputs and parameters to return its outputs. At the data abstraction level, Chainer has Variables. They represent inputs and outputs of chains, links and functions. As their actual data representation, they wrap numpy and cupy's n-dimentional arrays. In the following sections, we will construct our own network models with these concepts. Construct network models on Chainer In this section, I describe how to construct multi-layer perceptron (MLP) with three layers on top of the basic concepts shown in the previous section. It is very simple. from chainer import Chain import chainer.functions as F import chainer.links as L class MLP(Chain): def__init__(self): super(MLP, self).__init__( l1=L.Linear(784, 100), l2=L.Linear(100, 100), l3=L.Linear(100, 10), ) def__call__(self, x): h1 = F.relu(self.l1(x)) h2 = F.relu(self.l2(h1)) y = self.l3(h2) return y We define the MLP class derived from Chain class. Then we implement two methods, __init__ and __call__. The __init__ method is for initializing links that the chain has. It has three fully connected layers: chainer.links.Linear, or L.Linear above. The first layer named l1 takes 784-dimension input vectors, which means a 28x28-pixel grayscale handwritten digital image. The last layer named l3 returns 10-dimension output vectors, which correspond to ten digits from 0 to 9. Between them, it has another middle layer named l2 that takes 100 input/output vectors. Called on forward propagation is the __call__ method. It takes an input x as a Chainer variable and applies it to three Linear layers and two ReLU activity functions for l1 and l2 Linear layers. It then returns a Chainer variable y that is the output of l3 layer. Because the two ReLU activity functions are Chainer functions, they do not have learnable parameters internally, so you do not have to initialize them with the __init__ method. This code does forward propagation as well as network construction behind the stage. It is Chainer's magic, and backward propagation is automatically computed based on the network constructed here when we optimize it. Of course, you may write the __call__ method as follows. The local variables h1 and h2 are just for clear code. def__call__(self, x): returnself.l3(F.relu(self.l2(F.relu(self.l1(x))))) Define a Multi-class classifier Another chain is Classifier, which will be bound to the MLP chain defined in the previous section. Classifier is for general multi-class classification, and it computes loss value and accuracy of a network model for a given input vector and its ground truth. import chainer.functions as F class Classifier(Chain): def__init__(self, predictor): super(Classifier, self).__init__(predictor=predictor) def__call__(self, x, t): y = self.predictor(x) self.loss = F.softmax_cross_entropy(y, t) self.accuracy = F.accuracy(y, t) returnself.loss We define the Classifier class derived from the Chain class as well as the MLP class, because it takes a chain as a predictor. We similarly implement the __init__ and __call__ methods. In the __init__ method, it takes a predictor parameter as a chain, or a link is also acceptable, to initialize this class. In the __call__ method, it takes two inputs x and t. These are input vectors and its ground truth respectively. Chains that are passed to Chainer optimizers should be compliant with this protocol. First, we give the input vector x to the predictor, which would be MLP model in this post, to get the result y of forward propagation. Then, the loss value and accuracy are computed with the softmax_cross_entropy and accuracy functions for a given ground truth t. Finally, it returns the loss value. The accuracy can be accessed as an attribute any time. Actually, we initialize this classifier bound to the MLP model as follows. model is passed to Chainer optimizer. model = Classifier(MLP()) Conclusion Now we have our own network model. In this post I introduced basic concepts of Chainer. Then we implemented the MLP class that models multi-layer perceptron and Classifier class that computes loss value and accuracy to be optimized for a given ground truth. from chainer import Chain import chainer.functions as F import chainer.links as L class MLP(Chain): def__init__(self): super(MLP, self).__init__( l1=L.Linear(784, 100), l2=L.Linear(100, 100), l3=L.Linear(100, 10), ) def__call__(self, x): h1 = F.relu(self.l1(x)) h2 = F.relu(self.l2(h1)) y = self.l3(h2) return y class Classifier(Chain): def__init__(self, predictor): super(Classifier, self).__init__(predictor=predictor) def__call__(self, x, t): y = self.predictor(x) self.loss = F.softmax_cross_entropy(y, t) self.accuracy = F.accuracy(y, t) returnself.loss model = Classifier(MLP()) As some next steps, you may want to learn: How to optimize the model. How to train and test the model for an MNIST dataset. How to accelerate training the model using GPU. Chainer provides an MNIST example program in the chainer/examples/mnist directory, which will help you. About the author Masayuki Takagi is an entrepreneur and software engineer from Japan. His professional experience domains are advertising and deep learning, serving big Japanese corporations. His personal interests are fluid simulation, GPU computing, FPGA, compiler, and programming language design. Common Lisp is his most beloved programming language and he is the author of the cl-cuda library. Masayuki is a graduate of the University of Tokyo and lives in Tokyo with his buddy Plum, a long coat Chihuahua, and aco.
Read more
  • 0
  • 0
  • 1773

article-image-jenkins-20-impetus-devops-movement
Packt
19 Sep 2016
15 min read
Save for later

Jenkins 2.0: The impetus for DevOps Movement

Packt
19 Sep 2016
15 min read
In this article by Mitesh Soni, the author of the book DevOps for Web Development, provides some insight into DevOps movement, benefits of DevOps culture, Lifecycle of DevOps, how Jenkins 2.0 is bridging the gaps between Continuous Integration and Continuous Delivery using new features and UI improvements, installation and configuration of Jenkins 2.0. (For more resources related to this topic, see here.) Understanding the DevOps movement Let's try to understand what DevOps is. Is it a real, technical word? No, because DevOps is not just about technical stuff. It is also neither simply a technology nor an innovation. In simple terms, DevOps is a blend of complex terminologies. It can be considered as a concept, culture, development and operational philosophy, or a movement. To understand DevOps, let's revisit the old days of any IT organization. Consider there are multiple environments where an application is deployed. The following sequence of events takes place when any new feature is implemented or bug fixed:   The development team writes code to implement a new feature or fix a bug. This new code is deployed to the development environment and generally tested by the development team. The new code is deployed to the QA environment, where it is verified by the testing team. The code is then provided to the operations team for deploying it to the production environment. The operations team is responsible for managing and maintaining the code.   Let's list the possible issues in this approach: The transition of the current application build from the development environment to the production environment takes weeks or months. The priorities of the development team, QA team, and IT operations team are different in an organization and effective, and efficient co-ordination becomes a necessity for smooth operations. The development team is focused on the latest development release, while the operations team cares about the stability of the production environment. The development and operations teams are not aware of each other's work and work culture. Both teams work in different types of environments; there is a possibility that the development team has resource constraints and they therefore use a different kind of configuration. It may work on the localhost or in the dev environment. The operations team works on production resources and there will therefore be a huge gap in the configuration and deployment environments. It may not work where it needs to run—the production environment. Assumptions are key in such a scenario, and it is improbable that both teams will work under the same set of assumptions. There is manual work involved in setting up the runtime environment and configuration and deployment activities. The biggest issue with the manual application-deployment process is its nonrepeatability and error-prone nature. The development team has the executable files, configuration files, database scripts, and deployment documentation. They provide it to the operations team. All these artifacts are verified on the development environment and not in production or staging. Each team may take a different approach for setting up the runtime environment and the configuration and deployment activities, considering resource constraints and resource availability. In addition, the deployment process needs to be documented for future usage. Now, maintaining the documentation is a time-consuming task that requires collaboration between different stakeholders. Both teams work separately and hence there can be a situation where both use different automation techniques. Both teams are unaware of the challenges faced by each other and hence may not be able to visualize or understand an ideal scenario in which the application works. While the operations team is busy in deployment activities, the development team may get another request for a feature implementation or bug fix; in such a case, if the operations team faces any issues in deployment, they may try to consult the development team, who are already occupied with the new implementation request. This results in communication gaps, and the required collaboration may not happen. There is hardly any collaboration between the development team and the operations team. Poor collaboration causes many issues in the application's deployment to different environments, resulting in back-and-forth communication through e-mail, chat, calls, meetings, and so on, and it often ends in quick fixes. Challenges for the development team: The competitive market creates pressure of on-time delivery. They have to take care of production-ready code management and new feature implementation. The release cycle is often long and hence the development team has to make assumptions before the application deployment finally takes place. In such a scenario, it takes more time to fix the issues that occurred during deployment in the staging or production environment. Challenges for the operations team: Resource contention: It's difficult to handle increasing resource demands Redesigning or tweaking: This is needed to run the application in the production environment Diagnosing and rectifying: They are supposed to diagnose and rectify issues after application deployment in isolation The benefits of DevOps This diagram covers all the benefits of DevOps: Collaboration among different stakeholders brings many business and technical benefits that help organizations achieve their business goals. The DevOps lifecycle – it's all about "continuous" Continuous Integration(CI),Continuous Testing(CT), and Continuous Delivery(CD) are significant part of DevOps culture. CI includes automating builds, unit tests, and packaging processes while CD is concerned with the application delivery pipeline across different environments. CI and CD accelerate the application development process through automation across different phases, such as build, test, and code analysis, and enable users achieve end-to-end automation in the application delivery lifecycle: Continuous integration and continuous delivery or deployment are well supported by cloud provisioning and configuration management. Continuous monitoring helps identify issues or bottlenecks in the end-to-end pipeline and helps make the pipeline effective. Continuous feedback is an integral part of this pipeline, which directs the stakeholders whether are close to the required outcome or going in the different direction. "Continuous effort – not strength or intelligence – is the key to unlocking our potential"                                                                                            -Winston Churchill Continuous integration What is continuous integration? In simple words, CI is a software engineering practice where each check-in made by a developer is verified by either of the following: Pull mechanism: Executing an automated build at a scheduled time Push mechanism: Executing an automated build when changes are saved in the repository This step is followed by executing a unit test against the latest changes available in the source code repository. The main benefit of continuous integration is quick feedback based on the result of build execution. If it is successful, all is well; else, assign responsibility to the developer whose commit has broken the build, notify all stakeholders, and fix the issue. Read more about CI at http://martinfowler.com/articles/continuousIntegration.html. So why is CI needed? Because it makes things simple and helps us identify bugs or errors in the code at a very early stage of development, when it is relatively easy to fix them. Just imagine if the same scenario takes place after a long duration and there are too many dependencies and complexities we need to manage. In the early stages, it is far easier to cure and fix issues; consider health issues as an analogy, and things will be clearer in this context. Continuous integration is a development practice that requires developers to integrate code into a shared repository several times a day. Each check-in is then verified by an automated build, allowing teams to detect problems early. CI is a significant part and in fact a base for the release-management strategy of any organization that wants to develop a DevOps culture. Following are immediate benefits of CI: Automated integration with pull or push mechanism Repeatable process without any manual intervention Automated test case execution Coding standard verification Execution of scripts based on requirement Quick feedback: build status notification to stakeholders via e-mail Teams focused on their work and not in the managing processes Jenkins, Apache Continuum, Buildbot, GitLabCI, and so on are some examples of open source CI tools. AnthillPro, Atlassian Bamboo, TeamCity, Team Foundation Server, and so on are some examples of commercial CI tools. Continuous integration tools – Jenkins Jenkins was originally an open source continuous integration software written in Java under the MIT License. However, Jenkins 2 an open source automation server that focuses on any automation, including continuous integration and continuous delivery. Jenkins can be used across different platforms, such as Windows, Ubuntu/Debian, Red Hat/Fedora, Mac OS X, openSUSE, and FreeBSD. Jenkins enables user to utilize continuous integration services for software development in an agile environment. It can be used to build freestyle software projects based on Apache Ant and Maven 2/Maven 3. It can also execute Windows batch commands and shell scripts. It can be easily customized with the use of plugins. There are different kinds of plugins available for customizing Jenkins based on specific needs for setting up continuous integration. Categories of plugins include source code management (the Git, CVS, and Bazaar plugins), build triggers (the Accelerated Build Now and Build Flow plugins), build reports (the Code Scanner and Disk Usage plugins), authentication and user management (the Active Directory and GitHub OAuth plugins), and cluster management and distributed build (Amazon EC2 and Azure Slave plugins). To know more about all plugins, visit https://wiki.jenkins-ci.org/display/JENKINS/Plugins. To explore how to create a new plugin, visit https://wiki.jenkins-ci.org/display/JENKINS/Plugin+tutorial. To download different versions of plugins, visit https://updates.jenkins-ci.org/download/plugins/. Visit the Jenkins website at http://jenkins.io/. Jenkins accelerates the software development process through automation: Key features and benefits Here are some striking benefits of Jenkins: Easy install, upgrade, and configuration. Supported platforms: Windows, Ubuntu/Debian, Red Hat/Fedora/CentOS, Mac OS X, openSUSE, FreeBSD, OpenBSD, Solaris, and Gentoo. Manages and controls development lifecycle processes. Non-Java projects supported by Jenkins: Such as .NET, Ruby, PHP, Drupal, Perl, C++, Node.js, Python, Android, and Scala. A development methodology of daily integrations verified by automated builds. Every commit can trigger a build. Jenkins is a fully featured technology platform that enables users to implement CI and CD. The use of Jenkins is not limited to CI and CD. It is possible to include a model and orchestrate the entire pipeline with the use of Jenkins as it supports shell and Windows batch command execution. Jenkins 2.0 supports a delivery pipeline that uses a Domain-Specific Language (DSL) for modeling entire deployments or delivery pipelines. Pipeline as code provides a common language—DSL—to help the development and operations teams to collaborate in an effective manner. Jenkins 2 brings a new GUI with stage view to observe the progress across the delivery pipeline. Jenkins 2.0 is fully backward compatible with the Jenkins 1.x series. Jenkins 2 now requires Servlet 3.1 to run. You can use embedded Winstone-Jetty or a container that supports Servlet 3.1 (such as Tomcat 8). GitHub, Collabnet, SVN, TFS code repositories, and so on are supported by Jenkins for collaborative development. Continuous integration: Automate build and test—automated testing (continuous testing), package, and static code analysis. Supports common test frameworks such as HP ALM Tools, Junit, Selenium, and MSTest. For continuous testing, Jenkins has plugins for both; Jenkins slaves can execute test suites on different platforms. Jenkins supports static code analysis tools such as code verification by CheckStyle and FindBug. It also integrates with Sonar. Continuous delivery and continuous deployment: It automates the application deployment pipeline, integrates with popular configuration management tools, and automates environment provisioning. To achieve continuous delivery and deployment, Jenkins supports automatic deployment; it provides a plugin for direct integration with IBM uDeploy. Highly configurable: Plugins-based architecture that provides support to many technologies, repositories, build tools, and test tools; it has an open source CI server and provides over 400 plugins to achieve extensibility. Supports distributed builds: Jenkins supports "master/slave" mode, where the workload of building projects is delegated to multiple slave nodes. It has a machine-consumable remote access API to retrieve information from Jenkins for programmatic consumption, to trigger a new build, and so on. It delivers a better application faster by automating the application development lifecycle, allowing faster delivery. The Jenkins build pipeline (quality gate system) provides a build pipeline view of upstream and downstream connected jobs, as a chain of jobs, each one subjecting the build to quality-assurance steps. It has the ability to define manual triggers for jobs that require intervention prior to execution, such as an approval process outside of Jenkins. In the following diagram Quality Gates and Orchestration of Build Pipeline is illustrated: Jenkins can be used with the following tools in different categories as shown here: Language Java .Net Code repositories Subversion, Git, CVS, StarTeam Build tools Ant, Maven NAnt, MS Build Code analysis tools Sonar, CheckStyle, FindBugs, NCover, Visual Studio Code Metrics, PowerTool Continuous integration Jenkins Continuous testing Jenkins plugins (HP Quality Center 10.00 with the QuickTest Professional add-in, HP Unified Functional Testing 11.5x and 12.0x, HP Service Test 11.20 and 11.50, HP LoadRunner 11.52 and 12.0x, HP Performance Center 12.xx, HP QuickTest Professional 11.00, HP Application Lifecycle Management 11.00, 11.52, and 12.xx, HP ALM Lab Management 11.50, 11.52, and 12.xx, JUnit, MSTest, and VsTest) Infrastructure provisioning Configuration management tool—Chef Virtualization/cloud service provider VMware, AWS, Microsoft Azure (IaaS), traditional environment Continuous delivery/deployment Chef/deployment plugin/shell scripting/Powershell scripts/Windows batch commands Installing Jenkins Jenkins provides us with multiple ways to install it for all types of users. We can install it on at least the following operating systems: Ubuntu/Debian Windows Mac OS X OpenBSD FreeBSD openSUSE Gentoo CentOS/Fedora/Red Hat One of the easiest options I recommend is to use a WAR file. A WAR file can be used with or without a container or web application server. Having Java is a must before we try to use a WAR file for Jenkins, which can be done as follows: Download the jenkins.war file from https://jenkins.io/. Open command prompt in Windows or a terminal in Linux, go to the directory where the jenkins.war file is stored, and execute the following command: java – jar jenkins.war Once Jenkins is fully up and running, as shown in the following screenshot, explore it in the web browser by visiting http://localhost:8080. By default, Jenkins works on port 8080. Execute the following command from the command line: java -jar jenkins.war --httpPort=9999 For HTTPS, use the following command: java -jar jenkins.war --httpsPort=8888 Once Jenkins is running, visit the Jenkins home directory. In our case, we have installed Jenkins 2 on a CentOS 6.7 virtual machine. Go to /home/<username>/.jenkins, as shown in the following screenshot. If you can't see the .jenkins directory, make sure hidden files are visible. In CentOS, press Ctrl+H to make hidden files visible. Setting up Jenkins Now that we have installed Jenkins, let's verify whether Jenkins is running. Open a browser and navigate to http://localhost:8080 or http://<IP_ADDRESS>:8080. If you've used Jenkins earlier and recently downloaded the Jenkins 2 WAR file, it will ask for a security setup. To unlock Jenkins, follow these steps: Go to the .Jenkins directory and open the initialAdminPassword file from the secrets subdirectory: Copy the password in that file, paste it in the Administrator password box, and click on Continue, as shown here: Clicking on Continue will redirect you to the Customize Jenkins page. Click on Install suggested plugins: The installation of the plugins will start. Make sure that you have a working Internet connection. Once all the required plugins have been installed, you will seethe Create First Admin User page. Provide the required details, and click on Save and Finish: Jenkins is ready! Our Jenkins setup is complete. Click on Start using Jenkins: Get Jenkins plugins from https://wiki.jenkins-ci.org/display/JENKINS/Plugins. Summary We have covered some brief details on DevOps culture and Jenkins 2.0 and its new features. DevOps for Web Developmentprovides more details on extending Continuous Integration to Continuous Delivery and Continuous Deployment using Configuration management tools such as Chef and Cloud Computing platforms such Microsoft Azure (App Services) and AWS (Amazon EC2 and AWS Elastic Beanstalk), you refer at https://www.packtpub.com/networking-and-servers/devops-web-development. To get more details Jenkins, refer to JenkinsEssentials, https://www.packtpub.com/application-development/jenkins-essentials. Resources for Article: Further resources on this subject: Setting Up and Cleaning Up [article] Maven and Jenkins Plugin [article] Exploring Jenkins [article]
Read more
  • 0
  • 0
  • 14222

article-image-setting-upa-network-backup-server-bacula
Packt
19 Sep 2016
12 min read
Save for later

Setting Up a Network Backup Server with Bacula

Packt
19 Sep 2016
12 min read
In this article by Timothy Boronczyk,the author of the book CentOS 7 Server Management Cookbook,we'll discuss how to set up a network backup server with Bacula. The fact of the matter is that we are living in a world that is becoming increasingly dependent on data. Also, from accidental deletion to a catastrophic hard drive failure, there are many threats to the safety of your data. The more important your data is and the more difficult it is to recreate if it were lost, the more important it is to have backups. So, this article shows you how you can set up a backup server using Bacula and how to configure other systems on your network to back up their data to it. (For more resources related to this topic, see here.) Getting ready This article requires at least two CentOS systems with working network connections. The first system is the local system which we'll assume has the hostname benito and the IP address 192.168.56.41. The second system is the backup server. You'll need administrative access on both systems, either by logging in with the root account or through the use of sudo. How to do it… Perform the following steps on your local system to install and configure the Bacula file daemon: Install the bacula-client package. yum install bacula-client Open the file daemon's configuration file with your text editor. vi /etc/bacula/bacula-fd.conf In the FileDaemon resource, update the value of the Name directive to reflect the system's hostname with the suffix -fd. FileDaemon {   Name = benito-fd ... } Save the changes and close the file. Start the file daemon and enable it to start when the system reboots. systemctl start bacula-fd.service systemctl enable bacula-fd.service Open the firewall to allow TCP traffic through to port 9102. firewall-cmd --zone=public --permanent --add-port=9102/tcp firewall-cmd --reload Repeat steps 1-6 on each system that will be backed up. Install the bacula-console, bacula-director, bacula-storage, and bacula-client packages. yum install bacula-console bacula-director bacula-storage bacula-client Re-link the catalog library to use SQLite database storage. alternatives --config libbaccats.so Type 2 when asked to provide the selection number. Create the SQLite database file and import the table schema. /usr/libexec/bacula/create_sqlite3_database /usr/libexec/bacula/make_sqlite3_tables Open the director's configuration file with your text editor. vi /etc/bacula/bacula-dir.conf In the Job resource where Name has the value BackupClient1, change the value of the Name directive to reflect one of the local systems. Then add a Client directive with a value that matches that system's FileDaemonName. Job {   Name = "BackupBenito"   Client = benito-fd   JobDefs = "DefaultJob" } Duplicate the Job resource and update its directive values as necessary so that there is a Job resource defined for each system to be backed up. For each system that will be backed up, duplicate the Client resource where the Name directive is set to bacula-fd. In the copied resource, update the Name and Address directives to identify that system. Client {   Name = bacula-fd   Address = localhost   ... } Client {   Name = benito-fd   Address = 192.168.56.41   ... } Client {   Name = javier-fd   Address = 192.168.56.42   ... } Save your changes and close the file. Open the storage daemon's configuration file. vi /etc/bacula/bacula-sd.conf In the Device resource where Name has the value FileStorage, change the value of the Archive Device directive to /bacula. Device {   Name = FileStorage   Media Type = File   Archive Device = /bacula ... Save the update and close the file. Create the /bacula directory and assign it the proper ownership. mkdir /bacula chown bacula:bacula /bacula If you have SELinux enabled, reset the security context on the new directory. restorecon -Rv /bacula Start the director and storage daemons and enable them to start when the system reboots. systemctl start bacula-dir.service bacula-sd.service bacula-fd.service systemctl enable bacula-dir.service bacula-sd.service bacula-fd.service Open the firewall to allow TCP traffic through to ports 9101-9103. firewall-cmd --zone=public --permanent --add-port=9101-9103/tcp firewall-cmd –reload Launch Bacula's console interface. bconsole Enter label to create a destination for the backup. When prompted for the volume name, use Volume0001 or a similar value. When prompted for the pool, select the File pool. label Enter quit to leave the console interface. How it works… The suite's distributed architecture and the amount of flexibility it offers us can make configuring Bacula a daunting task.However, once you have everything up and running, you'll be able to rest easy knowing that your data is safe from disasters and accidents. Bacula is broken up into several components. In this article, our efforts centered on the following three daemons: the director, the file daemon, and the storage daemon. The file daemon is installed on each local system to be backed up and listens for connections from the director. The director connects to each file daemon as scheduled and tells it whichfiles to back up and where to copy them to (the storage daemon). This allows us to perform all scheduling at a central location. The storage daemon then receives the data and writes it to the backup medium, for example, disk or tape drive. On the local system, we installed the file daemon with the bacula-client package andedited the file daemon's configuration file at /etc/bacula/bacula-fd.conf to specify the name of the process. The convention is to add the suffix -fd to the system's hostname. FileDaemon {   Name = benito-fd   FDPort = 9102   WorkingDirectory = /var/spool/bacula   Pid Directory = /var/run   Maximum Concurrent Jobs = 20 } On the backup server, we installed thebacula-console, bacula-director, bacula-storage, and bacula-client packages. This gives us the director and storage daemon and another file daemon. This file daemon's purpose is to back up Bacula's catalog. Bacula maintains a database of metadata about previous backup jobs called the catalog, which can be managed by MySQL, PostgreSQL, or SQLite. To support multiple databases, Bacula is written so that all of its database access routines are contained in shared libraries with a different library for each database. When Bacula wants to interact with a database, it does so through libbaccats.so, a fake library that is nothing more than a symbolic link pointing to one of the specific database libraries. This let's Bacula support different databases without requiring us to recompile its source code. To create the symbolic link, we usedalternatives and selected the real library that we want to use. I chose SQLite since it's an embedded database library and doesn't require additional services. Next, we needed to initialize the database schema using scripts that come with Bacula. If you want to use MySQL, you'll need to create a dedicated MySQL user for Bacula to use and then initialize the schema with the following scripts instead. You'll also need to review Bacula's configuration files to provide Bacula with the required MySQL credentials. /usr/libexec/bacula/grant_mysql_privileges /usr/libexec/bacula/create_mysql_database /usr/libexec/bacula/make_mysql_tables Different resources are defined in the director's configuration file at /etc/bacula/bacula-dir.conf, many of which consist not only of their own values but also reference other resources. For example, the FileSet resource specifies which files are included or excluded in backups and restores, while a Schedule resource specifies when backups should be made. A JobDef resource can contain various configuration directives that are common to multiple backup jobs and also reference particular FileSet and Schedule resources. Client resources identify the names and addresses of systems running file daemons, and a Job resource will pull together a JobDef and Client resource to define the backup or restore task for a particular system. Some resources define things at a more granular level and are used as building blocks to define other resources. Thisallows us to create complex definitions in a flexible manner. The default resource definitions outline basic backup and restore jobs that are sufficient for this article (you'll want to study the configuration and see how the different resources fit together so that you can tweak them to better suit your needs). We customized the existing backup Jobresource by changing its name and client. Then, we customized the Client resource by changing its name and address to point to a specific system running a file daemon. A pair of Job and Client resources can be duplicated for each additional system youwant to back up. However, notice that I left the default Client resource that defines bacula-fd for the localhost. This is for the file daemon that's local to the backup server and will be the target for things such as restore jobs and catalog backups. Job {   Name = "BackupBenito"   Client = benito-fd   JobDefs = "DefaultJob" }   Job {   Name = "BackupJavier"   Client = javier-fd   JobDefs = "DefaultJob" }   Client {   Name = bacula-fd   Address = localhost   ... }   Client {   Name = benito-fd   Address = 192.168.56.41   ... }   Client {   Name = javier-fd   Address = 192.168.56.42   ... } To complete the setup, we labeled a backup volume. This task, as with most others, is performed through bconsole, a console interface to the Bacula director. We used thelabel command to specify a label for the backup volume and when prompted for the pool, we assigned the labeled volume to the File pool. In a way very similar to how LVM works, an individual device or storage unit is allocated as a volume and the volumes are grouped into storage pools. If a pool contains two volumes backed by tape drives for example and one of the drives is full, the storage daemon will write the data to the tape that has space available. Even though in our configuration we're storing the backup to disk, we still need to create a volume as the destination for data to be written to. There's more... At this point, you should consider which backup strategy works best for you. A full backup is a complete copy of your data, a differential backup captures only the files that have changed since the last full backup, and an incremental backup copies the files that have changed since the last backup (regardless of the type of backup). Commonly, administrators employ a previous combination, perhaps making a full backup at the start of the week and then differential or incremental backups each day thereafter. This saves storage space because the differential and incremental backups are not only smaller but also convenient when the need to restore a file arises because a limited number of backups need to be searched for the file. Another consideration is the expected size of each backup and how long it will take for the backup to run to completion. Full backups obviously take longer to run, and in an office with 9-5 working hours, Monday through Friday and it may not be possible to run a full backup during the evenings. Performing a full backup on Fridays gives the backup time over the weekend to run. Smaller, incremental backups can be performed on the other days when time is lesser. Yet another point that is important in your backup strategy is, how long the backups will be kept and where they will be kept. A year's worth of backups is of no use if your office burns down and they were sitting in the office's IT closet. At one employer, we kept the last full back up and last day's incremental on site;they were then duplicated to tape and stored off site. Regardless of the strategy you choose to implement, your backups are only as good as your ability to restore data from them. You should periodically test your backups to make sure you can restore your files. To run a backup job on demand, enter run in bconsole. You'll be prompted with a menu to select one of the current configured jobs. You'll then be presented with the job's options, such as what level of backup will be performed (full, incremental, or differential), it's priority, and when it will run. You can type yes or no to accept or cancel it or mod to modify a parameter. Once accepted, the job will be queued and assigned a job ID. To restore files from a backup, use the restore command. You'll be presented with a list of options allowing you to specify which backup the desired files will be retrieved from. Depending on your selection, the prompts will be different. Bacula's prompts are rather clear, so read them carefully and they will guide you through the process. Apart from the run and restore commands, another useful command is status. It allows you to see the current status of the Bacula components, if there are any jobs currently running, and which jobs have completed. A full list of commands can be retrieved by typing help in bconsole. See also For more information on working with Bacula, refer to the following resources: Bacula documentation (http://blog.bacula.org/documentation/) How to use Bacula on CentOS 7 (http://www.digitalocean.com/community/tutorial_series/how-to-use-bacula-on-centos-7) Bacula Web (a web-based reporting and monitoring tool for Bacula) (http://www.bacula-web.org/) Summary In this article, we discussed how we can set up a backup server using Bacula and how to configure other systems on our network to back up our data to it. Resources for Article: Further resources on this subject: Jenkins 2.0: The impetus for DevOps Movement [article] Gearing Up for Bootstrap 4 [article] Introducing Penetration Testing [article]
Read more
  • 0
  • 0
  • 30260
article-image-making-history-event-sourcing
Packt
15 Sep 2016
18 min read
Save for later

Making History with Event Sourcing

Packt
15 Sep 2016
18 min read
In this article by Christian Baxter, author of the book Mastering Akka, we will see, the most common, tried, and true approach is to model the data in a relational database when it comes to the persistence needs for an application. Following this approach has been the de facto way to store data until recently, when NoSQL (and to a lesser extent NewSQL) started to chip away at the footholds of relational database dominance. There's nothing wrong with storing your application's data this way—it's how we initially chose to do so for the bookstore application using PostgreSQL as the storage engine. This article deals with event sourcing and how to implement that approach using Akka Persistence. These are the main things you can expect to learn from this article: (For more resources related to this topic, see here.) Akka persistence for event sourcing Akka persistence is a relatively newer module within the Akka toolkit. It became available as experimental in the 2.3.x series. Throughout that series, it went through quite a few changes as the team worked on getting the API and functionality right. When Akka 2.4.2 was released, the experimental label was removed, signifying that persistence was stable and ready to be leveraged in production code. Akka persistence allows stateful actors to persist their internal state. It does this not to persisting the state itself, but instead as changes to that state. It uses an append-only model to persist these state changes, allowing you to later reconstitute the state by replaying the changes to that state. It also allows you to take periodic snapshots and use those to reestablish an actor's state as a performance optimization for long-lived entities with lots of state changes. Akka persistence's approach should certainly sound familiar as it's almost a direct overlay to the features of event sourcing. In fact, it was inspired by the eventsourced Scala library, so that overlay is no coincidence. Because of this alignment with event sourcing, Akka persistence will be the perfect tool for us to switch over to an event sourced model. Before getting into the details of the refactor, I want to describe some of the high-level concepts in the framework. The PersistentActor trait The PersistentActor trait is the core building block to create event sourced entities. This actor is able to persist its events to a pluggable journal. When a persistent actor is restarted (reloaded), it will replay its journaled events to reestablish its current internal state. These two behaviors perfectly fit what we need to do for our event sourced entities, so this will be our core building block. The PersistentActor trait has a log of features, more that I will cover in the next few sections. I'll cover the things that we will use in the bookstore refactoring, which I consider to be the most useful features in PersistentActor. If you want to learn more, then I suggest you take a look at the Akka documents as they pretty much cover everything else that you can do with PersistentActor. Persistent actor state handling A PersistentActor implementation has two basic states that it can be in—Recovering and Receiving Commands. When Recovering, it's in the process of reloading its event stream from the journal to rebuild its internal state. Any external messages that come in during this time will be stashed until the recovery process is complete. Once the recovery process completes, the persistent actor transitions into the Receiving Commands state where it can start to handle commands. These commands can then generate new events that can further modify the state of this entity. This two-state flow can be visualized in the following diagram: These two states are both represented by custom actor receive handling partial functions. You must provide implementations for both of the following vals in order to properly implement these two states for your persistent actor: val receiveRecover: Receive = { . . . } val receiveCommand: Receive = { . . . } While in the recovering state, there are two possible messages that you need to be able to handle. The first is one of the event types that you previously persisted for this entity type. When you get that type of message, you have to reapply the change implied by that event to the internal state of the actor. For example, if we had a SalesOrderFO fields object as the internal state, and we received a replayed event indicating that the order was approved, the handling might look something like this: var state:SalesOrderFO = ... val receiveRecover: Receive = { case OrderApproved(id) => state = state.copy(status = SalesOrderStatus.Approved) } We'd, of course, need to handle a lot more than that one event. This code sample was just to show you how you can modify the internal state of a persistent actor when it's being recovered. Once the actor has completed the recovery process, it can transition into the state where it starts to handle incoming command requests. Event sourcing is all about Action (command) and Reaction (events). When the persistent actor receives a command, it has the option to generate zero to many events as a result of that command. These events represent a happening on that entity that will affect its current state. Events you receive while in the Recovering state will be previously generated while in the Receiving Commands state. So, the preceding example that I coded, where we receive OrderApproved, must have previously come from some command that we handled earlier. The handling of that command could have looked something like this: val receiveCommand: Receive = { case ApproveOrder(id) => persist(OrderApproved(id)){ event => state = state.copy(status = SalesOrderStatus.Approved) sender() ! FullResult(state) } } After receiving the command request to change the order status to approved, the code makes a call to persist, which will asynchronously write an event into the journal. The full signature for persist is: persist[A](event: A)(handler: (A) ⇒ Unit): Unit The first argument there represents the event that you want to write to the journal. The second argument is a callback function that will be executed after the event has been successfully persisted (and won't be called at all if the persistence fails). For our example, we will use that callback function to mutate the internal state to update the status field to match the requested action. One thing to note is that the writing in the journal is asynchronous. So, one may then think that it's possible to be closing over that internal state in an unsafe way when the callback function is executed. If you persisted two events in rapid succession, couldn't it be possible for both of them to access that internal state at the same time in separate threads, kind of like when using Futures in an actor? Thankfully, this is not the case. The completion of a persistence action is sent back as a new message to the actor. The hidden receive handling for this message will then invoke the callback associated with that persistence action. By using the mailbox again, we will know these post-persistence actions will be executed one at a time, in a safe manner. As an added bonus, the sender associated with those post-persistence messages will be the original sender of the command so you can safely use sender() in a persistence callback to reply to the original requestor, as shown in my example. Another guarantee that the persistence framework makes when persisting events is that no other commands will be processed in between the persistence action and the associated callback. Any commands that come in during that time will be stashed until all of the post-persistence actions have been completed. This makes the persist/callback sequence atomic and isolated, in that nothing else can interfere with it while it's happening. Allowing additional commands to be executed during this process may lead to an inconsistent state and response to the caller who sent the commands. If for some reason, the persisting to the journal fails, there is an onPersistFailure callback that will be invoked. If you want to implement custom handling for this, you can override this method. No matter what, when persistence fails, the actor will be stopped after making this callback. At this point, it's possible that the actor is in an inconsistent state, so it's safer to stop it than to allow it to continue on in this state. Persistence failures probably mean something is failing with the journal anyway so restarting as opposed to stopping will more than likely lead to even more failures. There's one more callback that you can implement in your persistent actors and that's onPersistRejected. This will happen if the serialization framework rejects the serialization of the event to store. When this happens, the persist callback does not get invoked, so no internal state update will happen. In this case, the actor does not stop or restart because it's not in an inconsistent state and the journal itself is not failing. The PersistenceId Another important concept that you need to understand with PersistentActor is the persistenceId method. This abstract method must be defined for every type of PersistentActor you define, returning a String that is to be unique across different entity types and also between actor instances within the same type. Let's say I will create the Book entity as a PersistentActor and define the persistenceId method as follows: override def persistenceId = "book" If I do that, then I will have a problem with this entity, in that every instance will share the entire event stream for every other Book instance. If I want each instance of the Book entity to have its own separate event stream (and trust me, you will), then I will do something like this when defining the Book PersistentActor: class Book(id:Int) extends PersistentActor{ override def persistenceId = s"book-$id" } If I follow an approach like this, then I can be assured that each of my entity instances will have its own separate event stream as the persistenceId will be unique for every Int keyed book we have. In the current model, when creating a new instance of an entity, we will pass in the special ID of 0 to indicate that this entity does not yet exist and needs to be persisted. We will defer ID creation to the database, and once we have an ID (after persistence), we will stop that actor instance as it is not properly associated with that newly generated ID. With the persistenceId model of associating the event stream to an entity, we will need the ID as soon as we create the actor instance. This means we will need a way to have a unique identifier even before persisting the initial entity state. This is something to think about before we get to the upcoming refactor. Taking snapshots for faster recovery I've mentioned the concept of taking a snapshot of the current state of an entity to speed up the process of recovering its state. If you have a long-lived entity that has generated a large amount of events, it will take progressively more and more time to recover its state. Akka's PersistentActor supports the snapshot concept, putting it in your hands as to when to take the snapshot. Once you have taken the snapshots, the latest one will be offered to the entity during the recovery phase instead of all of the events that led up to it. This will reduce the total number of events to process to recover state, thus speeding up that process. This is a two-part process, with the first part being taking snapshots periodically and the second being handling them during the recovery phase. Let's take a look at the snapshot taking process first. Let's say that you coded a particular entity to save a new snapshot for every one hundred events received. To make this happen, your command handling block may look something like this: var eventTotal = ... val receiveCommand:Receive = { case UpdateStatus(status) => persist(StatusUpdated(status)){ event => state = state.copy(status = event.status) eventTotal += 1 if (eventTotal % 100 == 0) saveSnapshot(state) } case SaveSnapshotSuccess(metadata) => . . . case SaveSnapshotFailure(metadata, reason) => . . . } You can see in the post-persist logic that if we we're making a specific call to saveSnapshot, we are passing the latest version of the actor's internal state. You're not limited to doing this just in the post-persist logic in reaction to a new event, but you can also set up the actor to publish a snapshot on regular intervals. You can leverage Akka's scheduler to send a special message to the entity to instruct it to save the snapshot periodically. If you start saving snapshots, then you will have to start handling the two new messages that will be sent to the entity indicating the status of the saved snapshot. These two new message types are SaveSnapshotSuccess and SaveSnapshotFailure. The metadata that appears on both messages will tell you things, such as the persistence ID where the failure occurred, the sequence number of the snapshot that failed, and the timestamp of the failure. You can see these two new messages in the command handling block shown in the preceding code. Once you have saved a snapshot, you will need to start handling it in the recovery phase. The logic to handle a snapshot during recovery will look like the following code block: val receiveRecover:Receive = { case SnapshotOffer(metadata, offeredSnapshot) => state = offeredSnapshot case event => . . . } Here, you can see that if we get a snapshot during recovery, instead of just making an incremental change, as we do with real replayed events, we set the entire state to whatever the offered snapshot is. There may be hundreds of events that led up to that snapshot, but all we need to handle here is one message in order to wind the state forward to when we took that snapshot. This process will certainly pay dividends if we have lots of events for this entity and we continue to take periodic snapshots. One thing to note about snapshots is that you will only ever be offered the latest snapshot (per persistence id) during the recovery process. Even though I'm taking a new snapshot every 100 events, I will only ever be offered one,the latest one, during the recovery phase. Another thing to note is that there is no real harm in losing a snapshot. If your snapshot storage was wiped out for some reason, the only negative side effect is that you'll be stuck processing all of the events for an entity when recovering it. When you take snapshots, you don't lose any of the event history. Snapshots are completely supplemental and only benefit the performance of the recovery phase. You don't need to take them, and you can live without them if something happens to the ones you had taken. Serialization of events and snapshots Within both the persistence and snapshot examples, you can see I was passing objects into the persist and saveSnapshot calls. So, how are these objects marshaled to and from a format that can actually be written to those stores? The answer is—via Akka serialization. Akka persistence is dependent on Akka serialization to convert event and snapshot objects to and from a binary format that can be saved into a data store. If you don't make any changes to the default serialization configuration, then your objects will be converted into binary via Java serialization. Java serialization is both slow and inefficient in terms of size of the serialized object. It's also not flexible in terms of the object definition changing after producing the binary when you are trying to read it back in. It's not a good choice for our needs with our event sourced app. Luckily, Akka serialization allows you to provide your own custom serializers. If you, perhaps, wanted to use JSON as your serialized object representation then you can pretty easily build a custom serializer to do that. They also have a built-in Google Protobuf serializer that can convert your Protobuf binding classes into their binary format. We'll explore both custom serializers and the Protobuf serializer when we get into the sections dealing with the refactors. The AsyncWriteJournal Another important component in Akka persistence, which I've mentioned a few times already, is the AsyncWriteJournal. This component is an append-only data store that stores the sequence of events (per persistence id) a PersistentActor generates via calls to persist. The journal also stores the highestSequenceNr per persistence id that tracks the total number of persisted events for that persistence id. The journal is a pluggable component. You have the ability to configure the default journal and, also, override it on a per-entity basis. The default configuration for Akka does not provide a value for the journal to use, so you must either configure this setting or add a per-entity override (more on that in a moment) in order to start using persistence. If you want to set the default journal, then it can be set in your config with the following property: akka.persistence.journal.plugin="akka.persistence.journal.leveldb" The value in the preceding code must be the fully qualified path to another configuration section of the same name where the journal plugin's config lives. For this example, I set it to the already provided leveldb config section (from Akka's reference.conf). If you want to override the journal plugin for a particular entity instance only, then you can do so by overriding the journalPluginId method on that entity actor, as follows: class MyEntity extends PersistentActor{ override def journalPluginId = "my-other-journal" . . . } The same rules apply here, in which, my-other-journal must be the fully qualified name to a config section where the config for that plugin lives. My example config showed the use of the leveldb plugin that writes to the local file system. If you actually want to play around using this simple plugin, then you will also need to add the following dependencies into your sbt file: "org.iq80.leveldb" % "leveldb" % "0.7" "org.fusesource.leveldbjni" % "leveldbjni-all" % "1.8" If you want to use something different, then you can check the community plugins page on the Akka site to find one that suits your needs. For our app, we will use the Cassandra journal plugin. I'll show you how to set up the config for that in the section dealing with the installation of Cassandra. The SnapshotStore The last thing I want to cover before we start the refactoring process is the SnapshotStore. Like the AsyncWriteJournal, the SnapshotStore is a pluggable and configurable storage system, but this one stores just snapshots as opposed to the entire event stream for a persistence id. As I mentioned earlier, you don't need snapshots, and you can survive if the storage system you used for them gets wiped out for some reason. Because of this, you may consider using a separate storage plugin for them. When selecting the storage system for your events, you need something that is robust, distributed, highly available, fault tolerant, and backup capable. If you lose these events, you lose the entire data set for your application. But, the same is not true for snapshots. So, take that information into consideration when selecting the storage. You may decide to use the same system for both, but you certainly don't have to. Also, not every journal plugin can act as a snapshot plugin; so, if you decide to use the same for both, make sure that the journal plugin you select can handle snapshots. If you want to configure the snapshot store, then the config setting to do that is as follows: akka.persistence.snapshot-store.plugin="my-snapshot-plugin" The setting here follows the same rules as the write journal; the value must be the fully qualified name to a config section where the plugin's config lives. If you want to override the default setting on a per entity basis, then you can do so by overriding the snapshotPluginId command on your actor like this: class MyEntity extends PersistentActor{ override def snapshotPluginId = "my-other-snap-plugin" . . . } The same rules apply here as well, in which, the value must be a fully qualified path to a config section where the plugin's config lives. Also, there are no out-of-the-box default settings for the snapshot store, so if you want to use snapshots, you must either set the appropriate setting in your config or provide the earlier mentioned override on a per entity basis. For our needs, we will use the same storage mechanism—Cassandra—for both the write journal and the snapshot storage. We have a multi-node system currently, so using something that writes to the local file system, or a simple in-memory plugin, won't work for us. Summary In this article, you learned about Akka persistence for event sourcing and the need to take a snapshot of the current state of an entity to speed up the process to recover its state. Resources for Article: Further resources on this subject: Introduction to Akka [article] Using NoSQL Databases [article] PostgreSQL – New Features [article]
Read more
  • 0
  • 0
  • 2704

article-image-decoding-why-good-php-developerisnt-oxymoron
Packt
14 Sep 2016
20 min read
Save for later

Decoding Why "Good PHP Developer"Isn't an Oxymoron

Packt
14 Sep 2016
20 min read
In this article by Junade Ali, author of the book Mastering PHP Design Patterns, we will be revisiting object-oriented programming. Back in 2010 MailChimp published a post on their blog, it was entitled Ewww, You Use PHP? In this blog post they described the horror when they explained their choice of PHP to developers who consider the phrase good PHP programmer an oxymoron. In their rebuttal they argued that their PHP wasn't your grandfathers PHP and they use a sophisticated framework. I tend to judge the quality of PHP on the basis of, not only how it functions, but how secure it is and how it is architected. This book focuses on ideas of how you should architect your code. The design of software allows for developers to ease the extension of the code beyond its original purpose, in a bug free and elegant fashion. (For more resources related to this topic, see here.) As Martin Fowler put it: Any fool can write code that a computer can understand. Good programmers write code that humans can understand. This isn't just limited to code style, but how developers architect and structure their code. I've encountered many developers with their noses constantly stuck in documentation, copying and pasting bits of code until it works; hacking snippets together until it works. Moreover, I far too often see the software development process rapidly deteriorate as developers ever more tightly couple their classes with functions of ever increasing length. Software engineers mustn't just code software; they must know how to design it. Indeed often a good software engineer, when interviewing other software engineers will ask questions surrounding the design of the code itself. It is trivial to get a piece of code that will execute, and it is also benign to question a developer as to whether strtolower or str2lower is the correct name of a function (for the record, it's strtolower). Knowing the difference between a class and an object doesn't make you a competent developer; a better interview question would, for example, be how one could apply subtype polymorphism to a real software development challenge. Failure to assess software design skills dumbs down an interview and results in there being no way to differentiate between those who are good at it, and those who aren't. These advanced topics will be discussed throughout this book, by learning these tactics you will better understand what the right questions to ask are when discussing software architecture. Moxie Marlinspike once tweeted: As a software developer, I envy writers, musicians, and filmmakers. Unlike software, when they create something it is really done, forever. When developing software we mustn't forget we are authors, not just of instructions for a machine, but we are also authoring something that we later expect others to extend upon. Therefore, our code mustn't just be targeted at machines, but humans also. Code isn't just poetry for a machine, it should be poetry for humans also. This is, of course, better said than done. In PHP this may be found especially difficult given the freedom PHP offers developers on how they may architect and structure their code. By the very nature of freedom, it may be both used and abused, so it is true with the freedom offered in PHP. PHP offers freedom to developers to decide how to architect this code. By the very nature of freedom it can be both used and abused, so it is true with the freedom offered in PHP. Therefore, it is increasingly important that developers understand proper software design practices to ensure their code maintains long term maintainability. Indeed, another key skill lies in refactoringcode, improving design of existing code to make it easier to extend in the longer term Technical debt, the eventual consequence of poor system design, is something that I've found comes with the career of a PHP developer. This has been true for me whether it has been dealing with systems that provide advanced functionality or simple websites. It usually arises because a developer elects to implement bad design for a variety of reasons; this is when adding functionality to an existing codebase or taking poor design decisions during the initial construction of software. Refactoring can help us address these issues. SensioLabs (the creators of the Symfonyframework) have a tool called Insight that allows developers to calculate the technical debt in their own code. In 2011 they did an evaluation of technical debt in various projects using this tool; rather unsurprisingly they found that WordPress 4.1 topped the chart of all platforms they evaluated with them claiming it would take 20.1 years to resolve the technical debt that the project contains. Those familiar with the WordPress core may not be surprised by this, but this issue of course is not only associated to WordPress. In my career of working with PHP, from working with security critical cryptography systems to working with systems that work with mission critical embedded systems, dealing with technical debt comes with the job. Dealing with technical debt is not something to be ashamed of for a PHP Developer, indeed some may consider it courageous. Dealing with technical debt is no easy task, especially in the face of an ever more demanding user base, client, or project manager; constantly demanding more functionality without being familiar with the technical debt the project has associated to it. I recently emailed the PHP Internals group as to whether they should consider deprecating the error suppression operator @. When any PHP function is prepended by an @ symbol, the function will suppress an error returned by it. This can be brutal; especially where that function renders a fatal error that stops the execution of the script, making debugging a tough task. If the error is suppressed, the script may fail to execute without providing developers a reason as to why this is. Despite the fact that no one objected to the fact that there were better ways of handling errors (try/catch, proper validation) than abusing the error suppression operator and that deprecation should be an eventual aim of PHP, it is the case that some functions return needless warnings even though they already have a success/failure value. This means that due to technical debt in the PHP core itself, this operator cannot be deprecated until a lot of other prerequisite work is done. In the meantime, it is down to developers to decide the best methodologies of handling errors. Until the inherent problem of unnecessary error reporting is addressed, this operator cannot be deprecated. Therefore, it is down to developers to be educated as to the proper methodologies that should be used to address error handling and not to constantly resort to using an @ symbol. Fundamentally, technical debt slows down development of a project and often leads to code being deployed that is broken as developers try and work on a fragile project. When starting a new project, never be afraid to discus architecture as architecture meetings are vital to developer collaboration; as one scrum master I've worked with said in the face of criticism that "meetings are a great alternative to work", he said "meetings are work…how much work would you be doing without meetings?". Coding style - thePSR standards When it comes to coding style, I would like to introduce you to the PSR standards created by the PHP Framework Interop Group. Namely, the two standards that apply to coding standards are PSR-1 (Basic Coding Style) and PSR-2 (Coding Style Guide). In addition to this there are PSR standards that cover additional areas, for example, as of today; the PSR-4 standard is the most up-to-date autoloading standard published by the group. You can find out more about the standards at http://www.php-fig.org/. Coding style being used to enforce consistency throughout a codebase is something I strongly believe in, it does make a difference to your code readability throughout a project. It is especially important when you are starting a project (chances are you may be reading this book to find out how to do that right) as your coding style determines the style the developers following you in working on this project will adopt. Using a global standard such as PSR-1 or PSR-2 means that developers can easily switch between projects without having to reconfigure their code style in their IDE. Good code style can make formatting errors easier to spot. Needless to say that coding styles will develop as time progresses, to date I elect to work with the PSR standards. I am a strong believer in the phrase: Always code as if the guy who ends up maintaining your code will be a violent psychopath who knows where you live. It isn't known who wrote this phrase originally; but it's widely thought that it could have been John Woods or potentially Martin Golding. I would strongly recommend familiarizingyourself with these standards before proceeding in this book. Revising object-oriented programming Object-oriented programming is more than just classes and objects, it's a whole programming paradigm based around objects(data structures) that contain data fields and methods. It is essential to understand this; using classes to organize a bunch of unrelated methods together is not object orientation. Assuming you're aware of classes (and how to instantiate them), allow me to remind you of a few different bits and pieces. Polymorphism Polymorphism is a fairly long word for a fairly simple concept. Essentially, polymorphism means the same interfaceis used with a different underlying code. So multiple classes could have a draw function, each accepting the same arguments, but at an underlying level the code is implemented differently. In this article, I would also like to talk about Subtype Polymorphism in particular (also known as Subtyping or Inclusion Polymorphism). Let's say we have animals as our supertype;our subtypes may well be cats, dogs, and sheep. In PHP, interfaces allow you to define a set of functionality that a class that implements it must contain, as of PHP 7 you can also use scalar type hints to define the return types we expect. So for example, suppose we defined the following interface: interface Animal { public function eat(string $food) : bool; public function talk(bool $shout) : string; } We could then implement this interface in our own class, as follows: class Cat implements Animal { } If we were to run this code without defining the classes we would get an error message as follows: Class Cat contains 2 abstract methods and must therefore be declared abstract or implement the remaining methods (Animal::eat, Animal::talk) Essentially, we are required to implement the methods we defined in our interface, so now let's go ahead and create a class that implements these methods: class Cat implements Animal { public function eat(string $food): bool { if ($food === "tuna") { return true; } else { return false; } } public function talk(bool $shout): string { if ($shout === true) { return "MEOW!"; } else { return "Meow."; } } } Now that we've implemented these methods we can then just instantiate the class we are after and use the functions contained in it: $felix = new Cat(); echo $felix->talk(false); So where does polymorphism come into this? Suppose we had another class for a dog: class Dog implements Animal { public function eat(string $food): bool { if (($food === "dog food") || ($food === "meat")) { return true; } else { return false; } } public function talk(bool $shout): string { if ($shout === true) { return "WOOF!"; } else { return "Woof woof."; } } } Now let's suppose we have multiple different types of animals in a pets array: $pets = array( 'felix' => new Cat(), 'oscar' => new Dog(), 'snowflake' => new Cat() ); We can now actually go ahead and loop through all these pets individually in order to run the talk function.We don't care about the type of pet because the talkmethod that is implemented in every class we getis by virtue of us having extended the Animals interface. So let's suppose we wanted to have all our animals run the talk method, we could just use the following code: foreach ($pets as $pet) { echo $pet->talk(false); } No need for unnecessary switch/case blocks in order to wrap around our classes, we just use software design to make things easier for us in the long-term. Abstract classes work in a similar way, except for the fact that abstract classes can contain functionality where interfaces cannot. It is important to note that any class that defines one or more abstract classes must also be defined as abstract. You cannot have a normal class defining abstract methods, but you can have normal methods in abstract classes. Let's start off by refactoring our interface to be an abstract class: abstract class Animal { abstract public function eat(string $food) : bool; abstract public function talk(bool $shout) : string; public function walk(int $speed): bool { if ($speed > 0) { return true; } else { return false; } } } You might have noticed that I have also added a walk method as an ordinary, non-abstract method; this is a standard method that can be used or extended by any classes that inherit the parent abstract class. They already have implementation. Note that it is impossible to instantiate an abstract class (much like it's not possible to instantiate an interface). Instead we must extend it. So, in our Cat class let's substitute: class Cat implements Animal With the following code: class Cat extends Animal That's all we need to refactor in order to get classes to extend the Animal abstract class. We must implement the abstract functions in the classes as we outlined for the interfaces, plus we can use the ordinary functions without needing to implement them: $whiskers = new Cat(); $whiskers->walk(1); As of PHP 5.4 it has also become possible to instantiate a class and access a property of it in one system. PHP.net advertised it as: Class member access on instantiation has been added, e.g. (new Foo)->bar(). You can also do it with individual properties, for example,(new Cat)->legs. In our example, we can use it as follows: (new IcyAprilChapterOneCat())->walk(1); Just to recap a few other points about how PHP implemented OOP, the final keyword before a class declaration or indeed a function declaration means that you cannot override such classes or functions after they've been defined. So, if we were to try extending a class we have named as final: final class Animal { public function walk() { return "walking..."; } } class Cat extends Animal { } This results in the following output: Fatal error: Class Cat may not inherit from final class (Animal) Similarly, if we were to do the same except at a function level: class Animal { final public function walk() { return "walking..."; } } class Cat extends Animal { public function walk () { return "walking with tail wagging..."; } } This results in the following output: Fatal error: Cannot override final method Animal::walk() Traits (multiple inheritance) Traits were introduced into PHP as a mechanism for introducing Horizontal Reuse. PHP conventionally acts as a single inheritance language, namely because of the fact that you can't inherit more than one class into a script. Traditional multiple inheritance is a controversial process that is often looked down upon by software engineers. Let me give you an example of using Traits first hand; let's define an abstract Animal class which we want to extend into another class: class Animal { public function walk() { return "walking..."; } } class Cat extends Animal { public function walk () { return "walking with tail wagging..."; } } So now let's suppose we have a function to name our class, but we don't want it to apply to all our classes that extend the Animal class, we want it to apply to certain classes irrespective of whether they inherit the properties of the abstract Animal class or not. So we've defined our functions like so: function setFirstName(string $name): bool { $this->firstName = $name; return true; } function setLastName(string $name): bool { $this->lastName = $name; return true; } The problem now is that there is no place we can put them without using Horizontal Reuse, apart from copying and pasting different bits of code or resorting to using conditional inheritance. This is where Traits come to the rescue; let's start off by wrapping these methods in a Trait called Name: trait Name { function setFirstName(string $name): bool { $this->firstName = $name; return true; } function setLastName(string $name): bool { $this->lastName = $name; return true; } } So now that we've defined our Trait, we can just tell PHP to use it in our Cat class: class Cat extends Animal { use Name; public function walk() { return "walking with tail wagging..."; } } Notice the use of theName statement? That's where the magic happens. Now you can call the functions in that Trait without any problems: $whiskers = new Cat(); $whiskers->setFirstName('Paul'); echo $whiskers->firstName; All put together, the new code block looks as follows: trait Name { function setFirstName(string $name): bool { $this->firstName = $name; return true; } function setLastName(string $name): bool { $this->lastName = $name; return true; } } class Animal { public function walk() { return "walking..."; } } class Cat extends Animal { use Name; public function walk() { return "walking with tail wagging..."; } } $whiskers = new Cat(); $whiskers->setFirstName('Paul'); echo $whiskers->firstName; Scalar type hints Let me take this opportunity to introduce you to a PHP7 concept known as scalar type hinting; it allows you to define the return types (yes, I know this isn't strictly under the scope of OOP; deal with it). Let's define a function, as follows: function addNumbers (int $a, int $b): int { return $a + $b; } Let's take a look at this function; firstly you will notice that before each of the arguments we define the type of variable we want to receive, in this case,int or integer. Next up you'll notice there's a bit of code after the function definition : int, which defines our return type so our function can only receive an integer. If you don't provide the right type of variable as a function argument or don't return the right type of variable from the function; you will get a TypeError exception. In strict mode, PHP will also throw a TypeError exception in the event that strict mode is enabled and you also provide the incorrect number of arguments. It is also possible in PHP to define strict_types; let me explain why you might want to do this. Without strict_types, PHP will attempt to automatically convert a variable to the defined type in very limited circumstances. For example, if you pass a string containing solely numbers it will be converted to an integer, a string that's non-numeric, however, will result in a TypeError exception. Once you enable strict_typesthis all changes, you can no longer have this automatic casting behavior. Taking our previous example, without strict_types, you could do the following: echo addNumbers(5, "5.0"); Trying it again after enablingstrict_types, you will find that PHP throws a TypeError exception. This configuration only applies on an individual file basis, putting it before you include other files will not result in this configuration being inherited to those files. There are multiple benefits of why PHP chose to go down this route; they are listed very clearly in Version: 0.5.3 of the RFC that implemented scalar type hints called PHP RFC: Scalar Type Declarations. You can read about it by going to http://www.wiki.php.net (the wiki, not the main PHP website) and searching for scalar_type_hints_v5. In order to enable it, make sure you put this as the very first statement in your PHP script: declare(strict_types=1); This will not work unless you define strict_typesas the very first statement in a PHP script; no other usages of this definition are permitted. Indeed if you try to define it later on, your script PHP will throw a fatal error. Of course, in the interests of the rage induced PHP core fanatic reading this book in its coffee stained form, I should mention that there are other valid types that can be used in type hinting. For example, PHP 5.1.0 introduced this with arrays and PHP 5.0.0 introduced the ability for a developer to do this with their own classes. Let me give you a quick example of how this would work in practice, suppose we had an Address class: class Address { public $firstLine; public $postcode; public $country; public function __construct(string $firstLine, string $postcode, string $country) { $this->firstLine = $firstLine; $this->postcode = $postcode; $this->country = $country; } } We can then type the hint of the Address class that we inject into a Customer class: class Customer { public $name; public $address; public function __construct($name, Address $address) { $this->name = $name; $this->address = $address; } } And just to show how it all can come together: $address = new Address('10 Downing Street', 'SW1A2AA', 'UK'); $customer = new Customer('Davey Cameron', $address); var_dump($customer); Limiting debug access to private/protected properties If you define a class which contains private or protected variables, you will notice an odd behavior if you were to var_dumpthe object of that class. You will notice that when you wrap the object in a var_dumpit reveals all variables; be they protected, private, or public. PHP treats var_dump as an internal debugging function, meaning all data becomes visible. Fortunately, there is a workaround for this. PHP 5.6 introduced the __debugInfo magic method. Functions in classes preceded by a double underscore represent magic methods and have special functionality associated to them. Every time you try to var_dump an object that has the __debugInfo magic method set, the var_dump will be overridden with the result of that function call instead. Let me show you how this works in practice, let's start by defining a class: class Bear { private $hasPaws = true; } Let's instantiate this class: $richard = new Bear(); Now if we were to try and access the private variable that ishasPaws, we would get a fatal error; so this call: echo $richard->hasPaws; Would result in the following fatal error being thrown: Fatal error: Cannot access private property Bear::$hasPaws That is the expected output, we don't want a private property visible outside its object. That being said, if we wrap the object with a var_dump as follows: var_dump($richard); We would then get the following output: object(Bear)#1 (1) { ["hasPaws":"Bear":private]=> bool(true) } As you can see, our private property is marked as private, but nevertheless it is visible. So how would we go about preventing this? So, let's redefine our class as follows: class Bear { private $hasPaws = true; public function __debugInfo () { return call_user_func('get_object_vars', $this); } } Now, after we instantiate our class and var_dump the resulting object, we get the following output: object(Bear)#1 (0) { } The script all put together looks like this now, you will notice I've added an extra public property called growls, which I have set to true: <?php class Bear { private $hasPaws = true; public $growls = true; public function __debugInfo () { return call_user_func('get_object_vars', $this); } } $richard = new Bear(); var_dump($richard); If we were to var_dump this script (with both public and private property to play with), we would get the following output: object(Bear)#1 (1) { ["growls"]=> bool(true) } As you can see, only the public property is visible. So what is the moral of the story from this little experiment? Firstly, that var_dumps exposesprivate and protected properties inside objects, and secondly, that this behavior can be overridden. Summary In this article, we revised some PHP principles, including OOP principles. We also revised some PHP syntax basics. Resources for Article: Further resources on this subject: Running Simpletest and PHPUnit [article] Data Tables and DataTables Plugin in jQuery 1.3 with PHP [article] Understanding PHP basics [article]
Read more
  • 0
  • 0
  • 17843
Modal Close icon
Modal Close icon