How-To Tutorials

article-image-configuring-endpoint-protection-configuration-manager

01 Nov 2016

5 min read

Configuring Endpoint Protection in Configuration Manager

01 Nov 2016

0
0
2514

Packt

21 Oct 2016

22 min read

Hosting on Google App Engine

Packt

21 Oct 2016

22 min read

0
0
20960

Packt

21 Oct 2016

15 min read

The Data Science Venn Diagram

Packt

21 Oct 2016

15 min read

It is a common misconception that only those with a PhD or geniuses can understand the math/programming behind data science. This is absolutely false. In this article by Sinan Ozdemir, author of the book Principles of Data Science, we will discuss how data science begins with three basic areas: Math/statistics: This is the use of equations and formulas to perform analysis Computer programming: This is the ability to use code to create outcomes on the computer Domain knowledge: This refers to understanding the problem domain (medicine, finance, social science, and so on) (For more resources related to this topic, see here.) The following Venn diagram provides a visual representation of how the three areas of data science intersect: The Venn diagram of data science Those with hacking skills can conceptualize and program complicated algorithms using computer languages. Having a math and statistics knowledge base allows you to theorize and evaluate algorithms and tweak the existing procedures to fit specific situations. Having substantive (domain) expertise allows you to apply concepts and results in a meaningful and effective way. While having only two of these three qualities can make you intelligent, it will also leave a gap. Consider that you are very skilled in coding and have formal training in day trading. You might create an automated system to trade in your place but lack the math skills to evaluate your algorithms and, therefore, end up losing money in the long run. It is only when you can boast skills in coding, math, and domain knowledge, can you truly perform data science. The one that was probably a surprise for you was domain knowledge. It is really just knowledge of the area you are working in. If a financial analyst started analyzing data about heart attacks, they might need the help of a cardiologist to make sense of a lot of the numbers. Data science is the intersection of the three key areas mentioned earlier. In order to gain knowledge from data, we must be able to utilize computer programming to access the data, understand the mathematics behind the models we derive, and above all, understand our analyses' place in the domain we are in. This includes presentation of data. If we are creating a model to predict heart attacks in patients, is it better to create a PDF of information or an app where you can type in numbers and get a quick prediction? All these decisions must be made by the data scientist. Also, note that the intersection of math and coding is machine learning, but it is important to note that without the explicit ability to generalize any models or results to a domain, machine learning algorithms remain just as algorithms sitting on your computer. You might have the best algorithm to predict cancer. You could be able to predict cancer with over 99% accuracy based on past cancer patient data but if you don't understand how to apply this model in a practical sense such that doctors and nurses can easily use it, your model might be useless. Domain knowledge comes with both practice of data science and reading examples of other people's analyses. The math Most people stop listening once someone says the word "math". They'll nod along in an attempt to hide their utter disdain for the topic. We will use these subdomains of mathematics to create what are called models. A data model refers to an organized and formal relationship between elements of data, usually meant to simulate a real-world phenomenon. Essentially, we will use math in order to formalize relationships between variables. As a former pure mathematician and current math teacher, I know how difficult this can be. I will do my best to explain everything as clearly as I can. Between the three areas of data science, math is what allows us to move from domain to domain. Understanding theory allows us to apply a model that we built for the fashion industry to a financial model. Every mathematical concept I introduce, I do so with care, examples, and purpose. The math in this article is essential for data scientists. Example – Spawner-Recruit Models In biology, we use, among many others, a model known as the Spawner-Recruit model to judge the biological health of a species. It is a basic relationship between the number of healthy parental units of a species and the number of new units in the group of animals. In a public dataset of the number of salmon spawners and recruits, the following graph was formed to visualize the relationship between the two. We can see that there definitely is some sort of positive relationship (as one goes up, so does the other). But how can we formalize this relationship? For example, if we knew the number of spawners in a population, could we predict the number of recruits that group would obtain and vice versa? Essentially, models allow us to plug in one variable to get the other. Consider the following example: In this example, let's say we knew that a group of salmons had 1.15 (in thousands) of spawners. Then, we would have the following: This result can be very beneficial to estimate how the health of a population is changing. If we can create these models, we can visually observe how the relationship between the two variables can change. There are many types of data models, including probabilistic and statistical models. Both of these are subsets of a larger paradigm, called machine learning. The essential idea behind these three topics is that we use data in order to come up with the "best" model possible. We no longer rely on human instincts, rather, we rely on data. Spawner-Recruit model visualized The purpose of this example is to show how we can define relationships between data elements using mathematical equations. The fact that I used salmon health data was irrelevant! The main reason for this is that I would like you (the reader) to be exposed to as many domains as possible. Math and coding are vehicles that allow data scientists to step back and apply their skills virtually anywhere. Computer programming Let's be honest. You probably think computer science is way cooler than math. That's ok, I don't blame you. The news isn't filled with math news like it is with news on the technological front. You don't turn on the TV to see a new theory on primes, rather you will see investigative reports on how the latest smartphone can take photos of cats better or something. Computer languages are how we communicate with the machine and tell it to do our bidding. A computer speaks many languages and, like a book, can be written in many languages; similarly, data science can also be done in many languages. Python, Julia, and R are some of the many languages available to us. This article will focus exclusively on using Python. Why Python? We will use Python for a variety of reasons: Python is an extremely simple language to read and write even if you've coded before, which will make future examples easy to ingest and read later. It is one of the most common languages in production and in the academic setting (one of the fastest growing as a matter of fact). The online community of the language is vast and friendly. This means that a quick Google search should yield multiple results of people who have faced and solved similar (if not exact) situations. Python has prebuilt data science modules that both the novice and the veteran data scientist can utilize. The last is probably the biggest reason we will focus on Python. These prebuilt modules are not only powerful but also easy to pick up. Some of these modules are as follows: pandas sci-kit learn seaborn numpy/scipy requests (to mine data from the web) BeautifulSoup (for Web HTML parsing) Python practices Before we move on, it is important to formalize many of the requisite coding skills in Python. In Python, we have variables thatare placeholders for objects. We will focus on only a few types of basic objects at first: int (an integer) Examples: 3, 6, 99, -34, 34, 11111111 float (a decimal): Examples: 3.14159, 2.71, -0.34567 boolean (either true or false) The statement, Sunday is a weekend, is true The statement, Friday is a weekend, is false The statement, pi is exactly the ratio of a circle's circumference to its diameter, is true (crazy, right?) string (text or words made up of characters) I love hamburgers (by the way who doesn't?) Matt is awesome A Tweet is a string a list (a collection of objects) Example: 1, 5.4, True, "apple" We will also have to understand some basic logistical operators. For these operators, keep the boolean type in mind. Every operator will evaluate to either true or false. == evaluates to true if both sides are equal, otherwise it evaluates to false 3 + 4 == 7 (will evaluate to true) 3 – 2 == 7 (will evaluate to false) < (less than) 3 < 5 (true) 5 < 3 (false) <= (less than or equal to) 3 <= 3 (true) 5 <= 3 (false) > (greater than) 3 > 5 (false) 5 > 3 (true) >= (greater than or equal to) 3 >= 3 (true) 5 >= 3 (false) When coding in Python, I will use a pound sign (#) to create a comment, which will not be processed as code but is merely there to communicate with the reader. Anything to the right of a # is a comment on the code being executed. Example of basic Python In Python, we use spaces/tabs to denote operations that belong to other lines of code. Note the use of the if statement. It means exactly what you think it means. When the statement after the if statement is true, then the tabbed part under it will be executed, as shown in the following code: X = 5.8 Y = 9.5 X + Y == 15.3 # This is True! X - Y == 15.3 # This is False! if x + y == 15.3: # If the statement is true: print "True!" # print something! The print "True!" belongs to the if x + y == 15.3: line preceding it because it is tabbed right under it. This means that the print statement will be executed if and only if x + y equals 15.3. Note that the following list variable, my_list, can hold multiple types of objects. This one has an int, a float, boolean, and string (in that order): my_list = [1, 5.7, True, "apples"] len(my_list) == 4 # 4 objects in the list my_list[0] == 1 # the first object my_list[1] == 5.7 # the second object In the preceding code: I used the len command to get the length of the list (which was four). Note the zero-indexing of Python. Most computer languages start counting at zero instead of one. So if I want the first element, I call the index zero and if I want the 95th element, I call the index 94. Example – parsing a single Tweet Here is some more Python code. In this example, I will be parsing some tweets about stock prices: tweet = "RT @j_o_n_dnger: $TWTR now top holding for Andor, unseating $AAPL" words_in_tweet = first_tweet.split(' ') # list of words in tweet for word in words_in_tweet: # for each word in list if "$" in word: # if word has a "cashtag" print "THIS TWEET IS ABOUT", word # alert the user I will point out a few things about this code snippet, line by line, as follows: We set a variable to hold some text (known as a string in Python). In this example, the tweet in question is "RT @robdv: $TWTR now top holding for Andor, unseating $AAPL" The words_in_tweet variable "tokenizes" the tweet (separates it by word). If you were to print this variable, you would see the following: "['RT', '@robdv:', '$TWTR', 'now', 'top', 'holding', 'for', 'Andor,', 'unseating', '$AAPL'] We iterate through this list of words. This is called a for loop. It just means that we go through a list one by one. Here, we have another if statement. For each word in this tweet, if the word contains the $ character (this is how people reference stock tickers on twitter). If the preceding if statement is true (that is, if the tweet contains a cashtag), print it and show it to the user. The output of this code will be as follows: We get this output as these are the only words in the tweet that use the cashtag. Whenever I use Python in this article, I will ensure that I am as explicit as possible about what I am doing in each line of code. Domain knowledge As I mentioned earlier, this category focuses mainly on having knowledge about the particular topic you are working on. For example, if you are a financial analyst working on stock market data, you have a lot of domain knowledge. If you are a journalist looking at worldwide adoption rates, you might benefit from consulting an expert in the field. Does that mean that if you're not a doctor, you can't work with medical data? Of course not! Great data scientists can apply their skills to any area, even if they aren't fluent in it. Data scientists can adapt to the field and contribute meaningfully when their analysis is complete. A big part of domain knowledge is presentation. Depending on your audience, it can greatly matter how you present your findings. Your results are only as good as your vehicle of communication. You can predict the movement of the market with 99.99% accuracy, but if your program is impossible to execute, your results will go unused. Likewise, if your vehicle is inappropriate for the field, your results will go equally unused. Some more terminology This is a good time to define some more vocabulary. By this point, you're probably excitedly looking up a lot of data science material and seeing words and phrases I haven't used yet. Here are some common terminologies you are likely to come across: Machine learning: This refers to giving computers the ability to learn from data without explicit "rules" being given by a programmer. Machine learning combines the power of computers with intelligent learning algorithms in order to automate the discovery of relationships in data and creation of powerful data models. Speaking of data models, we will concern ourselves with the following two basic types of data models: Probabilistic model: This refers to using probability to find a relationship between elements that includes a degree of randomness Statistical model: This refers to taking advantage of statistical theorems to formalize relationships between data elements in a (usually) simple mathematical formula While both the statistical and probabilistic models can be run on computers and might be considered machine learning in that regard, we will keep these definitions separate as machine learning algorithms generally attempt to learn relationships in different ways. Exploratory data analysis – This refers to preparing data in order to standardize results and gain quick insights Exploratory data analysis (EDA) is concerned with data visualization and preparation. This is where we turn unorganized data into organized data and also clean up missing/incorrect data points. During EDA, we will create many types of plots and use these plots in order to identify key features and relationships to exploit in our data models. Data mining – This is the process of finding relationships between elements of data. Data mining is the part of Data science where we try to find relationships between variables (think spawn-recruit model). I tried pretty hard not to use the term big data up until now. It's because I think this term is misused, a lot. While the definition of this word varies from person to person. Big datais data that is too large to be processed by a single machine (if your laptop crashed, it might be suffering from a case of big data). The state of data science so far (this diagram is incomplete and is meant for visualization purposes only). Summary More and more people are jumping headfirst into the field of data science, most with no prior experience in math or CS, which on the surface is great. Average data scientists have access to millions of dating profiles' data, tweets, online reviews, and much more in order to jumpstart their education. However, if you jump into data science without the proper exposure to theory or coding practices and without respect of the domain you are working in, you face the risk of oversimplifying the very phenomenon you are trying to model. Resources for Article: Further resources on this subject: Reconstructing 3D Scenes [article] Basics of Classes and Objects [article] Saying Hello! [article]

0
0
12459

article-image-jupyter-and-python-scripting

Packt

21 Oct 2016

9 min read

Jupyter and Python Scripting

Packt

21 Oct 2016

9 min read

In this article by Dan Toomey, author of the book Learning Jupyter, we will see data access in Jupyter with Python and the effect of pandas on Jupyter. We will also see Python graphics and lastly Python random numbers. (For more resources related to this topic, see here.) Python data access in Jupyter I started a view for pandas using Python Data Access as the name. We will read in a large dataset and compute some standard statistics on the data. We are interested in seeing how we use pandas in Jupyter, how well the script performs, and what information is stored in the metadata (especially if it is a larger dataset). Our script accesses the iris dataset built into one of the Python packages. All we are looking to do is read in a slightly large number of items and calculate some basic operations on the dataset. We are really interested in seeing how much of the data is cached in the PYNB file. The Python code is: # import the datasets package from sklearn import datasets # pull in the iris data iris_dataset = datasets.load_iris() # grab the first two columns of data X = iris_dataset.data[:, :2] # calculate some basic statistics x_count = len(X.flat) x_min = X[:, 0].min() - .5 x_max = X[:, 0].max() + .5 x_mean = X[:, 0].mean() # display our results x_count, x_min, x_max, x_mean I broke these steps into a couple of cells in Jupyter, as shown in the following screenshot: Now, run the cells (using Cell | Run All) and you get this display below. The only difference is the last Out line where our values are displayed. It seemed to take longer to load the library (the first time I ran the script) than to read the data and calculate the statistics. If we look in the PYNB file for this notebook, we see that none of the data is cached in the PYNB file. We simply have code references to the library, our code, and the output from when we last calculated the script: { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "(300, 3.7999999999999998, 8.4000000000000004, 5.8433333333333337)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# calculate some basic statisticsn", "x_count = len(X.flat)n", "x_min = X[:, 0].min() - .5n", "x_max = X[:, 0].max() + .5n", "x_mean = X[:, 0].mean()n", "n", "# display our resultsn", "x_count, x_min, x_max, x_mean" ] } Python pandas in Jupyter One of the most widely used features of Python is pandas. pandas are built-in libraries of data analysis packages that can be used freely. In this example, we will develop a Python script that uses pandas to see if there is any effect to using them in Jupyter. I am using the Titanic dataset from http://www.kaggle.com/c/titanic-gettingStarted/download/train.csv. I am sure the same data is available from a variety of sources. Here is our Python script that we want to run in Jupyter: from pandas import * training_set = read_csv('train.csv') training_set.head() male = training_set[training_set.sex == 'male'] female = training_set[training_set.sex =='female'] womens_survival_rate = float(sum(female.survived))/len(female) mens_survival_rate = float(sum(male.survived))/len(male) The result is… we calculate the survival rates of the passengers based on sex. We create a new notebook, enter the script into appropriate cells, include adding displays of calculated data at each point and produce our results. Here is our notebook laid out where we added displays of calculated data at each cell,as shown in the following screenshot: When I ran this script, I had two problems: On Windows, it is common to use backslash ("") to separate parts of a filename. However, this coding uses the backslash as a special character. So, I had to change over to use forward slash ("/") in my CSV file path. I originally had a full path to the CSV in the above code example. The dataset column names are taken directly from the file and are case sensitive. In this case, I was originally using the 'sex' field in my script, but in the CSV file the column is named Sex. Similarly I had to change survived to Survived. The final script and result looks like the following screenshot when we run it: I have used the head() function to display the first few lines of the dataset. It is interesting… the amount of detail that is available for all of the passengers. If you scroll down, you see the results as shown in the following screenshot: We see that 74% of the survivors were women versus just 19% men. I would like to think chivalry is not dead! Curiously the results do not total to 100%. However, like every other dataset I have seen, there is missing and/or inaccurate data present. Python graphics in Jupyter How do Python graphics work in Jupyter? I started another view for this named Python Graphics so as to distinguish the work. If we were to build a sample dataset of baby names and the number of births in a year of that name, we could then plot the data. The Python coding is simple: import pandas import matplotlib %matplotlib inline baby_name = ['Alice','Charles','Diane','Edward'] number_births = [96, 155, 66, 272] dataset = list(zip(baby_name,number_births)) df = pandas.DataFrame(data = dataset, columns=['Name', 'Number']) df['Number'].plot() The steps of the script are as follows: We import the graphics library (and data library) that we need Define our data Convert the data into a format that allows for easy graphical display Plot the data We would expect a resultant graph of the number of births by baby name. Taking the above script and placing it into cells of our Jupyter node, we get something that looks like the following screenshot: I have broken the script into different cells for easier readability. Having different cells also allows you to develop the script easily step by step, where you can display the values computed so far to validate your results. I have done this in most of the cells by displaying the dataset and DataFrame at the bottom of those cells. When we run this script (Cell | Run All), we see the results at each step displayed as the script progresses: And finally we see our plot of the births as shown in the following screenshot. I was curious what metadata was stored for this script. Looking into the IPYNB file, you can see the expected value for the formula cells. The tabular data display of the DataFrame is stored as HTML—convenient: { "cell_type": "code", "execution_count": 43, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "<div>n", "<table border="1" class="dataframe">n", "<thead>n", "<tr style="text-align: right;">n", "<th></th>n", "<th>Name</th>n", "<th>Number</th>n", "</tr>n", "</thead>n", "<tbody>n", "<tr>n", "<th>0</th>n", "<td>Alice</td>n", "<td>96</td>n", "</tr>n", "<tr>n", "<th>1</th>n", "<td>Charles</td>n", "<td>155</td>n", "</tr>n", "<tr>n", "<th>2</th>n", "<td>Diane</td>n", "<td>66</td>n", "</tr>n", "<tr>n", "<th>3</th>n", "<td>Edward</td>n", "<td>272</td>n", "</tr>n", "</tbody>n", "</table>n", "</div>" ], "text/plain": [ " Name Numbern", "0 Alice 96n", "1 Charles 155n", "2 Diane 66n", "3 Edward 272" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], The graphic output cell that is stored like this: { "cell_type": "code", "execution_count": 27, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "<matplotlib.axes._subplots.AxesSubplot at 0x47cf8f0>" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" }, { "data": { "image/png": "<a few hundred lines of hexcodes> …/wc/B0RRYEH0EQAAAABJRU5ErkJggg==n", "text/plain": [ "<matplotlib.figure.Figure at 0x47d8e30>" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "# plot the datan", "df['Number'].plot()n" ] } ], Where the image/png tag contains a large hex digit string representation of the graphical image displayed on screen (I abbreviated the display in the coding shown). So, the actual generated image is stored in the metadata for the page. Python random numbers in Jupyter For many analyses we are interested in calculating repeatable results. However, much of the analysis relies on some random numbers to be used. In Python, you can set the seed for the random number generator to achieve repeatable results with the random_seed() function. In this example, we simulate rolling a pair of dice and looking at the outcome. We would example the average total of the two dice to be 6—the halfway point between the faces. The script we are using is this: import pylab import random random.seed(113) samples = 1000 dice = [] for i in range(samples): total = random.randint(1,6) + random.randint(1,6) dice.append(total) pylab.hist(dice, bins= pylab.arange(1.5,12.6,1.0)) pylab.show() Once we have the script in Jupyter and execute it, we have this result: I had added some more statistics. Not sure if I would have counted on such a high standard deviation. If we increased the number of samples, this would decrease. The resulting graph was opened in a new window, much as it would if you ran this script in another Python development environment. The toolbar at the top of the graphic is extensive, allowing you to manipulate the graphic in many ways. Summary In this article, we walked through simple data access in Jupyter through Python. Then we saw an example of using pandas. We looked at a graphics example. Finally, we looked at an example using random numbers in a Python script. Resources for Article: Further resources on this subject: Python Data Science Up and Running [article] Mining Twitter with Python – Influence and Engagement [article] Unsupervised Learning [article]

0
0
34017

article-image-prepare-for-2017-with-mapt

Packt

21 Oct 2016

2 min read

Prepare for our 2017 Awards with Mapt

Packt

21 Oct 2016

2 min read

At Packt, we're committed to supporting developers to learn the skills they need to remain relevant in their field. But what exactly does relevant mean? To us, relevance is about the impact you have. And we believe that software should have always have an impact, whether it's for a business, for customers - whoever it is, it's ultimately about making a difference. We want to reward developers who make an impact. Whether you're a web developer who's creating awesome applications and websites that are engaging users every single day, or even a data analyst who has used Machine Learning to uncover revealing insights about healthcare or the environment, we're going to want to hear from you. We don't want to give too much away right now, but we're confident that you're going to be interested in our award... So, to prepare yourself for our awards, get started on Mapt and find your route through some of the most important skills in software today. What are you waiting for? We're sponsoring seats on Mapt for limited prices this week. That means you'll be able to get a subscription for a special discounted price - but be quick, each discount is time limited! Subscribe here.

0
0
1783

How-To Tutorials

Ted Yu

20 Oct 2016

4 min read

Resolving Deadlock in HBase

Ted Yu

20 Oct 2016

4 min read

0
0
3351

How-To Tutorials

article-image-introduction-moodle-3-and-moodlecloud

Packt

19 Oct 2016

20 min read

An Introduction to Moodle 3 and MoodleCloud

Packt

19 Oct 2016

20 min read

0
0
14782

How-To Tutorials

article-image-heart-diseases-prediction-using-spark-200

Packt

18 Oct 2016

16 min read

Heart Diseases Prediction using Spark 2.0.0

Packt

18 Oct 2016

16 min read

0
0
4852

article-image-managing-application-configuration

Packt

18 Oct 2016

14 min read

Managing Application Configuration

Packt

18 Oct 2016

14 min read

In this article by Sean McCord author of the book CoreOS Cookbook, we will explore some of the options available to help bridge the configuration divide with the following topics: Configuring by URL Translating etcd to configuration files Building EnvironmentFiles Building an active configuration manager Using fleet globals (For more resources related to this topic, see here.) Configuring by URL One of the most direct ways to obtain application configurations is by URL. You can generate a configuration and store it as a file somewhere, or construct a configuration from a web request, returning the formatted file. In this section, we will construct a dynamic redis configuration by web request and then run redis using it. Getting ready First, we need a configuration server. This can be S3, an object store, etcd, a NodeJS application, a rails web server, or just about anything. The details don't matter, as long as it speaks HTTP. We will construct a simple one here using Go, just in case you don't have one ready. Make sure your GOPATH is set and create a new directory named configserver. Then, create a new file in that directory called main.go with the following contents: package main import ( "html/template" "log" "net/http" ) func init() { redisTmpl = template.Must(template.New("rcfg").Parse(redisString)) } func main() { http.HandleFunc("/config/redis", redisConfig) log.Fatal(http.ListenAndServe(":8080", nil)) } func redisConfig(w http.ResponseWriter, req *http.Request) { // TODO: pull configuration from database redisTmpl.Execute(w, redisConfigOpts{ Save: true, MasterIP: "192.168.25.100", MasterPort: "6379", }) } type redisConfigOpts struct { Save bool // Should redis save db to file? MasterIP string // IP address of the redis master MasterPort string // Port of the redis master } var redisTmpl *template.Template const redisString = ` {{if .Save}} save 900 1 save 300 10 save 60 10000 {{end}} slaveof {{.MasterIP}} {{.MasterPort}} ` For our example, we simply statically configure the values, but it is easy to see how we could query etcd or another database to fill in the appropriate values on demand. Now, just go and build and run the config server, and we are ready to implement our configURL-based configuration. How to do it... By design, CoreOS is a very stripped down OS. However, one of the tools it does come with is curl, which we can use to download our configuration. All we have to do is add it to our systemd/fleet unit file. For the redis-slave.service input the following: [Unit] Description=Redis slave server After=docker.service [Service] ExecStartPre=/usr/bin/mkdir -p /tmp/config/redis-slave ExecStartPre=/usr/bin/curl -s -o /tmp/config/redis-slave/redis.conf http://configserver-address:8080/config/redis ExecStartPre=-/usr/bin/docker kill %p ExecStartPre=-/usr/bin/docker rm %p ExecStart=/usr/bin/docker run --rm --name %p -v /tmp/config/redis-slave/redis.conf:/tmp/redis.conf redis:alpine /tmp/redis.conf We have made the configserver's address configserver-address in the preceding code, so make certain you fill in the appropriate IP for the system running the config server. How it works... We outsource the work of generating the configuration to the web server or beyond. This is a common idiom in modern cluster-oriented systems: many small pieces work together to make the whole. The idea of using a configuration URL is very flexible. In this case, it allows us to use a pre-packaged, official Docker image for an application that has no knowledge of the cluster, in its standard, default setup. While redis is fairly simple, the same concept can be used to generate and supply configurations for almost any legacy application. Translating etcd to configuration files In CoreOS, we have a well-suited database that is evidenced by its name and well suited to configuration (while the name etc is an abbreviation for the Latin et cetera, in common UNIX usage, /etc is where the system configuration is stored). It presents a standard HTTP server, which is easy to access from nearly anything. This makes storing application configuration in etcd a natural choice. The only problem is devising methods of storing the configuration in ways that are sufficiently expressive, flexible, and usable. Getting ready A naive but simple way of using etcd is to simply use it as a key-oriented file store as follows: etcdctl set myconfig $(cat mylocalconfig.conf |base64) etcdctl get myconfig |base64 -d > mylocalconfig.conf However, this method stores the configuration file in the database as a static, opaque blob and store/retrieve. Decoupling the generation from the consumption yields much more flexibility both in adapting configuration content to multiple consumers and producers and scaling out multiple access uses. How to do it... We can store and retrieve an entire configuration blob storage very simply as follows: etcdctl set /redis/config $(cat redis.conf |base64) etcdctl get /redis/config |base64 -d > redis.conf Or we can store more generally-structured data as follows: etcdctl set /redis/config/master 192.168.9.23 etcdctl set /redis/config/loglevel notice etcdctl set /redis/config/dbfile dump.rdb And use it in different ways: REDISMASTER=$(curl -s http://localhost:2379/v2/keys/redis/config/master |jq .node.value) cat <<ENDHERE >/etc/redis.conf slaveof $(curl -s http://localhost:2379/v2/keys/redis/config/master jq .node.value) loglevel $(etcdctl get /redis/config/loglevel) dbfile $(etcdctl get /redis/config/dbfile) ENDHERE Building EnvironmentFiles Environment variables are a popular choice for configuring container executions because nearly anything can read or write them, especially shell scripts. Moreover, they are always ephemeral, and by widely-accepted convention they override configuration file settings. Getting ready Systemd provides an EnvironmentFile directive that can be issued multiple times in a service file. This directive takes the argument of a filename that should contain key=value pairs to be loaded into the execution environment of the ExecStart program. CoreOS provides (in most non-bare metal installations) the file /etc/environment, which is formatted to be included with an EnvironmentFile statement. It typically contains variables describing the public and private IPs of the host. Environment file A common misunderstanding when starting out with Docker is about environment variables. Docker does not inherit the environment variables of the environment that calls docker run. Environment variables that are to be passed to the container must be explicitly stated using the -e option. This can be particularly confounding since systemd units do much the same thing. Therefore, to pass environments into Docker from a systemd unit, you need to define them both in the unit and in the docker run invocation. So this will work as expected: [Service] Environment=TESTVAR=testVal ExecStart=/usr/bin/docker -e TESTVAR=$TESTVAR nginx Whereas this will not: [Service] Environment=TESTVAR=unknowableVal ExecStart=/usr/bin/docker nginx How to do it... We will start by constructing an environment file generator unit. For testapp-env.service use the following: [Unit] Description=EnvironmentFile generator for testapp Before=testapp.service BindsTo=testapp.service [Install] RequiredBy=testapp.service [Service] ExecStart=/bin/sh -c "echo NOW=$(date +'%%'s) >/run/now.env" Type=oneshot RemainAfterExit=yes You may note the odd syntax for the date format. Systemd expands %s internally, so it needs to be escaped to be passed to the shell unmolested. For testapp.service use the following: [Unit] Description=My Amazing test app, configured by EnvironmentFile [Service] EnvironmentFile=/run/now.env ExecStart=/usr/bin/docker run --rm -p 8080:8080 -e NOW=${NOW} ulexus/environmentfile-demo If you are using fleet, you can submit these service files. If you are using raw systemd, you will need to install them into the /etc/systemd/system. Then issue the following: systemctl daemon-reload systemctl enable testapp-env.service systemctl start testapp.service testapp output How it works... The first unit writes the current UNIX timestamp to the file `/run/now.env and the second unit reads that file, parsing its contents into environment variables. We further pass the desired environment variables into the docker execution. Taking apart the first unit, there a number of important components. They are as follows: The Before statement tells systemd that the unit should be started before the main testapp. This is important so that the environment file exists before the service is started. Otherwise the unit will fail because the file does not exist or reads the wrong data if the file is stale. The BindsTo setting tells systemd that the unit should be stopped and started with testapp.service. This makes sure that it is restarted when testapp is restarted, refreshing the environment file. The RequiredBy setting tells systemd that this unit is required by the other unit. By stating the relationship in this manner, it allows the first unit to be separately enabled or disabled without any modification of the first unit. While that wouldn't matter in this case, in cases where the target service is a standard unit file which knows nothing about the helper unit, it allows us to use the add-on without fear of our changes to the official, standard service unit. The Type and RemainAfterExit combination of settings tells systemd to expect that the unit will exit, but to treat the unit as up even after it has exited. This allows the prerequisite to operate even though the unit has exited. In the second unit, the main service, the main thing to note is the EnvironmentFile line. It simply takes a file as an argument. We reference the file that was created (or updated) by the first script. Systemd reads it into the environment for any Exec* statements. Because Docker separates its containers' environments, we do still have to manually pass that variable into the container with the -e flag to docker run. There's more... You might be trying to figure out why we don't combine the units and try to set the environment variable with an ExecStartPre statement. Modifications to the environment from an Exec* statement are isolated from each other's Exec* statements. You can make changes to the environment within an Exec* statement, but those changes will not be carried over to any other Exec* statement. Also, you cannot execute any commands in an Environment or EnvironmentFile statement, nor can they expand any variables themselves. Building an active configuration manager Dynamic systems are, well, dynamic. They will often change while a dependent service is running. In such a case, simple runtime configuration systems as we have discussed thus far are insufficient. We need the ability to tell our dependent services to use the new, changed configuration. For such cases as this, we can implement active configuration management. In an active configuration, some processes monitor the state of dynamic components and notify or restart dependent services with the updated data. Getting ready Much like the active service announcer, we will be building our active configuration manager in Go, so a functional Go development environment is required. To increase readability, we have broken each subroutine into a separate file. How to do it... First, we construct the main routine, as follows: main.go: package main import ( "log" "os" "github.com/coreos/etcd/clientv3" "golang.org/x/net/context" ) var etcdKey = "web:backends" func main() { ctx := context.Background() log.Println("Creating etcd client") c, err := clientv3.NewFromURL(os.Getenv("ETCD_ENDPOINTS")) if err != nil { log.Fatal("Failed to create etcd client:", err) os.Exit(1) } defer c.Close() w := c.Watch(ctx, etcdKey, clientv3.WithPrefix()) for resp := range w { if resp.Canceled { log.Fatal("etcd watcher died") os.Exit(1) } go reconfigure(ctx, c) } } Next, our reconfigure routine, which pulls the current state from etcd, writes the configuration to file, and restarts our service, as follows: reconfigure.go: package main import ( "github.com/coreos/etcd/clientv3" "golang.org/x/net/context" ) // reconfigure haproxy func reconfigure(ctx context.Context, c *clientv3.Client) error { backends, err := get(ctx, c) if err != nil { return err } if err = write(backends); err != nil { return err } return restart() } The reconfigure routine just calls get, write and restart, in sequence. Let's create each of those as follows: get.go: package main import ( "bytes" "github.com/coreos/etcd/clientv3" "golang.org/x/net/context" ) // get the present list of backends func get(ctx context.Context, c *clientv3.Client) ([]string, error) { resp, err := clientv3.NewKV(c).Get(ctx, etcdKey) if err != nil { return nil, err } var backends = []string{} for _, node := range resp.Kvs { if node.Value != nil { v := bytes.NewBuffer(node.Value).String() backends = append(backends, v) } } return backends, nil } write.go: package main import ( "html/template" "os" ) var configTemplate *template.Template func init() { configTemplate = template.Must(template.New("config").Parse(configTemplateString)) } // Write the updated config file func write(backends []string) error { cf, err := os.Create("/config/haproxy.conf") if err != nil { return err } defer cf.Close() return configTemplate.Execute(cf, backends) } var configTemplateString = ` frontend public bind 0.0.0.0:80 default_backend servers backend servers {{range $index, $ip := .}} server srv-$index $ip {{end}} ` restart.go: package main import "github.com/coreos/go-systemd/dbus" // restart haproxy func restart() error { conn, err := dbus.NewSystemdConnection() if err != nil { return err } _, err = conn.RestartUnit("haproxy.service", "ignore-dependencies", nil) return err } With our active configuration manager available, we can now create a service unit to run it, as follows: haproxy-config-manager.service: [Unit] Description=Active configuration manager [Service] ExecStart=/usr/bin/docker run --rm --name %p -v /data/config:/data -v /var/run/dbus:/var/run/dbus -v /run/systemd:/run/systemd -e ETCD_ENDPOINTS=http://${COREOS_PUBLIC_IPV4}:2379 quay.io/ulexus/demo-active-configuration-manager Restart=always RestartSec=10 [X-Fleet] MachineOf=haproxy.service How it works... First, we monitor the pertinent keys in etcd. It helps to have all of the keys under one prefix, but if that isn't the case, we can simply add more watchers. When a change occurs, we pull the present values for all the pertinent keys from etcd and then rebuild our configuration file. Next, we tell systemd to restart the dependent service. If the target service has a valid ExecReload, we could tell systemd to reload, instead. In order to talk to systemd, we have passed in the dbus and systemd directories, to enable access to their respective sockets. Using fleet globals When you have a set of services that should be run on each of a set of machines, it can be tedious to run discrete and separate unit instances for each node. Fleet provides a reasonably flexible way to run these kinds of services, and when nodes are added, it will automatically start any declared globals on these machines. Getting ready In order to use fleet globals, you will need fleet running on each machine on which the globals will be executed. This is usually a simple matter of enabling fleet within the cloud-config as follows: #cloud-config coreos: fleet: metadata: service=nginx,cpu=i7,disk=ssd public-ip: "$public_ipv4" units: - name: fleet.service command: start How to do it... To make a fleet unit a global, simply declare the Global=true parameter in the [X-Fleet] section of the unit as follows: [Unit] Description=My global service [Service] ExecStart=/usr/bin/docker run --rm -p 8080:80 nginx [X-Fleet] Global=true Globals can also be filtered with other keys. For instance, a common filter is to run globals on all nodes that have certain metadata: [Unit] Description=My partial global service [Service] ExecStart=/usr/bin/docker run --rm -p 8080:80 nginx [X-Fleet] Global=true MachineMetadata=service=nginx Note that the metadata that is being referred to here is the fleet metadata, which is distinct from the instance metadata of your cloud provider or even the node tags of Kubernetes. How it works... Unlike most fleet units, there is not a one-to-one correspondence between the fleet unit instance and the actual running services. This has the side effect that modifications to a fleet global have immediate global effect. In other words, there is no rolling update with a fleet global. There is an immediate, universal replacement only. Hence, do not use globals for services that cannot be wholly down during upgrades. Summary We overcome the challenges for administrators who comes from traditional static deployment environments. We learned that we can't just build configuration or deploy it. It needs to be proactive in running environment. Any changes needs to be reloaded. Resources for Article: Further resources on this subject: How to Set Up CoreOS Environment [article] CoreOS Networking and Flannel Internals [article] Let's start with Extending Docker [article]

0
0
2281

How-To Tutorials

article-image-how-build-desktop-app-using-electron

Amit Kothari

17 Oct 2016

9 min read

How to build a desktop app using Electron

Amit Kothari

17 Oct 2016

9 min read

Desktop apps are making a comeback. Even companies with cloud-based applications with awesome web apps are investing in desktop apps to offer a better user experience. One example is team collaboration tool called Slack. They built a really good desktop app with web technologies using Electron. Electron is an open source framework used to build cross-platform desktop apps using web technologies. It uses Node.js and Chromium and allows us to develop desktop GUI apps using HTML, CSS and JavaScript. Electron is developed by GitHub, initially for Atom editor but now used by many companies, including Slack, Wordpress, Microsoft and Docker to name a few. Electron apps are web apps running in embedded Chromium web browser, with access to the full suite of Node.js modules and underlying operating system. In this post we will build a simple desktop app using Electron. Hello Electron Let’s start by creating a simple app. Before we start, we need Node.js and npm installed. Follow the instructions on the Node.js website if you do not have these installed already. Create a new director for your application and inside the app directory, create a package.json file by using the npm init command. Follow the prompts and remember to set main.js as the entry point. Once the file is generated, install electron-prebuild, which is the precomplied version of electron, and add it as a dev depenency in the package.json using the command npm install --save-dev electron-prebuilt. Also add "start": "electron ." under scripts, which we will use later to start our app. The package.json file will look something like this: { "name": "electron-tutorial", "version": "1.0.0", "description": "Electron Tutorial ", "main": "main.js", "scripts": { "start": "electron ." }, "devDependencies": { "electron-prebuilt": "^1.3.3" } } Create a file main.js with the following content: const {app, BrowserWindow} = require('electron'); // Global reference of the window object. let mainWindow; // When Electron finish initialization, create window and load app index.html app.on('ready', () => { mainWindow = new BrowserWindow({ width: 800, height: 600 }); mainWindow.loadURL(`file://${__dirname}/index.html`); }); We defined main.js as the entry point to our app in package.json. In main.js the electron app module controls the application lifecyle and BrowserWindow is used to create a native browser window. When Electron finishes initializing and our app is ready, we create a browser window to load our web page—index.html. As mentioned in the Electron documentation, remember to keep a global reference of the window object to avoid it from closing automatically when the JavaScript garbage collector kicks in. Finally, create the index.html file: <!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title>Hello Electron</title> </head> <body> <h1>Hello Electron</h1> </body> </html> We can now start our app by running the npm start command. Testing the Electron app Let’s write some integration tests for our app using Spectron. spectron allows us to test Electron apps using ChromeDriver and WebdriverIO. It is a test framework that is agnostic, but for this example, we will use mocha to write the tests. Let’s start by adding spectron and mocha as dev dependecies using the npm install --save-dev spectron and npm install --save-dev mocha commands. Then add "test": "./node_modules/mocha/bin/mocha" under scripts in the package.json file. This will be used to run our tests later. The package.json should look something like this: { "name": "electron-tutorial", "version": "1.0.0", "description": "Electron Tutorial ", "main": "main.js", "scripts": { "start": "electron .", "test": "./node_modules/mocha/bin/mocha" }, "devDependencies": { "electron-prebuilt": "^1.3.3", "mocha": "^3.0.2", "spectron": "^3.3.0" } } Now that we have all the dependencies installed, let’s write some tests. Create a directory called test and a file called test.js inside it. Copy the following content to test.js: var Application = require('spectron').Application; var electron = require('electron-prebuilt'); var assert = require('assert'); describe('Sample app', function () { var app; beforeEach(function () { app = new Application({ path: electron, args: ['.'] }); return app.start(); }); afterEach(function () { if (app && app.isRunning()) { return app.stop(); } }); it('should show initial window', function () { return app.browserWindow.isVisible() .then(function (isVisible) { assert.equal(isVisible, true); }); }); it('should have correct app title', function () { return app.client.getTitle() .then(function (title) { assert.equal(title, 'Hello Electron') }); }); }); Here we have couple of simple tests. We start the app before each test and stop after each test. The first test is to verify that the app's browserWindow is visible, and the second test is to verify the app’s title. We can run these tests using the npm run test command. spectron not only allows us to easily set up and tear down our app, but also give access to various APIs, allowing us to write sophisticated tests covering various business requirements. Please have a look at their documentation for more details. Packaging our app Now that we have a basic app, we are ready to package and build it for distribution. We will use electron-builder for this, which offers a complete solution to distribute apps on different platforms with the option to auto-update. It is recommended to use two separate package.jsons when using electron-builder, one for the development environment and build scripts and another one with app dependencies. But for our simple app, we can just use one package.json file. Let’s start by adding electron-builder as dev dependency using command npm install --save-dev electron-builder. Make sure you have the name, desciption, version and author defined in package.json. You also need to add electron-builder-specific options as build property in package.json: "build": { "appId": "com.amitkothari.electronsample", "category": "public.app-category.productivity" } For Mac OS, we need to specify appId and category. Look at the documentation for options for other platforms. Finally add script in package.json to package and build the app: "dist": "build" The updated package.json will look like this: { "name": "electron-tutorial", "version": "1.0.0", "description": "Electron Tutorial ", "author": "Amit Kothari", "main": "main.js", "scripts": { "start": "electron .", "test": "./node_modules/mocha/bin/mocha", "dist": "build" }, "devDependencies": { "electron-prebuilt": "^1.3.3", "mocha": "^3.0.2", "spectron": "^3.3.0", "electron-builder": "^5.25.1" }, "build": { "appId": "com.amitkothari.electronsample", "category": "public.app-category.productivity" } } Next we need to create a build directory under our project root directory. In this, put a file background.png for the Mac OS DMG background and icon.icns for app icon. We can now package our app by running the npm run dist command. Todo App We’ve built a very simple app, but Electron apps can do more than just show static text. Lets add some dynamic behavior to our app and convert it into a Todo list manager. We can use any JavaScript framework of choice, from AngularJS to React, with Electron, but for this example, we will use plain JavaScript. To start with, let’s update our index.html to display a todo list: <!DOCTYPE html> <html> <head> <meta charset="UTF-8"> <title>Hello Electron</title> <link rel="stylesheet" type="text/css" href="./style.css"> </head> <body> <div class="container"> <ul id="todoList"></ul> <textarea id="todoInput" placeholder="What needs to be done ?"></textarea> <button id="addTodoButton">Add to list</button> </div> </body> <script>require('./app.js')</script> </html> We also included style.css and app.js in index.html. All our CSS will be in style.css and our app logic will be in app.js. Create the style.css file with the following content: body { margin: 0; } ul { list-style-type: none; margin: 0; padding: 0; } li { padding: 10px; border-bottom: 1px solid #ddd; } button { background-color: black; color: #fff; margin: 5px; padding: 5px; cursor: pointer; border: none; font-size: 12px; } .container { width: 100%; } #todoInput { float: left; display: block; overflow: auto; margin: 15px; padding: 10px; font-size: 12px; width: 250px; } #addTodoButton { float: left; margin: 25px 10px; } And finally create the app.js file: (function () { const addTodoButton = document.getElementById('addTodoButton'); const todoList = document.getElementById('todoList'); // Create delete button for todo item const createTodoDeleteButton = () => { const deleteButton = document.createElement("button"); deleteButton.innerHTML = "X"; deleteButton.onclick = function () { this.parentNode.outerHTML = ""; }; return deleteButton; } // Create element to show todo text const createTodoText = (todo) => { const todoText = document.createElement("span"); todoText.innerHTML = todo; return todoText; } // Create a todo item with delete button and text const createTodoItem = (todo) => { const todoItem = document.createElement("li"); todoItem.appendChild(createTodoDeleteButton()); todoItem.appendChild(createTodoText(todo)); return todoItem; } // Clear input field const clearTodoInputField = () => { document.getElementById("todoInput").value = ""; } // Add new todo item and clear input field const addTodoItem = () => { const todo = document.getElementById('todoInput').value; if (todo) { todoList.appendChild(createTodoItem(todo)); clearTodoInputField(); } } addTodoButton.addEventListener("click", addTodoItem, false); } ()); Our app.js has a self invoking function which registers a listener (addTodoItem) on addTodoButton click event. On add button click event, the addTodoItem function will add a new todo item and clear the text area. Run the app again using the npm start command. Conclusion We built a very simple app, but it shows the potential of Electron. As stated on the Electron website, if you can build a website, you can build a desktop app. I hope you find this post interesting. If you have built an application with Electron, please share it with us. About the author Amit Kothari is a full-stack software developer based in Melbourne, Australia. He has 10+ years experience in designing and implementing software, mainly in Java/JEE. His recent experience is in building web applications using JavaScript frameworks such as React and AngularJS and backend microservices/REST API in Java. He is passionate about lean software development and continuous delivery.

0
0
42912

How-To Tutorials

article-image-diving-data-search-and-report

Packt

17 Oct 2016

11 min read

Diving into Data – Search and Report

Packt

17 Oct 2016

11 min read

In this article by Josh Diakun, Paul R Johnson, and Derek Mock authors of the books Splunk Operational Intelligence Cookbook - Second Edition, we will cover the basic ways to search the data in Splunk. We will cover how to make raw event data readable (For more resources related to this topic, see here.) The ability to search machine data is one of Splunk's core functions, and it should come as no surprise that many other features and functions of Splunk are heavily driven-off searches. Everything from basic reports and dashboards to data models and fully featured Splunk applications are powered by Splunk searches behind the scenes. Splunk has its own search language known as the Search Processing Language (SPL). This SPL contains hundreds of search commands, most of which also have several functions, arguments, and clauses. While a basic understanding of SPL is required in order to effectively search your data in Splunk, you are not expected to know all the commands! Even the most seasoned ninjas do not know all the commands and regularly refer to the Splunk manuals, website, or Splunk Answers (http://answers.splunk.com). To get you on your way with SPL, be sure to check out the search command cheat sheet and download the handy quick reference guide available at http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/SplunkEnterpriseQuickReferenceGuide. Searching Searches in Splunk usually start with a base search, followed by a number of commands that are delimited by one or more pipe (|) characters. The result of a command or search to the left of the pipe is used as the input for the next command to the right of the pipe. Multiple pipes are often found in a Splunk search to continually refine data results as needed. As we go through this article, this concept will become very familiar to you. Splunk allows you to search for anything that might be found in your log data. For example, the most basic search in Splunk might be a search for a keyword such as error or an IP address such as 10.10.12.150. However, searching for a single word or IP over the terabytes of data that might potentially be in Splunk is not very efficient. Therefore, we can use the SPL and a number of Splunk commands to really refine our searches. The more refined and granular the search, the faster the time to run and the quicker you get to the data you are looking for! When searching in Splunk, try to filter as much as possible before the first pipe (|) character, as this will save CPU and disk I/O. Also, pick your time range wisely. Often, it helps to run the search over a small time range when testing it and then extend the range once the search provides what you need. Boolean operators There are three different types of Boolean operators available in Splunk. These are AND, OR, and NOT. Case sensitivity is important here, and these operators must be in uppercase to be recognized by Splunk. The AND operator is implied by default and is not needed, but does no harm if used. For example, searching for the term error or success would return all the events that contain either the word error or the word success. Searching for error success would return all the events that contain the words error and success. Another way to write this can be error AND success. Searching web access logs for error OR success NOT mozilla would return all the events that contain either the word error or success, but not those events that also contain the word mozilla. Common commands There are many commands in Splunk that you will likely use on a daily basis when searching data within Splunk. These common commands are outlined in the following table: Command Description chart/timechart This command outputs results in a tabular and/or time-based output for use by Splunk charts. dedup This command de-duplicates results based upon specified fields, keeping the most recent match. eval This command evaluates new or existing fields and values. There are many different functions available for eval. fields This command specifies the fields to keep or remove in search results. head This command keeps the first X (as specified) rows of results. lookup This command looks up fields against an external source or list, to return additional field values. rare This command identifies the least common values of a field. rename This command renames the fields. replace This command replaces the values of fields with another value. search This command permits subsequent searching and filtering of results. sort This command sorts results in either ascending or descending order. stats This command performs statistical operations on the results. There are many different functions available for stats. table This command formats the results into a tabular output. tail This command keeps only the last X (as specified) rows of results. top This command identifies the most common values of a field. transaction This command merges events into a single event based upon a common transaction identifier. Time modifiers The drop-down time range picker in the Graphical User Interface (GUI) to the right of the Splunk search bar allows users to select from a number of different preset and custom time ranges. However, in addition to using the GUI, you can also specify time ranges directly in your search string using the earliest and latest time modifiers. When a time modifier is used in this way, it automatically overrides any time range that might be set in the GUI time range picker. The earliest and latest time modifiers can accept a number of different time units: seconds (s), minutes (m), hours (h), days (d), weeks (w), months (mon), quarters (q), and years (y). Time modifiers can also make use of the @ symbol to round down and snap to a specified time. For example, searching for sourcetype=access_combined earliest=-1d@d latest=-1h will search all the access_combined events from midnight, a day ago until an hour ago from now. Note that the snap (@) will round down such that if it were 12 p.m. now, we would be searching from midnight a day and a half ago until 11 a.m. today. Working with fields Fields in Splunk can be thought of as keywords that have one or more values. These fields are fully searchable by Splunk. At a minimum, every data source that comes into Splunk will have the source, host, index, and sourcetype fields, but some source might have hundreds of additional fields. If the raw log data contains key-value pairs or is in a structured format such as JSON or XML, then Splunk will automatically extract the fields and make them searchable. Splunk can also be told how to extract fields from the raw log data in the backend props.conf and transforms.conf configuration files. Searching for specific field values is simple. For example, sourcetype=access_combined status!=200 will search for events with a sourcetype field value of access_combined that has a status field with a value other than 200. Splunk has a number of built-in pre-trained sourcetypes that ship with Splunk Enterprise that might work with out-of-the-box, common data sources. These are available at http://docs.splunk.com/Documentation/Splunk/latest/Data/Listofpretrainedsourcetypes. In addition, Technical Add-Ons (TAs), which contain event types and field extractions for many other common data sources such as Windows events, are available from the Splunk app store at https://splunkbase.splunk.com. Saving searches Once you have written a nice search in Splunk, you may wish to save the search so that you can use it again at a later date or use it for a dashboard. Saved searches in Splunk are known as Reports. To save a search in Splunk, you simply click on the Save As button on the top right-hand side of the main search bar and select Report. Making raw event data readable When a basic search is executed in Splunk from the search bar, the search results are displayed in a raw event format by default. To many users, this raw event information is not particularly readable, and valuable information is often clouded by other less valuable data within the event. Additionally, if the events span several lines, only a few events can be seen on the screen at any one time. In this recipe, we will write a Splunk search to demonstrate how we can leverage Splunk commands to make raw event data readable, tabulating events and displaying only the fields we are interested in. Getting ready You should be familiar with the Splunk search bar and search results area. How to do it… Follow the given steps to search and tabulate the selected event data: Log in to your Splunk server. Select the Search & Reporting application from the drop-down menu located in the top left-hand side of the screen. Set the time range picker to Last 24 hours and type the following search into the Splunk search bar: index=main sourcetype=access_combined Then, click on Search or hit Enter. Splunk will return the results of the search and display the raw search events under the search bar. Let's rerun the search, but this time we will add the table command as follows: index=main sourcetype=access_combined | table _time, referer_domain, method, uri_path, status, JSESSIONID, useragent Splunk will now return the same number of events, but instead of presenting the raw events to you, the data will be in a nicely formatted table, displaying only the fields we specified. This is much easier to read! Save this search by clicking on Save As and then on Report. Give the report the name cp02_tabulated_webaccess_logs and click on Save. On the next screen, click on Continue Editing to return to the search. How it works… Let's break down the search piece by piece: Search fragment Description index=main All the data in Splunk is held in one or more indexes. While not strictly necessary, it is a good practice to specify the index (es) to search, as this will ensure a more precise search. sourcetype=access_combined This tells Splunk to search only the data associated with the access_combined sourcetype, which, in our case, is the web access logs. | table _time, referer_domain, method, uri_path, action, JSESSIONID, useragent Using the table command, we take the result of our search to the left of the pipe and tell Splunk to return the data in a tabular format. Splunk will only display the fields specified after the table command in the table of results. In this recipe, you used the table command. The table command can have a noticeable performance impact on large searches. It should be used towards the end of a search, once all the other processing on the data by the other Splunk commands has been performed. The stats command is more efficient than the table command and should be used in place of table where possible. However, be aware that stats and table are two very different commands. There's more… The table command is very useful in situations where we wish to present data in a readable format. Additionally, tabulated data in Splunk can be downloaded as a CSV file, which many users find useful for offline processing in spreadsheet software or for sending to others. There are some other ways we can leverage the table command to make our raw event data readable. Tabulating every field Often, there are situations where we want to present every event within the data in a tabular format, without having to specify each field one by one. To do this, we simply use a wildcard (*) character as follows: index=main sourcetype=access_combined | table * Removing fields, then tabulating everything else While tabulating every field using the wildcard (*) character is useful, you will notice that there are a number of Splunk internal fields, such as _raw, that appear in the table. We can use the fields command before the table command to remove the fields as follows: index=main sourcetype=access_combined | fields - sourcetype, index, _raw, source date* linecount punct host time* eventtype | table * If we do not include the minus (-) character after the fields command, Splunk will keep the specified fields and remove all the other fields. Summary In this article we covered along with the introduction to Splunk, how to make raw event data readable Resources for Article: Further resources on this subject: Splunk's Input Methods and Data Feeds [Article] The Splunk Interface [Article] The Splunk Web Framework [Article]

0
0
1170

Packt

17 Oct 2016

9 min read

First Projects with the ESP8266

Packt

17 Oct 2016

9 min read

In this article by Marco Schwartz, author Internet of Things with ESP8266, we will focus on the ESP8266 chip is ready to be used and you can connect it to your Wi-Fi network, we can now build some basic projects with it. This will help you understand the basics of the ESP8266. (For more resources related to this topic, see here.) We are going to see three projects in this article: how to control an LED, how to read data from a GPIO pin, and how to grab the contents from a web page. We will also see how to read data from a digital sensor. Controlling an LED First, we are going to see how to control a simple LED. Indeed, the GPIO pins of the ESP8266 can be configured to realize many functions: inputs, outputs, PWM outputs, and also SPI or I2C communications. This first project will teach you how to use the GPIO pins of the chip as outputs. The first step is to add an LED to our project. These are the extra components you will need for this project: 5mm LED (https://www.sparkfun.com/products/9590) 330 Ohm resistor (to limit the current in the LED) (https://www.sparkfun.com/products/8377) The next step is to connect the LED with the resistor to the ESP8266 board. To do so, the first thing to do is to place the resistor on the breadboard. Then, place the LED on the breadboard as well, connecting the longest pin of the LED (the anode) to one pin of the resistor. Then, connect the other end of the resistor to the GPIO pin 5 of the ESP8266, and the other end of the LED to the ground. This is how it should look like at the end: We are now going to light up the LED by programming the ESP8266 chip, by connecting it to the Wi-Fi network. This is the complete code for this section: // Import required libraries #include <ESP8266WiFi.h> void setup() { // Set GPIO 5 as output pinMode(5, OUTPUT); // Set GPIO 5 on a HIGH state digitalWrite(5, HIGH); } void loop() { } This code simply sets the GPIO pin as an output, and then applies a HIGH state on it. The HIGH state means that the pin is active, and that positive voltage (3.3V) is applied on the pin. A LOW state would mean that the output is at 0V. You can now copy this code and paste it in the Arduino IDE. Then, upload the code to the board using the instructions from the previous article. You should immediately see that the LED is lighting up. You can shut it down again by using digitalWrite(5, LOW) in the code. You could also, for example, modify the code so the ESP8266 switches the LED on and off every second. Reading data from a GPIO pin As a second project in this article, we are going to read the state of a GPIO pin. For this, we will use the same pin as in the previous project. You can therefore remove the LED and the resistor that we used in the previous project. Now, simply connect this pin (GPIO 5) of the board to the positive power supply on your breadboard with a wire, therefore applying a 3.3V signal on this pin. Reading data from a pin is really simple. This is the complete code for this part: // Import required libraries #include <ESP8266WiFi.h> void setup(void) { // Start Serial (to display results on the Serial monitor) Serial.begin(115200); // Set GPIO 5 as input pinMode(5, INPUT);} void loop() { // Read GPIO 5 and print it on Serial port Serial.print("State of GPIO 5: "); Serial.println(digitalRead(5)); // Wait 1 second delay(1000); } We simply set the pin as an input, and then read the value of this pin, and print it out every second. Copy and paste this code into the Arduino IDE, then upload it to the board using the instructions from the previous article. This is the result you should get in the Serial monitor: State of GPIO 5: 1 We can see that the returned value is 1 (digital state HIGH), which is what we expected, because we connected the pin to the positive power supply. As a test, you can also connect the pin to the ground, and the state should go to 0. Grabbing the content from a web page As a last project in this article, we are finally going to use the Wi-Fi connection of the chip to grab the content of a page. We will simply use the www.example.com page, as it's a basic page largely used for test purposes. This is the complete code for this project: // Import required libraries #include <ESP8266WiFi.h> // WiFi parameters constchar* ssid = "your_wifi_network"; constchar* password = "your_wifi_password"; // Host constchar* host = "www.example.com"; void setup() { // Start Serial Serial.begin(115200); // We start by connecting to a WiFi network Serial.println(); Serial.println(); Serial.print("Connecting to "); Serial.println(ssid); WiFi.begin(ssid, password); while (WiFi.status() != WL_CONNECTED) { delay(500); Serial.print("."); } Serial.println(""); Serial.println("WiFi connected"); Serial.println("IP address: "); Serial.println(WiFi.localIP()); } int value = 0; void loop() { Serial.print("Connecting to "); Serial.println(host); // Use WiFiClient class to create TCP connections WiFiClient client; const int httpPort = 80; if (!client.connect(host, httpPort)) { Serial.println("connection failed"); return; } // This will send the request to the server client.print(String("GET /") + " HTTP/1.1rn" + "Host: " + host + "rn" + "Connection: closernrn"); delay(10); // Read all the lines of the reply from server and print them to Serial while(client.available()){ String line = client.readStringUntil('r'); Serial.print(line); } Serial.println(); Serial.println("closing connection"); delay(5000); } The code is really basic: we first open a connection to the example.com website, and then send a GET request to grab the content of the page. Using the while(client.available()) code, we also listen for incoming data, and print it all inside the Serial monitor. You can now copy this code and paste it into the Arduino IDE. This is what you should see in the Serial monitor: This is basically the content of the page, in pure HTML code. Reading data from a digital sensor In this last section of this article, we are going to connect a digital sensor to our ESP8266 chip, and read data from it. As an example, we will use a DHT11 sensor that can be used to get ambient temperature and humidity. You will need to get this component for this section, the DHT11 sensor (https://www.adafruit.com/products/386) Let's now connect this sensor to your ESP8266: First, place the sensor on the breadboard. Then, connect the first pin of the sensor to VCC, the second pin to pin #5 of the ESP8266, and the fourth pin of the sensor to GND. This is how it will look like at the end: Note that here I've used another ESP8266 board, the Adafruit ESP8266 breakout board. We will also use the aREST framework in this example, so it's easy for you to access the measurements remotely. aREST is a complete framework to control your ESP8266 boards remotely (including from the cloud), and we are going to use it several times in the article. You can find more information about it at the following URL: http://arest.io/. Let's now configure the board. The code is too long to be inserted here, but I will detail the most important part of it now. It starts by including the required libraries: #include "ESP8266WiFi.h" #include <aREST.h> #include "DHT.h" To install those libraries, simply look for them inside the Arduino IDE library manager. Next, we need to set the pin on which the DHT sensor is connected to: #define DHTPIN 5 #define DHTTYPE DHT11 After that we declare an instance of the DHT sensor: DHT dht(DHTPIN, DHTTYPE, 15); As earlier, you will need to insert your own Wi-Fi name and password inside the code: const char* ssid = "wifi-name"; const char* password = "wifi-pass"; We also define two variables that will hold the measurements of the sensor: float temperature; float humidity; In the setup() function of the sketch, we initialize the sensor: dht.begin(); Still in the setup() function, we expose the variables to the aREST API, so we can access them remotely via Wi-Fi: rest.variable("temperature",&temperature); rest.variable("humidity",&humidity); Finally, in the loop() function, we make the measurements from the sensor: humidity = dht.readHumidity(); temperature = dht.readTemperature(); It's now time to test the project! Simply grab all the code and put it inside the Arduino IDE. Also make sure to install the aREST Arduino library using the Arduino library manager. Now, put the ESP8266 board in bootloader mode, and upload the code to the board. After that, reset the board, and open the Serial monitor. You should see the IP address of the board being displayed: Now, we can access the measurements from the sensor remotely. Simply go to your favorite web browser, and type: 192.168.115.105/temperature You should immediately get the answer from the board, with the temperature being displayed: { "temperature": 25.00, "id": "1", "name": "esp8266", "connected": true } You can of course do the same with humidity. Note that we used here the aREST API. You can learn more about it at: http://arest.io/. Congratulations, you just completed your very first projects using the ESP8266 chip! Feel free to experiment with what you learned in this article, and start learning more about how to configure your ESP8266 chip. Summary In this article, we realized our first basic projects using the ESP8266 Wi-Fi chip. We first learned how to control a simple output, by controlling the state of an LED. Then, we saw how to read the state of a digital pin on the chip. Finally, we learned how to read data from a digital sensor, and actually grab this data using the aREST framework. We are going to go right into the main topic of the article, and build our first Internet of Things project using the ESP8266. Resources for Article: Further resources on this subject: Sending Notifications using Raspberry Pi Zero [article] The Raspberry Pi and Raspbian [article] Working with LED Lamps [article]

0
0
14940

How-To Tutorials

article-image-bringing-devops-network-operations

Packt

14 Oct 2016

37 min read

Bringing DevOps to Network Operations

Packt

14 Oct 2016

37 min read

0
0
12947

Packt

14 Oct 2016

16 min read

Deployment and DevOps

Packt

14 Oct 2016

16 min read

In this article by Makoto Hashimoto and Nicolas Modrzyk, the authors of the book Clojure Programming Cookbook, we will cover the recipe Clojure on Amazon Web Services. (For more resources related to this topic, see here.) Clojure on Amazon Web Services This recipe is a standalone dish where you can learn how to combine the elegance of Clojure with Amazon Web Services (AWS). AWS was started in 2006 and is used by many businesses as easy to use web services. This style of serverless services is becoming more and more popular. You can use computer resources and software services on demand, without the need of preparing hardware or installing software by yourselves. You will mostly make use of the amazonica library, which is a comprehensive Clojure client for the entire Amazon AWS set of APIs. This library wraps the Amazon AWS APIs and supports most of AWS services including EC2, S3, Lambda, Kinesis, Elastic Beanstalk, Elastic MapReduce, and RedShift. This recipe has received a lot of its content and love from Robin Birtle, a leading member of the Clojure Community in Japan. Getting ready You need an AWS account and credentials to use AWS, so this recipe starts by showing you how to do the setup and acquire the necessary keys to get started. Signing up on AWS You need to sign up AWS if you don't have your account in AWS yet. In this case, go to https://aws.amazon.com, click on Sign In to the Console, and follow the instruction for creating your account: To complete the sign up, enter the number of a valid credit card and a phone number. Getting access key and secret access key To call the API, you now need your AWS's access key and secret access key. Go to AWS console and click on your name, which is located in the top right corner of the screen, and select Security Credential, as shown in the following screenshot: Select Access Keys (Access Key ID and Secret Access Key), as shown in the following screenshot: Then, the following screen appears; click on New Access Key: You can see your access key and secret access key, as shown in the following screenshot: Copy and save these strings for later use. Setting up dependencies in your project.clj Let's add amazonica library to your project.clj and restart your REPL: :dependencies [[org.clojure/clojure "1.8.0"] [amazonica "0.3.67"]] How to do it… From there on, we will go through some sample usage of the core Amazon services, accessed with Clojure, and the amazonica library. The three main ones we will review are as follows: EC2, Amazon's Elastic Cloud, which allows to run Virtual Machines on Amazon's Cloud S3, Simple Storage Service, which gives you Cloud based storage SQS, Simple Queue Services, which gives you Cloud-based data streaming and processing Let's go through each of these one by one. Using EC2 Let's assume you have an EC2 micro instance in Tokyo region: First of all, we will declare core and ec2 namespace in amazonica to use: (ns aws-examples.ec2-example (:require [amazonica.aws.ec2 :as ec2] [amazonica.core :as core])) We will set the access key and secret access key for enabling AWS client API accesses AWS. core/defcredential does as follows: (core/defcredential "Your Access Key" "Your Secret Access Key" "your region") ;;=> {:access-key "Your Access Key", :secret-key "Your Secret Access Key", :endpoint "your region"} The region you need to specify is ap-northeast-1, ap-south-1, or us-west-2. To get full regions list, use ec2/describe-regions: (ec2/describe-regions) ;;=> {:regions [{:region-name "ap-south-1", :endpoint "ec2.ap-south-1.amazonaws.com"} ;;=> ..... ;;=> {:region-name "ap-northeast-2", :endpoint "ec2.ap-northeast-2.amazonaws.com"} ;;=> {:region-name "ap-northeast-1", :endpoint "ec2.ap-northeast-1.amazonaws.com"} ;;=> ..... ;;=> {:region-name "us-west-2", :endpoint "ec2.us-west-2.amazonaws.com"}]} ec2/describe-instances returns very long information as the following: (ec2/describe-instances) ;;=> {:reservations [{:reservation-id "r-8efe3c2b", :requester-id "226008221399", ;;=> :owner-id "182672843130", :group-names [], :groups [], .... To get only necessary information of instance, we define the following __get-instances-info: (defn get-instances-info[] (let [inst (ec2/describe-instances)] (->> (mapcat :instances (inst :reservations)) (map #(vector [:node-name (->> (filter (fn [x] (= (:key x)) "Name" ) (:tags %)) first :value)] [:status (get-in % [:state :name])] [:instance-id (:instance-id %)] [:private-dns-name (:private-dns-name %)] [:global-ip (-> % :network-interfaces first :private-ip-addresses first :association :public-ip)] [:private-ip (-> % :network-interfaces first :private-ip-addresses first :private-ip-address)])) (map #(into {} %)) (sort-by :node-name)))) ;;=> #'aws-examples.ec2-example/get-instances-info Let's try to use the following function: get-instances-info) ;;=> ({:node-name "ECS Instance - amazon-ecs-cli-setup-my-cluster", ;;=> :status "running", ;;=> :instance-id "i-a1257a3e", ;;=> :private-dns-name "ip-10-0-0-212.ap-northeast-1.compute.internal", ;;=> :global-ip "54.199.234.18", ;;=> :private-ip "10.0.0.212"} ;;=> {:node-name "EcsInstanceAsg", ;;=> :status "terminated", ;;=> :instance-id "i-c5bbef5a", ;;=> :private-dns-name "", ;;=> :global-ip nil, ;;=> :private-ip nil}) As in the preceding example function, we can obtain instance-id list. So, we can start/stop instances using ec2/start-instances and ec2/stop-instances_ accordingly: (ec2/start-instances :instance-ids '("i-c5bbef5a")) ;;=> {:starting-instances ;;=> [{:previous-state {:code 80, :name "stopped"}, ;;=> :current-state {:code 0, :name "pending"}, ;;=> :instance-id "i-c5bbef5a"}]} (ec2/stop-instances :instance-ids '("i-c5bbef5a")) ;;=> {:stopping-instances ;;=> [{:previous-state {:code 16, :name "running"}, ;;=> :current-state {:code 64, :name "stopping"}, ;;=> :instance-id "i-c5bbef5a"}]} Using S3 Amazon S3 is secure, durable, and scalable storage in AWS cloud. It's easy to use for developers and other users. S3 also provide high durability, availability, and low cost. The durability is 99.999999999 % and the availability is 99.99 %. Let's create s3 buckets names makoto-bucket-1, makoto-bucket-2, and makoto-bucket-3 as follows: (s3/create-bucket "makoto-bucket-1") ;;=> {:name "makoto-bucket-1"} (s3/create-bucket "makoto-bucket-2") ;;=> {:name "makoto-bucket-2"} (s3/create-bucket "makoto-bucket-3") ;;=> {:name "makoto-bucket-3"} s3/list-buckets returns buckets information: (s3/list-buckets) ;;=> [{:creation-date #object[org.joda.time.DateTime 0x6a09e119 "2016-08-01T07:01:05.000+09:00"], ;;=> :owner ;;=> {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", ;;=> :display-name "tokoma1"}, ;;=> :name "makoto-bucket-1"} ;;=> {:creation-date #object[org.joda.time.DateTime 0x7392252c "2016-08-01T17:35:30.000+09:00"], ;;=> :owner ;;=> {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", ;;=> :display-name "tokoma1"}, ;;=> :name "makoto-bucket-2"} ;;=> {:creation-date #object[org.joda.time.DateTime 0x4d59b4cb "2016-08-01T17:38:59.000+09:00"], ;;=> :owner ;;=> {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", ;;=> :display-name "tokoma1"}, ;;=> :name "makoto-bucket-3"}] We can see that there are three buckets in your AWS console, as shown in the following screenshot: Let's delete two of the three buckets as follows: (s3/list-buckets) ;;=> [{:creation-date #object[org.joda.time.DateTime 0x56387509 "2016-08-01T07:01:05.000+09:00"], ;;=> :owner {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", :display-name "tokoma1"}, :name "makoto-bucket-1"}] We can see only one bucket now, as shown in the following screenshot: Now we will demonstrate how to send your local data to s3. s3/put-object uploads a file content to the specified bucket and key. The following code uploads /etc/hosts and makoto-bucket-1: (s3/put-object :bucket-name "makoto-bucket-1" :key "test/hosts" :file (java.io.File. "/etc/hosts")) ;;=> {:requester-charged? false, :content-md5 "HkBljfktNTl06yScnMRsjA==", ;;=> :etag "1e40658df92d353974eb249c9cc46c8c", :metadata {:content-disposition nil, ;;=> :expiration-time-rule-id nil, :user-metadata nil, :instance-length 0, :version-id nil, ;;=> :server-side-encryption nil, :etag "1e40658df92d353974eb249c9cc46c8c", :last-modified nil, ;;=> :cache-control nil, :http-expires-date nil, :content-length 0, :content-type nil, ;;=> :restore-expiration-time nil, :content-encoding nil, :expiration-time nil, :content-md5 nil, ;;=> :ongoing-restore nil}} s3/list-objects lists objects in a bucket as follows: (s3/list-objects :bucket-name "makoto-bucket-1") ;;=> {:truncated? false, :bucket-name "makoto-bucket-1", :max-keys 1000, :common-prefixes [], ;;=> :object-summaries [{:storage-class "STANDARD", :bucket-name "makoto-bucket-1", ;;=> :etag "1e40658df92d353974eb249c9cc46c8c", ;;=> :last-modified #object[org.joda.time.DateTime 0x1b76029c "2016-08-01T07:01:16.000+09:00"], ;;=> :owner {:id "3d6e87f691897059c23bcfb88b17da55f0c9aa02cc2a44e461f1594337059d27", ;;=> :display-name "tokoma1"}, :key "test/hosts", :size 380}]} To obtain the contents of objects in buckets, use s3/get-object: (s3/get-object :bucket-name "makoto-bucket-1" :key "test/hosts") ;;=> {:bucket-name "makoto-bucket-1", :key "test/hosts", ;;=> :input-stream #object[com.amazonaws.services.s3.model.S3ObjectInputStream 0x24f810e9 ;;=> ...... ;;=> :last-modified #object[org.joda.time.DateTime 0x79ad1ca9 "2016-08-01T07:01:16.000+09:00"], ;;=> :cache-control nil, :http-expires-date nil, :content-length 380, :content-type "application/octet-stream", ;;=> :restore-expiration-time nil, :content-encoding nil, :expiration-time nil, :content-md5 nil, ;;=> :ongoing-restore nil}} The result is a map, the content is a stream data, and the value of :object-content. To get the result as a string, we will use slurp_ as follows: (slurp (:object-content (s3/get-object :bucket-name "makoto-bucket-1" :key "test/hosts"))) ;;=> "127.0.0.1tlocalhostn127.0.1.1tphenixnn# The following lines are desirable for IPv6 capable hostsn::1 ip6-localhost ip6-loopbacknfe00::0 ip6-localnetnff00::0 ip6-mcastprefixnff02::1 ip6-allnodesnff02::2 ip6-allroutersnn52.8.30.189 my-cluster01-proxy1 n52.8.169.10 my-cluster01-master1 n52.8.198.115 my-cluster01-slave01 n52.9.12.12 my-cluster01-slave02nn52.8.197.100 my-node01n" Using Amazon SQS Amazon SQS is a high-performance, high-availability, and scalable Queue Service. We will demonstrate how easy it is to handle messages on queues in SQS using Clojure: (ns aws-examples.sqs-example (:require [amazonica.core :as core] [amazonica.aws.sqs :as sqs])) To create a queue, you can use sqs/create-queue as follows: (sqs/create-queue :queue-name "makoto-queue" :attributes {:VisibilityTimeout 3000 :MaximumMessageSize 65536 :MessageRetentionPeriod 1209600 :ReceiveMessageWaitTimeSeconds 15}) ;;=> {:queue-url "https://sqs.ap-northeast-1.amazonaws.com/864062283993/makoto-queue"} To get information of queue, use sqs/get-queue-attributes as follows: (sqs/get-queue-attributes "makoto-queue") ;;=> {:QueueArn "arn:aws:sqs:ap-northeast-1:864062283993:makoto-queue", ... You can configure a dead letter queue using sqs/assign-dead-letter-queue as follows: (sqs/create-queue "DLQ") ;;=> {:queue-url "https://sqs.ap-northeast-1.amazonaws.com/864062283993/DLQ"} (sqs/assign-dead-letter-queue (sqs/find-queue "makoto-queue") (sqs/find-queue "DLQ") 10) ;;=> nil Let's list queues defined: (sqs/list-queues) ;;=> {:queue-urls ;;=> ["https://sqs.ap-northeast-1.amazonaws.com/864062283993/DLQ" ;;=> "https://sqs.ap-northeast-1.amazonaws.com/864062283993/makoto-queue"]} The following image is of the console of SQS: Let's examine URLs of queues: (sqs/find-queue "makoto-queue") ;;=> "https://sqs.ap-northeast-1.amazonaws.com/864062283993/makoto-queue" (sqs/find-queue "DLQ") ;;=> "https://sqs.ap-northeast-1.amazonaws.com/864062283993/DLQ" To send messages, we use sqs/send-message: (sqs/send-message (sqs/find-queue "makoto-queue") "hello sqs from Clojure") ;;=> {:md5of-message-body "00129c8cc3c7081893765352a2f71f97", :message-id "690ddd68-a2f6-45de-b6f1-164eb3c9370d"} To receive messages, we use sqs/receive-message: (sqs/receive-message "makoto-queue") ;;=> {:messages [ ;;=> {:md5of-body "00129c8cc3c7081893765352a2f71f97", ;;=> :receipt-handle "AQEB.....", :message-id "bd56fea8-4c9f-4946-9521-1d97057f1a06", ;;=> :body "hello sqs from Clojure"}]} To remove all messages in your queues, we use sqs/purge-queue: (sqs/purge-queue :queue-url (sqs/find-queue "makoto-queue")) ;;=> nil To delete queues, we use sqs/delete-queue: (sqs/delete-queue "makoto-queue") ;;=> nil (sqs/delete-queue "DLQ") ;;=> nil Serverless Clojure with AWS Lambda Lambda is an AWS product that allows you to run Clojure code without the hassle and expense of setting up and maintaining a server environment. Behind the scenes, there are still servers involved, but as far as you are concerned, it is a serverless environment. Upload a JAR and you are good to go. Code running on Lambda is invoked in response to an event, such as a file being uploaded to S3, or according to a specified schedule. In production environments, Lambda is normally used in wider AWS deployment that includes standard server environments to handle discrete computational tasks. Particularly those that benefit from Lambda's horizontal scaling that just happens with configuration required. For Clojurians working on personal project, Lambda is a wonderful combination of power and limitation. Just how far can you hack Lambda given the constraints imposed by AWS? Clojure namespace helloworld Start off with a clean empty projected generated using lein new. From there, in your IDE of choice, configure and package and a new Clojure source file. In the following example, the package is com.sakkam and the source file uses the Clojure namespace helloworld. The entry point to your Lambda code is a Clojure function that is exposed as a method of a Java class using Clojure's gen-class. Similar to use and require, the gen-class function can be included in the Clojure ns definition, as the following, or specified separately. You can use any name you want for the handler function but the prefix must be a hyphen unless an alternate prefix is specified as part of the :methods definition: (ns com.sakkam.lambda.helloworld (:gen-class :methods [^:static [handler [String] String]])) (defn -myhandler [s] (println (str "Hello," s))) From the command line, use lein uberjar to create a JAR that can be uploaded to AWS Lambda. Hello World – the AWS part Getting your Hello World to work is now a matter of creating a new Lambda within AWS, uploading your JAR, and configuring your handler. Hello Stream The handler method we used in our Hello World Lambda function was coded directly and could be extended to accept custom Java classes as part of the method signature. However, for more complex Java integrations, implementing one of AWS's standard interfaces for Lambda is both straightforward and feels more like idiomatic Clojure. The following example replaces our own definition of a handler method with an implementation of a standard interface that is provided as part of the aws-lambda-java-core library. First of all, add the dependency [com.amazonaws/aws-lambda-java-core "1.0.0"] into your project.clj. While you are modifying your project.clj, also add in the dependency for [org.clojure/data.json "0.2.6"] since we will be manipulating JSON formatted objects as part of this exercise. Then, either create a new Clojure namespace or modify your existing one so that it looks like the following (the handler function must be named -handleRequest since handleRequest is specified as part of the interface): (ns aws-examples.lambda-example (:gen-class :implements [com.amazonaws.services.lambda.runtime.RequestStreamHandler]) (:require [clojure.java.io :as io] [clojure.data.json :as json] [clojure.string :as str])) (defn -handleRequest [this is os context] (let [w (io/writer os) parameters (json/read (io/reader is) :key-fn keyword)] (println "Lambda Hello Stream Output ") (println "this class: " (class this)) (println "is class:" (class is)) (println "os class:" (class os)) (println "context class:" (class context)) (println "Parameters are " parameters)) (.flush w)) Use lein uberjar again to create a JAR file. Since we have an existing Lambda function in AWS, we can overwrite the JAR used in the Hello World example. Since the handler function name has changed, we must modify our Lambda configuration to match. This time, the default test that provides parameters in JSON format should work as is, and the result will look something like the following: We can very easily get a more interesting test of Hello Stream by configuring this Lambda to run whenever a file is uploaded to S3. At the Lambda management page, choose the Event Sources tab, click on Add Event, and choose an S3 bucket to which you can easily add a file. Now, upload a file to the specified S3 bucket and then navigate to the logs of the Hello World Lambda function. You will find that Hello World has been automatically invoked, and a fairly complicated object that represents the uploaded file is supplied as a parameter to our Lambda function. Real-world Lambdas To graduate from a Hello World Lambda to real-world Lambdas, the chances are you going to need richer integration with other AWS facilities. As a minimum, you will probably want to write a file to an S3 bucket or insert a notification into SNS queue. Amazon provides an SDK that makes this integration straightforward for developers using standard Java. For Clojurians, using the Amazon Clojure wrapper Amazonica is a very fast and easy way to achieve the same. How it works… Here, we will explain how AWS works. What Is Amazon EC2? Using EC2, we don't need to buy hardware or installing operating system. Amazon provides various types of instances for customers' use cases. Each instance type has varies combinations of CPU, memory, storage, and networking capacity. Some instance types are given in the following table. You can select appropriate instances according to the characteristics of your application. Instance type Description M4 M4 type instance is designed for general purpose computing. This family provides a balanced CPU, memory and network bandwidth C4 C4 type instance is designed for applications that consume CPU resources. C4 is the highest CPU performance with the lowest cost R3 R3 type instances are for memory-intensive applications G2 G2 type instances has NVIDIA GPU and is used for graphic applications and GPU computing applications such as deep learning The following table shows the variations of models of M4 type instance. You can choose the best one among models. Model vCPU RAM (GiB) EBS bandwidth (Mbps) m4.large 2 8 450 m4.xlarge 4 16 750 m4.2xlarge 8 32 1,000 m4.4xlarge 16 64 2,000 m4.10xlarge 40 160 4,000 Amazon S3 Amazon S3 is storage for Cloud. It provides a simple web interface that allows you to store and retrieve data. S3 API is an ease of use but ensures security. S3 provides Cloud storage services and is scalable, reliable, fast, and inexpensive. Buckets and Keys Buckets are containers for objects stored in Amazon S3. Objects are stored in buckets. Bucket name is unique among all regions in the world. So, names of buckets are the top-level identities of S3 and units of charges and access controls. Keys are the unique identifiers for an object within a bucket. Every object in a bucket has exactly one key. Keys are the second-level identifiers and should be unique in a bucket. To identify an object, you use the combination of bucket name and key name. Objects Objects are accessed by a bucket names and keys. Objects consist of data and metadata. Metadata is a set of name-value pairs that describe the characteristics of object. Examples of metadata are the date last modified and content type. Objects can have multiple versions of data. There's more… It is clearly impossible to review all the different APIs for all the different services proposed via the Amazonica library, but you would probably get the feeling of having tremendous powers in your hands right now. (Don't forget to give that credit card back to your boss now …) Some other examples of Amazon services are as follows: Amazon IoT: This proposes a way to get connected devices easily and securely interact with cloud applications and other devices. Amazon Kinesis: This gives you ways of easily loading massive volumes of streaming data into AWS and easily analyzing them through streaming techniques. Summary We hope you enjoyed this appetizer to the book Clojure Programming Cookbook, which will present you a set of progressive readings to improve your Clojure skills, and make it so that Clojure becomes your de facto everyday language for professional and efficient work. This book presents different topics of generic programming, which are always to the point, with some fun so that each recipe feels not like a classroom, but more like a fun read, with challenging exercises left to the reader to gradually build up skills. See you in the book! Resources for Article: Further resources on this subject: Customizing Xtext Components [article] Reactive Programming and the Flux Architecture [article] Setup Routine for an Enterprise Spring Application [article]

0
0
4887

Packt

14 Oct 2016

28 min read

Fast Data Manipulation with R

Packt

14 Oct 2016

28 min read

0
0
4136

Configuring Endpoint Protection in Configuration Manager

Hosting on Google App Engine

The Data Science Venn Diagram

Jupyter and Python Scripting

Prepare for our 2017 Awards with Mapt

Resolving Deadlock in HBase

An Introduction to Moodle 3 and MoodleCloud

Heart Diseases Prediction using Spark 2.0.0

Managing Application Configuration

How to build a desktop app using Electron

Trending Topics

Diving into Data – Search and Report

First Projects with the ESP8266

Bringing DevOps to Network Operations

Deployment and DevOps

Fast Data Manipulation with R

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access