How-To Tutorials

03 Mar 2015

17 min read

Basics of Programming in Julia

03 Mar 2015

In this article by Ivo Balbaert, author of the book Getting Started with Julia Programming, we will explore how Julia interacts with the outside world, reading from standard input and writing to standard output, files, networks, and databases. Julia provides asynchronous networking I/O using the libuv library. We will see how to handle data in Julia. We will also discover the parallel processing model of Julia. In this article, the following topics are covered: Working with files (including the CSV files) Using DataFrames (For more resources related to this topic, see here.) Working with files To work with files, we need the IOStream type. IOStream is a type with the supertype IO and has the following characteristics: The fields are given by names(IOStream) 4-element Array{Symbol,1}: :handle :ios :name :mark The types are given by IOStream.types (Ptr{None}, Array{Uint8,1}, String, Int64) The file handle is a pointer of the type Ptr, which is a reference to the file object. Opening and reading a line-oriented file with the name example.dat is very easy: // code in Chapter 8io.jl fname = "example.dat" f1 = open(fname) fname is a string that contains the path to the file, using escaping of special characters with when necessary; for example, in Windows, when the file is in the test folder on the D: drive, this would become d:\test\example.dat. The f1 variable is now an IOStream(<file example.dat>) object. To read all lines one after the other in an array, use data = readlines(f1), which returns 3-element Array{Union(ASCIIString,UTF8String),1}: "this is line 1.rn" "this is line 2.rn" "this is line 3." For processing line by line, now only a simple loop is needed: for line in data println(line) # or process line end close(f1) Always close the IOStream object to clean and save resources. If you want to read the file into one string, use readall. Use this only for relatively small files because of the memory consumption; this can also be a potential problem when using readlines. There is a convenient shorthand with the do syntax for opening a file, applying a function process, and closing it automatically. This goes as follows (file is the IOStream object in this code): open(fname) do file process(file) end The do command creates an anonymous function, and passes it to open. Thus, the previous code example would have been equivalent to open(process, fname). Use the same syntax for processing a file fname line by line without the memory overhead of the previous methods, for example: open(fname) do file for line in eachline(file) print(line) # or process line end end Writing a file requires first opening it with a "w" flag, then writing strings to it with write, print, or println, and then closing the file handle that flushes the IOStream object to the disk: fname = "example2.dat" f2 = open(fname, "w") write(f2, "I write myself to a filen") # returns 24 (bytes written) println(f2, "even with println!") close(f2) Opening a file with the "w" option will clear the file if it exists. To append to an existing file, use "a". To process all the files in the current folder (or a given folder as an argument to readdir()), use this for loop: for file in readdir() # process file end Reading and writing CSV files A CSV file is a comma-separated file. The data fields in each line are separated by commas "," or another delimiter such as semicolons ";". These files are the de-facto standard for exchanging small and medium amounts of tabular data. Such files are structured so that one line contains data about one data object, so we need a way to read and process the file line by line. As an example, we will use the data file Chapter 8winequality.csv that contains 1,599 sample measurements, 12 data columns, such as pH and alcohol per sample, separated by a semicolon. In the following screenshot, you can see the top 20 rows: In general, the readdlm function is used to read in the data from the CSV files: # code in Chapter 8csv_files.jl: fname = "winequality.csv" data = readdlm(fname, ';') The second argument is the delimiter character (here, it is ;). The resulting data is a 1600x12 Array{Any,2} array of the type Any because no common type could be found: "fixed acidity" "volatile acidity" "alcohol" "quality" 7.4 0.7 9.4 5.0 7.8 0.88 9.8 5.0 7.8 0.76 9.8 5.0 … If the data file is comma separated, reading it is even simpler with the following command: data2 = readcsv(fname) The problem with what we have done until now is that the headers (the column titles) were read as part of the data. Fortunately, we can pass the argument header=true to let Julia put the first line in a separate array. It then naturally gets the correct datatype, Float64, for the data array. We can also specify the type explicitly, such as this: data3 = readdlm(fname, ';', Float64, 'n', header=true) The third argument here is the type of data, which is a numeric type, String or Any. The next argument is the line separator character, and the fifth indicates whether or not there is a header line with the field (column) names. If so, then data3 is a tuple with the data as the first element and the header as the second, in our case, (1599x12 Array{Float64,2}, 1x12 Array{String,2}) (There are other optional arguments to define readdlm, see the help option). In this case, the actual data is given by data3[1] and the header by data3[2]. Let's continue working with the variable data. The data forms a matrix, and we can get the rows and columns of data using the normal array-matrix syntax). For example, the third row is given by row3 = data[3, :] with data: 7.8 0.88 0.0 2.6 0.098 25.0 67.0 0.9968 3.2 0.68 9.8 5.0, representing the measurements for all the characteristics of a certain wine. The measurements of a certain characteristic for all wines are given by a data column, for example, col3 = data[ :, 3] represents the measurements of citric acid and returns a column vector 1600-element Array{Any,1}: "citric acid" 0.0 0.0 0.04 0.56 0.0 0.0 … 0.08 0.08 0.1 0.13 0.12 0.47. If we need columns 2-4 (volatile acidity to residual sugar) for all wines, extract the data with x = data[:, 2:4]. If we need these measurements only for the wines on rows 70-75, get these with y = data[70:75, 2:4], returning a 6 x 3 Array{Any,2} outputas follows: 0.32 0.57 2.0 0.705 0.05 1.9 … 0.675 0.26 2.1 To get a matrix with the data from columns 3, 6, and 11, execute the following command: z = [data[:,3] data[:,6] data[:,11]] It would be useful to create a type Wine in the code. For example, if the data is to be passed around functions, it will improve the code quality to encapsulate all the data in a single data type, like this: type Wine fixed_acidity::Array{Float64} volatile_acidity::Array{Float64} citric_acid::Array{Float64} # other fields quality::Array{Float64} end Then, we can create objects of this type to work with them, like in any other object-oriented language, for example, wine1 = Wine(data[1, :]...), where the elements of the row are splatted with the ... operator into the Wine constructor. To write to a CSV file, the simplest way is to use the writecsv function for a comma separator, or the writedlm function if you want to specify another separator. For example, to write an array data to a file partial.dat, you need to execute the following command: writedlm("partial.dat", data, ';') If more control is necessary, you can easily combine the more basic functions from the previous section. For example, the following code snippet writes 10 tuples of three numbers each to a file: // code in Chapter 8tuple_csv.jl fname = "savetuple.csv" csvfile = open(fname,"w") # writing headers: write(csvfile, "ColName A, ColName B, ColName Cn") for i = 1:10 tup(i) = tuple(rand(Float64,3)...) write(csvfile, join(tup(i),","), "n") end close(csvfile) Using DataFrames If you measure n variables (each of a different type) of a single object of observation, then you get a table with n columns for each object row. If there are m observations, then we have m rows of data. For example, given the student grades as data, you might want to know "compute the average grade for each socioeconomic group", where grade and socioeconomic group are both columns in the table, and there is one row per student. The DataFrame is the most natural representation to work with such a (m x n) table of data. They are similar to pandas DataFrames in Python or data.frame in R. A DataFrame is a more specialized tool than a normal array for working with tabular and statistical data, and it is defined in the DataFrames package, a popular Julia library for statistical work. Install it in your environment by typing in Pkg.add("DataFrames") in the REPL. Then, import it into your current workspace with using DataFrames. Do the same for the packages DataArrays and RDatasets (which contains a collection of example datasets mostly used in the R literature). A common case in statistical data is that data values can be missing (the information is not known). The DataArrays package provides us with the unique value NA, which represents a missing value, and has the type NAtype. The result of the computations that contain the NA values mostly cannot be determined, for example, 42 + NA returns NA. (Julia v0.4 also has a new Nullable{T} type, which allows you to specify the type of a missing value). A DataArray{T} array is a data structure that can be n-dimensional, behaves like a standard Julia array, and can contain values of the type T, but it can also contain the missing (Not Available) values NA and can work efficiently with them. To construct them, use the @data macro: // code in Chapter 8dataarrays.jl using DataArrays using DataFrames dv = @data([7, 3, NA, 5, 42]) This returns 5-element DataArray{Int64,1}: 7 3 NA 5 42. The sum of these numbers is given by sum(dv) and returns NA. One can also assign the NA values to the array with dv[5] = NA; then, dv becomes [7, 3, NA, 5, NA]). Converting this data structure to a normal array fails: convert(Array, dv) returns ERROR: NAException. How to get rid of these NA values, supposing we can do so safely? We can use the dropna function, for example, sum(dropna(dv)) returns 15. If you know that you can replace them with a value v, use the array function: repl = -1 sum(array(dv, repl)) # returns 13 A DataFrame is a kind of an in-memory database, versatile in the ways you can work with the data. It consists of columns with names such as Col1, Col2, Col3, and so on. Each of these columns are DataArrays that have their own type, and the data they contain can be referred to by the column names as well, so we have substantially more forms of indexing. Unlike two-dimensional arrays, columns in a DataFrame can be of different types. One column might, for instance, contain the names of students and should therefore be a string. Another column could contain their age and should be an integer. We construct a DataFrame from the program data as follows: // code in Chapter 8dataframes.jl using DataFrames # constructing a DataFrame: df = DataFrame() df[:Col1] = 1:4 df[:Col2] = [e, pi, sqrt(2), 42] df[:Col3] = [true, false, true, false] show(df) Notice that the column headers are used as symbols. This returns the following 4 x 3 DataFrame object: We could also have used the full constructor as follows: df = DataFrame(Col1 = 1:4, Col2 = [e, pi, sqrt(2), 42], Col3 = [true, false, true, false]) You can refer to the columns either by an index (the column number) or by a name, both of the following expressions return the same output: show(df[2]) show(df[:Col2]) This gives the following output: [2.718281828459045, 3.141592653589793, 1.4142135623730951,42.0] To show the rows or subsets of rows and columns, use the familiar splice (:) syntax, for example: To get the first row, execute df[1, :]. This returns 1x3 DataFrame. | Row | Col1 | Col2 | Col3 | |-----|------|---------|------| | 1 | 1 | 2.71828 | true | To get the second and third row, execute df [2:3, :] To get only the second column from the previous result, execute df[2:3, :Col2]. This returns [3.141592653589793, 1.4142135623730951]. To get the second and third column from the second and third row, execute df[2:3, [:Col2, :Col3]], which returns the following output: 2x2 DataFrame | Row | Col2 | Col3 | |---- |----- -|-------| | 1 | 3.14159 | false | | 2 | 1.41421 | true | The following functions are very useful when working with DataFrames: The head(df) and tail(df) functions show you the first six and the last six lines of data respectively. The names function gives the names of the columns names(df). It returns 3-element Array{Symbol,1}: :Col1 :Col2 :Col3. The eltypes function gives the data types of the columns eltypes(df). It gives the output as 3-element Array{Type{T<:Top},1}: Int64 Float64 Bool. The describe function tries to give some useful summary information about the data in the columns, depending on the type, for example, describe(df) gives for column 2 (which is numeric) the min, max, median, mean, number, and percentage of NAs: Col2 Min 1.4142135623730951 1st Qu. 2.392264761937558 Median 2.929937241024419 Mean 12.318522011105483 3rd Qu. 12.856194490192344 Max 42.0 NAs 0 NA% 0.0% To load in data from a local CSV file, use the method readtable. The returned object is of type DataFrame: // code in Chapter 8dataframes.jl using DataFrames fname = "winequality.csv" data = readtable(fname, separator = ';') typeof(data) # DataFrame size(data) # (1599,12) Here is a fraction of the output: The readtable method also supports reading in gzipped CSV files. Writing a DataFrame to a file can be done with the writetable function, which takes the filename and the DataFrame as arguments, for example, writetable("dataframe1.csv", df). By default, writetable will use the delimiter specified by the filename extension and write the column names as headers. Both readtable and writetable support numerous options for special cases. Refer to the docs for more information (refer to http://dataframesjl.readthedocs.org/en/latest/). To demonstrate some of the power of DataFrames, here are some queries you can do: Make a vector with only the quality information data[:quality] Give the wines with alcohol percentage equal to 9.5, for example, data[ data[:alcohol] .== 9.5, :] Here, we use the .== operator, which does element-wise comparison. data[:alcohol] .== 9.5 returns an array of Boolean values (true for datapoints, where :alcohol is 9.5, and false otherwise). data[boolean_array, : ] selects those rows where boolean_array is true. Count the number of wines grouped by quality with by(data, :quality, data -> size(data, 1)), which returns the following: 6x2 DataFrame | Row | quality | x1 | |-----|---------|-----| | 1 | 3 | 10 | | 2 | 4 | 53 | | 3 | 5 | 681 | | 4 | 6 | 638 | | 5 | 7 | 199 | | 6 | 8 | 18 | The DataFrames package contains the by function, which takes in three arguments: A DataFrame, here it takes data A column to split the DataFrame on, here it takes quality A function or an expression to apply to each subset of the DataFrame, here data -> size(data, 1), which gives us the number of wines for each quality value Another easy way to get the distribution among quality is to execute the histogram hist function hist(data[:quality]) that gives the counts over the range of quality (2.0:1.0:8.0,[10,53,681,638,199,18]). More precisely, this is a tuple with the first element corresponding to the edges of the histogram bins, and the second denoting the number of items in each bin. So there are, for example, 10 wines with quality between 2 and 3, and so on. To extract the counts as a variable count of type Vector, we can execute _, count = hist(data[:quality]); the _ means that we neglect the first element of the tuple. To obtain the quality classes as a DataArray class, we will execute the following: class = sort(unique(data[:quality])) We can now construct a df_quality DataFrame with the class and count columns as df_quality = DataFrame(qual=class, no=count). This gives the following output: 6x2 DataFrame | Row | qual | no | |-----|------|-----| | 1 | 3 | 10 | | 2 | 4 | 53 | | 3 | 5 | 681 | | 4 | 6 | 638 | | 5 | 7 | 199 | | 6 | 8 | 18 | To deepen your understanding and learn about the other features of Julia DataFrames (such as joining, reshaping, and sorting), refer to the documentation available at http://dataframesjl.readthedocs.org/en/latest/. Other file formats Julia can work with other human-readable file formats through specialized packages: For JSON, use the JSON package. The parse method converts the JSON strings into Dictionaries, and the json method turns any Julia object into a JSON string. For XML, use the LightXML package For YAML, use the YAML package For HDF5 (a common format for scientific data), use the HDF5 package For working with Windows INI files, use the IniFile package Summary In this article we discussed the basics of network programming in Julia. Resources for Article: Further resources on this subject: Getting Started with Electronic Projects? [article] Getting Started with Selenium Webdriver and Python [article] Handling The Dom In Dart [article]

0
0
18945

article-image-getting-started-postgresql

Packt

03 Mar 2015

11 min read

Getting Started with PostgreSQL

Packt

03 Mar 2015

11 min read

0
0
2587

article-image-performance-considerations

Packt

03 Mar 2015

13 min read

Performance Considerations

Packt

03 Mar 2015

13 min read

0
0
2339

article-image-elasticsearch-administration

Packt

03 Mar 2015

28 min read

Elasticsearch Administration

Packt

03 Mar 2015

28 min read

0
0
5417

Packt

03 Mar 2015

11 min read

MapReduce functions

Packt

03 Mar 2015

11 min read

In this article, by John Zablocki, author of the book, Couchbase Essentials, you will be acquainted to MapReduce and how you'll use it to create secondary indexes for our documents. At its simplest, MapReduce is a programming pattern used to process large amounts of data that is typically distributed across several nodes in parallel. In the NoSQL world, MapReduce implementations may be found on many platforms from MongoDB to Hadoop, and of course, Couchbase. Even if you're new to the NoSQL landscape, it's quite possible that you've already worked with a form of MapReduce. The inspiration for MapReduce in distributed NoSQL systems was drawn from the functional programming concepts of map and reduce. While purely functional programming languages haven't quite reached mainstream status, languages such as Python, C#, and JavaScript all support map and reduce operations. (For more resources related to this topic, see here.) Map functions Consider the following Python snippet: numbers = [1, 2, 3, 4, 5] doubled = map(lambda n: n * 2, numbers) #doubled == [2, 4, 6, 8, 10] These two lines of code demonstrate a very simple use of a map() function. In the first line, the numbers variable is created as a list of integers. The second line applies a function to the list to create a new mapped list. In this case, the map() function is supplied as a Python lambda, which is just an inline, unnamed function. The body of lambda multiplies each number by two. This map() function can be made slightly more complex by doubling only odd numbers, as shown in this code: numbers = [1, 2, 3, 4, 5] defdouble_odd(num): if num % 2 == 0: return num else: return num * 2 doubled = map(double_odd, numbers) #doubled == [2, 2, 6, 4, 10] Map functions are implemented differently in each language or platform that supports them, but all follow the same pattern. An iterable collection of objects is passed to a map function. Each item of the collection is then iterated over with the map function being applied to that iteration. The final result is a new collection where each of the original items is transformed by the map. Reduce functions Like maps, the reduce functions also work by applying a provided function to an iterable data structure. The key difference between the two is that the reduce function works to produce a single value from the input iterable. Using Python's built-in reduce() function, we can see how to produce a sum of integers, as follows: numbers = [1, 2, 3, 4, 5] sum = reduce(lambda x, y: x + y, numbers) #sum == 15 You probably noticed that unlike our map operation, the reduce lambda has two parameters (x and y in this case). The argument passed to x will be the accumulated value of all applications of the function so far, and y will receive the next value to be added to the accumulation. Parenthetically, the order of operations can be seen as ((((1 + 2) + 3) + 4) + 5). Alternatively, the steps are shown in the following list: x = 1, y = 2 x = 3, y = 3 x = 6, y = 4 x = 10, y = 5 x = 15 As this list demonstrates, the value of x is the cumulative sum of previous x and y values. As such, reduce functions are sometimes termed accumulate or fold functions. Regardless of their name, reduce functions serve the common purpose of combining pieces of a recursive data structure to produce a single value. Couchbase MapReduce Creating an index (or view) in Couchbase requires creating a map function written in JavaScript. When the view is created for the first time, the map function is applied to each document in the bucket containing the view. When you update a view, only new or modified documents are indexed. This behavior is known as incremental MapReduce. You can think of a basic map function in Couchbase as being similar to a SQL CREATE INDEX statement. Effectively, you are defining a column or a set of columns, to be indexed by the server. Of course, these are not columns, but rather properties of the documents to be indexed. Basic mapping To illustrate the process of creating a view, first imagine that we have a set of JSON documents as shown here: var books=[ { "id": 1, "title": "The Bourne Identity", "author": "Robert Ludlow" }, { "id": 2, "title": "The Godfather", "author": "Mario Puzzo" }, { "id": 3, "title": "Wiseguy", "author": "Nicholas Pileggi" } ]; Each document contains title and author properties. In Couchbase, to query these documents by either title or author, we'd first need to write a map function. Without considering how map functions are written in Couchbase, we're able to understand the process with vanilla JavaScript: books.map(function(book) { return book.author; }); In the preceding snippet, we're making use of the built-in JavaScript array's map() function. Similar to the Python snippets we saw earlier, JavaScript's map() function takes a function as a parameter and returns a new array with mapped objects. In this case, we'll have an array with each book's author, as follows: ["Robert Ludlow", "Mario Puzzo", "Nicholas Pileggi"] At this point, we have a mapped collection that will be the basis for our author index. However, we haven't provided a means for the index to be able to refer back to its original document. If we were using a relational database, we'd have effectively created an index on the Title column with no way to get back to the row that contained it. With a slight modification to our map function, we are able to provide the key (the id property) of the document as well in our index: books.map(function(book) { return [book.author, book.id]; }); In this slightly modified version, we're including the ID with the output of each author. In this way, the index has its document's key stored with its title. [["The Bourne Identity", 1], ["The Godfather", 2], ["Wiseguy", 3]] We'll soon see how this structure more closely resembles the values stored in a Couchbase index. Basic reducing Not every Couchbase index requires a reduce component. In fact, we'll see that Couchbase already comes with built-in reduce functions that will provide you with most of the reduce behavior you need. However, before relying on only those functions, it's important to understand why you'd use a reduce function in the first place. Returning to the preceding example of the map, let's imagine we have a few more documents in our set, as follows: var books=[ { "id": 1, "title": "The Bourne Identity", "author": "Robert Ludlow" }, { "id": 2, "title": "The Bourne Ultimatum", "author": "Robert Ludlow" }, { "id": 3, "title": "The Godfather", "author": "Mario Puzzo" }, { "id": 4, "title": "The Bourne Supremacy", "author": "Robert Ludlow" }, { "id": 5, "title": "The Family", "author": "Mario Puzzo" }, { "id": 6, "title": "Wiseguy", "author": "Nicholas Pileggi" } ]; We'll still create our index using the same map function because it provides a way of accessing a book by its author. Now imagine that we want to know how many books an author has written, or (assuming we had more data) the average number of pages written by an author. These questions are not possible to answer with a map function alone. Each application of the map function knows nothing about the previous application. In other words, there is no way for you to compare or accumulate information about one author's book to another book by the same author. Fortunately, there is a solution to this problem. As you've probably guessed, it's the use of a reduce function. As a somewhat contrived example, consider this JavaScript: mapped = books.map(function (book) { return ([book.id, book.author]); }); counts = {} reduced = mapped.reduce(function(prev, cur, idx, arr) { var key = cur[1]; if (! counts[key]) counts[key] = 0; ++counts[key] }, null); This code doesn't quite accurately reflect the way you would count books with Couchbase but it illustrates the basic idea. You look for each occurrence of a key (author) and increment a counter when it is found. With Couchbase MapReduce, the mapped structure is supplied to the reduce() function in a better format. You won't need to keep track of items in a dictionary. Couchbase views At this point, you should have a general sense of what MapReduce is, where it came from, and how it will affect the creation of a Couchbase Server view. So without further ado, let's see how to write our first Couchbase view. In fact, there were two to choose from. The bucket we'll use is beer-sample. If you didn't install it, don't worry. You can add it by opening the Couchbase Console and navigating to the Settings tab. Here, you'll find the option to install the bucket, as shown next: First, you need to understand the document structures with which you're working. The following JSON object is a beer document (abbreviated for brevity): { "name": "Sundog", "type": "beer", "brewery_id": "new_holland_brewing_company", "description": "Sundog is an amber ale...", "style": "American-Style Amber/Red Ale", "category": "North American Ale" } As you can see, the beer documents have several properties. We're going to create an index to let us query these documents by name. In SQL, the query would look like this: SELECT Id FROM Beers WHERE Name = ? You might be wondering why the SQL example includes only the Id column in its projection. For now, just know that to query a document using a view with Couchbase, the property by which you're querying must be included in an index. To create that index, we'll write a map function. The simplest example of a map function to query beer documents by name is as follows: function(doc) { emit(doc.name); } This body of the map function has only one line. It calls the built-in Couchbase emit() function. This function is used to signal that a value should be indexed. The output of this map function will be an array of names. The beer-sample bucket includes brewery data as well. These documents look like the following code (abbreviated for brevity): { "name": "Thomas Hooker Brewing", "city": "Bloomfield", "state": "Connecticut", "website": "http://www.hookerbeer.com/", "type": "brewery" } If we reexamine our map function, we'll see an obvious problem; both the brewery and beer documents have a name property. When this map function is applied to the documents in the bucket, it will create an index with documents from either the brewery or beer documents. The problem is that Couchbase documents exist in a single container—the bucket. There is no namespace for a set of related documents. The solution has typically involved including a type or docType property on each document. The value of this property is used to distinguish one document from another. In the case of the beer-sample database, beer documents have type = "beer" and brewery documents have type = "brewery". Therefore, we are easily able to modify our map function to create an index only on beer documents: function(doc) { if (doc.type == "beer") { emit(doc.name); } } The emit() function actually takes two arguments. The first, as we've seen, emits a value to be indexed. The second argument is an optional value and is used by the reduce function. Imagine that we want to count the number of beer types in a particular category. In SQL, we would write the following query: SELECT Category, COUNT(*) FROM Beers GROUP BY Category To achieve the same functionality with Couchbase Server, we'll need to use both map and reduce functions. First, let's write the map. It will create an index on the category property: function(doc) { if (doc.type == "beer") { emit(doc.category, 1); } } The only real difference between our category index and our name index is that we're including an argument for the value parameter of the emit() function. What we'll do with that value is simply count them. This counting will be done in our reduce function: function(keys, values) { return values.length; } In this example, the values parameter will be given to the reduce function as a list of all values associated with a particular key. In our case, for each beer category, there will be a list of ones (that is, [1, 1, 1, 1, 1, 1]). Couchbase also provides a built-in _count function. It can be used in place of the entire reduce function in the preceding example. Now that we've seen the basic requirements when creating an actual Couchbase view, it's time to add a view to our bucket. The easiest way to do so is to use the Couchbase Console. Summary In this article, you learned the purpose of secondary indexes in a key/value store. We dug deep into MapReduce, both in terms of its history in functional languages and as a tool for NoSQL and big data systems. Resources for Article: Further resources on this subject: Map Reduce? [article] Introduction to Mapreduce [article] Working with Apps Splunk [article]

0
0
4795

Packt

03 Mar 2015

14 min read

Introducing Splunk

Packt

03 Mar 2015

14 min read

In this article by Betsy Page Sigman, author of the book Splunk Essentials, Splunk, whose "name was inspired by the process of exploring caves, or splunking, helps analysts, operators, programmers, and many others explore data from their organizations by obtaining, analyzing, and reporting on it. This multinational company, cofounded by Michael Baum, Rob Das, and Erik Swan, has a core product called "Splunk Enterprise. This manages searches, inserts, deletes, and filters, and analyzes big data that is generated by machines, as well as other types of data. "They also have a free version that has most of the capabilities of Splunk Enterprise and is an excellent learning tool. (For more resources related to this topic, see here.) Understanding events, event types, and fields in Splunk An understanding of events and event types is important before going further. Events In Splunk, an event is not just one of" the many local user meetings that are set up between developers to help each other out (although those can be very useful), "but also refers to a record of one activity that is recorded in a log file. Each event usually has: A timestamp indicating the date and exact time the event was created Information about what happened on the system that is being tracked Event types An event type is a way to allow "users to categorize similar events. It is field-defined by the user. You can define an event type in several ways, and the easiest way is by using the SplunkWeb interface. One common reason for setting up an event type is to examine why a system has failed. Logins are often problematic for systems, and a search for failed logins can help pinpoint problems. For an interesting example of how to save "a search on failed logins as an event type, visit http://docs.splunk.com/Documentation/Splunk/6.1.3/Knowledge/ClassifyAndGroupSimilarEvents#Save_a_search_as_a_new_event_type. Why are events and event types so important in Splunk? Because without events, there would be nothing to search, of course. And event types allow us to make meaningful searches easily and quickly according to our needs, as we'll see later. Sourcetypes Sourcetypes are also "important to understand, as they help define the rules for an event. A sourcetype is one of the default fields that Splunk assigns to data as it comes into the system. It determines what type of data it is so that Splunk can format it appropriately as it indexes it. This also allows the user who wants to search the "data to easily categorize it. Some of the common sourcetypes are listed as follows: access_combined, for "NCSA combined format HTTP web server logs apache_error, for standard "Apache web server error logs cisco_syslog, for the "standard syslog produced by Cisco network devices (including PIX firewalls, routers, and ACS), usually via remote syslog to a central log host websphere_core, a core file" export from WebSphere (Source: http://docs.splunk.com/Documentation/Splunk/latest/Data/Whysourcetypesmatter) Fields Each event in Splunk is" associated with a number of fields. The core fields of host, course, sourcetype, and timestamp are key to Splunk. These fields are extracted from events at multiple points in the data processing pipeline that Splunk uses, and each of these fields includes a name and a value. The name describes the field (such as the userid) and the value says what that field's value is (susansmith, for example). Some of these fields are default fields that are given because of where the event came from or what it is. When data is processed by Splunk, and when it is indexed or searched, it uses these fields. For indexing, the default fields added include those of host, source, and sourcetype. When searching, Splunk is able to select from a bevy of fields that can either be defined by the user or are very basic, such as action results in a purchase (for a website event). Fields are essential for doing the basic work of Splunk – that is, indexing and searching. Getting data into Splunk It's time to spring into action" now and input some data into Splunk. Adding data is "simple, easy, and quick. In this section, we will use some data and tutorials created by Splunk to learn how to add data: Firstly, to obtain your data, visit the tutorial data at http://docs.splunk.com/Documentation/Splunk/6.1.5/SearchTutorial/GetthetutorialdataintoSplunk that is readily available on Splunk. Here, download the folder tutorialdata.zip. Note that this will be a fresh dataset that has been collected over the last 7 days. Download it but don't extract the data from it just yet. You then need to log in to Splunk, using admin as the username and then by using your password. Once logged in, you will notice that toward the upper-right corner of your screen is the button Add Data, as shown in the following screenshot. Click "on this button: Button to Add Data Once you have "clicked on this button, you'll see a screen" similar to the "following screenshot: Add Data to Splunk by Choosing a Data Type or Data Source Notice here the "different types of data that you can select, as "well as the different data sources. Since the data we're going to use is a file, under "Or Choose a Data Source, click on From files and directories. Once you have clicked on this, you can then click on the radio button next to Skip preview, as indicated in the following screenshot, since you don't need to preview the data" now. You then need to click on "Continue: Preview data You can download the tutorial files at: http://docs.splunk.com/Documentation/Splunk/6.1.5/SearchTutorial/GetthetutorialdataintoSplunk As shown in the next screenshot, click on Upload and index a file, find the tutorialdata.zip file you just downloaded (it is probably in your Downloads folder), and then click on More settings, filling it in as shown in the following screenshot. (Note that you will need to select Segment in path under Host and type 1 under Segment Number.) Click on Save when you are done: Can specify source, additional settings, and source type Following this, you "should see a screen similar to the following" screenshot. Click on Start Searching, we will look at the data now: You should see this if your data has been successfully indexed into Splunk. You will now" see a screen similar to the following" screenshot. Notice that the number of events you have will be different, as will the time of the earliest event. At this point, click on Data Summary: The Search screen You should see the Data Summary screen like in the following screenshot. However, note that the Hosts shown here will not be the same as the ones you get. Take a quick look at what is on the Sources tab and the Sourcetypes tab. Then find the most recent data (in this case 127.0.0.1) and click on it. Data Summary, where you can see Hosts, Sources, and Sourcetypes After" clicking on the most recent data, which in "this case is bps-T341s, look at the events contained there. Later, when we use streaming data, we can see how the events at the top of this list change rapidly. Here, you will see a listing of events, similar to those shown in the "following screenshot: Events lists for the host value You can click on the Splunk logo in the upper-left corner "of the web page to return to the home page. Under Administrator at the "top-right of the page, click on Logout. Searching Twitter data We will start here by doing a simple search of our Twitter index, which is automatically created by the app once you have enabled Twitter input (as explained previously). In our earlier searches, we used the default index (which the tutorial data was downloaded to), so we didn't have to specify the index we wanted to use. Here, we will use just the Twitter index, so we need to specify that in the search. A simple search Imagine that we wanted to search for tweets containing the word coffee. We could use the code presented here and place it in the search bar: index=twitter text=*coffee* The preceding code searches only your Twitter index and finds all the places where the word coffee is mentioned. You have to put asterisks there, otherwise you will only get the tweets with just "coffee". (Note that the text field is not case sensitive, so tweets with either "coffee" or "Coffee" will be included in the search results.) The asterisks are included before and after the text "coffee" because otherwise we would only get events where just "coffee" was tweeted – a rather rare occurrence, we expect. In fact, when we search our indexed Twitter data without the asterisks around coffee, we got no results. Examining the Twitter event Before going further, it is useful to stop and closely examine the events that are collected as part of the search. The sample tweet shown in the following screenshot shows the large number of fields that are part of each tweet. The > was clicked to expand the event: A Twitter event There are several items to look closely at here: _time: Splunk assigns a timestamp for every event. This is done in UTC (Coordinated Universal Time) time format. contributors: The value for this field is null, as are the values of many Twitter fields. Retweeted_status: Notice the {+} here; in the following event list, you will see there are a number of fields associated with this, which can be seen when the + is selected and the list is expanded. This is the case wherever you see a {+} in a list of fields: Various retweet fields In addition to those shown previously, there are many other fields associated with a tweet. The 140 character (maximum) text field that most people consider to be the tweet is actually a small part of the actual data collected. The implied AND If you want to search on more than one term, there is no need to add AND as it is already implied. If, for example, you want to search for all tweets that include both the text "coffee" and the text "morning", then use: index=twitter text=*coffee* text=*morning* If you don't specify text= for the second term and just put *morning*, Splunk assumes that you want to search for *morning* in any field. Therefore, you could get that word in another field in an event. This isn't very likely in this case, although coffee could conceivably be part of a user's name, such as "coffeelover". But if you were searching for other text strings, such as a computer term like log or error, such terms could be found in a number of fields. So specifying the field you are interested in would be very important. The need to specify OR Unlike AND, you must always specify the word OR. For example, to obtain all events that mention either coffee or morning, enter: index=twitter text=*coffee* OR text=*morning* Finding other words used Sometimes you might want to find out what other words are used in tweets about coffee. You can do that with the following search: index=twitter text=*coffee* | makemv text | mvexpand text | top 30 text This search first searches for the word "coffee" in a text field, then creates a multivalued field from the tweet, and then expands it so that each word is treated as a separate piece of text. Then it takes the top 30 words that it finds. You might be asking yourself how you would use this kind of information. This type of analysis would be of interest to a marketer, who might want to use words that appear to be associated with coffee in composing the script for an advertisement. The following screenshot shows the results that appear (1 of 2 pages). From this search, we can see that the words love, good, and cold might be words worth considering: Search of top 30 text fields found with *coffee* When you do a search like this, you will notice that there are a lot of filler words (a, to, for, and so on) that appear. You can do two things to remedy this. You can increase the limit for top words so that you can see more of the words that come up, or you can rerun the search using the following code. "Coffee" (with a capital C) is listed (on the unshown second page) separately here from "coffee". The reason for this is that while the search is not case sensitive (thus both "coffee" and "Coffee" are picked up when you search on "coffee"), the process of putting the text fields through the makemv and the mvexpand processes ends up distinguishing on the basis of case. We could rerun the search, excluding some of the filler words, using the code shown here: index=twitter text=*coffee* | makemv text | mvexpand text |search NOT text="RT" AND NOT text="a" AND NOT text="to" ANDNOT text="the" | top 30 text Using a lookup table Sometimes it is useful to use a lookup file to avoid having to use repetitive code. It would help us to have a list of all the small words that might be found often in a tweet just by the nature of each word's frequent use in language, so that we might eliminate them from our quest to find words that would be relevant for use in the creation of advertising. If we had a file of such small words, we could use a command indicating not to use any of these more common, irrelevant words when listing the top 30 words associated with our search topic of interest. Thus, for our search for words associated with the text "coffee", we would be interested in words like " dark", "flavorful", and "strong", but not words like "a", "the", and "then". We can do this using a lookup command. There are three types of lookup commands, which are presented in the following table: Command Description lookup Matches a value of one field with a value of another, based on a .csv file with the two fields. Consider a lookup table named lutable that contains fields for machine_name and owner. Consider what happens when the following code snippet is used after a preceding search (indicated by . . . |): . . . | lookup lutable owner Splunk will use the lookup table to match the owner's name with its machine_name and add the machine_name to each event. inputlookup All fields in the .csv file are returned as results. If the following code snippet is used, both machine_name and owner would be searched: . . . | inputlookup lutable outputlookup This code outputs search results to a lookup table. The following code outputs results from the preceding research directly into a table it creates: . . . | outputlookup newtable.csv saves The command we will use here is inputlookup, because we want to reference a .csv file we can create that will include words that we want to filter out as we seek to find possible advertising words associated with coffee. Let's call the .csv file filtered_words.csv, and give it just a single text field, containing words like "is", "the", and "then". Let's rewrite the search to look like the following code: index=twitter text=*coffee*| makemv text | mvexpand text| search NOT [inputlookup filtered_words | fields text ]| top 30 text Using the preceding code, Splunk will search our Twitter index for *coffee*, and then expand the text field so that individual words are separated out. Then it will look for words that do NOT match any of the words in our filtered_words.csv file, and finally output the top 30 most frequently found words among those. As you can see, the lookup table can be very useful. To learn more about Splunk lookup tables, go to http://docs.splunk.com/Documentation/Splunk/6.1.5/SearchReference/Lookup. Summary In this article, we have learned more about how to use Splunk to create reports, dashboards. Splunk Enterprise Software, or Splunk, is an extremely powerful tool for searching, exploring, and visualizing data of all types. Splunk is becoming increasingly popular, as more and more businesses, both large and small, discover its ease and usefulness. Analysts, managers, students, and others can quickly learn how to use the data from their systems, networks, web traffic, and social media to make attractive and informative reports. This is a straightforward, practical, and quick introduction to Splunk that should have you making reports and gaining insights from your data in no time. Resources for Article: Further resources on this subject: Lookups [article] Working with Apps in Splunk [article] Loading data, creating an app, and adding dashboards and reports in Splunk [article]

0
0
11723

Packt

03 Mar 2015

14 min read

SciPy for Signal Processing

Packt

03 Mar 2015

14 min read

In this article by Sergio J. Rojas G. and Erik A Christensen, authors of the book Learning SciPy for Numerical and Scientific Computing - Second Edition, we will focus on the usage of some most commonly used routines that are included in SciPy modules—scipy.signal, scipy.ndimage, and scipy.fftpack, which are used for signal processing, multidimensional image processing, and computing Fourier transforms, respectively. We define a signal as data that measures either a time-varying or spatially varying phenomena. Sound or electrocardiograms are excellent examples of time-varying quantities, while images embody the quintessential spatially varying cases. Moving images are treated with the techniques of both types of signals, obviously. The field of signal processing treats four aspects of this kind of data: its acquisition, quality improvement, compression, and feature extraction. SciPy has many routines to treat effectively tasks in any of the four fields. All these are included in two low-level modules (scipy.signal being the main module, with an emphasis on time-varying data, and scipy.ndimage, for images). Many of the routines in these two modules are based on Discrete Fourier Transform of the data. SciPy has an extensive package of applications and definitions of these background algorithms, scipy.fftpack, which we will start covering first. (For more resources related to this topic, see here.) Discrete Fourier Transforms The Discrete Fourier Transform (DFT from now on) transforms any signal from its time/space domain into a related signal in the frequency domain. This allows us not only to be able to analyze the different frequencies of the data, but also for faster filtering operations, when used properly. It is possible to turn a signal in the frequency domain back to its time/spatial domain; thanks to the Inverse Fourier Transform. We will not go into detail of the mathematics behind these operators, since we assume familiarity at some level with this theory. We will focus on syntax and applications instead. The basic routines in the scipy.fftpack module compute the DFT and its inverse, for discrete signals in any dimension, which are fft and ifft (one dimension), fft2 and ifft2 (two dimensions), and fftn and ifftn (any number of dimensions). All of these routines assume that the data is complex valued. If we know beforehand that a particular dataset is actually real valued, and should offer real-valued frequencies, we use rfft and irfft instead, for a faster algorithm. All these routines are designed so that composition with their inverses always yields the identity. The syntax is the same in all cases, as follows: fft(x[, n, axis, overwrite_x]) The first parameter, x, is always the signal in any array-like form. Note that fft performs one-dimensional transforms. This means in particular, that if x happens to be two-dimensional, for example, fft will output another two-dimensional array where each row is the transform of each row of the original. We can change it to columns instead, with the optional parameter, axis. The rest of parameters are also optional; n indicates the length of the transform, and overwrite_x gets rid of the original data to save memory and resources. We usually play with the integer n when we need to pad the signal with zeros, or truncate it. For higher dimension, n is substituted by shape (a tuple), and axis by axes (another tuple). To better understand the output, it is often useful to shift the zero frequencies to the center of the output arrays with fftshift. The inverse of this operation, ifftshift, is also included in the module. The following code shows some of these routines in action, when applied to a checkerboard image: >>> import numpy >>> from scipy.fftpack import fft,fft2, fftshift >>> import matplotlib.pyplot as plt >>> B=numpy.ones((4,4)); W=numpy.zeros((4,4)) >>> signal = numpy.bmat("B,W;W,B") >>> onedimfft = fft(signal,n=16) >>> twodimfft = fft2(signal,shape=(16,16)) >>> plt.figure() >>> plt.gray() >>> plt.subplot(121,aspect='equal') >>> plt.pcolormesh(onedimfft.real) >>> plt.colorbar(orientation='horizontal') >>> plt.subplot(122,aspect='equal') >>> plt.pcolormesh(fftshift(twodimfft.real)) >>> plt.colorbar(orientation='horizontal') >>> plt.show() Note how the first four rows of the one-dimensional transform are equal (and so are the last four), while the two-dimensional transform (once shifted) presents a peak at the origin, and nice symmetries in the frequency domain. In the following screenshot (obtained from the preceding code), the left-hand side image is fft and the right-hand side image is fft2 of a 2 x 2 checkerboard signal: The scipy.fftpack module also offers the Discrete Cosine Transform with its inverse (dct, idct) as well as many differential and pseudo-differential operators defined in terms of all these transforms: diff (for derivative/integral), hilbert and ihilbert (for the Hilbert transform), tilbert and itilbert (for the h-Tilbert transform of periodic sequences), and so on. Signal construction To aid in the construction of signals with predetermined properties, the scipy.signal module has a nice collection of the most frequent one-dimensional waveforms in the literature: chirp and sweep_poly (for the frequency-swept cosine generator), gausspulse (a Gaussian modulated sinusoid) and sawtooth and square (for the waveforms with those names). They all take as their main parameter a one-dimensional ndarray representing the times at which the signal is to be evaluated. Other parameters control the design of the signal, according to frequency or time constraints. Let's take a look into the following code snippet, which illustrates the use of these one dimensional waveforms that we just discussed: >>> import numpy >>> from scipy.signal import chirp, sawtooth, square, gausspulse >>> import matplotlib.pyplot as plt >>> t=numpy.linspace(-1,1,1000) >>> plt.subplot(221); plt.ylim([-2,2]) >>> plt.plot(t,chirp(t,f0=100,t1=0.5,f1=200)) # plot a chirp >>> plt.subplot(222); plt.ylim([-2,2]) >>> plt.plot(t,gausspulse(t,fc=10,bw=0.5)) # Gauss pulse >>> plt.subplot(223); plt.ylim([-2,2]) >>> t*=3*numpy.pi >>> plt.plot(t,sawtooth(t)) # sawtooth >>> plt.subplot(224); plt.ylim([-2,2]) >>> plt.plot(t,square(t)) # Square wave >>> plt.show() Generated by this code, the following diagram shows waveforms for chirp (upper-left), gausspulse (upper-right), sawtooth (lower-left), and square (lower-right): The usual method of creating signals is to import them from the file. This is possible by using purely NumPy routines, for example fromfile: fromfile(file, dtype=float, count=-1, sep='') The file argument may point to either a file or a string, the count argument is used to determine the number of items to read, and sep indicates what constitutes a separator in the original file/string. For images, we have the versatile routine, imread in either the scipy.ndimage or scipy.misc module: imread(fname, flatten=False) The fname argument is a string containing the location of an image. The routine infers the type of file, and reads the data into an array, accordingly. In case the flatten argument is turned to True, the image is converted to gray scale. Note that, in order to work, the Python Imaging Library (PIL) needs to be installed. It is also possible to load .wav files for analysis, with the read and write routines from the wavfile submodule in the scipy.io module. For instance, given any audio file with this format, say audio.wav, the command, rate,data = scipy.io.wavfile.read("audio.wav"), assigns an integer value to the rate variable, indicating the sample rate of the file (in samples per second), and a NumPy ndarray to the data variable, containing the numerical values assigned to the different notes. If we wish to write some one-dimensional ndarray data into an audio file of this kind, with the sample rate given by the rate variable, we may do so by issuing the following command: >>> scipy.io.wavfile.write("filename.wav",rate,data) Filters A filter is an operation on signals that either removes features or extracts some component. SciPy has a very complete set of known filters, as well as the tools to allow construction of new ones. The complete list of filters in SciPy is long, and we encourage the reader to explore the help documents of the scipy.signal and scipy.ndimage modules for the complete picture. We will introduce in these pages, as an exposition, some of the most used filters in the treatment of audio or image processing. We start by creating a signal worth filtering: >>> from numpy import sin, cos, pi, linspace >>> f=lambda t: cos(pi*t) + 0.2*sin(5*pi*t+0.1) + 0.2*sin(30*pi*t) + 0.1*sin(32*pi*t+0.1) + 0.1*sin(47* pi*t+0.8) >>> t=linspace(0,4,400); signal=f(t) We first test the classical smoothing filter of Wiener and Kolmogorov, wiener. We present in a plot, the original signal (in black) and the corresponding filtered data, with a choice of a Wiener window of the size 55 samples (in blue). Next, we compare the result of applying the median filter, medfilt, with a kernel of the same size as before (in red): >>> from scipy.signal import wiener, medfilt >>> import matplotlib.pylab as plt >>> plt.plot(t,signal,'k') >>> plt.plot(t,wiener(signal,mysize=55),'r',linewidth=3) >>> plt.plot(t,medfilt(signal,kernel_size=55),'b',linewidth=3) >>> plt.show() This gives us the following graph showing the comparison of smoothing filters (wiener is the one that has its starting point just below 0.5 and medfilt has its starting point just above 0.5): Most of the filters in the scipy.signal module can be adapted to work in arrays of any dimension. But in the particular case of images, we prefer to use the implementations in the scipy.ndimage module, since they are coded with these objects in mind. For instance, to perform a median filter on an image for smoothing, we use scipy.ndimage.median_filter. Let's see an example. We will start by loading Lena to the array and corrupting the image with Gaussian noise (zero mean and standard deviation of 16): >>> from scipy.stats import norm # Gaussian distribution >>> import matplotlib.pyplot as plt >>> import scipy.misc >>> import scipy.ndimage >>> plt.gray() >>> lena=scipy.misc.lena().astype(float) >>> plt.subplot(221); >>> plt.imshow(lena) >>> lena+=norm(loc=0,scale=16).rvs(lena.shape) >>> plt.subplot(222); >>> plt.imshow(lena) >>> denoised_lena = scipy.ndimage.median_filter(lena,3) >>> plt.subplot(224); >>> plt.imshow(denoised_lena) The set of filters for images come in two flavors—statistical and morphological. For example, among the filters of statistical nature, we have the Sobel algorithm oriented to detection of edges (singularities along curves). Its syntax is as follows: sobel(image, axis=-1, output=None, mode='reflect', cval=0.0) The optional parameter, axis, indicates the dimension in which the computations are performed. By default, this is always the last axis (-1). The mode parameter, which is one of the strings 'reflect', 'constant', 'nearest', 'mirror', or 'wrap', indicates how to handle the border of the image, in case there is insufficient data to perform the computations there. In case the mode is 'constant', we may indicate the value to use in the border, with the cval parameter. Let's look into the following code snippet, which illustrates the use of the sobel filter: >>> from scipy.ndimage.filters import sobel >>> import numpy >>> lena=scipy.misc.lena() >>> sblX=sobel(lena,axis=0); sblY=sobel(lena,axis=1) >>> sbl=numpy.hypot(sblX,sblY) >>> plt.subplot(223); >>> plt.imshow(sbl) >>> plt.show() The following screenshot illustrates Lena (upper-left) and noisy Lena (upper-right) with the preceding two filters in action—edge map with sobel (lower-left) and median filter (lower-right): Morphology We also have the possibility of creating and applying filters to images based on mathematical morphology, both to binary and gray-scale images. The four basic morphological operations are opening (binary_opening), closing (binary_closing), dilation (binary_dilation), and erosion (binary_erosion). Note that the syntax for each of these filters is very simple, since we only need two ingredients—the signal to filter and the structuring element to perform the morphological operation. Let's take a look into the general syntax for these morphological operations: binary_operation(signal, structuring_element) We may use combinations of these four basic morphological operations to create more complex filters for removal of holes, hit-or-miss transforms (to find the location of specific patterns in binary images), denoising, edge detection, and many more. The SciPy module also allows for creating some common filters using the preceding syntax. For instance, for the location of the letter e in a text, we could use the following command instead: >>> binary_hit_or_miss(text, letterE) For comparative purposes, let's use this command in the following code snippet: >>> import numpy >>> import scipy.ndimage >>> import matplotlib.pylab as plt >>> from scipy.ndimage.morphology import binary_hit_or_miss >>> text = scipy.ndimage.imread('CHAP_05_input_textImage.png') >>> letterE = text[37:53,275:291] >>> HitorMiss = binary_hit_or_miss(text, structure1=letterE, origin1=1) >>> eLocation = numpy.where(HitorMiss==True) >>> x=eLocation[1]; y=eLocation[0] >>> plt.imshow(text, cmap=plt.cm.gray, interpolation='nearest') >>> plt.autoscale(False) >>> plt.plot(x,y,'wo',markersize=10) >>> plt.axis('off') >>> plt.show() The output for the preceding lines of code is generated as follows: For gray-scale images, we may use a structuring element (structuring_element) or a footprint. The syntax is, therefore, a little different: grey_operation(signal, [structuring_element, footprint, size, ...]) If we desire to use a completely flat and rectangular structuring element (all ones), then it is enough to indicate the size as a tuple. For instance, to perform gray-scale dilation of a flat element of size (15,15) on our classical image of Lena, we issue the following command: >>> grey_dilation(lena, size=(15,15)) The last kind of morphological operations coded in the scipy.ndimage module perform distance and feature transforms. Distance transforms create a map that assigns to each pixel, the distance to the nearest object. Feature transforms provide with the index of the closest background element instead. These operations are used to decompose images into different labels. We may even choose different metrics such as Euclidean distance, chessboard distance, and taxicab distance. The syntax for the distance transform (distance_transform) using a brute force algorithm is as follows: distance_transform_bf(signal, metric='euclidean', sampling=None, return_distances=True, return_indices=False, distances=None, indices=None) We indicate the metric with the strings such as 'euclidean', 'taxicab', or 'chessboard'. If we desire to provide the feature transform instead, we switch return_distances to False and return_indices to True. Similar routines are available with more sophisticated algorithms—distance_transform_cdt (using chamfering for taxicab and chessboard distances). For Euclidean distance, we also have distance_transform_edt. All these use the same syntax. Summary In this article, we explored signal processing (any dimensional) including the treatment of signals in frequency space, by means of their Discrete Fourier Transforms. These correspond to the fftpack, signal, and ndimage modules. Resources for Article: Further resources on this subject: Signal Processing Techniques [article] SciPy for Computational Geometry [article] Move Further with NumPy Modules [article]

0
0
13934

Packt

03 Mar 2015

18 min read

Time Travelling with Spring

Packt

03 Mar 2015

18 min read

0
0
2002

Packt

03 Mar 2015

24 min read

Packaged Elegance

Packt

03 Mar 2015

24 min read

In this article by John Farrar, author of the book KnockoutJS Web development, we will see how templates drove us to a more dynamic, creative platform. The next advancement in web development was custom HTML components. KnockoutJS allows us to jump right in with some game-changing elegance for designers and developers. In this article, we will focus on: An introduction to components Bring Your Own Tags (BYOT) Enhancing attribute handling Making your own libraries Asynchronous module definition (AMD)—on demand resource loading This entire article is about packaging your code for reuse. Using these techniques, you can make your code more approachable and elegant. (For more resources related to this topic, see here.) Introduction to components The best explanation of a component is a packaged template with an isolated ViewModel. Here is the syntax we would use to declare a like component on the page: <div data-bind="component: "like"''"></div> If you are passing no parameters through to the component, this is the correct syntax. If you wish to pass parameters through, you would use a JSON style structure as follows: <div data-bind="component:{name: 'like-widget',params:{ approve: like} }"></div> This would allow us to pass named parameters through to our custom component. In this case, we are passing a parameter named approve. This would mean we had a bound viewModel variable by the name of like. Look at how this would be coded. Create a page called components.html using the _base.html file to speed things up as we have done in all our other articles. In your script section, create the following ViewModel: <script>ViewModel = function(){self = this;self.like = ko.observable(true);};// insert custom component herevm = new ViewModel();ko.applyBindings(vm);</script> Now, we will create our custom component. Here is the basic component we will use for this first component. Place the code where the comment is, as we want to make sure it is added before our applyBindings method is executed: ko.components.register('like-widget', { viewModel: function(params) { this.approve = params.approve; // Behaviors: this.toggle = function(){ this.approve(!this.approve()); }.bind(this); }, template: '<div class="approve"> <button data-bind="click: toggle"> <span data-bind="visible: approve" class="glyphicon glyphicon-thumbs-up"></span> <span data-bind="visible:! approve()" class="glyphicon glyphicon-thumbs-down"></span> </button> </div>' }); There are two sections to our components: the viewModel and template sections. In this article, we will be using Knockout template details inside the component. The standard Knockout component passes variables to the component using the params structure. We can either use this structure or you could optionally use the self = this approach if desired. In addition to setting the variable structure, it is also possible to create behaviors for our components. If we look in the template code, we can see we have data-bound the click event to toggle the approve setting in our component. Then, inside the button, by binding to the visible trait of the span element, either the thumbs up or thumbs down image will be shown to the user. Yes, we are using a Bootstrap icon element rather than a graphic here. Here is a screenshot of the initial state: When we click on the thumb image, it will toggle between the thumbs up and the thumbs down version. Since we also passed in the external variable that is bound to the page ViewModel, we see that the value in the matched span text will also toggle. Here is the markup we would add to the page to produce these results in the View section of our code: <div data-bind="component: {name: 'like-widget', params:{ approve: like} }"></div> <span data-bind="text: like"></span> You could build this type of functionality with a jQuery plugin as well, but it is likely to take a bit more code to do two-way binding and match the tight functionality we have achieved here. This doesn't mean jQuery plugins are bad, as this is also a jQuery-related technology. What it does mean is we have ways to do things even better. It is this author's opinion that features like this would still make great additions to the core jQuery library. Yet, I am not holding my breath waiting for them to adopt a Knockout-type project to the wonderful collection of projects they have at this point, and do not feel we should hold that against them. Keeping focused on what they do best is one of the reasons libraries like Knockout can provide a wider array of options. It seems the decisions are working on our behalf even if they are taking a different approach than I expected. Dynamic component selection You should have noticed when we selected the component that we did so using a quoted declaration. While at first it may seem to be more constricting, remember that it is actually a power feature. By using a variable instead of a hardcoded value, you can dynamically select the component you would like to be inserted. Here is the markup code: <div data-bind="component: { name: widgetName, params: widgetParams }"></div> <span data-bind="text:widgetParams.approve"></span> Notice that we are passing in both widgetName as well as widgetParams. Because we are binding the structure differently, we also need to show the bound value differently in our span. Here is the script part of our code that needs to be added to our viewModel code: self.widgetName = ko.observable("like-widget"); self.widgetParams = { approve: ko.observable(true) }; We will get the same visible results but notice that each of the like buttons is acting independent of the other. What would happen if we put more than one of the same elements on the page? If we do that, Knockout components will act independent of other components. Well, most of the time they act independent. If we bound them to the same variable they would not be independent. In your viewModel declaration code, add another variable called like2 as follows: self.like2 = ko.observable(false); Now, we will add another like button to the page by copying our first like View code. This time, change the value from like to like2 as follows: <like-widget params="approve: like2"></like-widget> <span data-bind="text: like2"></span> This time when the page loads, the other likes display with a thumbs up, but this like will display with a thumbs down. The text will also show false stored in the bound value. Any of the like buttons will act independently because each of them is bound to unique values. Here is a screenshot of the third button: Bring Your Own Tags (BYOT) What is an element? Basically, an element is a component that you reach using the tag syntax. This is the way it is expressed in the official documentation at this point and it is likely to stay that way. It is still a component under the hood. Depending on the crowd you are in, this distinction will be more or less important. Mostly, just be aware of the distinction in case someone feels it is important, as that will let you be on the same page in discussions. Custom tags are a part of the forthcoming HTML feature called Web Components. Knockout allows you to start using them today. Here is the View code: <like-widget params="approve: like3"></like-widget> <span data-bind="text: like3"></span> You may want to code some tags with a single tag rather than a double tag, as in an opening and closing tag syntax. Well, at this time, there are challenges getting each browser to see the custom element tags when declared as a single tag. This means custom tags, or elements, will need to be declared as opening and closing tags for now. We will also need to create our like3 bound variable for viewModel with the following code: self.like3 = ko.observable(true); Running the code gives us the same wonderful functionality as our data-bind approach, but now we are creating our own HTML tags. Has there ever been a time you wanted a special HTML tag that just didn't exist? There is a chance you could create that now using Knockout component element-style coding. Enhancing attribute handling Now, while custom tags are awesome, there is just something different about passing everything in with a single param attribute. The reason for this is that this process matches how our tags work when we are using the data-bind approach to coding. In the following example, we will look at passing things in via individual attributes. This is not meant to work as a data-bind approach, but it is focused completely on the custom tag element component. The first thing you want to do is make sure this enhancement doesn't cause any issues with the normal elements. We did this by checking the custom elements for a standard prefix. You do not need to work through this code as it is a bit more advanced. The easiest thing to do is to include our Knockout components tag with the following script tag: <script src="/share/js/knockout.komponents.js"></script> In this tag, we have this code segment to convert the tags that start with kom- to tags that use individual attributes rather than a JSON translation of the attributes. Feel free to borrow the code to create libraries of your own. We are going to be creating a standard set of libraries on GitHub for these component tags. Since the HTML tags are Knockout components, we are calling these libraries "KOmponents". The" resource can be found at https://github.com/sosensible/komponents. Now, with that library included, we will use our View code to connect to the new tag. Here is the code to use in the View: <kom-like approve="tagLike"></kom-like> <span data-bind="text: tagLike"></span> Notice that in our HTML markup, the tag starts with the library prefix. This will also require viewModel to have a binding to pass into this tag as follows: self.tagLike = ko.observable(true); The following is the code for the actual "attribute-aware version" of Knockout components. Do not place this in the code as it is already included in the library in the shared directory: // <kom-like /> tag ko.components.register('kom-like', { viewModel: function(params) { // Data: value must but true to approve this.approve = params.approve; // Behaviors: this.toggle = function(){ this.approve(!this.approve()); }.bind(this); }, template: '<div class="approve"> <button data-bind="click: toggle"> <span data-bind="visible: approve" class="glyphicon glyphicon-thumbs-up"></span> <span data-bind="visible:! approve()" class="glyphicon glyphicon-thumbs-down"></span> </button> </div>' }); The tag in the View changed as we passed the information in via named attributes and not as a JSON structure inside a param attribute. We also made sure to manage these tags by using a prefix. The reason for this is that we did not want our fancy tags to break the standard method of passing params commonly practiced with regular Knockout components. As we see, again we have another functional component with the added advantage of being able to pass the values in a style more familiar to those used to coding with HTML tags. Building your own libraries Again, we are calling our custom components KOmponents. We will be creating a number of library solutions over time and welcome others to join in. Tags will not do everything for us, as there are some limitations yet to be conquered. That doesn't mean we wait for all the features before doing the ones we can for now. In this article, we will also be showing some tags from our Bootstrap KOmponents library. First we will need to include the Bootstrap KOmponents library: <script src="/share/js/knockout.komponents.bs.js"></script> Above viewModel in our script, we need to add a function to make this section of code simpler. At times, when passing items into observables, we can pass in richer bound data using a function like this. Again, create this function above the viewModel declaration of the script, shown as follows: var listItem = function(display, students){ this.display = ko.observable(display); this.students = ko.observable(students); this.type = ko.computed(function(){ switch(Math.ceil(this.students()/5)){ case 1: case 2: return 'danger'; break; case 3: return 'warning'; break; case 4: return 'info'; break; default: return 'success'; } },this); }; Now, inside viewModel, we will declare a set of data to pass to a Bootstrap style listGroup as follows: self.listData = ko.observableArray([ new listItem("HTML5",12), new listItem("CSS",8), new listItem("JavaScript",19), new listItem("jQuery",48), new listItem("Knockout",33) ]); Each item in our array will have display, students, and type variables. We are using a number of features in Bootstrap here but packaging them all up inside our Bootstrap smart tag. This tag starts to go beyond the bare basics. It is still very implementable, but we don't want to throw too much at you to absorb at one time, so we will not go into the detailed code for this tag. What we do want to show is how much power can be wrapped into custom Knockout tags. Here is the markup we will use to call this tag and bind the correct part of viewModel for display: <kom-listgroup data="listData" badgeField="'students'" typeField="'type'"></kom-listgroup> That is it. You should take note of a couple of special details. The data is passed in as a bound Knockout ViewModel. The badge field is passed in as a string name to declare the field on the data collection where the badge count will be pulled. The same string approach has been used for the type field. The type will set the colors as per standard Bootstrap types. The theme here is that if there are not enough students to hold a class, then it shows the danger color in the list group custom tag. Here is what it looks like in the browser when we run the code: While this is neat, let's jump into our browser tools console and change the value of one of the items. Let's say there was a class on some cool web technology called jQuery. What if people had not heard of it and didn't know what it was and you really wanted to take the class? Well, it would be nice to encourage a few others to check it out. How would you know whether the class was at a danger level or not? Well, we could simply use the badge and the numbers, but how awesome is it to also use the color coding hint? Type the following code into the console and see what changes: vm.listData()[3].display() Because JavaScript starts counting with zero for the first item, we will get the following result: Now we know we have the right item, so let's set the student count to nine using the following code in the browser console: vm.listData()[3].students(9) Notice the change in the jQuery class. Both the badge and the type value have updated. This screenshot of the update shows how much power we can wield with very little manual coding: We should also take a moment to see how the type was managed. Using the functional assignment, we were able to use the Knockout computed binding for that value. Here is the code for that part again: this.type = ko.computed(function(){ switch(Math.ceil(this.students()/5)){ case 1: case 2: return 'danger'; break; case 3: return 'warning'; break; case 4: return 'info'; break; default: return 'success'; } },this); While the code is outside the viewModel declaration, it is still able to bind properly to make our code run even inside a custom tag created with Knockout's component binding. Bootstrap component example Here is another example of binding with Bootstrap. The general best practice for using modal display boxes is to place them higher in the code, perhaps under the body tag, to make sure there are no conflicts with the rest of the code. Place this tag right below the body tag as shown in the following code: <kom-modal id="'komModal'" title="komModal.title()" body="komModal.body()"></kom-modal> Again, we will need to make some declarations inside viewModel for this to work right. Enter this code into the declarations of viewModel: self.komModal = { title: ko.observable('Modal KOMponent'), body: ko.observable('This is the body of the <strong>modal KOMponent</strong>.') }; We will also create a button on the page to call our viewModel. The button will use the binding that is part of Bootstrap. The data-toggle and data-target attributes are not Knockout binding features. Knockout works side-by-side wonderfully though. Another point of interest is the standard ID attribute, which tells how Bootstrap items, like this button, interact with the modal box. This is another reason it may be beneficial to use KOmponents or a library like it. Here is the markup code: <button type="button" data-toggle="modal" data- target="#komModal">Open Modal KOmponent</button> When we click on the button, this is the requestor we see: Now, to understand the power of Knockout working with our requestor, head back over to your browser tools console. Enter the following command into the prompt: vm.komModal.body("Wow, live data binding!") The following screenshot shows the change: Who knows what type of creative modular boxes we can build using this type of technology. This brings us closer towards creating what we can imagine. Perhaps it may bring us closer to building some of the wild things our customers imagine. While that may not be your main motivation for using Knockout, it would be nice to have a few less roadblocks when we want to be creative. It would also be nice to have this wonderful ability to package and reuse these solutions across a site without using copy and paste and searching back through the code when the client makes a change to make updates. Again, feel free to look at the file to see how we made these components work. They are not extremely complicated once you get the basics of using Knockout and its components. If you are looking to build components of your own, they will help you get some insight on how to do things inside as you move your skills to the next level. Understanding the AMD approach We are going to look into the concept of what makes an AMD-style website. The point of this approach to sites is to pull content on demand. The content, or modules as they are defined here, does not need to be loaded in a particular order. If there are pieces that depend on other pieces, that is, of course, managed. We will be using the RequireJS library to manage this part of our code. We will create four files in this example, as follows: amd.html amd.config.js pick.js pick.html In our AMD page, we are going to create a configuration file for our RequireJS functionality. That will be the amd.config.js file mentioned in the aforementioned list. We will start by creating this file with the following code: // require.js settings var require = { baseUrl: ".", paths: { "bootstrap": "/share/js/bootstrap.min", "jquery": "/share/js/jquery.min", "knockout": "/share/js/knockout", "text": "/share/js/text" }, shim: { "bootstrap": { deps: ["jquery"] }, "knockout": { deps: ["jquery"] }, } }; We see here that we are creating some alias names and setting the paths these names point to for this page. The file could, of course, be working for more than one page, but in this case, it has specifically been created for a single page. The configuration in RequireJS does not need the .js extension on the file names, as you would have noted. Now, we will look at our amd.html page where we pull things together. We are again using the standard page we have used for this article, which you will notice if you preview the done file example of the code. There are a couple of differences though, because the JavaScript files do not all need to be called at the start. RequireJS handles this well for us. We are not saying this is a standard practice of AMD, but it is an introduction of the concepts. We will need to include the following three script files in this example: <script src="/share/js/knockout.js"></script> <script src="amd.config.js"></script> <script src="/share/js/require.js"></script> Notice that the configuration settings need to be set before calling the require.js library. With that set, we can create the code to wire Knockout binding on the page. This goes in our amd.html script at the bottom of the page: <script> ko.components.register('pick', { viewModel: { require: 'pick' }, template: { require: 'text!pick.html' } }); viewModel = function(){ this.choice = ko.observable(); } vm = new viewModel(); ko.applyBindings(vm); </script> Most of this code should look very familiar. The difference is that the external files are being used to set the content for viewModel and template in the pick component. The require setting smartly knows to include the pick.js file for the pick setting. It does need to be passed as a string, of course. When we include the template, you will see that we use text! in front of the file we are including. We also declare the extension on the file name in this case. The text method actually needs to know where the text is coming from, and you will see in our amd.config.js file that we created an alias for the inclusion of the text function. Now, we will create the pick.js file and place it in the same directory as the amd.html file. It could have been in another directory, and you would have to just set that in the component declaration along with the filename. Here is the code for this part of our AMD component: define(['knockout'], function(ko) { function LikeWidgetViewModel(params) { this.chosenValue = params.value; this.land = Math.round(Math.random()) ? 'heads' : 'tails'; } LikeWidgetViewModel.prototype.heads = function() { this.chosenValue('heads'); }; LikeWidgetViewModel.prototype.tails = function() { this.chosenValue('tails'); }; return LikeWidgetViewModel; }); Notice that our code starts with the define method. This is our AMD functionality in place. It is saying that before we try to execute this section of code we need to make sure the Knockout library is loaded. This allows us to do on-demand loading of code as needed. The code inside the viewModel section is the same as the other examples we have looked at with one exception. We return viewModel as you see at the end of the preceding code. We used the shorthand code to set the value for heads and tails in this example. Now, we will look at our template file, pick.html. This is the code we will have in this file: <div class="like-or-dislike" data-bind="visible: !chosenValue()"> <button data-bind="click: heads">Heads</button> <button data-bind="click: tails">Tails</button> </div> <div class="result" data-bind="visible: chosenValue"> You picked <strong data-bind="text: chosenValue"></strong> The correct value was <strong data-bind="text: land"></strong> </div> There is nothing special other than the code needed to make this example work. The goal is to allow a custom tag to offer up heads or tails options on the page. We also pass in a bound variable from viewModel. We will be passing it into three identical tags. The tags are actually going to load the content instantly in this example. The goal is to get familiar with how the code works. We will take it to full practice at the end of the article. Right now, we will put this code in the View segment of our amd.html page: <h2>One Choice</h2> <pick params="value: choice"></pick><br> <pick params="value: choice"></pick><br> <pick params="value: choice"></pick> Notice that we have included the pick tag three times. While we are passing in the bound choice item from viewModel, each tag will randomly choose heads or tails. When we run the code, this is what we will see: Since we passed the same bound item into each of the three tags, when we click on any heads or tails set, it will immediately pass that value out to viewModel, which will in turn immediately pass the value back into the other two tag sets. They are all wired together through viewModel binding being the same variable. This is the result we get if we click on Tails: Well, it is the results we got that time. Actually, the results change pretty much every time we refresh the page. Now, we are ready to do something extra special by combining our AMD approach with Knockout modules. Summary This article has shown the awesome power of templates working together with ViewModels within Knockout components. You should now have an awesome foundation to do more with less than ever before. You should know how to mingle your jQuery code with the Knockout code side by side. To review, in this article, we learned what Knockout components are. We learned how to use the components to create custom HTML elements that are interactive and powerful. We learned how to enhance custom elements to allow variables to be managed using the more common attributes approach. We learned how to use an AMD-style approach to coding with Knockout. We also learned how to AJAX everything and integrate jQuery to enhance Knockout-based solutions. What's next? That is up to you. One thing is for sure, the possibilities are broader using Knockout than they were before. Happy coding and congratulations on completing your study of KnockoutJS! Resources for Article: Further resources on this subject: Top features of KnockoutJS [article] Components [article] Web Application Testing [article]

0
0
1776

How-To Tutorials

article-image-starting-small-and-growing-modular-way

Packt

02 Mar 2015

27 min read

Starting Small and Growing in a Modular Way

Packt

02 Mar 2015

27 min read

This article written by Carlo Russo, author of the book KnockoutJS Blueprints, describes that RequireJS gives us a simplified format to require many parameters and to avoid parameter mismatch using the CommonJS require format; for example, another way (use this or the other one) to write the previous code is: define(function(require) { var $ = require("jquery"), ko = require("knockout"), viewModel = {}; $(function() { ko.applyBindings(viewModel); });}); (For more resources related to this topic, see here.) In this way, we skip the dependencies definition, and RequireJS will add all the texts require('xxx') found in the function to the dependency list. The second way is better because it is cleaner and you cannot mismatch dependency names with named function arguments. For example, imagine you have a long list of dependencies; you add one or remove one, and you miss removing the relative function parameter. You now have a hard-to-find bug. And, in case you think that r.js optimizer behaves differently, I just want to assure you that it's not so; you can use both ways without any concern regarding optimization. Just to remind you, you cannot use this form if you want to load scripts dynamically or by depending on variable value; for example, this code will not work: var mod = require(someCondition ? "a" : "b");if (someCondition) { var a = require('a');} else { var a = require('a1');} You can learn more about this compatibility problem at this URL: http://www.requirejs.org/docs/whyamd.html#commonjscompat. You can see more about this sugar syntax at this URL: http://www.requirejs.org/docs/whyamd.html#sugar. Now that you know the basic way to use RequireJS, let's look at the next concept. Component binding handler The component binding handler is one of the new features introduced in Version 2.3 of KnockoutJS. Inside the documentation of KnockoutJS, we find the following explanation: Components are a powerful, clean way of organizing your UI code into self-contained, reusable chunks. They can represent individual controls/widgets, or entire sections of your application. A component is a combination of HTML and JavaScript. The main idea behind their inclusion was to create full-featured, reusable components, with one or more points of extensibility. A component is a combination of HTML and JavaScript. There are cases where you can use just one of them, but normally you'll use both. You can get a first simple example about this here: http://knockoutjs.com/documentation/component-binding.html. The best way to create self-contained components is with the use of an AMD module loader, such as RequireJS; put the View Model and the template of the component inside two different files, and then you can use it from your code really easily. Creating the bare bones of a custom module Writing a custom module of KnockoutJS with RequireJS is a 4-step process: Creating the JavaScript file for the View Model. Creating the HTML file for the template of the View. Registering the component with KnockoutJS. Using it inside another View. We are going to build bases for the Search Form component, just to move forward with our project; anyway, this is the starting code we should use for each component that we write from scratch. Let's cover all of these steps. Creating the JavaScript file for the View Model We start with the View Model of this component. Create a new empty file with the name BookingOnline/app/components/search.js and put this code inside it: define(function(require) {var ko = require("knockout"), template = require("text!./search.html");function Search() {}return { viewModel: Search, template: template};}); Here, we are creating a constructor called Search that we will fill later. We are also using the text plugin for RequireJS to get the template search.html from the current folder, into the argument template. Then, we will return an object with the constructor and the template, using the format needed from KnockoutJS to use as a component. Creating the HTML file for the template of the View In the View Model we required a View called search.html in the same folder. At the moment, we don't have any code to put inside the template of the View, because there is no boilerplate code needed; but we must create the file, otherwise RequireJS will break with an error. Create a new file called BookingOnline/app/components/search.html with the following content: <div>Hello Search</div> Registering the component with KnockoutJS When you use components, there are two different ways to give KnockoutJS a way to find your component: Using the function ko.components.register Implementing a custom component loader The first way is the easiest one: using the default component loader of KnockoutJS. To use it with our component you should just put the following row inside the BookingOnline/app/index.js file, just before the row $(function () {: ko.components.register("search", {require: "components/search"}); Here, we are registering a module called search, and we are telling KnockoutJS that it will have to find all the information it needs using an AMD require for the path components/search (so it will load the file BookingOnline/app/components/search.js). You can find more information and a really good example about a custom component loader at: http://knockoutjs.com/documentation/component-loaders.html#example-1-a-component-loader-that-sets-up-naming-conventions. Using it inside another View Now, we can simply use the new component inside our View; put the following code inside our Index View (BookingOnline/index.html), before the script tag: <div data-bind="component: 'search'"></div> Here, we are using the component binding handler to use the component; another commonly used way is with custom elements. We can replace the previous row with the following one: <search></search> KnockoutJS will use our search component, but with a WebComponent-like code. If you want to support IE6-8 you should register the WebComponents you are going to use before the HTML parser can find them. Normally, this job is done inside the ko.components.register function call, but, if you are putting your script tag at the end of body as we have done until now, your WebComponent will be discarded. Follow the guidelines mentioned here when you want to support IE6-8: http://knockoutjs.com/documentation/component-custom-elements.html#note-custom-elements-and-internet-explorer-6-to-8 Now, you can open your web application and you should see the text, Hello Search. We put that markup only to check whether everything was working here, so you can remove it now. Writing the Search Form component Now that we know how to create a component, and we put the base of our Search Form component, we can try to look for the requirements for this component. A designer will review the View later, so we need to keep it simple to avoid the need for multiple changes later. From our analysis, we find that our competitors use these components: Autocomplete field for the city Calendar fields for check-in and check-out Selection field for the number of rooms, number of adults and number of children, and age of children This is a wireframe of what we should build (we got inspired by Trivago): We could do everything by ourselves, but the easiest way to realize this component is with the help of a few external plugins; we are already using jQuery, so the most obvious idea is to use jQuery UI to get the Autocomplete Widget, the Date Picker Widget, and maybe even the Button Widget. Adding the AMD version of jQuery UI to the project Let's start downloading the current version of jQuery UI (1.11.1); the best thing about this version is that it is one of the first versions that supports AMD natively. After reading the documentation of jQuery UI for the AMD (URL: http://learn.jquery.com/jquery-ui/environments/amd/) you may think that you can get the AMD version using the download link from the home page. However, if you try that you will get just a package with only the concatenated source; for this reason, if you want the AMD source file, you will have to go directly to GitHub or use Bower. Download the package from https://github.com/jquery/jquery-ui/archive/1.11.1.zip and extract it. Every time you use an external library, remember to check the compatibility support. In jQuery UI 1.11.1, as you can see in the release notes, they removed the support for IE7; so we must decide whether we want to support IE6 and 7 by adding specific workarounds inside our code, or we want to remove the support for those two browsers. For our project, we need to put the following folders into these destinations: jquery-ui-1.11.1/ui -> BookingOnline/app/ui jquery-ui-1.11.1/theme/base -> BookingOnline/css/ui We are going to apply the widget by JavaScript, so the only remaining step to integrate jQuery UI is the insertion of the style sheet inside our application. We do this by adding the following rows to the top of our custom style sheet file (BookingOnline/css/styles.css): @import url("ui/core.css");@import url("ui/menu.css");@import url("ui/autocomplete.css");@import url("ui/button.css");@import url("ui/datepicker.css");@import url("ui/theme.css") Now, we are ready to add the widgets to our web application. You can find more information about jQuery UI and AMD at: http://learn.jquery.com/jquery-ui/environments/amd/ Making the skeleton from the wireframe We want to give to the user a really nice user experience, but as the first step we can use the wireframe we put before to create a skeleton of the Search Form. Replace the entire content with a form inside the file BookingOnline/components/search.html: <form data-bind="submit: execute"></form> Then, we add the blocks inside the form, step by step, to realize the entire wireframe: <div> <input type="text" placeholder="Enter a destination" /> <label> Check In: <input type="text" /> </label> <label> Check Out: <input type="text" /> </label> <input type="submit" data-bind="enable: isValid" /></div> Here, we built the first row of the wireframe; we will bind data to each field later. We bound the execute function to the submit event (submit: execute), and a validity check to the button (enable: isValid); for now we will create them empty. Update the View Model (search.js) by adding this code inside the constructor: this.isValid = ko.computed(function() {return true;}, this); And add this function to the Search prototype: Search.prototype.execute = function() { }; This is because the validity of the form will depend on the status of the destination field and of the check-in date and check-out date; we will update later, in the next paragraphs. Now, we can continue with the wireframe, with the second block. Here, we should have a field to select the number of rooms, and a block for each room. Add the following markup inside the form, after the previous one, for the second row to the View (search.html): <div> <fieldset> <legend>Rooms</legend> <label> Number of Room <select data-bind="options: rangeOfRooms, value: numberOfRooms"> </select> </label>  <fieldset> <legend> Room <span data-bind="text: roomNumber"></span> </legend> </fieldset>  </fieldset></div> In this markup we are asking the user to choose between the values found inside the array rangeOfRooms, to save the selection inside a property called numberOfRooms, and to show a frame for each room of the array rooms with the room number, roomNumber. When developing and we want to check the status of the system, the easiest way to do it is with a simple item inside a View bound to the JSON of a View Model. Put the following code inside the View (search.html): <pre data-bind="text: ko.toJSON($data, null, 2)"></pre> With this code, you can check the status of the system with any change directly in the printed JSON. You can find more information about ko.toJSON at http://knockoutjs.com/documentation/json-data.html Update the View Model (search.js) by adding this code inside the constructor: this.rooms = ko.observableArray([]);this.numberOfRooms = ko.computed({read: function() { return this.rooms().length;},write: function(value) { var previousValue = this.rooms().length; if (value > previousValue) { for (var i = previousValue; i < value; i++) { this.rooms.push(new Room(i + 1)); } } else { this.rooms().splice(value); this.rooms.valueHasMutated(); }},owner: this}); Here, we are creating the array of rooms, and a property to update the array properly. If the new value is bigger than the previous value it adds to the array the missing item using the constructor Room; otherwise, it removes the exceeding items from the array. To get this code working we have to create a module, Room, and we have to require it here; update the require block in this way: var ko = require("knockout"), template = require("text!./search.html"), Room = require("room"); Also, add this property to the Search prototype: Search.prototype.rangeOfRooms = ko.utils.range(1, 10); Here, we are asking KnockoutJS for an array with the values from the given range. ko.utils.range is a useful method to get an array of integers. Internally, it simply makes an array from the first parameter to the second one; but if you use it inside a computed field and the parameters are observable, it re-evaluates and updates the returning array. Now, we have to create the View Model of the Room module. Create a new file BookingOnline/app/room.js with the following starting code: define(function(require) {var ko = require("knockout");function Room(roomNumber) { this.roomNumber = roomNumber;}return Room;}); Now, our web application should appear like so: As you can see, we now have a fieldset for each room, so we can work on the template of the single room. Here, you can also see in action the previous tip about the pre field with the JSON data. With KnockoutJS 3.2 it is harder to decide when it's better to use a normal template or a component. The rule of thumb is to identify the degree of encapsulation you want to manage: Use the component when you want a self-enclosed black box, or the template if you want to manage the View Model directly. What we want to show for each room is: Room number Number of adults Number of children Age of each child We can update the Room View Model (room.js) by adding this code into the constructor: this.numberOfAdults = ko.observable(2);this.ageOfChildren = ko.observableArray([]);this.numberOfChildren = ko.computed({read: function() { return this.ageOfChildren().length;},write: function(value) { var previousValue = this.ageOfChildren().length; if (value > previousValue) { for (var i = previousValue; i < value; i++) { this.ageOfChildren.push(ko.observable(0)); } } else { this.ageOfChildren().splice(value); this.ageOfChildren.valueHasMutated(); }},owner: this});this.hasChildren = ko.computed(function() {return this.numberOfChildren() > 0;}, this); We used the same logic we have used before for the mapping between the count of the room and the count property, to have an array of age of children. We also created a hasChildren property to know whether we have to show the box for the age of children inside the View. We have to add—as we have done before for the Search View Model—a few properties to the Room prototype: Room.prototype.rangeOfAdults = ko.utils.range(1, 10);Room.prototype.rangeOfChildren = ko.utils.range(0, 10);Room.prototype.rangeOfAge = ko.utils.range(0, 17); These are the ranges we show inside the relative select. Now, as the last step, we have to put the template for the room in search.html; add this code inside the fieldset tag, after the legend tag (as you can see here, with the external markup): <fieldset> <legend> Room <span data-bind="text: roomNumber"></span> </legend> <label> Number of adults <select data-bind="options: rangeOfAdults, value: numberOfAdults"></select> </label> <label> Number of children <select data-bind="options: rangeOfChildren, value: numberOfChildren"></select> </label> <fieldset data-bind="visible: hasChildren"> <legend>Age of children</legend>  <select data-bind="options: $parent.rangeOfAge, value: $rawData"></select>  </fieldset> </fieldset>  Here, we are using the properties we have just defined. We are using rangeOfAge from $parent because inside foreach we changed context, and the property, rangeOfAge, is inside the Room context. Why did I use $rawData to bind the value of the age of the children instead of $data? The reason is that ageOfChildren is an array of observables without any container. If you use $data, KnockoutJS will unwrap the observable, making it one-way bound; but if you use $rawData, you will skip the unwrapping and get the two-way data binding we need here. In fact, if we use the one-way data binding our model won't get updated at all. If you really don't like that the fieldset for children goes to the next row when it appears, you can change the fieldset by adding a class, like this: <fieldset class="inline" data-bind="visible: hasChildren"> Now, your application should appear as follows: Now that we have a really nice starting form, we can update the three main fields to use the jQuery UI Widgets. Realizing an Autocomplete field for the destination As soon as we start to write the code for this field we face the first problem: how can we get the data from the backend? Our team told us that we don't have to care about the backend, so we speak to the backend team to know how to get the data. After ten minutes we get three files with the code for all the calls to the backend; all we have to do is to download these files (we already got them with the Starting Package, to avoid another download), and use the function getDestinationByTerm inside the module, services/rest. Before writing the code for the field let's think about which behavior we want for it: When you put three or more letters, it will ask the server for the list of items Each recurrence of the text inside the field into each item should be bold When you select an item, a new button should appear to clear the selection If the current selected item and the text inside the field are different when the focus exits from the field, it should be cleared The data should be taken using the function, getDestinationByTerm, inside the module, services/rest The documentation of KnockoutJS also explains how to create custom binding handlers in the context of RequireJS. The what and why about binding handlers All the bindings we use inside our View are based on the KnockoutJS default binding handler. The idea behind a binding handler is that you should put all the code to manage the DOM inside a component different from the View Model. Other than this, the binding handler should be realized with reusability in mind, so it's always better not to hard-code application logic inside. The KnockoutJS documentation about standard binding is already really good, and you can find many explanations about its inner working in the Appendix, Binding Handler. When you make a custom binding handler it is important to remember that: it is your job to clean after; you should register event handling inside the init function; and you should use the update function to update the DOM depending on the change of the observables. This is the standard boilerplate code when you use RequireJS: define(function(require) {var ko = require("knockout"), $ = require("jquery");ko.bindingHandlers.customBindingHandler = { init: function(element, valueAccessor, allBindingsAccessor, data, context) { /* Code for the initialization… */ ko.utils.domNodeDisposal.addDisposeCallback(element, function () { /* Cleaning code … */ }); }, update: function (element, valueAccessor) { /* Code for the update of the DOM… */ }};}); And inside the View Model module you should require this module, as follows: require('binding-handlers/customBindingHandler'); ko.utils.domNodeDisposal is a list of callbacks to be executed when the element is removed from the DOM; it's necessary because it's where you have to put the code to destroy the widgets, or remove the event handlers. Binding handler for the jQuery Autocomplete widget So, now we can write our binding handler. We will define a binding handler named autocomplete, which takes the observable to put the found value. We will also define two custom bindings, without any logic, to work as placeholders for the parameters we will send to the main binding handler. Our binding handler should: Get the value for the autoCompleteOptions and autoCompleteEvents optional data bindings. Apply the Autocomplete Widget to the item using the option of the previous step. Register all the event listeners. Register the disposal of the Widget. We also should ensure that if the observable gets cleared, the input field gets cleared too. So, this is the code of the binding handler to put inside BookingOnline/app/binding-handlers/autocomplete.js (I put comments between the code to make it easier to understand): define(function(require) {var ko = require("knockout"), $ = require("jquery"), autocomplete = require("ui/autocomplete");ko.bindingHandlers.autoComplete = { init: function(element, valueAccessor, allBindingsAccessor, data, context) { Here, we are giving the name autoComplete to the new binding handler, and we are also loading the Autocomplete Widget of jQuery UI: var value = ko.utils.unwrapObservable(valueAccessor()), allBindings = ko.utils.unwrapObservable(allBindingsAccessor()), options = allBindings.autoCompleteOptions || {}, events = allBindings.autoCompleteEvents || {}, $element = $(element); Then, we take the data from the binding for the main parameter, and for the optional binding handler; we also put the current element into a jQuery container: autocomplete(options, $element);if (options._renderItem) { var widget = $element.autocomplete("instance"); widget._renderItem = options._renderItem;}for (var event in events) { ko.utils.registerEventHandler(element, event, events[event]);} Now we can apply the Autocomplete Widget to the field. If you are questioning why we used ko.utils.registerEventHandler here, the answer is: to show you this function. If you look at the source, you can see that under the wood it uses $.bind if jQuery is registered; so in our case we could simply use $.bind or $.on without any problem. But I wanted to show you this function because sometimes you use KnockoutJS without jQuery, and you can use it to support event handling of every supported browser. The source code of the function _renderItem is (looking at the file ui/autocomplete.js): _renderItem: function( ul, item ) {return $( "<li>" ).text( item.label ).appendTo( ul );}, As you can see, for security reasons, it uses the function text to avoid any possible code injection. It is important that you know that you should do data validation each time you get data from an external source and put it in the page. In this case, the source of data is already secured (because we manage it), so we override the normal behavior, to also show the HTML tag for the bold part of the text. In the last three rows we put a cycle to check for events and we register them. The standard way to register for events is with the event binding handler. The only reason you should use a custom helper is to give to the developer of the View a way to register events more than once. Then, we add to the init function the disposal code: // handle disposalko.utils.domNodeDisposal.addDisposeCallback(element, function() {$element.autocomplete("destroy");}); Here, we use the destroy function of the widget. It's really important to clean up after the use of any jQuery UI Widget or you'll create a really bad memory leak; it's not a big problem with simple applications, but it will be a really big problem if you realize an SPA. Now, we can add the update function: }, update: function(element, valueAccessor) { var value = valueAccessor(), $element = $(element), data = value(); if (!data) $element.val(""); }};}); Here, we read the value of the observable, and clean the field if the observable is empty. The update function is executed as a computed observable, so we must be sure that we subscribe to the observables required inside. So, pay attention if you put conditional code before the subscription, because your update function could be not called anymore. Now that the binding is ready, we should require it inside our form; update the View search.html by modifying the following row: <input type="text" placeholder="Enter a destination" /> Into this: <input type="text" placeholder="Enter a destination" data-bind="autoComplete: destination, autoCompleteEvents: destination.events, autoCompleteOptions: destination.options" /> If you try the application you will not see any error; the reason is that KnockoutJS ignores any data binding not registered inside the ko.bindingHandlers object, and we didn't require the binding handler autocomplete module. So, the last step to get everything working is the update of the View Model of the component; add these rows at the top of the search.js, with the other require(…) rows: Room = require("room"), rest = require("services/rest");require("binding-handlers/autocomplete"); We need a reference to our new binding handler, and a reference to the rest object to use it as source of data. Now, we must declare the properties we used inside our data binding; add all these properties to the constructor as shown in the following code: this.destination = ko.observable();this.destination.options = { minLength: 3,source: rest.getDestinationByTerm,select: function(event, data) { this.destination(data.item);}.bind(this),_renderItem: function(ul, item) { return $("<li>").append(item.label).appendTo(ul);}};this.destination.events = {blur: function(event) { if (this.destination() && (event.currentTarget.value !== this.destination().value)) { this.destination(undefined); }}.bind(this)}; Here, we are defining the container (destination) for the data selected inside the field, an object (destination.options) with any property we want to pass to the Autocomplete Widget (you can check all the documentation at: http://api.jqueryui.com/autocomplete/), and an object (destination.events) with any event we want to apply to the field. Here, we are clearing the field if the text inside the field and the content of the saved data (inside destination) are different. Have you noticed .bind(this) in the previous code? You can check by yourself that the value of this inside these functions is the input field. As you can see, in our code we put references to the destination property of this, so we have to update the context to be the object itself; the easiest way to do this is with a simple call to the bind function. Summary In this article, we have seen all some functionalities of KnockoutJS (core). The application we realized was simple enough, but we used it to learn better how to use components and custom binding handlers. If you think we put too much code for such a small project, try to think what differences you have seen between the first and the second component: the more component and binding handler code you write, the lesser you will have to write in the future. The most important point about components and custom binding handlers is that you have to realize them looking at future reuse; more good code you write, the better it will be for you later. The core point of this article was AMD and RequireJS; how to use them inside a KnockoutJS project, and why you should do it. Resources for Article: Further resources on this subject: Components [article] Web Application Testing [article] Top features of KnockoutJS [article] e to add—as we have done before for the Search View Model—

0
0
2180

How-To Tutorials

Packt

02 Mar 2015

24 min read

Model-View-ViewModel

Packt

02 Mar 2015

24 min read

In this article, by Einar Ingebrigtsen, author of the book, SignalR Blueprints, we will focus on a different programming model for client development: Model-View-ViewModel (MVVM). It will reiterate what you have already learned about SignalR, but you will also start to see a recurring theme in how you should architect decoupled software that adheres to the SOLID principles. It will also show the benefit of thinking in single page application terms (often referred to as Single Page Application (SPA)), and how SignalR really fits well with this idea. (For more resources related to this topic, see here.) The goal – an imagined dashboard A counterpart to any application is often a part of monitoring its health. Is it running? and are there any failures?. Getting this information in real time when the failure occurs is important and also getting some statistics from it is interesting. From a SignalR perspective, we will still use the hub abstraction to do pretty much what we have been doing, but the goal is to give ideas of how and what we can use SignalR for. Another goal is to dive into the architectural patterns, making it ready for larger applications. MVVM allows better separation and is very applicable for client development in general. A question that you might ask yourself is why KnockoutJS instead of something like AngularJS? It boils down to the personal preference to a certain degree. AngularJS is described as a MVW where W stands for Whatever. I find AngularJS less focused on the same things I focus on and I also find it very verbose to get it up and running. I'm not in any way an expert in AngularJS, but I have used it on a project and I found myself writing a lot to make it work the way I wanted it to in terms of MVVM. However, I don't think it's fair to compare the two. KnockoutJS is very focused in what it's trying to solve, which is just a little piece of the puzzle, while AngularJS is a full client end-to-end framework. On this note, let's just jump straight to it. Decoupling it all MVVM is a pattern for client development that became very popular in the XAML stack, enabled by Microsoft based on Martin Fowlers presentation model. Its principle is that you have a ViewModel that holds the state and exposes behavior that can be utilized from a view. The view observes any changes of the state the ViewModel exposes, making the ViewModel totally unaware that there is a view. The ViewModel is decoupled and can be put in isolation and is perfect for automated testing. As part of the state that the ViewModel typically holds is the model part, which is something it usually gets from the server, and a SignalR hub is the perfect transport to get this. It boils down to recognizing the different concerns that make up the frontend and separating it all. This gives us the following diagram: Back to basics This time we will go back in time, going down what might be considered a more purist path; use the browser elements (HTML, JavaScript, and CSS) and don't rely on any server-side rendering. Clients today are powerful and very capable and offloading the composition of what the user sees onto the client frees up server resources. You can also rely on the infrastructure of the Web for caching with static HTML files not rendered by the server. In fact, you could actually put these resources on a content delivery network, making the files available as close as possible to the end user. This would result in better load times for the user. You might have other reasons to perform server-side rendering and not just plain HTML. Leveraging existing infrastructure or third-party party tools could be those reasons. It boils down to what's right for you. But this particular sample will focus on things that the client can do. Anyways, let's get started. Open Visual Studio and create a new project by navigating to FILE | New | Project. The following dialog box will show up: From the left-hand side menu, select Web and then ASP.NET Web Application. Enter Chapter4 in the Name textbox and select your location. Select the Empty template from the template selector and make sure you deselect the Host in the cloud option. Then, click on OK, as shown in the following screenshot: Setting up the packages First, we want Twitter bootstrap. To get this, follow these steps: Add a NuGet package reference. Right-click on References in Solution Explorer and select Manage NuGet Packages and type Bootstrap in the search dialog box. Select it and then click on Install. We want a slightly different look, so we'll download one of the many bootstrap themes out here. Add a NuGet package reference called metro-bootstrap. As jQuery is still a part of this, let's add a NuGet package reference to it as well. For the MVVM part, we will use something called KnockoutJS; add it through NuGet as well. Add a NuGet package reference, as in the previous steps, but this time, type SignalR in the search dialog box. Find the package called Microsoft ASP.NET SignalR. Making any SignalR hubs available for the client Add a file called Startup.cs file to the root of the project. Add a Configuration method that will expose any SignalR hubs, as follows: public void Configuration(IAppBuilder app) { app.MapSignalR(); } At the top of the Startup.cs file, above the namespace declaration, but right below the using statements, add the following code: [assembly: OwinStartupAttribute(typeof(Chapter4.Startup))] Knocking it out of the park KnockoutJS is a framework that implements a lot of the principles found in MVVM and makes it easier to apply. We're going to use the following two features of KnockoutJS, and it's therefore important to understand what they are and what significance they have: Observables: In order for a view to be able to know when state change in a ViewModel occurs, KnockoutJS has something called an observable for single objects or values and observable array for arrays. BindingHandlers: In the view, the counterparts that are able to recognize the observables and know how to deal with its content are known as BindingHandlers. We create binding expression in the view that instructs the view to get its content from the properties found in the binding context. The default binding context will be the ViewModel, but there are more advanced scenarios where this changes. In fact, there is a BindingHandler that enables you to specify the context at any given time called with. Our single page Whether one should strive towards having an SPA is widely discussed on the Web these days. My opinion on the subject, in the interest of the user, is that we should really try to push things in this direction. Having not to post back and cause a full reload of the page and all its resources and getting into the correct state gives the user a better experience. Some of the arguments to perform post-backs every now and then go in the direction of fixing potential memory leaks happening in the browser. Although, the technique is sound and the result is right, it really just camouflages a problem one has in the system. However, as with everything, it really depends on the situation. At the core of an SPA is a single page (pun intended), which is usually the index.html file sitting at the root of the project. Add the new index.html file and edit it as follows: Add a new HTML file (index.html) at the root of the project by right- clicking on the Chapter4 project in Solution Explorer. Navigate to Add | New Item | Web from the left-hand side menu, and then select HTML Page and name it index.html. Finally, click on Add. Let's put in the things we've added dependencies to, starting with the style sheets. In the index.html file, you'll find the <head> tag; add the following code snippet under the <title></title> tag: <link href="Content/bootstrap.min.css" rel="stylesheet" /> <link href="Content/metro-bootstrap.min.css" rel="stylesheet" /> Next, add the following code snippet right beneath the preceding code: <script type="text/javascript" src="Scripts/jquery- 1.9.0.min.js"></script> <script type="text/javascript" src="Scripts/jquery.signalR- 2.1.1.js"></script> <script type="text/javascript" src="signalr/hubs"></script> <script type="text/javascript" src="Scripts/knockout- 3.2.0.js"></script> Another thing we will need in this is something that helps us visualize things; Google has a free, open source charting library that we will use. We will take a dependency to the JavaScript APIs from Google. To do this, add the following script tag after the others: <script type="text/javascript" src="https://www.google.com/jsapi"></script> Now, we can start filling in the view part. Inside the <body> tag, we start by putting in a header, as shown here: <div class="navbar navbar-default navbar-static-top bsnavbar"> <div class="container"> <div class="navbar-header"> <h1>My Dashboard</h1> </div> </div> </div> The server side of things In this little dashboard thing, we will look at web requests, both successful and failed. We will perform some minor things for us to be able to do this in a very naive way, without having to flesh out a full mechanism to deal with error situations. Let's start by enabling all requests even static resources, such as HTML files, to run through all HTTP modules. A word of warning: there are performance implications of putting all requests through the managed pipeline, so normally, you wouldn't necessarily want to do this on a production system, but for this sample, it will be fine to show the concepts. Open Web.config in the project and add the following code snippet within the <configuration> tag: <system.webServer> <modules runAllManagedModulesForAllRequests="true" /> </system.webServer> The hub In this sample, we will only have one hub, the one that will be responsible for dealing with reporting requests and failed requests. Let's add a new class called RequestStatisticsHub. Right-click on the project in Solution Explorer, select Class from Add, name it RequestStatisticsHub.cs, and then click on Add. The new class should inherit from the hub. Add the following using statement at the top: using Microsoft.AspNet.SignalR; We're going to keep a track of the count of requests and failed requests per time with a resolution of not more than every 30 seconds in the memory on the server. Obviously, if one wants to scale across multiple servers, this is way too naive and one should choose an out-of-process shared key-value store that goes across servers. However, for our purpose, this will be fine. Let's add a using statement at the top, as shown here: using System.Collections.Generic; At the top of the class, add the two dictionaries that we will use to hold this information: static Dictionary<string, int> _requestsLog = new Dictionary<string, int>(); static Dictionary<string, int> _failedRequestsLog = new Dictionary<string, int>(); In our client, we want to access these logs at startup. So let's add two methods to do so: public Dictionary<string, int> GetRequests() { return _requestsLog; } public Dictionary<string, int> GetFailedRequests() { return _failedRequestsLog; } Remember the resolution of only keeping track of number of requests per 30 seconds at a time. There is no default mechanism in the .NET Framework to do this so we need to add a few helper methods to deal with rounding of time. Let's add a class called DateTimeRounding at the root of the project. Mark the class as a public static class and put the following extension methods in the class: public static DateTime RoundUp(this DateTime dt, TimeSpan d) { var delta = (d.Ticks - (dt.Ticks % d.Ticks)) % d.Ticks; return new DateTime(dt.Ticks + delta); } public static DateTime RoundDown(this DateTime dt, TimeSpan d) { var delta = dt.Ticks % d.Ticks; return new DateTime(dt.Ticks - delta); } public static DateTime RoundToNearest(this DateTime dt, TimeSpan d) { var delta = dt.Ticks % d.Ticks; bool roundUp = delta > d.Ticks / 2; return roundUp ? dt.RoundUp(d) : dt.RoundDown(d); } Let's go back to the RequestStatisticsHub class and add some more functionality now so that we can deal with rounding of time: static void Register(Dictionary<string, int> log, Action<dynamic, string, int> hubCallback) { var now = DateTime.Now.RoundToNearest(TimeSpan.FromSeconds(30)); var key = now.ToString("HH:mm"); if (log.ContainsKey(key)) log[key] = log[key] + 1; else log[key] = 1; var hub = GlobalHost.ConnectionManager.GetHubContext<RequestStatisticsHub>() ; hubCallback(hub.Clients.All, key, log[key]); } public static void Request() { Register(_requestsLog, (hub, key, value) => hub.requestCountChanged(key, value)); } public static void FailedRequest() { Register(_requestsLog, (hub, key, value) => hub.failedRequestCountChanged(key, value)); } This enables us to have a place to call in order to report requests and these get published back to any clients connected to this particular hub. Note the usage of GlobalHost and its ConnectionManager property. When we want to get a hub instance and when we are not in the hub context of a method being called from a client, we use ConnectionManager to get it. It gives is a proxy for the hub and enables us to call methods on any connected client. Naively dealing with requests With all this in place, we will be able to easily and naively deal with what we consider correct and failed requests. Let's add a Global.asax file by right-clicking on the project in Solution Explorer and select the New item from the Add. Navigate to Web and find Global Application Class, then click on Add. In the new file, we want to replace the BindingHandlers method with the following code snippet: protected void Application_AuthenticateRequest(object sender, EventArgs e) { var path = HttpContext.Current.Request.Path; if (path == "/") path = "index.html"; if (path.ToLowerInvariant().IndexOf(".html") < 0) return; var physicalPath = HttpContext.Current.Request.MapPath(path); if (File.Exists(physicalPath)) { RequestStatisticsHub.Request(); } else { RequestStatisticsHub.FailedRequest(); } } Basically, with this, we are only measuring requests with .html in its path, and if it's only "/", we assume it's "index.html". Any file that does not exist, accordingly, is considered an error; typically a 404 error and we register it as a failed request. Bringing it all back to the client With the server taken care of, we can start consuming all this in the client. We will now be heading down the path of creating a ViewModel and hook everything up. ViewModel Let's start by adding a JavaScript file sitting next to our index.html file at the root level of the project, call it index.js. This file will represent our ViewModel. Also, this scenario will be responsible to set up KnockoutJS, so that the ViewModel is in fact activated and applied to the page. As we only have this one page for this sample, this will be fine. Let's start by hooking up the jQuery document that is ready: $(function() { }); Inside the function created here, we will enter our viewModel definition, which will start off being an empty one: var viewModel = function() { }; KnockoutJS has a function to apply a viewModel to the document, meaning that the document or body will be associated with the viewModel instance given. Right under the definition of viewModel, add the following line: ko.applyBindings(new viewModel()); Compiling this and running it should at the very least not give you any errors but nothing more than a header saying My Dashboard. So, we need to lighten this up a bit. Inside the viewModel function definition, add the following code snippet: var self = this; this.requests = ko.observableArray(); this.failedRequests = ko.observableArray(); We enter a reference to this as a variant called self. This will help us with scoping issues later on. The arrays we added are now KnockoutJS's observable arrays that allows the view or any BindingHandler to observe the changes that are coming in. The ko.observableArray() and ko.observable() arrays both return a new function. So, if you want to access any values in it, you must unwrap it by calling it something that might seem counterintuitive at first. You might consider your variable as just another property. However, for the observableArray(), KnockoutJS adds most of the functions found in the array type in JavaScript and they can be used directly on the function without unwrapping. If you look at a variable that is an observableArray in the console of the browser, you'll see that it looks as if it actually is just any array. This is not really true though; to get to the values, you will have to unwrap it by adding () after accessing the variable. However, all the functions you're used to having on an array are here. Let's add a function that will know how to handle an entry into the viewModel function. An entry coming in is either an existing one or a new one; the key of the entry is the giveaway to decide: function handleEntry(log, key, value) { var result = log().forEach(function (entry) { if (entry[0] == key) { entry[1](value); return true; } }); if (result !== true) { log.push([key, ko.observable(value)]); } }; Let's set up the hub and add the following code to the viewModel function: var hub = $.connection.requestStatisticsHub; var initializedCount = 0; hub.client.requestCountChanged = function (key, value) { if (initializedCount < 2) return; handleEntry(self.requests, key, value); } hub.client.failedRequestCountChanged = function (key, value) { if (initializedCount < 2) return; handleEntry(self.failedRequests, key, value); } You might notice the initalizedCount variable. Its purpose is not to deal with requests until completely initialized, which comes next. Add the following code snippet to the viewModel function: $.connection.hub.start().done(function () { hub.server.getRequests().done(function (requests) { for (var property in requests) { handleEntry(self.requests, property, requests[property]); } initializedCount++; }); hub.server.getFailedRequests().done(function (requests) { for (var property in requests) { handleEntry(self.failedRequests, property, requests[property]); } initializedCount++; }); }); We should now have enough logic in our viewModel function to actually be able to get any requests already sitting there and also respond to new ones coming. BindingHandler The key element of KnockoutJS is its BindingHandler mechanism. In KnockoutJS, everything starts with a data-bind="" attribute on an element in the HTML view. Inside the attribute, one puts binding expressions and the BindingHandlers are a key to this. Every expression starts with the name of the handler. For instance, if you have an <input> tag and you want to get the value from the input into a property on the ViewModel, you would use the BindingHandler value. There are a few BindingHandlers out of the box to deal with the common scenarios (text, value for each, and more). All of the BindingHandlers are very well documented on the KnockoutJS site. For this sample, we will actually create our own BindingHandler. KnockoutJS is highly extensible and allows you to do just this amongst other extensibility points. Let's add a JavaScript file called googleCharts.js at the root of the project. Inside it, add the following code: google.load('visualization', '1.0', { 'packages': ['corechart'] }); This will tell the Google API to enable the charting package. The next thing we want to do is to define the BindingHandler. Any handler has the option of setting up an init function and an update function. The init function should only occur once, when it's first initialized. Actually, it's when the binding context is set. If the parent binding context of the element changes, it will be called again. The update function will be called whenever there is a change in an observable or more observables that the binding expression is referring to. For our sample, we will use the init function only and actually respond to changes manually because we have a more involved scenario than what the default mechanism would provide us with. The update function that you can add to a BindingHandler has the exact same signature as the init function; hence, it is called an update. Let's add the following code underneath the load call: ko.bindingHandlers.lineChart = { init: function (element, valueAccessor, allValueAccessors, viewModel, bindingContext) { } }; This is the core structure of a BindingHandler. As you can see, we've named the BindingHandler as lineChart. This is the name we will use in our view later on. The signature of init and update are the same. The first parameter represents the element that holds the binding expression, whereas the second valueAccessor parameter holds a function that enables us to access the value, which is a result of the expression. KnockoutJS deals with the expression internally and parses any expression and figures out how to expand any values, and so on. Add the following code into the init function: optionsInput = valueAccessor(); var options = { title: optionsInput.title, width: optionsInput.width || 300, height: optionsInput.height || 300, backgroundColor: 'transparent', animation: { duration: 1000, easing: 'out' } }; var dataHash = {}; var chart = new google.visualization.LineChart(element); var data = new google.visualization.DataTable(); data.addColumn('string', 'x'); data.addColumn('number', 'y'); function addRow(row, rowIndex) { var value = row[1]; if (ko.isObservable(value)) { value.subscribe(function (newValue) { data.setValue(rowIndex, 1, newValue); chart.draw(data, options); }); } var actualValue = ko.unwrap(value); data.addRow([row[0], actualValue]); dataHash[row[0]] = actualValue; }; optionsInput.data().forEach(addRow); optionsInput.data.subscribe(function (newValue) { newValue.forEach(function(row, rowIndex) { if( !dataHash.hasOwnProperty(row[0])) { addRow(row,rowIndex); } }); chart.draw(data, options); }); chart.draw(data, options); As you can see, observables has a function called subscribe(), which is the same for both an observable array and a regular observable. The code adds a subscription to the array itself; if there is any change to the array, we will find the change and add any new row to the chart. In addition, when we create a new row, we subscribe to any change in its value so that we can update the chart. In the ViewModel, the values were converted into observable values to accommodate this. View Go back to the index.html file; we need the UI for the two charts we're going to have. Plus, we need to get both the new BindingHandler loaded and also the ViewModel. Add the following script references after the last script reference already present, as shown here: <script type="text/javascript" src="googleCharts.js"></script> <script type="text/javascript" src="index.js"></script> Inside the <body> tag below the header, we want to add a bootstrap container and a row to hold two metro styled tiles and utilize our new BindingHandler. Also, we want a footer sitting at the bottom, as shown in the following code: <div class="container"> <div class="row"> <div class="col-sm-6 col-md-4"> <div class="thumbnail tile tile-green-sea tile-large"> <div data-bind="lineChart: { title: 'Web Requests', width: 300, height: 300, data: requests }"></div> </div> </div> <div class="col-sm-6 col-md-4"> <div class="thumbnail tile tile-pomegranate tile- large"> <div data-bind="lineChart: { title: 'Failed Web Requests', width: 300, height: 300, data: failedRequests }"></div> </div> </div> </div> <hr /> <footer class="bs-footer" role="contentinfo"> <div class="container"> The Dashboard </div> </footer> </div> Note the data: requests and data: failedRequests are a part of the binding expressions. These will be handled and resolved by KnockoutJS internally and pointed to the observable arrays on the ViewModel. The other properties are options that go into the BindingHandler and something it forwards to the Google Charting APIs. Trying it all out Running the preceding code (Ctrl + F5) should yield the following result: If you open a second browser and go to the same URL, you will see the change in the chart in real time. Waiting approximately for 30 seconds and refreshing the browser should add a second point automatically and also animate the chart accordingly. Typing a URL with a file that does exist should have the same effect on the failed requests chart. Summary In this article, we had a brief encounter with MVVM as a pattern with the sole purpose of establishing good practices for your client code. We added this to a single page application setting, sprinkling on top the SignalR to communicate from the server to any connected client. Resources for Article: Further resources on this subject: Using R for Statistics Research and Graphics? [article] Aspects Data Manipulation in R [article] Learning Data Analytics R and Hadoop [article]

0
0
1928

How-To Tutorials

article-image-building-color-picker-hex-rgb-conversion

Packt

02 Mar 2015

18 min read

Building a Color Picker with Hex RGB Conversion

Packt

02 Mar 2015

18 min read

In this article by Vijay Joshi, author of the book Mastering jQuery UI, we are going to create a color selector, or color picker, that will allow the users to change the text and background color of a page using the slider widget. We will also use the spinner widget to represent individual colors. Any change in colors using the slider will update the spinner and vice versa. The hex value of both text and background colors will also be displayed dynamically on the page. (For more resources related to this topic, see here.) This is how our page will look after we have finished building it: Setting up the folder structure To set up the folder structure, follow this simple procedure: Create a folder named Article inside the MasteringjQueryUI folder. Directly inside this folder, create an HTML file and name it index.html. Copy the js and css folder inside the Article folder as well. Now go inside the js folder and create a JavaScript file named colorpicker.js. With the folder setup complete, let's start to build the project. Writing markup for the page The index.html page will consist of two sections. The first section will be a text block with some text written inside it, and the second section will have our color picker controls. We will create separate controls for text color and background color. Inside the index.html file write the following HTML code to build the page skeleton: <html> <head> <link rel="stylesheet" href="css/ui-lightness/jquery-ui- 1.10.4.custom.min.css"> </head> <body> <div class="container"> <div class="ui-state-highlight" id="textBlock"> <p> Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. </p> <p> Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. </p> <p> Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum. </p> </div> <div class="clear"> </div> <ul class="controlsContainer"> <li class="left"> <div id="txtRed" class="red slider" data-spinner="sptxtRed" data-type="text"></div><input type="text" value="0" id="sptxtRed" data-slider="txtRed" readonly="readonly" /> <div id="txtGreen" class="green slider" dataspinner=" sptxtGreen" data-type="text"></div><input type="text" value="0" id="sptxtGreen" data-slider="txtGreen" readonly="readonly" /> <div id="txtBlue" class="blue slider" dataspinner=" sptxtBlue" data-type="text"></div><input type="text" value="0" id="sptxtBlue" data-slider="txtBlue" readonly="readonly" /> <div class="clear"> </div> Text Color : <span>#000000</span> </li> <li class="right"> <div id="bgRed" class="red slider" data-spinner="spBgRed" data-type="bg" ></div><input type="text" value="255" id="spBgRed" data-slider="bgRed" readonly="readonly" /> <div id="bgGreen" class="green slider" dataspinner=" spBgGreen" data-type="bg" ></div><input type="text" value="255" id="spBgGreen" data-slider="bgGreen" readonly="readonly" /> <div id="bgBlue" class="blue slider" data-spinner="spBgBlue" data-type="bg" ></div><input type="text" value="255" id="spBgBlue" data-slider="bgBlue" readonly="readonly" /> <div class="clear"> </div> Background Color : <span>#ffffff</span> </li> </ul> </div> <script src="js/jquery-1.10.2.js"></script> <script src="js/jquery-ui-1.10.4.custom.min.js"></script> <script src="js/colorpicker.js"></script> </body> </html> We started by including the jQuery UI CSS file inside the head section. Proceeding to the body section, we created a div with the container class, which will act as parent div for all the page elements. Inside this div, we created another div with id value textBlock and a ui-state-highlight class. We then put some text content inside this div. For this example, we have made three paragraph elements, each having some random text inside it. After div#textBlock, there is an unordered list with the controlsContainer class. This ul element has two list items inside it. First list item has the CSS class left applied to it and the second has CSS class right applied to it. Inside li.left, we created three div elements. Each of these three div elements will be converted to a jQuery slider and will represent the red (R), green (G), and blue (B) color code, respectively. Next to each of these divs is an input element where the current color code will be displayed. This input will be converted to a spinner as well. Let's look at the first slider div and the input element next to it. The div has id txtRed and two CSS classes red and slider applied to it. The red class will be used to style the slider and the slider class will be used in our colorpicker.js file. Note that this div also has two data attributes attached to it, the first is data-spinner, whose value is the id of the input element next to the slider div we have provided as sptxtRed, the second attribute is data-type, whose value is text. The purpose of the data-type attribute is to let us know whether this slider will be used for changing the text color or the background color. Moving on to the input element next to the slider now, we have set its id as sptxtRed, which should match the value of the data-spinner attribute on the slider div. It has another attribute named data-slider, which contains the id of the slider, which it is related to. Hence, its value is txtRed. Similarly, all the slider elements have been created inside div.left and each slider has an input next to id. The data-type attribute will have the text value for all sliders inside div.left. All input elements have also been assigned a value of 0 as the initial text color will be black. The same pattern that has been followed for elements inside div.left is also followed for elements inside div.right. The only difference is that the data-type value will be bg for slider divs. For all input elements, a value of 255 is set as the background color is white in the beginning. In this manner, all the six sliders and the six input elements have been defined. Note that each element has a unique ID. Finally, there is a span element inside both div.left and div.right. The hex color code will be displayed inside it. We have placed #000000 as the default value for the text color inside the span for the text color and #ffffff as the default value for the background color inside the span for background color. Lastly, we have included the jQuery source file, the jQuery UI source file, and the colorpicker.js file. With the markup ready, we can now write the properties for the CSS classes that we used here. Styling the content To make the page presentable and structured, we need to add CSS properties for different elements. We will do this inside the head section. Go to the head section in the index.html file and write these CSS properties for different elements: <style type="text/css"> body{ color:#025c7f; font-family:Georgia,arial,verdana; width:700px; margin:0 auto; } .container{ margin:0 auto; font-size:14px; position:relative; width:700px; text-align:justify; } #textBlock{ color:#000000; background-color: #ffffff; } .ui-state-highlight{ padding: 10px; background: none; } .controlsContainer{ border: 1px solid; margin: 0; padding: 0; width: 100%; float: left; } .controlsContainer li{ display: inline-block; float: left; padding: 0 0 0 50px; width: 299px; } .controlsContainer div.ui-slider{ margin: 15px 0 0; width: 200px; float:left; } .left{ border-right: 1px solid; } .clear{ clear: both; } .red .ui-slider-range{ background: #ff0000; } .green .ui-slider-range{ background: #00ff00; } .blue .ui-slider-range{ background: #0000ff; } .ui-spinner{ height: 20px; line-height: 1px; margin: 11px 0 0 15px; } input[type=text]{ margin-top: 0; width: 30px; } </style> First, we defined some general rules for page body and div .container. Then, we defined the initial text color and background color for the div with id textBlock. Next, we defined the CSS properties for the unordered list ul .controlsContainer and its list items. We have provided some padding and width to each list item. We have also specified the width and other properties for the slider as well. Since the class ui-slider is added by jQuery UI to a slider element after it is initialized, we have added our properties in the .controlsContainer div .ui-slider rule. To make the sliders attractive, we then defined the background colors for each of the slider bars by defining color codes for red, green, and blue classes. Lastly, CSS rules have been defined for the spinner and the input box. We can now check our progress by opening the index.html page in our browser. Loading it will display a page that resembles the following screenshot: It is obvious that sliders and spinners will not be displayed here. This is because we have not written the JavaScript code required to initialize those widgets. Our next section will take care of them. Implementing the color picker In order to implement the required functionality, we first need to initialize the sliders and spinners. Whenever a slider is changed, we need to update its corresponding spinner as well, and conversely if someone changes the value of the spinner, we need to update the slider to the correct value. In case any of the value changes, we will then recalculate the current color and update the text or background color depending on the context. Defining the object structure We will organize our code using the object literal. We will define an init method, which will be the entry point. All event handlers will also be applied inside this method. To begin with, go to the js folder and open the colorpicker.js file for editing. In this file, write the code that will define the object structure and a call to it: var colorPicker = { init : function () { }, setColor : function(slider, value) { }, getHexColor : function(sliderType) { }, convertToHex : function (val) { } } $(function() { colorPicker.init(); }); An object named colorPicker has been defined with four methods. Let's see what all these methods will do: init: This method will be the entry point where we will initialize all components and add any event handlers that are required. setColor: This method will be the main method that will take care of updating the text and background colors. It will also update the value of the spinner whenever the slider moves. This method has two parameters; the slider that was moved and its current value. getHexColor: This method will be called from within setColor and it will return the hex code based on the RGB values in the spinners. It takes a sliderType parameter based on which we will decide which color has to be changed; that is, text color or background color. The actual hex code will be calculated by the next method. convertToHex: This method will convert an RGB value for color into its corresponding hex value and return it to get a HexColor method. This was an overview of the methods we are going to use. Now we will implement these methods one by one, and you will understand them in detail. After the object definition, there is the jQuery's $(document).ready() event handler that will call the init method of our object. The init method In the init method, we will initialize the sliders and the spinners and set the default values for them as well. Write the following code for the init method in the colorpicker.js file: init : function () { var t = this; $( ".slider" ).slider( { range: "min", max: 255, slide : function (event, ui) { t.setColor($(this), ui.value); }, change : function (event, ui) { t.setColor($(this), ui.value); } }); $('input').spinner( { min :0, max : 255, spin : function (event, ui) { var sliderRef = $(this).data('slider'); $('#' + sliderRef).slider("value", ui.value); } }); $( "#txtRed, #txtGreen, #txtBlue" ).slider('value', 0); $( "#bgRed, #bgGreen, #bgBlue" ).slider('value', 255); } In the first line, we stored the current scope value, this, in a local variable named t. Next, we will initialize the sliders. Since we have used the CSS class slider on each slider, we can simply use the .slider selector to select all of them. During initialization, we provide four options for sliders: range, max, slide, and change. Note the value for max, which has been set to 255. Since the value for R, G, or B can be only between 0 and 255, we have set max as 255. We do not need to specify min as it is 0 by default. The slide method has also been defined, which is invoked every time the slider handle moves. The call back for slide is calling the setColor method with an instance of the current slider and the value of the current slider. The setColor method will be explained in the next section. Besides slide, the change method is also defined, which also calls the setColor method with an instance of the current slider and its value. We use both the slide and change methods. This is because a change is called once the user has stopped sliding the slider handle and the slider value has changed. Contrary to this, the slide method is called each time the user drags the slider handle. Since we want to change colors while sliding as well, we have defined the slide as well as change methods. It is time to initialize the spinners now. The spinner widget is initialized with three properties. These are min and max, and the spin. min and max method has been set to 0 and 255, respectively. Every time the up/down button on the spinner is clicked or the up/down arrow key is used, the spin method will be called. Inside this method, $(this) refers to the current spinner. We find our related slider to this spinner by reading the data-slider attribute of this spinner. Once we get the exact slider, we set its value using the value method on the slider widget. Note that calling the value method will invoke the change method of the slider as well. This is the primary reason we have defined a callback for the change event while initializing the sliders. Lastly, we will set the default values for the sliders. For sliders inside div.left, we have set the value as 0 and for sliders inside div.right, the value is set to 255. You can now check the page on your browser. You will find that the slider and the spinner elements are initialized now, with the values we specified: You can also see that changing the spinner value using either the mouse or the keyboard will update the value of the slider as well. However, changing the slider value will not update the spinner. We will handle this in the next section where we will change colors as well. Changing colors and updating the spinner The setColor method is called each time the slider or the spinner value changes. We will now define this method to change the color based on whether the slider's or spinner's value was changed. Go to the setColor method declaration and write the following code: setColor : function(slider, value) { var t = this; var spinnerRef = slider.data('spinner'); $('#' + spinnerRef).spinner("value", value); var sliderType = slider.data('type') var hexColor = t.getHexColor(sliderType); if(sliderType == 'text') { $('#textBlock').css({'color' : hexColor}); $('.left span:last').text(hexColor); } else { $('#textBlock').css({'background-color' : hexColor}); $('.right span:last').text(hexColor); } } In the preceding code, we receive the current slider and its value as a parameter. First we get the related spinner to this slider using the data attribute spinner. Then we set the value of the spinner to the current value of the slider. Now we find out the type of slider for which setColor is being called and store it in the sliderType variable. The value for sliderType will either be text, in case of sliders inside div.left, or bg, in case of sliders inside div.right. In the next line, we will call the getHexColor method and pass the sliderType variable as its argument. The getHexColor method will return the hex color code for the selected color. Next, based on the sliderType value, we set the color of div#textBlock. If the sliderType is text, we set the color CSS property of div#textBlock and display the selected hex code in the span inside div.left. If the sliderType value is bg, we set the background color for div#textBlock and display the hex code for the background color in the span inside div.right. The getHexColor method In the preceding section, we called the getHexColor method with the sliderType argument. Let's define it first, and then we will go through it in detail. Write the following code to define the getHexColor method: getHexColor : function(sliderType) { var t = this; var allInputs; var hexCode = '#'; if(sliderType == 'text') { //text color allInputs = $('.left').find('input[type=text]'); } else { //background color allInputs = $('.right').find('input[type=text]'); } allInputs.each(function (index, element) { hexCode+= t.convertToHex($(element).val()); }); return hexCode; } The local variable t has stored this to point to the current scope. Another variable allInputs is declared, and lastly a variable to store the hex code has been declared, whose value has been set to # initially. Next comes the if condition, which checks the value of parameter sliderType. If the value of sliderType is text, it means we need to get all the spinner values to change the text color. Hence, we use jQuery's find selector to retrieve all input boxes inside div.left. If the value of sliderType is bg, it means we need to change the background color. Therefore, the else block will be executed and all input boxes inside div.right will be retrieved. To convert the color to hex, individual values for red, green, and blue will have to be converted to hex and then concatenated to get the full color code. Therefore, we iterate in inputs using the .each method. Another method convertToHex is called, which converts the value of a single input to hex. Inside the each method, we keep concatenating the hex value of the R, G, and B components to a variable hexCode. Once all iterations are done, we return the hexCode to the parent function where it is used. Converting to hex convertToHex is a small method that accepts a value and converts it to the hex equivalent. Here is the definition of the convertToHex method: convertToHex : function (val) { var x = parseInt(val, 10).toString(16); return x.length == 1 ? "0" + x : x; } Inside the method, firstly we will convert the received value to an integer using the parseInt method and then we'll use JavaScript's toString method to convert it to hex, which has base 16. In the next line, we will check the length of the converted hex value. Since we want the 6-character dash notation for color (such as #ff00ff), we need two characters each for red, green, and blue. Hence, we check the length of the created hex value. If it is only one character, we append a 0 to the beginning to make it two characters. The hex value is then returned to the parent function. With this, our implementation is complete and we can check it on a browser. Load the page in your browser and play with the sliders and spinners. You will see the text or background color changing, based on their value: You will also see the hex code displayed below the sliders. Also note that changing the sliders will change the value of the corresponding spinner and vice versa. Improving the Colorpicker This was a very basic tool that we built. You can add many more features to it and enhance its functionality. Here are some ideas to get you started: Convert it into a widget where all the required DOM for sliders and spinners is created dynamically Instead of two sliders, incorporate the text and background changing ability into a single slider with two handles, but keep two spinners as usual Summary In this article, we created a basic color picker/changer using sliders and spinners. You can use it to view and change the colors of your pages dynamically. Resources for Article: Further resources on this subject: Testing Ui Using WebdriverJs? [article] Important Aspect Angularjs Ui Development [article] Kendo Ui Dataviz Advance Charting [article]

0
0
5586

How-To Tutorials

Packt

02 Mar 2015

19 min read

Entity Framework DB First – Inheritance Relationships between Entities

Packt

02 Mar 2015

19 min read

0
0
15753

Packt

02 Mar 2015

19 min read

Dealing with Interrupts

Packt

02 Mar 2015

19 min read

This article is written by Francis Perea, the author of the book Arduino Essentials. In all our previous projects, we have been constantly looking for events to occur. We have been polling, but looking for events to occur supposes a relatively big effort and a waste of CPU cycles to only notice that nothing happened. In this article, we will learn about interrupts as a totally new way to deal with events, being notified about them instead of looking for them constantly. Interrupts may be really helpful when developing projects in which fast or unknown events may occur, and thus we will see a very interesting project which will lead us to develop a digital tachograph for a computer-controlled motor. Are you ready? Here we go! (For more resources related to this topic, see here.) The concept of an interruption As you may have intuited, an interrupt is a special mechanism the CPU incorporates to have a direct channel to be noticed when some event occurs. Most Arduino microcontrollers have two of these: Interrupt 0 on digital pin 2 Interrupt 1 on digital pin 3 But some models, such as the Mega2560, come with up to five interrupt pins. Once an interrupt has been notified, the CPU completely stops what it was doing and goes on to look at it, by running a special dedicated function in our code called Interrupt Service Routine (ISR). When I say that the CPU completely stops, I mean that even functions such as delay() or millis() won't be updated while the ISR is being executed. Interrupts can be programmed to respond on different changes of the signal connected to the corresponding pin and thus the Arduino language has four predefined constants to represent each of these four modes: LOW: It will trigger the interrupt whenever the pin gets a LOW value CHANGE: The interrupt will be triggered when the pins change their values from HIGH to LOW or vice versa RISING: It will trigger the interrupt when signal goes from LOW to HIGH FALLING: It is just the opposite of RISING; the interrupt will be triggered when the signal goes from HIGH to LOW The ISR The function that the CPU will call whenever an interrupt occurs is so important to the micro that it has to accomplish a pair of rules: They can't have any parameter They can't return anything The interrupts can be executed only one at a time Regarding the first two points, they mean that we can neither pass nor receive any data from the ISR directly, but we have other means to achieve this communication with the function. We will use global variables for it. We can set and read from a global variable inside an ISR, but even so, these variables have to be declared in a special way. We have to declare them as volatile as we will see this later on in the code. The third point, which specifies that only one ISR can be attended at a time, is what makes the function millis() not being able to be updated. The millis() function relies on an interrupt to be updated, and this doesn't happen if another interrupt is already being served. As you may understand, ISR is critical to the correct code execution in a microcontroller. As a rule of thumb, we will try to keep our ISRs as simple as possible and leave all heavy weight processing that occurs outside of it, in the main loop of our code. The tachograph project To understand and manage interrupts in our projects, I would like to offer you a very particular one, a tachograph, a device that is present in all our cars and whose mission is to account for revolutions, normally the engine revolutions, but also in brake systems such as Anti-lock Brake System (ABS) and others. Mechanical considerations Well, calling it mechanical perhaps is too much, but let's make some considerations regarding how we are going to make our project account for revolutions. For this example project, I have used a small DC motor driven through a small transistor and, like in lots of industrial applications, an encoded wheel is a perfect mechanism to read the number of revolutions. By simply attaching a small disc of cardboard perpendicularly to your motor shaft, it is very easy to achieve it. By using our old friend, the optocoupler, we can sense something between its two parts, even with just a piece of cardboard with a small slot in just one side of its surface. Here, you can see the template I elaborated for such a disc, the cross in the middle will help you position the disc as perfectly as possible, that is, the cross may be as close as possible to the motor shaft. The slot has to be cut off of the black rectangle as shown in the following image: The template for the motor encoder Once I printed it, I glued it to another piece of cardboard to make it more resistant and glued it all to the crown already attached to my motor shaft. If yours doesn't have a surface big enough to glue the encoder disc to its shaft, then perhaps you can find a solution by using just a small piece of dough or similar to it. Once the encoder disc is fixed to the motor and spins attached to the motor shaft, we have to find a way to place the optocoupler in a way that makes it able to read through the encoder disc slot. In my case, just a pair of drops of glue did the trick, but if your optocoupler or motor doesn't allow you to apply this solution, I'm sure that a pair of zip ties or a small piece of dough can give you another way to fix it to the motor too. In the following image, you can see my final assembled motor with its encoder disc and optocoupler ready to be connected to the breadboard through alligator clips: The complete assembly for the motor encoder Once we have prepared our motor encoder, let's perform some tests to see it working and begin to write code to deal with interruptions. A simple interrupt tester Before going deep inside the whole code project, let's perform some tests to confirm that our encoder assembly is working fine and that we can correctly trigger an interrupt whenever the motor spins and the cardboard slot passes just through the optocoupler. The only thing you have to connect to your Arduino at the moment is the optocoupler; we will now operate our motor by hand and in a later section, we will control its speed from the computer. The test's circuit schematic is as follows: A simple circuit to test the encoder Nothing new in this circuit, it is almost the same as the one used in the optical coin detector, with the only important and necessary difference of connecting the wire coming from the detector side of the optocoupler to pin 2 of our Arduino board, because, as said in the preceding text, the interrupt 0 is available only through that pin. For this first test, we will make the encoder disc spin by hand, which allows us to clearly perceive when the interrupt triggers. For the rest of this example, we will use the LED included with the Arduino board connected to pin 13 as a way to visually indicate that the interrupts have been triggered. Our first interrupt and its ISR Once we have connected the optocoupler to the Arduino and prepared things to trigger some interrupts, let's see the code that we will use to test our assembly. The objective of this simple sketch is to commute the status of an LED every time an interrupt occurs. In the proposed tester circuit, the LED status variable will be changed every time the slot passes through the optocoupler: /* Chapter 09 - Dealing with interrupts A simple tester By Francis Perea for Packt Publishing */ // A LED will be used to notify the change #define ledPin 13 // Global variables we will use // A variable to be used inside ISR volatile int status = LOW; // A function to be called when the interrupt occurs void revolution(){ // Invert LED status status=!status; } // Configuration of the board: just one output void setup() { pinMode(ledPin, OUTPUT); // Assign the revolution() function as an ISR of interrupt 0 // Interrupt will be triggered when the signal goes from // LOW to HIGH attachInterrupt(0, revolution, RISING); } // Sketch execution loop void loop(){ // Set LED status digitalWrite(ledPin, status); } Let's take a look at its most important aspects. The LED pin apart, we declare a variable to account for changes occurring. It will be updated in the ISR of our interrupt; so, as I told you earlier, we declare it as follows: volatile int status = LOW; Following which we declare the ISR function, revolution(), which as we already know doesn't receive any parameter nor return any value. And as we said earlier, it must be as simple as possible. In our test case, the ISR simply inverts the value of the global volatile variable to its opposite value, that is, from LOW to HIGH and from HIGH to LOW. To allow our ISR to be called whenever an interrupt 0 occurs, in the setup() function, we make a call to the attachInterrupt() function by passing three parameters to it: Interrupt: The interrupt number to assign the ISR to ISR: The name without the parentheses of the function that will act as the ISR for this interrupt Mode: One of the following already explained modes that define when exactly the interrupt will be triggered In our case, the concrete sentence is as follows: attachInterrupt(0, revolution, RISING); This makes the function revolution() be the ISR of interrupt 0 that will be triggered when the signal goes from LOW to HIGH. Finally, in our main loop there is little to do. Simply update the LED based on the current value of the status variable that is going to be updated inside the ISR. If everything went right, you should see the LED commute every time the slot passes through the optocoupler as a consequence of the interrupt being triggered and the revolution() function inverting the value of the status variable that is used in the main loop to set the LED accordingly. A dial tachograph For a more complete example in this section, we will build a tachograph, a device that will present the current revolutions per minute of the motor in a visual manner by using a dial. The motor speed will be commanded serially from our computer by reusing some of the codes in our previous projects. It is not going to be very complicated if we include some way to inform about an excessive number of revolutions and even cut the engine in an extreme case to protect it, is it? The complete schematic of such a big circuit is shown in the following image. Don't get scared about the number of components as we have already seen them all in action before: The tachograph circuit As you may see, we will use a total of five pins of our Arduino board to sense and command such a set of peripherals: Pin 2: This is the interrupt 0 pin and thus it will be used to connect the output of the optocoupler. Pin 3: It will be used to deal with the servo to move the dial. Pin 4: We will use this pin to activate sound alarm once the engine current has been cut off to prevent overcharge. Pin 6: This pin will be used to deal with the motor transistor that allows us to vary the motor speed based on the commands we receive serially. Remember to use a PWM pin if you choose to use another one. Pin 13: Used to indicate with an LED an excessive number of revolutions per minute prior to cutting the engine off. There are also two more pins which, although not physically connected, will be used, pins 0 and 1, given that we are going to talk to the device serially from the computer. Breadboard connections diagram There are some wires crossed in the previous schematic, and perhaps you can see the connections better in the following breadboard connection image: Breadboard connection diagram for the tachograph The complete tachograph code This is going to be a project full of features and that is why it has such a number of devices to interact with. Let's resume the functioning features of the dial tachograph: The motor speed is commanded from the computer via a serial communication with up to five commands: Increase motor speed (+) Decrease motor speed (-) Totally stop the motor (0) Put the motor at full throttle (*) Reset the motor after a stall (R) Motor revolutions will be detected and accounted by using an encoder and an optocoupler Current revolutions per minute will be visually presented with a dial operated with a servomotor It gives visual indication via an LED of a high number of revolutions In case a maximum number of revolutions is reached, the motor current will be cut off and an acoustic alarm will sound With such a number of features, it is normal that the code for this project is going to be a bit longer than our previous sketches. Here is the code: /* Chapter 09 - Dealing with interrupt Complete tachograph system By Francis Perea for Packt Publishing */ #include <Servo.h> //The pins that will be used #define ledPin 13 #define motorPin 6 #define buzzerPin 4 #define servoPin 3 #define NOTE_A4 440 // Milliseconds between every sample #define sampleTime 500 // Motor speed increment #define motorIncrement 10 // Range of valir RPMs, alarm and stop #define minRPM 0 #define maxRPM 10000 #define alarmRPM 8000 #define stopRPM 9000 // Global variables we will use // A variable to be used inside ISR volatile unsigned long revolutions = 0; // Total number of revolutions in every sample long lastSampleRevolutions = 0; // A variable to convert revolutions per sample to RPM int rpm = 0; // LED Status int ledStatus = LOW; // An instace on the Servo class Servo myServo; // A flag to know if the motor has been stalled boolean motorStalled = false; // Thr current dial angle int dialAngle = 0; // A variable to store serial data int dataReceived; // The current motor speed int speed = 0; // A time variable to compare in every sample unsigned long lastCheckTime; // A function to be called when the interrupt occurs void revolution(){ // Increment the total number of // revolutions in the current sample revolutions++; } // Configuration of the board void setup() { // Set output pins pinMode(motorPin, OUTPUT); pinMode(ledPin, OUTPUT); pinMode(buzzerPin, OUTPUT); // Set revolution() as ISR of interrupt 0 attachInterrupt(0, revolution, CHANGE); // Init serial communication Serial.begin(9600); // Initialize the servo myServo.attach(servoPin); //Set the dial myServo.write(dialAngle); // Initialize the counter for sample time lastCheckTime = millis(); } // Sketch execution loop void loop(){ // If we have received serial data if (Serial.available()) { // read the next char dataReceived = Serial.read(); // Act depending on it switch (dataReceived){ // Increment speed case '+': if (speed<250) { speed += motorIncrement; } break; // Decrement speed case '-': if (speed>5) { speed -= motorIncrement; } break; // Stop motor case '0': speed = 0; break; // Full throttle case '*': speed = 255; break; // Reactivate motor after stall case 'R': speed = 0; motorStalled = false; break; } //Only if motor is active set new motor speed if (motorStalled == false){ // Set the speed motor speed analogWrite(motorPin, speed); } } // If a sample time has passed // We have to take another sample if (millis() - lastCheckTime > sampleTime){ // Store current revolutions lastSampleRevolutions = revolutions; // Reset the global variable // So the ISR can begin to count again revolutions = 0; // Calculate revolution per minute rpm = lastSampleRevolutions * (1000 / sampleTime) * 60; // Update last sample time lastCheckTime = millis(); // Set the dial according new reading dialAngle = map(rpm,minRPM,maxRPM,180,0); myServo.write(dialAngle); } // If the motor is running in the red zone if (rpm > alarmRPM){ // Turn on LED digitalWrite(ledPin, HIGH); } else{ // Otherwise turn it off digitalWrite(ledPin, LOW); } // If the motor has exceed maximum RPM if (rpm > stopRPM){ // Stop the motor speed = 0; analogWrite(motorPin, speed); // Disable it until a 'R' command is received motorStalled = true; // Make alarm sound tone(buzzerPin, NOTE_A4, 1000); } // Send data back to the computer Serial.print("RPM: "); Serial.print(rpm); Serial.print(" SPEED: "); Serial.print(speed); Serial.print(" STALL: "); Serial.println(motorStalled); } It is the first time in this article that I think I have nothing to explain regarding the code that hasn't been already explained before. I have commented everything so that the code can be easily read and understood. In general lines, the code declares both constants and global variables that will be used and the ISR for the interrupt. In the setup section, all initializations of different subsystems that need to be set up before use are made: pins, interrupts, serials, and servos. The main loop begins by looking for serial commands and basically updates the speed value and the stall flag if command R is received. The final motor speed setting only occurs in case the stall flag is not on, which will occur in case the motor reaches the stopRPM value. Following with the main loop, the code looks if it has passed a sample time, in which case the revolutions are stored to compute real revolutions per minute (rpm), and the global revolutions counter incremented inside the ISR is set to 0 to begin again. The current rpm value is mapped to an angle to be presented by the dial and thus the servo is set accordingly. Next, a pair of controls is made: One to see if the motor is getting into the red zone by exceeding the max alarmRPM value and thus turning the alarm LED on And another to check if the stopRPM value has been reached, in which case the motor will be automatically cut off, the motorStalled flag is set to true, and the acoustic alarm is triggered When the motor has been stalled, it won't accept changes in its speed until it has been reset by issuing an R command via serial communication. In the last action, the code sends back some info to the Serial Monitor as another way of feedback with the operator at the computer and this should look something like the following screenshot: Serial Monitor showing the tachograph in action Modular development It has been quite a complex project in that it incorporates up to six different subsystems: optocoupler, motor, LED, buzzer, servo, and serial, but it has also helped us to understand that projects need to be developed by using a modular approach. We have worked and tested every one of these subsystems before, and that is the way it should usually be done. By developing your projects in such a submodular way, it will be easy to assemble and program the whole of the system. As you may see in the following screenshot, only by using such a modular way of working will you be able to connect and understand such a mess of wires: A working desktop may get a bit messy Summary I'm sure you have got the point regarding interrupts with all the things we have seen in this article. We have met and understood what an interrupt is and how does the CPU attend to it by running an ISR, and we have even learned about their special characteristics and restrictions and that we should keep them as little as possible. On the programming side, the only thing necessary to work with interrupts is to correctly attach the ISR with a call to the attachInterrupt() function. From the point of view of hardware, we have assembled an encoder that has been attached to a spinning motor to account for its revolutions. Finally, we have the code. We have seen a relatively long sketch, which is a sign that we are beginning to master the platform, are able to deal with a bigger number of peripherals, and that our projects require more complex software every time we have to deal with these peripherals and to accomplish all the other necessary tasks to meet what is specified in the project specifications. Resources for Article: Further resources on this subject: The Arduino Mobile Robot? [article] Using the Leap Motion Controller with Arduino [article] Android and Udoo Home Automation [article]

0
0
28248

How-To Tutorials

Packt

02 Mar 2015

15 min read

A Quick Start Guide to Flume

Packt

02 Mar 2015

15 min read

In this article by Steve Hoffman, the author of the book, Apache Flume: Distributed Log Collection for Hadoop - Second Edition, we will learn about the basics that are required to be known before we start working with Apache Flume. This article will help you get started with Flume. So, let's start with the first step: downloading and configuring Flume. (For more resources related to this topic, see here.) Downloading Flume Let's download Flume from http://flume.apache.org/. Look for the download link in the side navigation. You'll see two compressed .tar archives available along with the checksum and GPG signature files used to verify the archives. Instructions to verify the download are on the website, so I won't cover them here. Checking the checksum file contents against the actual checksum verifies that the download was not corrupted. Checking the signature file validates that all the files you are downloading (including the checksum and signature) came from Apache and not some nefarious location. Do you really need to verify your downloads? In general, it is a good idea and it is recommended by Apache that you do so. If you choose not to, I won't tell. The binary distribution archive has bin in the name, and the source archive is marked with src. The source archive contains just the Flume source code. The binary distribution is much larger because it contains not only the Flume source and the compiled Flume components (jars, javadocs, and so on), but also all the dependent Java libraries. The binary package contains the same Maven POM file as the source archive, so you can always recompile the code even if you start with the binary distribution. Go ahead, download and verify the binary distribution to save us some time in getting started. Flume in Hadoop distributions Flume is available with some Hadoop distributions. The distributions supposedly provide bundles of Hadoop's core components and satellite projects (such as Flume) in a way that ensures things such as version compatibility and additional bug fixes are taken into account. These distributions aren't better or worse; they're just different. There are benefits to using a distribution. Someone else has already done the work of pulling together all the version-compatible components. Today, this is less of an issue since the Apache BigTop project started (http://bigtop.apache.org/). Nevertheless, having prebuilt standard OS packages, such as RPMs and DEBs, ease installation as well as provide startup/shutdown scripts. Each distribution has different levels of free and paid options, including paid professional services if you really get into a situation you just can't handle. There are downsides, of course. The version of Flume bundled in a distribution will often lag quite a bit behind the Apache releases. If there is a new or bleeding-edge feature you are interested in using, you'll either be waiting for your distribution's provider to backport it for you, or you'll be stuck patching it yourself. Furthermore, while the distribution providers do a fair amount of testing, such as any general-purpose platform, you will most likely encounter something that their testing didn't cover, in which case, you are still on the hook to come up with a workaround or dive into the code, fix it, and hopefully, submit that patch back to the open source community (where, at a future point, it'll make it into an update of your distribution or the next version). So, things move slower in a Hadoop distribution world. You can see that as good or bad. Usually, large companies don't like the instability of bleeding-edge technology or making changes often, as change can be the most common cause of unplanned outages. You'd be hard pressed to find such a company using the bleeding-edge Linux kernel rather than something like Red Hat Enterprise Linux (RHEL), CentOS, Ubuntu LTS, or any of the other distributions whose target is stability and compatibility. If you are a startup building the next Internet fad, you might need that bleeding-edge feature to get a leg up on the established competition. If you are considering a distribution, do the research and see what you are getting (or not getting) with each. Remember that each of these offerings is hoping that you'll eventually want and/or need their Enterprise offering, which usually doesn't come cheap. Do your homework. Here's a short, nondefinitive list of some of the more established players. For more information, refer to the following links: Cloudera: http://cloudera.com/ Hortonworks: http://hortonworks.com/ MapR: http://mapr.com/ An overview of the Flume configuration file Now that we've downloaded Flume, let's spend some time going over how to configure an agent. A Flume agent's default configuration provider uses a simple Java property file of key/value pairs that you pass as an argument to the agent upon startup. As you can configure more than one agent in a single file, you will need to additionally pass an agent identifier (called a name) so that it knows which configurations to use. In my examples where I'm only specifying one agent, I'm going to use the name agent. By default, the configuration property file is monitored for changes every 30 seconds. If a change is detected, Flume will attempt to reconfigure itself. In practice, many of the configuration settings cannot be changed after the agent has started. Save yourself some trouble and pass the undocumented --no-reload-conf argument when starting the agent (except in development situations perhaps). If you use the Cloudera distribution, the passing of this flag is currently not possible. I've opened a ticket to fix that at https://issues.cloudera.org/browse/DISTRO-648. If this is important to you, please vote it up. Each agent is configured, starting with three parameters: agent.sources=<list of sources>agent.channels=<list of channels>agent.sinks=<list of sinks> Each source, channel, and sink also has a unique name within the context of that agent. For example, if I'm going to transport my Apache access logs, I might define a channel named access. The configurations for this channel would all start with the agent.channels.access prefix. Each configuration item has a type property that tells Flume what kind of source, channel, or sink it is. In this case, we are going to use an in-memory channel whose type is memory. The complete configuration for the channel named access in the agent named agent would be: agent.channels.access.type=memory Any arguments to a source, channel, or sink are added as additional properties using the same prefix. The memory channel has a capacity parameter to indicate the maximum number of Flume events it can hold. Let's say we didn't want to use the default value of 100; our configuration would now look like this: agent.channels.access.type=memoryagent.channels.access.capacity=200 Finally, we need to add the access channel name to the agent.channels property so that the agent knows to load it: agent.channels=access Let's look at a complete example using the canonical "Hello, World!" example. Starting up with "Hello, World!" No technical article would be complete without a "Hello, World!" example. Here is the configuration file we'll be using: agent.sources=s1agent.channels=c1agent.sinks=k agent.sources.s1.type=netcatagent.sources.s1.channels=c1agent.sources.s1.bind=0.0.0.0agent.sources.s1.port=1234 agent.channels.c1.type=memory agent.sinks.k1.type=loggeragent.sinks.k1.channel=c1 Here, I've defined one agent (called agent) who has a source named s1, a channel named c1, and a sink named k1. The s1 source's type is netcat, which simply opens a socket listening for events (one line of text per event). It requires two parameters: a bind IP and a port number. In this example, we are using 0.0.0.0 for a bind address (the Java convention to specify listen on any address) and port 12345. The source configuration also has a parameter called channels (plural), which is the name of the channel(s) the source will append events to, in this case, c1. It is plural, because you can configure a source to write to more than one channel; we just aren't doing that in this simple example. The channel named c1 is a memory channel with a default configuration. The sink named k1 is of the logger type. This is a sink that is mostly used for debugging and testing. It will log all events at the INFO level using Log4j, which it receives from the configured channel, in this case, c1. Here, the channel keyword is singular because a sink can only be fed data from one channel. Using this configuration, let's run the agent and connect to it using the Linux netcat utility to send an event. First, explode the .tar archive of the binary distribution we downloaded earlier: $ tar -zxf apache-flume-1.5.2-bin.tar.gz$ cd apache-flume-1.5.2-bin Next, let's briefly look at the help. Run the flume-ng command with the help command: $ ./bin/flume-ng helpUsage: ./bin/flume-ng <command> [options]... commands:help display this help textagent run a Flume agentavro-client run an avro Flume clientversion show Flume version info global options:--conf,-c <conf> use configs in <conf> directory--classpath,-C <cp> append to the classpath--dryrun,-d do not actually start Flume, just print the command--plugins-path <dirs> colon-separated list of plugins.d directories. See the plugins.d section in the user guide for more details. Default: $FLUME_HOME/plugins.d-Dproperty=value sets a Java system property value-Xproperty=value sets a Java -X option agent options:--conf-file,-f <file> specify a config file (required)--name,-n <name> the name of this agent (required)--help,-h display help text avro-client options:--rpcProps,-P <file> RPC client properties file with server connection params--host,-H <host> hostname to which events will be sent--port,-p <port> port of the avro source--dirname <dir> directory to stream to avro source--filename,-F <file> text file to stream to avro source (default: std input)--headerFile,-R <file> File containing event headers as key/value pairs on each new line--help,-h display help text Either --rpcProps or both --host and --port must be specified. Note that if <conf> directory is specified, then it is always included first in the classpath. As you can see, there are two ways with which you can invoke the command (other than the simple help and version commands). We will be using the agent command. The use of avro-client will be covered later. The agent command has two required parameters: a configuration file to use and the agent name (in case your configuration contains multiple agents). Let's take our sample configuration and open an editor (vi in my case, but use whatever you like): $ vi conf/hw.conf Next, place the contents of the preceding configuration into the editor, save, and exit back to the shell. Now you can start the agent: $ ./bin/flume-ng agent -n agent -c conf -f conf/hw.conf -Dflume.root.logger=INFO,console The -Dflume.root.logger property overrides the root logger in conf/log4j.properties to use the console appender. If we didn't override the root logger, everything would still work, but the output would go to the log/flume.log file instead of being based on the contents of the default configuration file. Of course, you can edit the conf/log4j.properties file and change the flume.root.logger property (or anything else you like). To change just the path or filename, you can set the flume.log.dir and flume.log.file properties in the configuration file or pass additional flags on the command line as follows: $ ./bin/flume-ng agent -n agent -c conf -f conf/hw.conf -Dflume.root.logger=INFO,console -Dflume.log.dir=/tmp -Dflume.log.file=flume-agent.log You might ask why you need to specify the -c parameter, as the -f parameter contains the complete relative path to the configuration. The reason for this is that the Log4j configuration file should be included on the class path. If you left the -c parameter off the command, you'll see this error: Warning: No configuration directory set! Use --conf <dir> to override.log4j:WARN No appenders could be found for logger (org.apache.flume.lifecycle.LifecycleSupervisor).log4j:WARN Please initialize the log4j system properly.log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info But you didn't do that so you should see these key log lines: 2014-10-05 15:39:06,109 (conf-file-poller-0) [INFO - org.apache.flume.conf.FlumeConfiguration.validateConfiguration(FlumeConfiguration.java:140)] Post-validation flume configuration contains configuration foragents: [agent] This line tells you that your agent starts with the name agent. Usually you'd look for this line only to be sure you started the right configuration when you have multiple configurations defined in your configuration file. 2014-10-05 15:39:06,076 (conf-file-poller-0) [INFO - org.apache.flume.node.PollingPropertiesFileConfigurationProvider$FileWatcherRunnable.run(PollingPropertiesFileConfigurationProvider.java:133)] Reloadingconfiguration file:conf/hw.conf This is another sanity check to make sure you are loading the correct file, in this case our hw.conf file. 2014-10-05 15:39:06,221 (conf-file-poller-0) [INFO - org.apache.flume.node.Application.startAllComponents(Application.java:138)]Starting new configuration:{ sourceRunners:{s1=EventDrivenSourceRunner: { source:org.apache.flume.source.NetcatSource{name:s1,state:IDLE} }} sinkRunners:{k1=SinkRunner: { policy:org.apache.flume.sink.DefaultSinkProcessor@442fbe47 counterGroup:{ name:null counters:{} } }}channels:{c1=org.apache.flume.channel.MemoryChannel{name: c1}} } Once all the configurations have been parsed, you will see this message, which shows you everything that was configured. You can see s1, c1, and k1, and which Java classes are actually doing the work. As you probably guessed, netcat is a convenience for org.apache.flume.source.NetcatSource. We could have used the class name if we wanted. In fact, if I had my own custom source written, I would use its class name for the source's type parameter. You cannot define your own short names without patching the Flume distribution. 2014-10-05 15:39:06,427 (lifecycleSupervisor-1-0) [INFO - org.apache.flume.source.NetcatSource.start(NetcatSource.java:164)] CreatedserverSocket:sun.nio.ch.ServerSocketChannelImpl[/0.0.0.0:12345] Here, we see that our source is now listening on port 12345 for the input. So, let's send some data to it. Finally, open a second terminal. We'll use the nc command (you can use Telnet or anything else similar) to send the Hello World string and press the Return (Enter) key to mark the end of the event: % nc localhost 12345Hello WorldOK The OK message came from the agent after we pressed the Return key, signifying that it accepted the line of text as a single Flume event. If you look at the agent log, you will see the following: 2014-10-05 15:44:11,215 (SinkRunner-PollingRunner-DefaultSinkProcessor)[INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)] Event: { headers:{} body: 48 65 6C 6C 6F 20 57 6F 72 6C 64Hello World } This log message shows you that the Flume event contains no headers (NetcatSource doesn't add any itself). The body is shown in hexadecimal along with a string representation (for us humans to read, in this case, our Hello World message). If I send the following line and then press the Enter key, you'll get an OK message: The quick brown fox jumped over the lazy dog. You'll see this in the agent's log: 2014-10-05 15:44:57,232 (SinkRunner-PollingRunner-DefaultSinkProcessor)[INFO - org.apache.flume.sink.LoggerSink.process(LoggerSink.java:70)]Event: { headers:{} body: 54 68 65 20 71 75 69 63 6B 20 62 72 6F 77 6E 20The quick brown } The event appears to have been truncated. The logger sink, by design, limits the body content to 16 bytes to keep your screen from being filled with more than what you'd need in a debugging context. If you need to see the full contents for debugging, you should use a different sink, perhaps the file_roll sink, which would write to the local filesystem. Summary In this article, we covered how to download the Flume binary distribution. We created a simple configuration file that included one source writing to one channel, feeding one sink. The source listened on a socket for network clients to connect to and to send it event data. These events were written to an in-memory channel and then fed to a Log4j sink to become the output. We then connected to our listening agent using the Linux netcat utility and sent some string events to our Flume agent's source. Finally, we verified that our Log4j-based sink wrote the events out. Resources for Article: Further resources on this subject: About Cassandra [article] Introducing Kafka [article] Transformation [article]

0
0
7160

Basics of Programming in Julia

Getting Started with PostgreSQL

Performance Considerations

Elasticsearch Administration

MapReduce functions

Introducing Splunk

SciPy for Signal Processing

Time Travelling with Spring

Packaged Elegance

Starting Small and Growing in a Modular Way

Trending Topics

Model-View-ViewModel

Building a Color Picker with Hex RGB Conversion

Entity Framework DB First – Inheritance Relationships between Entities

Dealing with Interrupts

A Quick Start Guide to Flume

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access