Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-understanding-elf-specimen
Packt
07 Jan 2016
21 min read
Save for later

Understanding the ELF specimen

Packt
07 Jan 2016
21 min read
In this article by Ryan O'Neill, author of the book Learning Linux Binary Analysis, ELF will be discussed. In order to reverse-engineer Linux binaries, we must understand the binary format itself. ELF has become the standard binary format for UNIX and UNIX-Flavor OS's. Binary formats such as ELF are not generally a quick study, and to learn ELF requires some degree of application of the different components that you learn as you go. Programming things that perform tasks such as binary parsing will require learning some ELF, and in the act of programming such things, you will in-turn learn ELF better and more proficiently as you go along. ELF is often thought to be a dry and complicated topic to learn, and if one were to simply read through the ELF specs without applying them through the spirit of creativity, then indeed it would be. ELF is really quite an incredible composition of computer science at work, with program layout, program loading, dynamic linking, symbol table lookups, and many other tightly orchestrated components. (For more resources related to this topic, see here.) ELF section headers Now that we've looked at what program headers are, it is time to look at section headers. I really want to point out here the distinction between the two; I often hear people calling sections "segments" and vice versa. A section is not a segment. Segments are necessary for program execution, and within segments are contained different types of code and data which are separated within sections and these sections always exist, and usually they are addressable through something called section-headers. Section-headers are what make sections accessible, but if the section-headers are stripped (Missing from the binary), it doesn't mean that the sections are not there. Sections are just data or code. This data or code is organized across the binary in different sections. The sections themselves exist within the boundaries of the text and data segment. Each section contains either code or data of some type. The data could range from program data, such as global variables, or dynamic linking information that is necessary for the linker. Now, as mentioned earlier, every ELF object has sections, but not all ELF objects have section headers. Usually this is because the executable has been tampered with (Such as the section headers having been stripped so that debugging is harder). All of GNU's binutils, such as objcopy, objdump, and other tools such as gdb, rely on the section-headers to locate symbol information that is stored in the sections specific to containing symbol data. Without section-headers, tools such as gdb and objdump are nearly useless. Section-headers are convenient to have for granular inspection over what parts or sections of an ELF object we are viewing. In fact, section-headers make reverse engineering a lot easier, since they provide us with the ability to use certain tools that require them. If, for instance, the section-header table is stripped, then we can't access a section such as .dynsym, which contains imported/exported symbols describing function names and offsets/addresses. Even if a section-header table has been stripped from an executable, a moderate reverse engineer can actually reconstruct a section-header table (and even part of a symbol table) by getting information from certain program headers, since these will always exist in a program or shared library. We discussed the dynamic segment earlier and the different DT_TAGs that contain information about the symbol table and relocation entries. This is what a 32 bit ELF section-header looks like: typedef struct {     uint32_t   sh_name; // offset into shdr string table for shdr name     uint32_t   sh_type; // shdr type I.E SHT_PROGBITS     uint32_t   sh_flags; // shdr flags I.E SHT_WRITE|SHT_ALLOC     Elf32_Addr sh_addr;  // address of where section begins     Elf32_Off  sh_offset; // offset of shdr from beginning of file     uint32_t   sh_size;   // size that section takes up on disk     uint32_t   sh_link;   // points to another section     uint32_t   sh_info;   // interpretation depends on section type     uint32_t   sh_addralign; // alignment for address of section     uint32_t   sh_entsize;  // size of each certain entries that may be in section } Elf32_Shdr; Let's take a look at some of the most important section types, once again allowing room to study the ELF(5) man pages and the official ELF specification for more detailed information about the sections. .text The .text section is a code section that contains program code instructions. In an executable program where there are also phdr, this section would be within the range of the text segment. Because it contains program code, it is of the section type SHT_PROGBITS. .rodata The rodata section contains read-only data, such as strings, from a line of C code: printf("Hello World!n"); These are stored in this section. This section is read-only, and therefore must exist in a read-only segment of an executable. So, you would find .rodata within the range of the text segment (Not the data segment). Because this section is read-only, it is of the type SHT_PROGBITS. .plt The Procedure linkage table (PLT) contains code that is necessary for the dynamic linker to call functions that are imported from shared libraries. It resides in the text segment and contains code, so it is marked as type SHT_PROGBITS. .data The data section, not to be confused with the data segment, will exist within the data segment and contain data such as initialized global variables. It contains program variable data, so it is marked as SHT_PROGBITS. .bss The bss section contains uninitialized global data as part of the data segment, and therefore takes up no space on the disk other than 4 bytes, which represents the section itself. The data is initialized to zero at program-load time, and the data can be assigned values during program execution. The bss section is marked as SHT_NOBITS, since it contains no actual data. .got The Global offset table (GOT) section contains the global offset table. This works together with the PLT to provide access to imported shared library functions, and is modified by the dynamic linker at runtime. This section has to do with program execution and is therefore marked as SHT_PROGBITS. .dynsym The dynsym section contains dynamic symbol information imported from shared libraries. It is contained within the text segment and is marked as type SHT_DYNSYM. .dynstr The dynstr section contains the string table for dynamic symbols; this has the name of each symbol in a series of null terminated strings. .rel.* Relocation sections contain information about how the parts of an ELF object or process image need to be fixed up or modified at linking or runtime. .hash The hash section, sometimes called .gnu.hash, contains a hash table for symbol lookup. The following hash algorithm is used for symbol name lookups in Linux ELF: uint32_t dl_new_hash (const char *s) {         uint32_t h = 5381;           for (unsigned char c = *s; c != ' '; c = *++s)                 h = h * 33 + c;           return h; } .symtab The symtab section contains symbol information of the type ElfN_Sym. The symtab section is marked as type SHT_SYMTAB as it contains symbol information. .strtab This section contains the symbol string table that is referenced by the st_name entries within the ElfN_Sym structs of .symtab, and is marked as type SHT_STRTAB since it contains a string table. .shstrtab The shstrtab section contains the section header string table, which is a set of null terminated strings containing the names of each section, such as .text, .data, and so on. This section is pointed to by the ELF file header entry called e_shstrndx, which holds the offset of .shstrtab. This section is marked as SHT_STRTAB since it contains a string table. .ctors and .dtors The .ctors (constructors) and .dtors (destructors) sections contain code for initialization and finalization, which is to be executed before and after the actual main() body of program code, and then after the main program code. The __constructor__ function attribute is often used by hackers and virus writers to implement a function that performs an anti-debugging trick, such as calling PTRACE_TRACEME, so that the process traces itself and no debuggers can attach themselves to it. This way, the anti-debugging mechanism gets executed before the program enters main(). There are many other section names and types, but we have covered most of the primary ones found in a dynamically linked executable. One can now visualize how an executable is laid out with both phdrs and shdrs: ELF Relocations From the ELF(5) man pages: Relocation is the process of connecting symbolic references with symbolic definitions.  Relocatable files must have information that describes how to modify their section contents, thus allowing executable and shared object files to hold the right information for a process's program image. Relocation entries are these data. The process of relocation relies on symbols, which is why we covered symbols first. An example of relocation might be a couple of relocatable objects (ET_REL) being linked together to create an executable. obj1.o wants to call a function, foo(), located in obj2.o. Both obj1.o and obj2.o are being linked to create a fully working executable; they are currently Position independent code (PIC), but once relocated to form an executable, they will no longer be position independent since symbolic references will be resolved into symbolic definitions. The term "relocated" means exactly that: a piece of code or data is being relocated from a simple offset in an object file to some memory address location in an executable, and anything that references that relocated code or data must also be adjusted. Let's take a quick look at a 32 bit relocation entry: typedef struct {     Elf32_Addr r_offset;     uint32_t   r_info; } Elf32_Rel; And some relocation entries require an addend: typedef struct {     Elf32_Addr r_offset;     uint32_t   r_info;     int32_t    r_addend; } Elf32_Rela; Following is the description of the preceding snippet: r_offset: This points to the location (offset or address) that requires the relocation action (which is going to be some type of modification) r_info: This gives both the symbol table index with respect to which the relocation must be made, and the type of relocation to apply r_addend: This specifies a constant addend used to compute the value stored in the relocatable field Let's take a look at the source code for obj1.o: _start() {   foo(); } We see that it calls the function foo(), however foo() is not located within the source code or the compiled object file, so there will be a relocation entry necessary for symbolic reference: ryan@alchemy:~$ objdump -d obj1.o obj1.o:     file format elf32-i386 Disassembly of section .text: 00000000 <func>:    0:  55                     push   %ebp    1:  89 e5                  mov    %esp,%ebp    3:  83 ec 08               sub    $0x8,%esp    6:  e8 fc ff ff ff         call   7 <func+0x7>    b:  c9                     leave     c:  c3                     ret   As we can see, the call to foo() is highlighted and simply calls to nowhere; 7 is the offset of itself. So, when obj1.o, which calls foo() (located in obj2.o), is linked with obj2.o to make an executable, a relocation entry is there to point at offset 7, which is the data that needs to be modified, changing it to the offset of the actual function, foo(), once the linker knows its location in the executable during link time: ryan@alchemy:~$ readelf -r obj1.o Relocation section '.rel.text' at offset 0x394 contains 1 entries:  Offset     Info    Type            Sym.Value  Sym. Name 00000007  00000902 R_386_PC32        00000000   foo As we can see, a relocation field at offset 7 is specified by the relocation entry's r_offset field. R_386_PC32 is the relocation type; to understand all of these types, read the ELF specs as we will only be covering some. Each relocation type requires a different computation on the relocation target being modified. R_386_PC32 says to modify the target with S + A – P. The following list explains all these terms: S is the value of the symbol whose index resides in the relocation entry A is the addend found in the relocation entry P is the place (section offset or address) where the storage unit is being relocated (computed using r_offset) If we look at the final output of our executable after compiling obj1.o and obj2.o, as shown in the following code snippet: ryan@alchemy:~$ gcc -nostdlib obj1.o obj2.o -o relocated ryan@alchemy:~$ objdump -d relocated test:     file format elf32-i386 Disassembly of section .text: 080480d8 <func>:  80480d8:  55                     push   %ebp  80480d9:  89 e5                  mov    %esp,%ebp  80480db:  83 ec 08               sub    $0x8,%esp  80480de:  e8 05 00 00 00         call   80480e8 <foo>  80480e3:  c9                     leave   80480e4:  c3                     ret     80480e5:  90                     nop  80480e6:  90                     nop  80480e7:  90                     nop 080480e8 <foo>:  80480e8:  55                     push   %ebp  80480e9:  89 e5                  mov    %esp,%ebp  80480eb:  5d                     pop    %ebp  80480ec:  c3                     ret We can see that the call instruction (the relocation target) at 0x80480de has been modified with the 32 bit offset value of 5, which points to foo(). The value 5 is the result of the R386_PC_32 relocation action: S + A – P: 0x80480e8 + 0xfffffffc – 0x80480df = 5 0xfffffffc is the same as -4 if a signed integer, so the calculation can also be seen as: 0x80480e8 + (0x80480df + sizeof(uint32_t)) To calculate an offset into a virtual address, use the following computation: address_of_call + offset + 5 (Where 5 is the length of the call instruction) Which in this case is 0x80480de + 5 + 5 = 0x80480e8. An address may also be computed into an offset with the following computation: address – address_of_call – 4 (Where 4 is the length of a call instruction – 1) Relocatable code injection based binary patching Relocatable code injection is a technique that hackers, virus writers, or anyone who wants to modify the code in a binary may utilize as a way to sort of re-link a binary after it has already been compiled. That is, you can inject an object file into an executable, update the executables symbol table, and perform the necessary relocations on the injected object code so that it becomes a part of the executable. A complicated virus might use this rather than just appending code at the end of an executable or finding existing padding. This technique requires extending the text segment to create enough padding room to load the object file. The real trick though is handling the relocations and applying them properly. I designed a custom reverse engineering tool for ELF that is named Quenya. Quenya has many features and capabilities, and one of them is to inject object code into an executable. Why do this? Well, one reason would be to inject a malicious function into an executable, and then hijack a legitimate function and replace it with the malicious one. From a security point of view, one could do hot-patching and apply a legitimate patch to a binary rather than doing something malicious. Let's pretend we are an attacker and we want to infect a program that calls puts() to print "Hello World", and our goal is to hijack puts() so that it calls evil_puts(). First, we would need to write a quick PIC object that can write a string to standard output: #include <sys/syscall.h> int _write (int fd, void *buf, int count) {   long ret;     __asm__ __volatile__ ("pushl %%ebxnt"                         "movl %%esi,%%ebxnt"                         "int $0x80nt" "popl %%ebx":"=a" (ret)                         :"0" (SYS_write), "S" ((long) fd),                         "c" ((long) buf), "d" ((long) count));   if (ret >= 0) {     return (int) ret;   }   return -1; } int evil_puts(void) {         _write(1, "HAHA puts() has been hijacked!n", 31); } Now, we compile evil_puts.c into evil_puts.o, and inject it into our program, hello_world: ryan@alchemy:~/quenya$ ./hello_world Hello World This program calls the following: puts(“Hello Worldn”); We now use Quenya to inject and relocate our evil_puts.o file into hello_world: [Quenya v0.1@alchemy] reloc evil_puts.o hello_world 0x08048624  addr: 0x8048612 0x080485c4 _write addr: 0x804861e 0x080485c4  addr: 0x804868f 0x080485c4  addr: 0x80486b7 Injection/Relocation succeeded As we can see, the function write() from our evil_puts.o has been relocated and assigned an address at 0x804861e in the executable, hello_world. The next command, hijack, overwrites the global offset table entry for puts() with the address of evil_puts(): [Quenya v0.1@alchemy] hijack binary hello_world evil_puts puts Attempting to hijack function: puts Modifying GOT entry for puts Succesfully hijacked function: puts Commiting changes into executable file [Quenya v0.1@alchemy] quit And Whammi! ryan@alchemy:~/quenya$ ./hello_world HAHA puts() has been hijacked! We have successfully relocated an object file into an executable and modified the executable's control flow so that it executes the code that we injected. If we use readelf -s on hello_world, we can actually now see a symbol called evil_puts(). For the readers interest, I have included a small snippet of code that contains the ELF relocation mechanics in Quenya; it may be a little bit obscure without knowledge of the rest of the code base, but it is also somewhat straightforward if you've paid attention to what we learned about relocations. It is just a snippet and does not show any of the other important aspects such as modifying the executables symbol table: case SHT_RELA: for (j = 0; j < obj.shdr[i].sh_size / sizeof(Elf32_Rela); j++, rela++) {   rela = (Elf32_Rela *)(obj.mem + obj.shdr[i].sh_offset);       /* symbol table */                            symtab = (Elf32_Sym *)obj.section[obj.shdr[i].sh_link];               /* symbol we are applying relocation to */       symbol = &symtab[ELF32_R_SYM(rela->r_info)];        /* section to modify */       TargetSection = &obj.shdr[obj.shdr[i].sh_info];       TargetIndex = obj.shdr[i].sh_info;        /* target location */       TargetAddr = TargetSection->sh_addr + rela->r_offset;              /* pointer to relocation target */       RelocPtr = (Elf32_Addr *)(obj.section[TargetIndex] + rela->r_offset);              /* relocation value */       RelVal = symbol->st_value;       RelVal += obj.shdr[symbol->st_shndx].sh_addr;       switch (ELF32_R_TYPE(rela->r_info))       {         /* R_386_PC32      2    word32  S + A - P */         case R_386_PC32:               *RelocPtr += RelVal;                   *RelocPtr += rela->r_addend;                   *RelocPtr -= TargetAddr;                   break;              /* R_386_32        1    word32  S + A */            case R_386_32:                *RelocPtr += RelVal;                   *RelocPtr += rela->r_addend;                   break;       }  } As shown in the preceding code, the relocation target that RelocPtr points to is modified according to the relocation action requested by the relocation type (such as R_386_32). Although relocatable code binary injection is a good example of the idea behind relocations, it is not a perfect example of how a linker actually performs it with multiple object files. Nevertheless, it still retains the general idea and application of a relocation action. Later on, we will talk about shared library (ET_DYN) injection, which brings us now to the topic of dynamic linking. Summary In this article we discussed different types of ELF section headers and ELF relocations. Resources for Article:   Further resources on this subject: Advanced User Management [article] Creating Code Snippets [article] Advanced Wireless Sniffing [article]
Read more
  • 0
  • 0
  • 10445

article-image-parallelization-using-reducers
Packt
06 Jan 2016
18 min read
Save for later

Parallelization using Reducers

Packt
06 Jan 2016
18 min read
In this article by Akhil Wali, the author of the book Mastering Clojure, we will study this particular abstraction of collections and how it is quite orthogonal to viewing collections as sequences. Sequences and laziness are great way of handling collections. The Clojure standard library provides several functions to handle and manipulate sequences. However, abstracting a collection as a sequence has an unfortunate consequence—any computation that is performed over all the elements of a sequence is inherently sequential. All standard sequence functions create a new collection that be similar to the collection they are passed. Interestingly, performing a computation over a collection without creating a similar collection—even as an intermediary result—is quite useful. For example, it is often required to reduce a given collection to a single value through a series of transformations in an iterative manner. This sort of computation does not necessarily require the intermediary results of each transformation to be saved. (For more resources related to this topic, see here.) A consequence of iteratively computing values from a collection is that we cannot parallelize it in a straightforward way. Modern map-reduce frameworks handle this kind of computation by pipelining the elements of a collection through several transformations in parallel and finally reducing the results into a single result. Of course, the result could be a new collection as well. A drawback is that this methodology produces concrete collections as intermediate results of each transformation, which is rather wasteful. For example, if we want to filter values from a collection, a map-reduce strategy would require creating empty collections to represent values that are left out of the reduction step to produce the final result. This incurs unnecessary memory allocation and also creates additional work for the reduction step that produces the final result. Hence, there’s scope for optimizing these kinds of computations. This brings us to the notion of treating computations over collections as reducers to attain better performance. Of course, this doesn't mean that reducers are a replacement for sequences. Sequences and laziness are great for abstracting computations that create and manipulate collections, while reducers are specialized high-performance abstractions of collections in which a collection needs to be piped through several transformations and combined to produce the final result. Reducers achieve a performance gain in the following ways: Reducing the amount of memory allocated to produce the desired result Parallelizing the process of reducing a collection into a single result, which could be an entirely new collection The clojure.core.reducers namespace provides several functions for processing collections using reducers. Let's now examine how reducers are implemented and also study a few examples that demonstrate how reducers can be used. Using reduce to transform collections Sequences and functions that operate on sequences preserve the sequential ordering between the constituent elements of a collection. Lazy sequences avoid unnecessary realization of elements in a collection until they are required for a computation, but the realization of these values is still performed in a sequential manner. However, this characteristic of sequential ordering may not be desirable for all computations performed over it. For example, it's not possible to map a function over a vector and then lazily realize values in the resulting collection by random access, since the map function converts the collection that it is supplied into a sequence. Also, functions such as map and filter are lazy but still sequential by nature. Consider a unary function as shown in Example 3.1 that we intend to map it over a given vector. The function must compute a value from the one it is supplied, and also perform a side effect so that we can observe its application over the elements in a collection: Example 3.1. A simple unary function (defn square-with-side-effect [x]   (do     (println (str "Side-effect: " x))     (* x x))) The square-with-side-effect function defined here simply returns the square of a number x using the * function. This function also prints the value of x using a println form whenever it is called. Suppose this function is mapped over a given vector. The resulting collection will have to be realized completely if a computation has to be performed over it, even if all the elements from the resulting vector are not required. This can be demonstrated as follows: user> (def mapped (map square-with-side-effect [0 1 2 3 4 5])) #'user/mapped user> (reduce + (take 3 mapped)) Side-effect: 0 Side-effect: 1 Side-effect: 2 Side-effect: 3 Side-effect: 4 Side-effect: 5 5 As previously shown, the mapped variable contains the result of mapping the square-with-side-effect function over a vector. If we try to sum the first three values in the resulting collection using the reduce, take, and + functions, all the values in the [0 1 2 3 4 5] vector are printed as a side effect. This means that the square-with-side-effect function was applied to all the elements in the initial vector, despite the fact that only the first three elements were actually required by the reduce form. Of course, this can be solved by using the seq function to convert the vector to a sequence before we map the square-with-side-effect function over it. But then, we lose the ability to efficiently access elements in a random order in the resulting collection. To dive deeper into why this actually happens, you first need to understand how the standard map function is actually implemented. A simplified definition of the map function is shown here: Example 3.2. A simplified definition of the map function (defn map [f coll]   (cons (f (first coll))         (lazy-seq (map f (rest coll))))) The definition of map in Example 3.2 is a simplified and rather incomplete one, as it doesn't check for an empty collection and cannot be used over multiple collections. That aside, this definition of map does indeed apply a function f to all the elements in a coll collection. This is implemented using a composition of the cons, first, rest, and lazy-seq forms. The implementation can be interpreted as, "apply the f function to the first element in the coll collection, and then map f over the rest of the collection in a lazy manner." An interesting consequence of this implementation is that the map function has the following characteristics: The ordering among the elements in the coll collection is preserved. This computation is performed recursively. The lazy-seq form is used to perform the computation in a lazy manner. The use of the first and rest forms indicates that coll must be a sequence, and the cons form will also produce a result that is a sequence. Hence, the map function accepts a sequence and builds a new one. Another interesting characteristic about lazy sequences is that they are realized in chunks. This means that a lazy sequence is realized in chunks of 32 elements, each as an optimization, when the values in the sequence are actually required. Sequences that behave this way are termed as chunked sequences. Of course, not all sequences are chunked, and we can check whether a given sequence is chunked using the chunked-seq? predicate. The range function returns a chunked sequence, shown as follows: user> (first (map #(do (print !) %) (range 70))) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 0 user> (nth (map #(do (print !) %) (range 70)) 32) !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! 32 Both the statements in the output shown previously select a single element from a sequence returned by the map function. The function passed to the map function in both the above statements prints the ! character and returns the value supplied to it. In the first statement, the first 32 elements of the resulting sequence are realized, even though only the first element is required. Similarly, the second statement is observed to realize the first 64 elements of the resulting sequence when the element at the 32nd position is obtained using the nth function. Chunked sequences have been an integral part of Clojure since version 1.1. However, none of the properties of the sequences described above are needed to transform a given collection into a result that is not a sequence. If we were to handle such computations efficiently, we cannot build on functions that return sequences, such as map and filter. Incidentally, the reduce function does not necessarily produce a sequence. It also has a couple of other interesting properties: The reduce function actually lets the collection it is passed to define how it is computed over or reduced. Thus, reduce is collection independent. Also, the reduce function is versatile enough to build a single value or an entirely new collection as well. For example, using reduce with the * or + function will create a single-valued result, while using it with the cons or concat function can create a new collection as the result. Thus, reduce can build anything. A collection is said to be reducible if it defines how it can be reduced to a single result. The binary function that is used by the reduce function along with a collection is also termed as a reducing function. A reducing function requires two arguments: one to represent the result of the computation so far, and another to represent an input value that has to be combined into the result. Several reducing functions can be composed into one, which effectively changes how the reduce function processes a given collection. This composition is done using reducers, which can be thought of as a shortened version of the term reducing function transformers. The use of sequences and laziness can be compared to the use of reducers to perform a given computation by Rich Hickey's infamous pie-maker analogy. Suppose a pie-maker has been supplied a bag of apples with an intent to reduce the apples to a pie. There are a couple of transformations needed to perform this task. First, the stickers on all the apples have to be removed; as in, we map a function to "take the sticker off" the apples in the collection. Also, all the rotten apples will have to be removed, which is analogous to using the filter function to remove elements from a collection. Instead of performing this work herself, the pie-maker delegates it to her assistant. The assistant could first take the stickers off all the apples, thus producing a new collection, and then take out the rotten apples to produce another new collection, which illustrates the use of lazy sequences. But then, the assistant would be doing unnecessary work by removing the stickers off of the rotten apples, which will have to be discarded later anyway. On the other hand, the assistant could delay this work until the actual reduction of the processed apples into a pie is performed. Once the work actually needs to be performed, the assistant will compose the two tasks of mapping and filtering the collection of apples, thus avoiding any unnecessary work. This case depicts the use of reducers for composing and transforming the tasks needed to effectively reduce a collection of apples to a pie. By using reducers, we create a recipe of tasks to reduce a collection of apples to a pie and delay all processing until the final reduction, instead of dealing with collections of apples as intermediary results of each task: The following namespaces must be included in your namespace declaration for the upcoming examples. (ns my-namespace   (:require [clojure.core.reducers :as r])) The clojure.core.reducers namespace requires Java 6 with the jsr166y.jar or Java 7+ for fork/join support.  Let's now briefly explore how reducers are actually implemented. Functions that operate on sequences use the clojure.lang.ISeq interface to abstract the behavior of a collection. In the case of reducers, the common interface that we must build upon is that of a reducing function. As we mentioned earlier, a reducing function is a two-arity function in which the first argument is the result produced so far and the second argument is the current input, which has to be combined with the first argument. The process of performing a computation over a collection and producing a result can be generalized into three distinct cases. They can be described as follows: A new collection with the same number of elements as that of the collection it is supplied needs to be produced. This one-to-one case is analogous to using the map function. The computation shrinks the supplied collection by removing elements from it. This can be done using the filter function. The computation could also be expansive, in which case it produces a new collection that contains an increased number of elements. This is like what the mapcat function does. These cases depict the different ways by which a collection can be transformed into the desired result. Any computation, or reduction, over a collection can be thought of as an arbitrary sequence of such transformations. These transformations are represented by transformers, which are functions that transform a reducing function. They can be implemented as shown here in Example 3.3: Example 3.3. Transformers (defn mapping [f]   (fn [rf]     (fn [result input]       (rf result (f input)))))    (defn filtering [p?]   (fn [rf]     (fn [result input]       (if (p? input)         (rf result input)         result))))   (defn mapcatting [f]   (fn [rf]     (fn [result input]       (reduce rf result (f input))))) The mapping, filtering, and mapcatting functions in the above example Example 3.3 represent the core logic of the map, filter, and mapcat functions respectively. All of these functions are transformers that take a single argument and return a new function. The returned function transforms a supplied reducing function, represented by rf, and returns a new reducing function, created using this expression: (fn [result input] ... ). Functions returned by the mapping, filtering, and mapcatting functions are termed as reducing function transformers. The mapping function applies the f function to the current input, represented by the input variable. The value returned by the f function is then combined with the accumulated result, represented by result, using the reducing function rf. This transformer is a frighteningly pure abstraction of the standard map function that applies an f function over a collection. The mapping function makes no assumptions about the structure of the collection it is supplied, or how the values returned by the f function are combined to produce the final result. Similarly, the filtering function uses a predicate, p?, to check whether the current input of the rf reducing function must combined into the final result, represented by result. If the predicate is not true, then the reducing function will simply return the result value without any modification. The mapcatting function uses the reduce function to combine the value result with the result of the (f input) expression. In this transformer, we can assume that the f function will return a new collection and the rf reducing function will somehow combine two collections into a new one. One of the foundations of the reducers library is the CollReduce protocol defined in the clojure.core.protocols namespace. This protocol defines the behavior of a collection when it is passed as an argument to the reduce function, and it is declared as shown below Example 3.4: Example 3.4. The CollReduce protocol (defprotocol CollReduce   (coll-reduce [coll rf init])) The clojure.core.reducers namespace defines a reducer function that creates a reducible collection by dynamically extending the CollReduce protocol, as shown in this code Example 3.5: The reducer function (defn reducer   ([coll xf]    (reify      CollReduce      (coll-reduce [_ rf init]        (coll-reduce coll (xf rf) init))))) The reducer function combines a collection (coll) and a reducing function transformer (xf), which is returned by the mapping, filtering, and mapcatting functions, to produce a new reducible collection. When reduce is invoked on a reducible collection, it will ultimately ask the collection to reduce itself using the reducing function returned by the (xf rf) expression. Using this mechanism, several reducing functions can be composed into a computation that has to be performed over a given collection. Also, the reducer function needs to be defined only once, and the actual implementation of coll-reduce is provided by the collection supplied to the reducer function. Now, we can redefine the reduce function to simply invoke the coll-reduce function implemented by a given collection, as shown herein Example 3.6: Example 3.6. Redefining the reduce function (defn reduce   ([rf coll]    (reduce rf (rf) coll))   ([rf init coll]    (coll-reduce coll rf init))) As shown in the above code Example 3.6, the reduce function delegates the job of reducing a collection to the collection itself using the coll-reduce function. Also, the reduce function will use the rf reducing function to supply the init argument when it is not specified. An interesting consequence of this definition of reduce is that the rf function must produce an identity value when it is supplied no arguments. The standard reduce function even uses the CollReduce protocol to delegate the job of reducing a collection to the collection itself, but it will also fall back on the default definition of reduce if the supplied collection does not implement the CollReduce protocol. Since Clojure 1.4, the reduce function allows a collection to define how it is reduced using the clojure.core.CollReduce protocol. Clojure 1.5 introduced the clojure.core.reducers namespace, which extends the use of this protocol. All the standard Clojure collections, namely lists, vectors, sets, and maps, implement the CollReduce protocol. The reducer function can be used to build a sequence of transformations to be applied on a collection when it is passed as an argument to the reduce function. This can be demonstrated as follows: user> (r/reduce + 0 (r/reducer [1 2 3 4] (mapping inc))) 14 user> (reduce + 0 (r/reducer [1 2 3 4] (mapping inc))) 14 In this output, the mapping function is used with the inc function to create a reducing function transformer that increments all the elements in a given collection. This transformer is then combined with a vector using the reducer function to produce a reducible collection. The call to reduce in both of the above statements is transformed into the (reduce + [2 3 4 5]) expression, thus producing the result as 14. We can now redefine the map, filter, and mapcat functions using the reducer function, as shown belowin Example 3.7: Redefining the map, filter and mapcat functions using the reducer form (defn map [f coll]   (reducer coll (mapping f)))   (defn filter [p? coll]   (reducer coll (filtering p?)))   (defn mapcat [f coll]   (reducer coll (mapcatting f))) As shown in Example 3.7, the map, filter, and mapcat functions are now simply compositions of the reducer form with the mapping, filtering, and mapcatting transformers respectively. The definitions of CollReduce, reducer, reduce, map, filter, and mapcat are simplified versions of their actual definitions in the clojure.core.reducers namespace. The definitions of the map, filter, and mapcat functions shown in Example 3.7 have the same shape as the standard versions of these functions, shown as follows: user> (r/reduce + (r/map inc [1 2 3 4])) 14 user> (r/reduce + (r/filter even? [1 2 3 4])) 6 user> (r/reduce + (r/mapcat range [1 2 3 4])) 10 Hence, the map, filter, and mapcat functions from the clojure.core.reducers namespace can be used in the same way as the standard versions of these functions. The reducers library also provides a take function that can be used as a replacement for the standard take function. We can use this function to reduce the number of calls to the square-with-side-effect function (from Example 3.1) when it is mapped over a given vector, as shown below: user> (def mapped (r/map square-with-side-effect [0 1 2 3 4 5])) #'user/mapped user> (reduce + (r/take 3 mapped)) Side-effect: 0 Side-effect: 1 Side-effect: 2 Side-effect: 3 5 Thus, using the map and take functions from the clojure.core.reducers namespace as shown above avoids the application of the square-with-side-effect function over all five elements in the [0 1 2 3 4 5] vector, as only the first three are required. The reducers library also provides variants of the standard take-while, drop, flatten, and remove functions which are based on reducers. Effectively, functions based on reducers will require a lesser number of allocations than sequence-based functions, thus leading to an improvement in performance. For example, consider the process and process-with-reducer functions, shown here: Example 3.8. Functions to process a collection of numbers using sequences and reducers (defn process [nums]   (reduce + (map inc (map inc (map inc nums))))) (defn process-with-reducer [nums]   (reduce + (r/map inc (r/map inc (r/map inc nums))))) This process function in Example 3.8 applies the inc function over a collection of numbers represented by nums using the map function. The process-with-reducer function performs the same action but uses the reducer variant of the map function. The process-with-reducer function will take a lesser amount of time to produce its result from a large vector compared to the process function, as shown here: user> (def nums (vec (range 1000000))) #'user/nums user> (time (process nums)) "Elapsed time: 471.217086 msecs" 500002500000 user> (time (process-with-reducer nums)) "Elapsed time: 356.767024 msecs" 500002500000 The process-with-reducer function gets a slight performance boost as it requires a lesser number of memory allocations than the process function. The performance of this computation can be improved by a greater scale if we can somehow parallelize it. Summary In this article, we explored the clojure.core.reducers library in detail. We took a look at how reducers are implemented and also how we can use reducers to handle large collections of data in an efficient manner. Resources for Article:   Further resources on this subject: Getting Acquainted with Storm [article] Working with Incanter Datasets [article] Application Performance [article]
Read more
  • 0
  • 0
  • 4834

article-image-building-command-line-tool
Packt
06 Jan 2016
15 min read
Save for later

Building a Command-line Tool

Packt
06 Jan 2016
15 min read
In this article Eduardo Díaz, author of the book Clojure for Java Developers, we will learn how to build a command line tool in Clojure. We'll also look to some new features in the Clojure world; we'll be discussing core.async and transducers, which are the new ways to write asynchronous programs. core.async is a very exciting method to create thousands of light threads, along with the capability to manage them. Transducers are a way to separate computation from the source of data; you can use transducers with the data flowing between the light threads or use them with a vector of data. (For more resources related to this topic, see here.) The requirements First, let's take the time to understand the requirements fully. Let's try to summarize our requirement in a single statement: We need to know all the places in a set of pages that use the same CSS selector. This seems to be a well-defined problem, but let's not forget to specify some things in order to make the best possible decisions. We want to be able to use more than one browser. We want to look for the same CSS selector in almost all the pages. It will be better to have an image of where the elements are used, instead of a text report. We want it to be a command-line app. We need to count the number of elements that a CSS selector can match in a single page and in all of them; this might help us get a sense of the classes we can change freely and the ones we should never touch. It should be written in Clojure! How can we solve the previous requirements? Java and Clojure already have a wide variety of libraries that we can use. Let's have a look at a couple of them, which we can use in this example. Automating the browser The biggest issue seems to be finding a simple way to build a cross browser. Even automating a single browser sounds like a complex task; how could we automate different browsers? You probably have heard about selenium, a library that enables us to automate a browser. It is normally used to test, but it also lets us take screenshots to lookup for certain elements and allows us to run custom JavaScript on the browser and in turn its architecture allows it to run on different browsers. It does seem like a great fit. In the modern world, you can use selenium for almost any language you want; however, it is written in Java and you can expect a first class support if you are running in the JVM. We are using Clojure and we can expect better integration with the Clojure language; for this particular project we will rely on clj-webdriver (https://github.com/semperos/clj-webdriver). It is an open source project that features an idiomatic Clojure API for selenium, called Taxi. You can find the documentation for Taxi at https://github.com/semperos/clj-webdriver/wiki/Introduction%3A-Taxi Parsing the command-line parameters We want to build a command-line app and if we want to do it in the best possible way, it is important to think of our users. Command-line users are comfortable with using their apps in a standard way. One of the most used and best-known standards to pass arguments to command-line apps is the standard GNU. There are libraries in most of the languages that help you parse command-line arguments and Clojure is no exception. Let's use the tools.cli library (https://github.com/clojure/tools.cli). The tools.cli library is a parser for the command-line arguments, it adheres to the GNU standard and makes it very easy to create a command-line interface that is safe and familiar to use. Some very interesting features that tools.cli gives you are: Instant creation of a help message that shows each option available Custom parsing of options (you can give your function to the parser, so you'll get the exact value you want) Custom validation of options. (you can give a function to the parser so it validates what you are passing to the command line) It works with Clojure script, so you can create your own Node.js with Clojure script and use tools.cli to parse the arguments! The README file in GitHub is very helpful and it can be updated to the latest version. Therefore, I recommend that you have a look to understand all the possibilities of this awesome library. We have everything we need in order to build our app, let's write it now. Getting to know the check-css Project This project has the following three main namespaces: Core Browser Page Let's check the responsibilities of each of these namespaces. The check-css.browser namespace (ns check-css.browser   (:require [clj-webdriver.taxi :as taxi]             [clojure.string :as s]             [clojure.java.io :as io]))   (defn exec-site-fn [urls f & {:keys [driver]                               :or {driver {:browser :chrome}}}]   (taxi/with-driver driver     (doseq [url urls]       (taxi/to url)       (f url)))) This is very simple code, it includes the function exec-site-fn that receives a list of urls and the optional configuration of your driver. If you don't specify a driver, it will be Chrome by default. Taxi includes a macro with-driver, which allows you to execute a procedure with a single browser in a sequential manner. We get the following benefits from this: We need the resources (memory, cpu) for one browser We don't need to coordinate parallel execution We don't need to think about closing the resources correctly (in this case, the browser) So this function just executes something for some urls using a single browser, we can think of it as just a helper function. The check-css.page namespace (ns check-css.page   (:require [clj-webdriver.taxi :as taxi]             [clojure.string :as s]             [clojure.java.io :as io]))   (defn execute-script-fn [base-path js-src selector url]   (let [path-url (s/replace url #"/" "_")         path (str base-path "/" path-url ".png")]     (taxi/execute-script (s/replace js-src #"#selector#  " selector))     (println (str "Checking site " url))     (taxi/take-screenshot :file path))) This is again a helper function, which does two things: It executes some JavaScript that you pass along (changing references to #selector to the passed selector). It takes a screenshot. How can we use this to our advantage? It is very easy to use JavaScript to mark the elements you are interested in. You can use this script as shown: var els = document.querySelectorAll('#selector#); for (var i = 0; i < els.length; i++) {     els[i].style.border = '2px solid red'; } Therefore, we just need to use everything together for this to work and we'll see that in the check-css.core namespace. The check-css.core namespace (ns check-css.core …) (def cli-options   [["-s" "--selector SELECTOR" "CSS Selector"]    ["-p" "--path PATH" "The base folder for images"]    ["-b" "--browser BROWSER" "Browser"     :default :chrome     :parse-fn keyword]    ["-h" "--help"]]) (defn find-css-usages [browser selector output-path urls]   (let [js-src (-> (io/resource "script.js") slurp)         apply-script-fn (partial p/execute-script-fn                                  output-path                                  js-src                                  selector)]     (doseq [url urls]       (b/exec-site-fn urls apply-script-fn                       :driver {:browser browser})))) (defn -main [& args]   (let [{:keys [options arguments summary]} (parse-opts args cli-options)         {:keys [browser selector path help]} options         urls arguments]     (if-not help       (find-css-usages browser selector path urls)       (exit 0 summary)))) This code looks very simple; here we can see the usage of tools.cli and the function that takes everything together, find-css-usages. This function: It reads the JavaScript file from the classpath It creates a function that only receives the url, so it is compatible with the function f in exec-site-fn. This is all that is needed to execute our program. Now we can do the following from the command line: # lein uberjar # java -jar target/uberjar/check-css-0.1.0-SNAPSHOT-standalone.jar -p . -s "input" -b chrome http://www.google.com http://www.facebook.com It creates a couple of screenshots of Google and Facebook, pointing out the elements that are inputs. Granted, we can do something more interesting with our app, but for now, let's focus on the code. There are a couple of things we want to do to this code. The first thing is that we want to have some sort of statistical record of how many elements were found, not just the screenshots. The second important thing has to do with an opportunity to learn about core.async and what's coming up next in the Clojure world. Core.async Core.async is yet another way of programming concurrently, it uses the idea of lightweight threads and channels it to communicate between them. Why lightweight threads? The lightweight threads are used in languages like go and erlang. They pride in being able to run thousands of threads in a single process. What is the difference between the lightweight threads and traditional threads? The traditional threads need to reserve memory and this also takes some time. If you want to create a couple thousand threads, you will be using a noticeable amount of memory for each thread and asking the kernel to do that also takes time. What difference do lightweight threads make? To have a couple hundred lightweight threads you only need to create a couple of threads, there is no need to reserve memory. The lightweight threads are merely a software idea. This can be achieved with most languages and Clojure adds first class support (without changing the language, this is part of the lisp power) using core.async! Let's have a look of how it works. There are two concepts that you need to keep in mind: Goblocks: They are the lightweight threads. Channels: The channels are a way to communicate between various goblocks, you can think of them as queues. Goblocks can publish a message to the channel and other goblocks can take a message from them. Just as there are integration patterns for queues, there are integration patterns for channels, where you will find concepts similar to broadcasting, filtering, and mapping. Now, let's play a little with each of them so you can understand how to use them for our program. Goblocks You will find goblocks in the clojure.core.async namespace. Goblocks are extremely easy to use, you need the go macro and you will do something similar to this: (ns test   (:require [clojure.core.async :refer [go]]))   (go   (println "Running in a goblock!")) They are similar to threads; you just need to remember that you can create goblocks freely. There can be thousands of running goblocks in a single JVM. Channels You can actually use anything you like to communicate between goblocks, but it is recommended that you use channels. Channels have two main operations namely, putting and getting. Let's see how to do it: (ns test   (:require [clojure.core.async :refer [go chan >! <!]]))   (let [c (chan)]   (go (println (str "The data in the channel is" (<! c))))   (go (>! c 6))) That's it! It looks pretty simple, as you can see there are three main functions that we are using with channels: chan: This function creates a channel; the channels can store messages in a buffer and if you want that functionality you should just pass the size of the buffer to the chan function. If no size is specified, the channel can store only one message. >!: The put function must be used within a goblock; it receives a channel and the value you want to publish to it. This function is blocking; if a channel's buffer is already full, it will block until something is consumed from the channel. <! The take function must be used within a goblock, it receives the channel you are taking from. It is blocking, if you haven't published something in the channel it will park until there's data available. There are lots of other functions that you can use with channels, for now let's add two related functions that you will probably use soon >!!: The blocking put, it works exactly the same as the put function, except it can be used from anywhere. Remember, if a channel cannot take more data this function will block the entire thread from where it runs. <!!: The blocking take, it works exactly as the take function, except you can use this from anywhere not just inside goblocks. Just keep in mind that this blocks the thread where it runs until there's data available If you look into the core.async API docs, (http://clojure.github.io/core.async/) you will find a fair amount of functions. Some of them look similar to the functions that give you functionalities similar to queues, let's look at the broadcast function. (ns test   (:require [clojure.core.async.lab :refer [broadcast]]             [clojure.core.async :refer [chan <! >!! go-loop]])   (let [c1 (chan 5)       c2 (chan 5)       bc (broadcast c1 c2)]   (go-loop []     (println "Getting from the first channel" (<! c1))     (recur))   (go-loop []     (println "Getting from the second channel" (<! C2))     (recur))   (>!! bc 5)   (>!! bc 9)) With this you can now publish to several channels at the same time, this is helpful to subscribe multiple processes to a single source of events, with a great amount of separation of concerns. If you take a good look, you will also find familiar functions over there: map, filter, and reduce. Depending of the version of core.async, some of these functions could not be there anymore. Why are these functions there? Those functions are for modifying collections of data, right? The reason is that there has been a good amount of effort towards using channels as higher-level abstractions. The idea is to see channels as collections of events, if you think of them that way it's easy to see that you can create a new channel by mapping every element of an old channel, or you can create a new channel by filtering away some elements. In recent versions of Clojure, the abstraction has become even more noticeable with transducers. Transducers Transducers are a way to separate the computations from the input source, simply they are a way to apply a sequence of steps to a sequence or a channel. Let's look at an example for a sequence. (let [odd-counts (comp (map count)                        (filter odd?))       vs [[1 2 3 4 5 6]           [:a :c :d :e]           [:test]]]   (sequence odd-counts vs)) comp feels similar to the threading macros, it composes functions and stores the steps of the computation. The interesting part is that we can use this same odd-counts transformation with a channel, as shown: (let [odd-counts (comp (map count)                        (filter odd?))       input (chan)       output (chan 5 odd-counts)]   (go-loop []     (let [x (<! output)]       (println x))       (recur))   (>!! input [1 2 3 4 5 6])   (>!! input [:a :c :d :e])   (>!! input [:test])) This is quite interesting and now you can use this to understand how to improve the code of the check-css program. The main thing we'll gain (besides learning core.async and how to use transducers) is the visibility and separation of concerns; if we have channels publishing events, then it becomes extremely simple to add new subscribers without changing anything. Summary In this article we have learned how to build a command line tool, what are the requirements we needed to build the tool, how we can use this in different projects. Resources for Article: Further resources on this subject: Big Data [article] Implementing a Reusable Mini-firewall Using Transducers [article] Developing a JavaFX Application for iOS [article]
Read more
  • 0
  • 0
  • 2488

article-image-raster-calculations
Packt
06 Jan 2016
8 min read
Save for later

Raster Calculations

Packt
06 Jan 2016
8 min read
In this article by Alexander Bruy, author of the book, QGIS 2 Cookbook, we will see some of the most common operations related to Digital Elevation Models (DEM). (For more resources related to this topic, see here.) Calculating a hillshade layer A hillshade layer is commonly used to enhance the appearance of a map, and it can be computed from a DEM. This recipe shows how to compute it. Getting ready Open the dem_to_prepare.tif layer. This layer contains a DEM in EPSG:4326 CRS and elevation data in feet. These characteristics are unsuitable to runmost terrain analysis algorithms, so we will modify this layer to get a suitable algorithm. How to do it... In the Processing toolbox, find the Hillshade algorithm and double-click on it to open it, as shown in the following screenshot: Select the DEM in the Input layer field. Leave the rest of the parameters with their default values. Click on Run to run the algorithm. The hillshade layer will be added to the QGIS project, as shown in the following screenshot: How it works... As in the case of the slope, the algorithm is part of the GDAL library. You will see that the parameters are quite similar to the slope case. This is because the slope is used to compute the hillshade layer. Based on the slope and the aspect of the terrain in each cell, and using the position of the sun defined by the Azimuth and Altitude fields, the algorithm computes the illumination that the cell will receive. You can try changing the values of these parameters to alter the appearance of the layer. There's more... As in the case of slope, there are alternative options to compute the hillshade. The SAGA one in the Processing toolbox has a feature that is worth mentioning. The SAGA hillshade algorithm contains a field named method. This field is used to select the method used to compute the hillshade value, and the last method available. Raytracing, differs from the other ones. In that it models the real behavior of light, making an analysis that is not local but that uses the full information of the DEM instead. This renders more precise hillshade layers, but the processing time can be notably larger. Enhancing your map view with a hillshade layer You can combine the hillshade layer with your other layers to enhance their appearance. Since you have used a DEM to compute the hillshade layer, it should be already in your QGIS project along with the hillshade itself. However, it will be covered by it since the new layers are produced by the processing. Move it to the top of the layer list, so you can see the DEM (and not the hillshade layer) and style it to something like the following screenshot: In the Properties dialog of the layer, move to the Transparency section and set the Global transparency value to 50 %, as shown in the following screenshot: Now you should see the hillshade layer through the DEM, and the combination of both of them will look like the following screenshot: Analyzing hydrology A common analysis from a DEM is to compute hydrological elements, such as the channel network or the set of watersheds. This recipe shows the steps to follow to do it. Getting ready Open the DEM that we prepared in the previous recipe. How to do it... In the Processing toolbox, find the Fill sinks algorithm and double-click on it to open that: Select the DEM in the DEM field and run the algorithm. It will generate a new filtered DEM layer. From now on, we will just use this DEM in the recipe, but not the original one. Open the Catchment Area in and select the filtered DEM in the Elevation field: Run the algorithm. It will generate a Catchment Area layer. Open the Channel network algorithm and fill it, as shown in the following screenshot: Run the algorithm. It will extract the channel network from the DEM based on Catchment Area and generate it as both raster and vector layer. Open the Watershed basins algorithm and fill it, as shown in the following screenshot: Run the algorithm. It will generate a raster layer with the watersheds calculated from the DEM and the channel network. Each watershed is a hydrological unit that represents the area that flows into a junction defined by the channel network: How it works... Starting from the DEM, the preceding described steps follow a typical workflow for hydrological analysis: First, the sinks are removed. This is a required preparation whenever you plan to do hydrological analysis. The DEM might contain sinks where a flow direction cannot be computed, which represents a problem in order to model the movement of water across those cells. Removing them solves this problem. The catchment area is computed from the DEM. The values in the catchment area layer represent the area upstream of each cell. That is, the total area in which, if water is dropped, it will eventually reach the cell. Cells with high values of catchment area will likely contain a river, whereas cell with lower values will have the overland flow. By setting a threshold on the catchment area values, we can separate the river cells (those above the threshold) from the remaining ones and extract the channel network. Finally, we compute the watersheds associated with each junction in the channel network extracted in the last step. There's more... The key parameter in the preceding workflow is the catchment area threshold. If a larger threshold is used, fewer cells will be considered as river cells, and the resulting channel network will be sparser. Because the watersheds are computed based on the channel network, it will result in a lower number of watersheds. You can try yourself with different values of the catchment area threshold. Here, you can see the result for threshold equal to 10,00,000 and 5,00,00,000. The following screenshot shows the result of threshold equal to 10,00,000: The following screenshot shows the result of threshold equal to 5,00,00,000: Note that in the previous case, with a higher threshold value, there is only one single watershed in the resulting layer. The threshold values are expressed in the units of the catchment area, which, because the cell size is assumed to be in meters, are in square meters. Calculating a topographic index Because the topography defines and influences most of the processes that take place in a given terrain, the DEM can be used to extract many different parameters that give us information about those processes. This recipe shows to calculate a popular one named the Topographic Wetness Index, which estimates the soil wetness based on the topography. Getting ready Open the DEM that we prepared in the Calculating a hillshade layer recipe. How to do it... Calculate a slope layer using the Slope, Aspect, and Curvature algorithm from the Processing toolbox. Calculate a catchment area layer using the Catchment Area algorithm from the Processing toolbox. Note that you must use a sinkless DEM, as the DEM that we generated in the previous recipe with the Fill sinks algorithm. Open the Topographic Wetness Index algorithm from the Processing toolbox and fill it, as shown in the following screenshot: Run the algorithm. It will create a layer with the Topographic Wetness Index field indicating the soil wetness in each cell: How it works... The index combines slope and catchment area, two parameters that influence the soil wetness. If the catchment area value is high, it means more water will flow into the cell thus increasing its soil wetness. A low value of slope will have a similar effect because the water that flows into the cell will not flow out of it quickly. The algorithm expects the slope to be expressed in radians. That's the reason why the Slope, Aspect, and Curvature algorithm has to be used because it produces its slope output in radians. The Slope algorithm that you will also find, which is based on the GDAL library, creates a slope layer with values expressed in degrees. You can use that layer if you convert its units by using the raster calculator. There's more... Other indices based on the same input layers can be found in different algorithm in the Processing toolbox. The Stream Power Index and the LS factor fields use the slope and catchment area as inputs as well and can be related to potential erosion. Summary In this article, we saw the working of the hillshade layer and a topographic index along with their calculation technique. We also saw how to analyze hydrology. Resources for Article: Further resources on this subject: Identifying the Best Places [article] Style Management in QGIS [article] Geolocating photos on the map [article]
Read more
  • 0
  • 0
  • 2374

article-image-forensics-recovery
Packt
05 Jan 2016
6 min read
Save for later

Forensics Recovery

Packt
05 Jan 2016
6 min read
In this article by Bhanu Birani and Mayank Birani, the authors of the book, IOS Forensics Cookbook, we have discussed Forensics recovery; also, how it is important, when in some investigation cases there is a need of decrypting the information from the iOS devices. These devices are in an encrypted form usually. In this article, we will focus on various tools and scripts, which can be used to read the data from the devices under investigation. We are going to cover the following topics: DFU and Recovery mode Extracting iTunes backup (For more resources related to this topic, see here.) DFU and Recovery Mode In this section we'll cover both the DFU mode and the Recovery mode separately. DFU mode In this section, we will see how to launch the DFU mode, but before that we see what DFU means. DFU stands for Device Firmware Upgrade, which means this mode is used specifically while iOS upgrades. This is a mode where device can be connected with iTunes and still do not load iBoot boot loader. Your device screen will be completely black in DFU mode because neither the boot loader nor the operating system is loaded. DFU bypasses the iBoot so that you can downgrade your device. How to do it... We need to follow these steps in order to launch a device in DFU mode: Turn off your device. Connect your device to the computer. Press your Home button and the Power button, together, for 10 seconds. Now, release the Power button and keep holding the Home button till your computer detects the device that is connected. After sometime, iTunes should detect your device. Make sure that your phone does not show any Restore logo on the device, if it does, then you are in Recovery mode, not in DFU. Once your DFU operations are done, you can hold the Power and Home buttons till you see the Apple logo in order to return to the normal functioning device. This is the easiest way to recover a device from a faulty backup file. Recovery mode In this section, you will learn about the Recovery mode of our iOS devices. To dive deep into the Recovery mode, we fist need to understand a few basics such as which boot loader is been used by iOS devices, how the boot takes place, and so on. We will explore all such concepts in order to simplify the understanding of the Recovery mode. All iOS devices use the iBoot boot loader in order to load the operating systems. The iBoot's state, which is used for recovery and restore purposes, is called Recovery mode. iOS cannot be downgraded in this state as the iBoot is loaded. iBoot also prevents any other custom firmware to flash into device unless it is a jailbreak, that is, "pwned". How to do it... The following are the detailed steps to launch the Recovery mode on any iOS device: You need to turn off your iOS device in order to launch the Recovery mode. Disconnect all the cables from the device and remove it from the dock if it is connected. Now, while holding the Home button, connect your iOS device to the computer using the cable. Hold the Home button till you see the Connect to iTunes screen. Once you see the screen, you have entered the Recovery mode. Now you will receive a popup in your Mac saying "iTunes has detected your iDevice in recovery mode". Now you can use iTunes to restore the device in the Recovery mode. Make sure your data is backed up because the recovery will restore the device to Factory Settings. You can later restore from the backup as well. Once your Recovery mode operations are complete, you will need to escape from the Recovery mode. To escape, just press the power button and the home button concurrently for 10-12 seconds. Extracting iTunes backup Extracting the logical information from the iTunes backup is crucial for forensics investigation. There is a full stack of tools available for extracting data from the iTunes backup. They come in a wide variety, distributed from open source to paid tools. Some of these forensic tools are Oxygen Forensics Suite, Access Data MPE+, EnCase, iBackup Bot, DiskAid, and so on. The famous open source tools are iPhone backup analyzer and iPhone analyzer. In this section, we are going to learn how to use the iPhone backup extractor tools. How to do it... The iPhone backup extractor is an open source forensic tool, which can extract information from device backups. However, there is one constraint that the backup should be created from iTunes 10 onwards. Follow these steps to extract data from iTunes backup: Download the iPhone backup extractor from http://supercrazyawesome.com/. Make sure that all your iTunes backup is located at this directory: ~/Library/ApplicationSupports/MobileSync/Backup. In case you don't have the required backup at this location, you can also copy paste it. The application will prompt after it is launched. The prompt should look similar to the following screenshot: Now tap on the Read Backups button to read the backup available at ~/Library/ApplicationSupports/MobileSync/Backup. Now, you can choose any option as shown here: This tool also allows you to extract data for an individual application and enables you to read the iOS file system backup. Now, you can select the file you want to extract. Once the file is selected, click on Extract. You will be get a popup asking for the destination directory. This complete process should look similar to the following screenshot: There are various other tools similar to this; iPhone Backup Browser is one of them, where you can view your decrypted data stored in your backup files. This tool supports only Windows operating system as of now. You can download this software from http://code.google.com/p/iphonebackupbrowser/. Summary In this article, we covered how to launch the DFU and the DFU and the Recovery modes. We also learned to extract the logical information from the iTunes backup using the iPhone backup extractor tool. Resources for Article: Further resources on this subject: Signing up to be an iOS developer [article] Exploring Swift [article] Introduction to GameMaker: Studio [article]
Read more
  • 0
  • 0
  • 17242

article-image-nsx-core-components
Packt
05 Jan 2016
16 min read
Save for later

NSX Core Components

Packt
05 Jan 2016
16 min read
In this article by Ranjit Singh Thakurratan, the author of the book, Learning VMware NSX, we have discussed some of the core components of NSX. The article begins with a brief introduction of the NSX core components followed by a detailed discussion of these core components. We will go over three different control planes and see how each of the NSX core components fit in this architecture. Next, we will cover the VXLAN architecture and the transport zones that allow us to create and extend overlay networks across multiple clusters. We will also look at NSX Edge and the distributed firewall in greater detail and take a look at the newest NSX feature of multi-vCenter or cross-vCenterNSX deployment. By the end of this article, you will have a thorough understanding of the NSX core components and also their functional inter-dependencies. In this article, we will cover the following topics: An introduction to the NSX core components NSX Manager NSX Controller clusters VXLAN architecture overview Transport zones NSX Edge Distributed firewall Cross-vCenterNSX (For more resources related to this topic, see here.) An introduction to the NSX core components The foundational core components of NSX are divided across three different planes. The core components of a NSX deployment consist of a NSX Manager, Controller clusters, and hypervisor kernel modules. Each of these are crucial for your NSX deployment; however, they are decoupled to a certain extent to allow resiliency during the failure of multiple components. For example if your controller clusters fail, your virtual machines will still be able to communicate with each other without any network disruption. You have to ensure that the NSX components are always deployed in a clustered environment so that they are protected by vSphere HA. The high-level architecture of NSX primarily describes three different planes wherein each of the core components fit in. They are the Management plane, the Control plane, and the Data plane. The following figure represents how the three planes are interlinked with each other. The management plane is how an end user interacts with NSX as a centralized access point, while the data plane consists of north-south or east-west traffic. Let's look at some of the important components in the preceding figure: Management plane: The management plane primarily consists of NSX Manager. NSX Manager is a centralized network management component and primarily allows a single management point. It also provides the REST API that a user can use to perform all the NSX functions and actions. During the deployment phase, the management plane is established when the NSX appliance is deployed and configured. This management plane directly interacts with the control plane and also with the data plane. The NSX Manager is then managed via the vSphere web client and CLI. The NSX Manager is configured to interact with vSphere and ESXi, and once configured, all of the NSX components are then configured and managed via the vSphere web GUI. Control plane: The control plane consists of the NSX Controller that manages the state of virtual networks. NSX Controllers also enable overlay networks (VXLAN) that are multicast-free and make it easier to create new VXLAN networks without having to enable multicast functionality on physical switches. The controllers also keep track of all the information about the virtual machines, hosts, and VXLAN networks and can perform ARP suppression as well. No data passes through the control plane, and a loss of controllers does not affect network functionality between virtual machines. Overlay networks and VXLANs can be used interchangeably. They both represent L2 over L3 virtual networks. Data plane: The NSX data plane primarily consists of NSX logical switch. The NSX logical switch is a part of the vSphere distributed switch and is created when a VXLAN network is created. The logical switch and other NSX services such as logical routing and logical firewall are enabled at the hypervisor kernel level after the installation of hypervisor kernel modules (VIBs). This logical switch is the key to enabling overlay networks that are able to encapsulate and send traffic over existing physical networks. It also allows gateway devices that allow L2 bridging between virtual and physical workloads.The data plane receives its updates from the control plane as hypervisors maintain local virtual machines and VXLAN (Logical switch) mapping tables as well. A loss of data plane will cause a loss of the overlay (VXLAN) network, as virtual machines that are part of a NSX logical switch will not be able to send and receive data. NSX Manager NSX Manager, once deployed and configured, can deploy Controller cluster appliances and prepare the ESXi host that involves installing various vSphere installation bundles (VIB) that allow network virtualization features such as VXLAN, logical switching, logical firewall, and logical routing. NSX Manager can also deploy and configure Edge gateway appliances and its services. The NSX version as of this writing is 6.2 that only supports 1:1 vCenter connectivity. NSX Manager is deployed as a single virtual machine and relies on VMware's HA functionality to ensure its availability. There is no NSX Manager clustering available as of this writing. It is important to note that a loss of NSX Manager will lead to a loss of management and API access, but does not disrupt virtual machine connectivity. Finally, the NSX Manager's configuration UI allows an administrator to collect log bundles and also to back up the NSX configuration. NSX Controller clusters NSX Controller provides a control plane functionality to distribute Logical Routing, VXLAN network information to the underlying hypervisor. Controllers are deployed as Virtual Appliances, and they should be deployed in the same vCenter to which NSX Manager is connected. In a production environment, it is recommended to deploy minimum three controllers. For better availability and scalability, we need to ensure that DRS ant-affinity rules are configured to deploy Controllers on a separate ESXI host. The control plane to management and data plane traffic is secured by a certificate-based authentication. It is important to note that controller nodes employ a scale-out mechanism, where each controller node uses a slicing mechanism that divides the workload equally across all the nodes. This renders all the controller nodes as Active at all times. If one controller node fails, then the other nodes are reassigned the tasks that were owned by the failed node to ensure operational status. The VMware NSX Controller uses a Paxos-based algorithm within the NSX Controller cluster. The Controller removes dependency on multicast routing/PIM in the physical network. It also suppresses broadcast traffic in VXLAN networks. The NSX version 6.2 only supports three controller nodes. VXLAN architecture overview One of the most important functions of NSX is enabling virtual networks. These virtual networks or overlay networks have become very popular due to the fact that they can leverage existing network infrastructure without the need to modify it in any way. The decoupling of logical networks from the physical infrastructure allows users to scale rapidly. Overlay networks or VXLAN was developed by a host of vendors that include Arista, Cisco, Citrix, Red Hat, and Broadcom. Due to this joint effort in developing its architecture, it allows the VXLAN standard to be implemented by multiple vendors. VXLAN is a layer 2 over layer 3 tunneling protocol that allows logical network segments to extend on routable networks. This is achieved by encapsulating the Ethernet frame with additional UPD, IP, and VXLAN headers. Consequently, this increases the size of the packet by 50 bytes. Hence, VMware recommends increasing the MTU size to a minimum of 1600 bytes for all the interfaces in the physical infrastructure and any associated vSwitches. When a virtual machine generates traffic meant for another virtual machine on the same virtual network, the hosts on which these source and destination virtual machines run are called VXLAN Tunnel End Point (VTEP). VTEPs are configured as separate VM Kernel interfaces on the hosts. The outer IP header block in the VXLAN frame contains the source and the destination IP addresses that contain the source hypervisor and the destination hypervisor. When a packet leaves the source virtual machine, it is encapsulated at the source hypervisor and sent to the target hypervisor. The target hypervisor, upon receiving this packet, decapsulates the Ethernet frame and forwards it to the destination virtual machine. Once the ESXI host is prepared from NSX Manager, we need to configure VTEP. NSX supports multiple VXLAN vmknics per host for uplink load balancing features. In addition to this, Guest VLAN tagging is also supported. A sample packet flow We face a challenging situation when a virtual machine generates traffic—Broadcast, Unknown Unicast, or Multicast (BUM)—meant for another virtual machine on the same virtual network (VNI) on a different host. Control plane modes play a crucial factor in optimizing the VXLAN traffic depending on the modes selected for the Logical Switch/Transport Scope: Unicast Hybrid Multicast By default, a Logical Switch inherits its replication mode from the transport zone. However, we can set this on a per-Logical-Switch basis. Segment ID is needed for Multicast and Hybrid Modes. The following is a representation of the VXLAN-encapsulated packet showing the VXLAN headers: As indicated in the preceding figure, the outer IP header identifies the source and the destination VTEPs. The VXLAN header also has the Virtual Network Identifier (VNI) that is a 24-bit unique network identifier. This allows the scaling of virtual networks beyond the 4094 VLAN limitation placed by the physical switches. Two virtual machines that are a part of the same virtual network will have the same virtual network identifier, similar to how two machines on the same VLAN share the same VLAN ID. Transport zones A group of ESXi hosts that are able to communicate with one another over the physical network by means of VTEPs are said to be in the same transport zone. A transport zone defines the extension of a logical switch across multiple ESXi clusters that span across multiple virtual distributed switches. A typical environment has more than one virtual distributed switch that spans across multiple hosts. A transport zone enables a logical switch to extend across multiple virtual distributed switches, and any ESXi host that is a part of this transport zone can have virtual machines as a part of that logical network. A logical switch is always created as part of a transport zone and ESXi hosts can participate in them. The following is a figure that shows a transport zone that defines the extension of a logical switch across multiple virtual distributed switches: NSX Edge Services Gateway The NSX Edge Services Gateway (ESG) offers a feature rich set of services that include NAT, routing, firewall, load balancing, L2/L3 VPN, and DHCP/DNS relay. NSX API allows each of these services to be deployed, configured, and consumed on-demand. The ESG is deployed as a virtual machine from NSX Manager that is accessed using the vSphere web client. Four different form factors are offered for differently-sized environments. It is important that you factor in enough resources for the appropriate ESG when building your environment. The ESG can be deployed in different sizes. The following are the available size options for an ESG appliance: X-Large: The X-large form factor is suitable for high performance firewall, load balancer, and routing or a combination of multiple services. When an X-large form factor is selected, the ESG will be deployed with six vCPUs and 8GB of RAM. Quad-Large: The Quad-large form factor is ideal for a high performance firewall. It will be deployed with four vCPUs and 1GB of RAM. Large: The large form factor is suitable for medium performance routing and firewall. It is recommended that, in production, you start with the large form factor. The large ESG is deployed with two vCPUs and 1GB of RAM. Compact: The compact form factor is suitable for DHCP and DNS replay functions. It is deployed with one vCPU and 512MB of RAM. Once deployed, a form factor can be upgraded by using the API or the UI. The upgrade action will incur an outage. Edge gateway services can also be deployed in an Active/Standby mode to ensure high availability and resiliency. A heartbeat network between the Edge appliances ensures state replication and uptime. If the active gateway goes down and the "declared dead time" passes, the standby Edge appliance takes over. The default declared dead time is 15 seconds and can be reduced to 6 seconds. Let's look at some of the Edge services as follows: Network Address Translation: The NSX Edge supports both source and destination NAT and NAT is allowed for all traffic flowing through the Edge appliance. If the Edge appliance supports more than 100 virtual machines, it is recommended that a Quad instance be deployed to allow high performance translation. Routing: The NSX Edge allows centralized routing that allows the logical networks deployed in the NSX domain to be routed to the external physical network. The Edge supports multiple routing protocols including OSPF, iBGP, and eBGP. The Edge also supports static routing. Load balancing: The NSX Edge also offers a load balancing functionality that allows the load balancing of traffic between the virtual machines. The load balancer supports different balancing mechanisms including IP Hash, least connections, URI-based, and round robin. Firewall: NSX Edge provides a stateful firewall functionality that is ideal for north-south traffic flowing between the physical and the virtual workloads behind the Edge gateway. The Edge firewall can be deployed alongside the hypervisor kernel-based distributed firewall that is primarily used to enforce security policies between workloads in the same logical network. L2/L3VPN: The Edge also provides L2 and L3 VPNs that makes it possible to extend L2 domains between two sites. An IPSEC site-to-site connectivity between two NSX Edges or other VPN termination devices can also be set up. DHCP/DNS relay: NSX Edge also offers DHCP and DNS relay functions that allows you to offload these services to the Edge gateway. Edge only supports DNS relay functionality and can forward any DNS requests to the DNS server. The Edge gateway can be configured as a DHCP server to provide and manage IP addresses, default gateway, DNS servers and, search domain information for workloads connected to the logical networks. Distributed firewall NSX provides L2-L4stateful firewall services by means of a distributed firewall that runs in the ESXi hypervisor kernel. Because the firewall is a function of the ESXi kernel, it provides massive throughput and performs at a near line rate. When the ESXi host is initially prepared by NSX, the distributed firewall service is installed in the kernel by deploying the kernel VIB – VMware Internetworking Service insertion platform or VSIP. VSIP is responsible for monitoring and enforcing security policies on all the traffic flowing through the data plane. The distributed firewall (DFW) throughput and performance scales horizontally as more ESXi hosts are added. DFW instances are associated to each vNIC, and every vNIC requires one DFW instance. A virtual machine with 2 vNICs has two DFW instances associated with it, each monitoring its own vNIC and applying security policies to it. DFW is ideally deployed to protect virtual-to-virtual or virtual-to-physical traffic. This makes DFW very effective in protecting east-west traffic between workloads that are a part of the same logical network. DFW policies can also be used to restrict traffic between virtual machines and external networks because it is applied at the vNIC of the virtual machine. Any virtual machine that does not require firewall protection can be added to the exclusion list. A diagrammatic representation is shown as follows: DFW fully supports vMotion and the rules applied to a virtual machine always follow the virtual machine. This means any manual or automated vMotion triggered by DRS does not cause any disruption in its protection status. The VSIP kernel module also adds spoofguard and traffic redirection functionalities as well. The spoofguard function maintains a VM name and IP address mapping table and prevents against IP spoofing. Spoofguard is disabled by default and needs to be manually enabled per logical switch or virtual distributed switch port group. Traffic redirection allows traffic to be redirected to a third-party appliance that can do enhanced monitoring, if needed. This allows third-party vendors to be interfaced with DFW directly and offer custom services as needed. Cross-vCenterNSX With NSX 6.2, VMware introduced an interesting feature that allows you to manage multiple vCenterNSX environments using a primary NSX Manager. This allows to have easy management and also enables lots of new functionalities including extending networks and other features such as distributed logical routing. Cross-vCenterNSX deployment also allows centralized management and eases disaster recovery architectures. In a cross-vCenter deployment, multiple vCenters are all paired with their own NSX Manager per vCenter. One NSX Manager is assigned as the primary while other NSX Managers become secondary. This primary NSX Manager can now deploy a universal controller cluster that provides the control plane. Unlike a standalone vCenter-NSX deployment, secondary NSX Managers do not deploy their own controller clusters. The primary NSX Manager also creates objects whose scope is universal. This means that these objects extend to all the secondary NSX Managers. These universal objects are synchronized across all the secondary NSX Managers and can be edited and changed by the primary NSX Manager only. This does not prevent you from creating local objects on each of the NSX Managers. Similar to local NSX objects, a primary NSX Manager can create global objects such as universal transport zones, universal logical switches, universal distributed routers, universal firewall rules, and universal security objects. There can be only one universal transport zone in a cross-vCenterNSX environment. After it is created, it is synchronized across all the secondary NSX Managers. When a logical switch is created inside a universal transport zone, it becomes a universal logical switch that spans layer 2 network across all the vCenters. All traffic is routed using the universal logical router, and any traffic that needs to be routed between a universal logical switch and a logical switch (local scope) requires an ESG. Summary We began the article with a brief introduction of the NSX core components and looked at the management, control, and the data plane. We then discussed NSX Manager and the NSX Controller clusters. This was followed by a VXLAN architecture overview discussion, where we looked at the VXLAN packet. We then discussed transport zones and NSX Edge gateway services. We ended the article with NSX Distributed firewall services and also an overview of Cross-vCenterNSX deployment. Resources for Article: Further resources on this subject: vRealize Automation and the Deconstruction of Components [article] Monitoring and Troubleshooting Networking [article] Managing Pools for Desktops [article]
Read more
  • 0
  • 0
  • 11229
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-assessment-planning
Packt
04 Jan 2016
12 min read
Save for later

Assessment Planning

Packt
04 Jan 2016
12 min read
In this article by Kevin Cardwell the author of the book Advanced Penetration Testing for Highly-Secured Environments - Second Edition, discusses the test environment and how we have selected the chosen platform. We will discuss the following: Introduction to advanced penetration testing How to successfully scope your testing (For more resources related to this topic, see here.) Introduction to advanced penetration testing Penetration testing is necessary to determine the true attack footprint of your environment. It may often be confused with vulnerability assessment and thus it is important that the differences should be fully explained to your clients. Vulnerability assessments Vulnerability assessments are necessary for discovering potential vulnerabilities throughout the environment. There are many tools available that automate this process so that even an inexperienced security professional or administrator can effectively determine the security posture of their environment. Depending on scope, additional manual testing may also be required. Full exploitation of systems and services is not generally in scope for a normal vulnerability assessment engagement. Systems are typically enumerated and evaluated for vulnerabilities, and testing can often be done with or without authentication. Most vulnerability management and scanning solutions provide actionable reports that detail mitigation strategies such as applying missing patches, or correcting insecure system configurations. Penetration testing Penetration testing expands upon vulnerability assessment efforts by introducing exploitation into the mix The risk of accidentally causing an unintentional denial of service or other outage is moderately higher when conducting a penetration test than it is when conducting vulnerability assessments. To an extent, this can be mitigated by proper planning, and a solid understanding of the technologies involved during the testing process. Thus, it is important that the penetration tester continually updates and refines the necessary skills. Penetration testing allows the business to understand if the mitigation strategies employed are actually working as expected; it essentially takes the guesswork out of the equation. The penetration tester will be expected to emulate the actions that an attacker would attempt and will be challenged with proving that they were able to compromise the critical systems targeted. The most successful penetration tests result in the penetration tester being able to prove without a doubt that the vulnerabilities that are found will lead to a significant loss of revenue unless properly addressed. Think of the impact that you would have if you could prove to the client that practically anyone in the world has easy access to their most confidential information! Penetration testing requires a higher skill level than is needed for vulnerability analysis. This generally means that the price of a penetration test will be much higher than that of a vulnerability analysis. If you are unable to penetrate the network you will be ensuring your clientele that their systems are secure to the best of your knowledge. If you want to be able to sleep soundly at night, I recommend that you go above and beyond in verifying the security of your clients. Advanced penetration testing Some environments will be more secured than others. You will be faced with environments that use: Effective patch management procedures Managed system configuration hardening policies Multi-layered DMZ's Centralized security log management Host-based security controls Network intrusion detection or prevention systems Wireless intrusion detection or prevention systems Web application intrusion detection or prevention systems Effective use of these controls increases the difficulty level of a penetration test significantly. Clients need to have complete confidence that these security mechanisms and procedures are able to protect the integrity, confidentiality, and availability of their systems. They also need to understand that at times the reason an attacker is able to compromise a system is due to configuration errors, or poorly designed IT architecture. Note that there is no such thing as a panacea in security. As penetration testers, it is our duty to look at all angles of the problem and make the client aware of anything that allows an attacker to adversely affect their business. Advanced penetration testing goes above and beyond standard penetration testing by taking advantage of the latest security research and exploitation methods available. The goal should be to prove that sensitive data and systems are protected even from a targeted attack, and if that is not the case, to ensure that the client is provided with the proper instruction on what needs to be changed to make it so. A penetration test is a snapshot of the current security posture. Penetration testing should be performed on a continual basis. Many exploitation methods are poorly documented, frequently hard to use, and require hands-on experience to effectively and efficiently execute. At DefCon 19 Bruce "Grymoire" Barnett provided an excellent presentation on "Deceptive Hacking". In this presentation, he discussed how hackers use many of the very same techniques used by magicians. This is exactly the tenacity that penetration testers must assume as well. Only through dedication, effort, practice, and the willingness to explore unknown areas will penetration testers be able to mimic the targeted attack types that a malicious hacker would attempt in the wild. Often times you will be required to work on these penetration tests as part of a team and will need to know how to use the tools that are available to make this process more endurable and efficient. This is yet another challenge presented to today's pentesters. Working in a silo is just not an option when your scope restricts you to a very limited testing period. In some situations, companies may use non-standard methods of securing their data, which makes your job even more difficult. The complexity of their security systems working in tandem with each other may actually be the weakest link in their security strategy. The likelihood of finding exploitable vulnerabilities is directly proportional to the complexity of the environment being tested. Before testing begins Before we commence with testing, there are requirements that must be taken into consideration. You will need to determine the proper scoping of the test, timeframes and restrictions, the type of testing (Whitebox, Blackbox), and how to deal with third-party equipment and IP space. Determining scope Before you can accurately determine the scope of the test, you will need to gather as much information as possible. It is critical that the following is fully understood prior to starting testing procedures: Who has the authority to authorize testing? What is the purpose of the test? What is the proposed timeframe for the testing? Are there any restrictions as to when the testing can be performed? Does your customer understand the difference between a vulnerability assessment and a penetration test? Will you be conducting this test with, or without cooperation of the IT Security Operations Team? Are you testing their effectiveness? Is social engineering permitted? How about denial-of-service attacks? Are you able to test physical security measures used to secure servers, critical data storage, or anything else that requires physical access? For example, lock picking, impersonating an employee to gain entry into a building, or just generally walking into areas that the average unaffiliated person should not have access to. Are you allowed to see the network documentation or to be informed of the network architecture prior to testing to speed things along? (Not necessarily recommended as this may instill doubt for the value of your findings. Most businesses do not expect this to be easy information to determine on your own.) What are the IP ranges that you are allowed to test against? There are laws against scanning and testing systems without proper permissions. Be extremely diligent when ensuring that these devices and ranges actually belong to your client or you may be in danger of facing legal ramifications. What are the physical locations of the company? This is more valuable to you as a tester if social engineering is permitted because it ensures that you are at the sanctioned buildings when testing. If time permits, you should let your clients know if you were able to access any of this information publicly in case they were under the impression that their locations were secret or difficult to find. What to do if there is a problem or if the initial goal of the test has been reached. Will you continue to test to find more entries or is the testing over? This part is critical and ties into the question of why the customer wants a penetration test in the first place. Are there legal implications that you need to be aware of such as systems that are in different countries, and so on? Not all countries have the same laws when it comes to penetration testing. Will additional permission be required once a vulnerability has been exploited? This is important when performing tests on segmented networks. The client may not be aware that you can use internal systems as pivot points to delve deeper within their network. How are databases to be handled? Are you allowed to add records, users, and so on? This listing is not all-inclusive and you may need to add items to the list depending on the requirements of your clients. Much of this data can be gathered directly from the client, but some will have to be handled by your team. If there are legal concerns, it is recommended that you seek legal counsel to ensure you fully understand the implications of your testing. It is better to have too much information than not enough, once the time comes to begin testing. In any case, you should always verify for yourself that the information you have been given is accurate. You do not want to find out that the systems you have been accessing do not actually fall under the authority of the client! It is of utmost importance to gain proper authorization in writing before accessing any of your clients systems. Failure to do so may result in legal action and possibly jail. Use proper judgment! You should also consider that errors and omissions insurance is a necessity when performing penetration testing. Setting limits–nothing lasts forever Setting proper limitations is essential if you want to be successful at performing penetration testing. Your clients need to understand the full ramifications involved, and should be made aware of any residual costs incurred, if additional services beyond those listed within the contract are needed. Be sure to set defined start and end dates for your services. Clearly define the rules of engagement and include IP ranges, buildings, hours, and so on that may need to be tested. If it is not in your rules of engagement documentation, it should not be tested. Meetings should be predefined prior to the start of testing, and the customer should know exactly what your deliverables will be. Rules of engagement documentation Every penetration test will need to start with a rules of engagement document that all involved parties must have. This document should at a minimum cover several items: Proper permissions by appropriate personnel. Begin and end dates for your testing. The type of testing that will be performed. Limitations of testing. What type of testing is permitted? DDOS? Full penetration? Social engineering? These questions need to be addressed in detail. Can intrusive tests as well as unobtrusive testing be performed? Does your client expect cleanup to be performed afterwards or is this a stage environment that will be completely rebuilt after testing has been completed? IP ranges and physical locations to be tested. How the report will be transmitted at the end of the test? (Use secure means of transmission!) Which tools will be used during the test? Do not limit yourself to only one specific tool; it may be beneficial to provide a list of the primary toolset to avoid confusion in the future. For example, we will use the tools found in the most recent edition of the Kali Suite. Let your client know how any illegal data that is found during testing would be handled: law enforcement should be contacted prior to the client. Please be sure to understand fully the laws in this regard before conducting your test. How sensitive information will be handled: you should not be downloading sensitive customer information; there are other methods of proving that the clients' data is not secured. This is especially important when regulated data is a concern. Important contact information for both your team and for the key employees of the company you are testing. An agreement of what you will do to ensure the customer's system information does not remain on unsecured laptops and desktops used during testing. Will you need to properly scrub your machine after this testing? What do you plan to do with the information you gathered? Is it to be kept somewhere for future testing? Make sure this has been addressed before you start testing, not after. The rules of engagement should contain all the details that are needed to determine the scope of the assessment. Any questions should have been answered prior to drafting your rules of engagement to ensure there are no misunderstandings once the time comes to test. Your team members need to keep a copy of this signed document on their person at all times when performing the test. Imagine you have been hired to assert the security posture of a client's wireless network and you are stealthily creeping along the parking lot on private property with your gigantic directional Wi-Fi antenna and a laptop. If someone witnesses you in this act, they will probably be concerned and call the authorities. You will need to have something on you that documents you have a legitimate reason to be there. This is one time where having the contact information of the business leaders that hired you will come in extremely handy! Summary In this article, we focused on all that is necessary to prepare and plan for a successful penetration test. We discussed the differences between penetration testing and vulnerability assessments. The steps involved with proper scoping were detailed, as were the necessary steps to ensure all information has been gathered prior to testing. One thing to remember is that proper scoping and planning is just as important as ensuring you test against the latest and greatest vulnerabilities. Resources for Article: Further resources on this subject: Penetration Testing[article] Penetration Testing and Setup[article] BackTrack 4: Security with Penetration Testing Methodology[article]
Read more
  • 0
  • 0
  • 6781

article-image-different-ir-algorithms-you-will-learn
Packt
04 Jan 2016
23 min read
Save for later

Different IR Algorithms

Packt
04 Jan 2016
23 min read
In this article written by Sudipta Mukherjee, author of the book F# for Machine Learning, we learn about how information overload is almost a passé term; however, it is still valid. Information retrieval is a big arena and most of it is far from being solved. However, that being said, we have come a long way and the results produced by some of the state of the art information retrieval algorithms are really impressive. You may not know that you are using information retrieval but whenever you search for some documents on your PC or on the internet, you are actually using the produce of some information retrieval algorithm at the background. So as the metaphor goes, finding the needle (read information/insight) in a haystack (read your data archive on your PC or on the web) is the key to successful business. Distance based: Two documents are matched based on their proximity, calculated by several distance metric on the vector representation of the document Set based: Two documents are matched based on their proximity, calculated by several set based / fuzzy set based, metric based on the bag of words (BoW) model of the document (For more resources related to this topic, see here.) What are some interesting things that you can do? You will learn how the same algorithm can find similar biscuits and identify author of digital documents from the words authors use. You will also learn how IR distance metrics can be used to group color images. Information retrieval using tf-idf Whenever you type some search term in your "Windows" search box, some documents appear matching your search term. There is a common, well-known, easy-to-implement algorithm that makes it possible to rank the documents based on the search term. Basically, the algorithm allows the developers to assign some kind of score to each document in the result set. That score can be seen as a score of confidence that the system has on how much the user would like that result. The score that this algorithm attaches with each document is a product of two different scores. The first one is called term frequency (tf) and the other one is called inverse document frequency (idf). Their product is referred to as tf-idf or "term frequency inverse document frequency". Tf is the number of times some term occurs in a given document. Idf is the ratio between the total number of documents scanned and the number of documents in which a given search term is found. However, this ration is not used as is. Log of this ration is used as idf, as shown next. The following is a term frequency and inverse term frequency example for the word "example": This is the same as: Idf is normally calculated with the following formula: The following is the code that demonstrates how to find tf-idf score for the given search terms; in this case, "example". Sentences are fabricated to match up the desired number of count of the word "example" in document 2; in this case, "sentence2". Here  denotes the set of all the documents and  denotes a second document. let sentence1 = "this is a a sample" let sentence2 = "this is another another example example example" let word = "example" let numberOfDocs = 2. let tf1 = sentence1.Split ' ' |> Array.filter ( fun t -> t = word) |> Array.length let tf2 = sentence2.Split ' ' |> Array.filter ( fun t -> t = word) |> Array.length let docs = [|sentence1;sentence2|] let foundIn = docs |> Array.map ( fun t -> t.Split ' '                                             |> Array.filter ( fun z -> z = word))                                             |> Array.filter ( fun m -> m |> Array.length <> 0)                                             |> Array.length let idf =  Operators.log10 ( numberOfDocs / float foundIn) let pr1 = float  tf1 * idf let pr2 = float tf2 * idf printfn "%f %f" pr1 pr2 This produces the following output: 0.0 ­ 0.903090 This means that the second document is more closely related with the word "example" than the first one. In this case, this is one of the extreme cases where the word doesn't appear at all in one document and appears three times in the other. However, with the same word occurring multiple times in both the documents, you will get different scores for each document. You can think of these scores as the confidence scores for association of the word and the document. More the score, more is the confidence that the document is bound to have something related to that word. Measures of similarity In the following section, you will create a framework for finding several distance measures. A distance between two probability distribution functions (pdf) is an important way to know how close two entities are. One way to generate the pdfs from histogram is to normalize the histogram. Generating a PDF from a histogram A histogram holds the number of times a value occurred. For example, a text can be represented as a histogram where the histogram values represent the number of times a word appears in the text. For gray images, it can be the number of times each gray scale appears in the image. In the following section, you will build a few modules that hold several distance metrics. A distance metric is a measure of how similar two objects are. It can also be called a measure of proximity. The following metrics use the notation of  and  to denote the ith value of either PDF. Let's say we have a histogram denoted by , and there are  elements in the histogram. Then a rough estimate of  is . The following function does this transformation from histogram to pdf: let toPdf (histogram:int list)=         histogram |> List.map (fun t -> float t / float histogram.Length ) There are a couple of important assumptions made. First, we assume that the number of bins is equal to the number of elements. So the histogram and the pdf will have the same number of elements. That's not exactly correct in mathematical sense. All the elements of a pdf should add up to 1. The following implementation of histToPdf guarantees that but it's not a good choice as it is not normalization. So resist the temptation to use this one. let histToPdf (histogram:int list)=         let sum = histogram |> List.sum         histogram |> List.map (fun t -> float t / float sum) Generating a histogram from a list of elements is simple. Following is the function that takes a list and returns the histogram. F# has the function already built in; it is called countBy. let listToHist (l : int list)=         l |> List.countBy (fun t -> t) Next is an example of how using these two functions, a list of integer can be transformed into a pdf. The following method takes a list of integers and returns the probability distribution associated: let listToPdf (aList : int list)=         aList |> List.countBy (fun t -> t)               |> List.map snd               |> toPdf Here is how you can use it: let list = [1;1;1;2;3;4;5;1;2] let pdf = list |> listToPdf I have captured the following in the F# interactive and got the following histogram from the preceding list: val it : (int * int) list = [(1, 4); (2, 2); (3, 1); (4, 1); (5, 1)] If you just project the second element for this histogram and store it in an int list, then you can represent the histogram in a list. So for this example, the histogram can be represented as: let hist = [4;2;1;1;1] Distance metrics are classified among several families based on their structural similarity. The sections following show how to work on these metrics using F# and uses histograms as int list as input.  and  denote the vectors that represent the entity being compared. For example, for the document retrieval system, these numbers might indicate the number of times a given word occurred in each of the documents that are being compared.  denotes the i element of the vector represented by  and  denotes the ith element of the vector represented by . Some literature call these vectors the profile vectors. Minkowski family As you can see, when two pdfs are almost the same then this family of distance metrics tends to be zero, and when they are further apart, they lead to positive infinity. So if the distance metric between two pdfs is close to zero then we can conclude that they are similar and if the distance is more, then we can conclude otherwise. All these formulae of these metrics are special cases of what's known as Minkowski distance. Euclidean distance The following code implements Euclidean distance. //Euclidean distance let euclidean (p:int list)(q:int list) =       List.zip p q |> List.sumBy (fun t -> float (fst t - snd t) ** 2.) City block distance The following code implements City block distance. let cityBlock (p:int list)(q:int list) =     List.zip p q |> List.sumBy (fun t -> float (fst t  - snd t)) Chebyshev distance The following code implements Chebyshev distance. let chebyshev(p:int list)(q:int list) =         List.zip p q |> List.map( fun t -> abs (fst t - snd t)) |> List.max L1 family This family of distances relies on normalization to keep the values within limit. All these metrics are of the form A/B where A is primarily a measure of proximity between the two pdfs P and Q. Most of the time A is calculated based on the absolute distance. For example, the numerator of the Sørensen distance is the City Block distance while the bottom is a normalization component that is obtained by adding each element of the two participating pdfs. Sørensen The following code implements Sørensendistance. let sorensen  (p:int list)(q:int list) =         let zipped = List.zip (p |> toPdf ) (q|>toPdf)         let numerator = zipped  |> List.sumBy (fun t -> float (fst t - snd t))         let denominator = zipped |> List.sumBy (fun t -> float (fst t + snd t))numerator / denominator Gower distance The following code implements Gowerdistance. There could be division by zero if the collection q is empty. let gower(p:int list)(q:int list)=         //I love this. Free flowing fluid conversion         //rather than cramping abs and fst t - snd t in a         //single line      let numerator = List.zip (p|>toPdf) (q |> toPdf)                          |> List.map (fun t -> fst t - snd t)                          |> List.map (fun z -> abs z)                          |> List.map float                          |> List.sum                          |> float      let denominator = float p.Length      numerator / denominator Soergel The following code implements Soergeldistance. let soergel (p:int list)(q:int list) =         let zipped = List.zip (p|>toPdf) (q |> toPdf)         let numerator =  zipped |> List.sumBy(fun t -> abs( fst t - snd t))         let denominator = zipped |> List.sumBy(fun t -> max (fst t ) (snd t))         float numerator / float denominator kulczynski d The following code implements Kulczynski ddistance. let kulczynski_d (p:int list)(q:int list) =         let zipped = List.zip (p|>toPdf) (q |> toPdf)         let numerator =  zipped |> List.sumBy(fun t -> abs( fst t - snd t))         let denominator = zipped |> List.sumBy(fun t -> min (fst t ) (snd t))         float numerator / float denominator kulczynski s The following code implements Kulczynski sdistance. let kulczynski_s (p:int list)(q:int list) = / kulczynski_d p q Canberra distance The following code implements Canberradistance. let canberra (p:int list)(q:int list) =     let zipped = List.zip (p|>toPdf) (q |> toPdf)     let numerator =  zipped |> List.sumBy(fun t -> abs( fst t - snd t))     let denominator = zipped |> List.sumBy(fun t -> fst t + snd t)     float numerator / float denominator Intersection family This family of distances tries to find the overlap between two participating pdfs. Intersection The following code implements Intersectiondistance. let intersection(p:int list ) (q: int list) =         List.zip (p|>toPdf) (q |> toPdf)             |> List.map (fun t -> min (fst t) (snd t)) |> List.sum Wave Hedges The following code implements Wave Hedgesdistance. let waveHedges (p:int list)(q:int list)=         List.zip (p|>toPdf) (q |> toPdf)             |> List.sumBy ( fun t -> 1.0 - float( min (fst t) (snd t))                                      / float (max (fst t) (snd t))) Czekanowski distance The following code implements Czekanowski distance. let czekanowski(p:int list)(q:int list) =         let zipped  = List.zip (p|>toPdf) (q |> toPdf)         let numerator = 2. * float (zipped |> List.sumBy (fun t -> min (fst t) (snd t)))         let denominator =  zipped |> List.sumBy (fun t -> fst t + snd t)         numerator / float denominator Motyka The following code implements Motyka distance.  let motyka(p:int list)(q:int list)=        let zipped = List.zip (p|>toPdf) (q |> toPdf)        let numerator = zipped |> List.sumBy (fun t -> min (fst t) (snd t))        let denominator = zipped |> List.sumBy (fun t -> fst t + snd t)        float numerator / float denominator Ruzicka The following code implements Ruzicka distance. let ruzicka (p:int list) (q:int list) =         let zipped = List.zip (p|>toPdf) (q |> toPdf)         let numerator = zipped |> List.sumBy (fun t -> min (fst t) (snd t))         let denominator = zipped |> List.sumBy (fun t -> max (fst t) (snd t))         float numerator / float denominator Inner Product family Distances belonging to this family are calculated by some product of pairwise elements from both the participating pdfs. Then this product is normalized with a value also calculated from the pairwise elements. Innerproduct The following code implements Inner-productdistance. let innerProduct(p:int list)(q:int list)=         List.zip (p|>toPdf) (q |> toPdf)             |> List.sumBy (fun t -> fst t * snd t) Harmonic mean The following code implements Harmonicdistance. let harmonicMean(p:int list)(q:int list)=        2. * (List.zip (p|>toPdf) (q |> toPdf)           |> List.sumBy (fun t -> ( fst t * snd t )/(fst t + snd t))) Cosine similarity The following code implements Cosine Similarity distance measure. let cosineSimilarity(p:int list)(q:int list)=     let zipped = List.zip p q     let prod  (x,y) = float x *  float y     let numerator = zipped |> List.sumBy prod     let denominator  =  sqrt ( p|> List.map sqr |> List.sum |> float) *                         sqrt ( q|> List.map sqr |> List.sum |> float)     numerator / denominator Kumar Hassebrook The following code implements Kumar Hassebrook distance measure.   let kumarHassebrook (p:int list) (q:int list) =         let sqr x = x * x         let zipped = List.zip (p|>toPdf) (q |> toPdf)         let numerator = zipped |> List.sumBy prod         let denominator =  (p |> List.sumBy sqr) +                            (q |> List.sumBy sqr) - numerator           numerator / denominator Dice coefficient The following code implements Dice coefficient.   let dicePoint(p:int list)(q:int list)=             let zipped = List.zip (p|>toPdf) (q |> toPdf)         let numerator = zipped |> List.sumBy (fun t -> fst t * snd t)         let denominator  =  (p |> List.sumBy sqr)  +                             (q |> List.sumBy sqr)           float numerator / float denominator Fidelity family or squared-chord family This family of distances uses square root as an instrument to keep the distance within limit. Sometimes other functions, such as log, are also used. Fidelity The following code implements FidelityDistance measure.   let fidelity(p:int list)(q:int list)=               List.zip (p|>toPdf) (q |> toPdf)|> List.map  prod                                         |> List.map sqrt                                         |> List.sum Bhattacharya The following code implements Bhattacharya distance measure. let bhattacharya(p:int list)(q:int list)=         -log (fidelity p q) Hellinger The following code implements Hellingerdistance measure. let hellinger(p:int list)(q:int list)=             let product = List.zip (p|>toPdf) (q |> toPdf)                          |> List.map prod |> List.sumBy sqrt        let right = 1. -  product        2. * right |> abs//taking this off will result in NaN                   |> float                   |> sqrt Matusita The following code implements Matusita distance measure. let matusita(p:int list)(q:int list)=         let value =2. - 2. *( List.zip (p|>toPdf) (q |> toPdf)                             |> List.map prod                             |> List.sumBy sqrt)         value |> abs |> sqrt Squared Chord The following code implements Squared Chord distance measure. let squarredChord(p:int list)(q:int list)=         List.zip (p|>toPdf) (q |> toPdf)             |> List.sumBy (fun t -> sqrt (fst t ) - sqrt (snd t)) Squared L2 family This is almost the same as the L1 family, just that it got rid of the expensive square root operation and relies on the squares instead. However, that should not be an issue. Sometimes the squares can be quite large so a normalization scheme is provided by dividing the result of the squared sum with another square sum, as done in "Divergence". Squared Euclidean The following code implements Squared Euclidean distance measure. For most purpose this can be used instead of Euclidean distance as it is computationally cheap and performs as well. let squaredEuclidean (p:int list)(q:int list)=        List.zip (p|>toPdf) (q |> toPdf)            |> List.sumBy (fun t-> float (fst t - snd t) ** 2.0) Squared Chi The following code implements Squared Chi distance measure. let squaredChi(p:int list)(q:int list)=         List.zip (p|>toPdf) (q |> toPdf)            |> List.sumBy (fun t -> (fst t - snd t ) ** 2.0 / (fst t + snd t)) Pearson's Chi The following code implements Pearson’s Chidistance measure. let pearsonsChi(p:int list)(q:int list)=         List.zip (p|>toPdf) (q |> toPdf)            |> List.sumBy (fun t -> (fst t - snd t ) ** 2.0 / snd t) Neyman's Chi The following code implements Neyman’s Chi distance measure. let neymanChi(p:int list)(q:int list)=         List.zip (p|>toPdf) (q |> toPdf)            |> List.sumBy (fun t -> (fst t - snd t ) ** 2.0 / fst t) Probabilistic Symmetric Chi The following code implements Probabilistic Symmetric Chidistance measure. let probabilisticSymmetricChi(p:int list)(q:int list)=        2.0 * squaredChi p q Divergence The following code implements Divergence measure. This metric is useful when the elements of the collections have elements in different order of magnitude. This normalization will make the distance properly adjusted for several kinds of usages. let divergence(p:int list)(q:int list)=         List.zip (p|>toPdf) (q |> toPdf)             |> List.sumBy (fun t -> (fst t - snd t) ** 2. / (fst t + snd t) ** 2.) Clark The following code implements Clark’s distance measure. let clark(p:int list)(q:int list)=               sqrt( List.zip (p|>toPdf) (q |> toPdf)         |> List.map (fun t ->  float( abs( fst t - snd t))                                / (float (fst t + snd t)))           |> List.sumBy (fun t -> t * t)) Additive Symmetric Chi The following code implements Additive Symmetric Chi distance measure. let additiveSymmetricChi(p:int list)(q:int list)=         List.zip (p|>toPdf) (q |> toPdf)             |> List.sumBy (fun t -> (fst t - snd t ) ** 2. * (fst t + snd t)/ (fst t * snd t))  Summary Congratulations!! you learnt how different similarity measures work and when to use which one, to find the closest match. Edmund Burke said that, "It's the nature of every greatness not to be exact", and I can't agree more. Most of the time the users aren't really sure what they are looking for. So providing a binary answer of yes or no, or found or not found is not that useful. Striking the middle ground by attaching a confidence score to each result is the key. The techniques that you learnt will prove to be useful when we deal with recommender system and anomaly detection, because both these fields rely heavily on the IR techniques. Resources for Article:   Further resources on this subject: Learning Option Pricing [article] Working with Windows Phone Controls [article] Simplifying Parallelism Complexity in C# [article]
Read more
  • 0
  • 0
  • 2369

article-image-flexible-layouts-swift-and-uistackview-0
Milton Moura
04 Jan 2016
12 min read
Save for later

Flexible Layouts with Swift and UIStackview

Milton Moura
04 Jan 2016
12 min read
In this post we will build a Sign In and Password Recovery form with a single flexible layout, using Swift and the UIStackView class, which has been available since the release of the iOS 9 SDK. By taking advantage of UIStackView's properties, we will dynamically adapt to the device's orientation and show / hide different form components with animations. The source code for this post can the found in this github repository. Auto Layout Auto Layout has become a requirement for any application that wants to adhere to modern best practices of iOS development. When introduced in iOS 6, it was optional and full visual support in Interface Builder just wasn't there. With the release of iOS 8 and the introduction of Size Classes, the tools and the API improved but you could still dodge and avoid Auto Layout. But now, we are at a point where, in order to fully support all device sizes and split-screen multitasking on the iPad, you must embrace it and design your applications with a flexible UI in mind. The problem with Auto Layout Auto Layout basically works as an linear equation solver, taking all of the constraints defined in your views and subviews, and calculates the correct sizes and positioning for them. One disadvantage of this approach is that you are obligated to define, typically, between 2 to 6 constraints for each control you add to your view. With different constraint sets for different size classes, the total number of constraints increases considerably and the complexity of managing them increases as well. Enter the Stack View In order to reduce this complexity, the iOS 9 SDK introduced the UIStackView, an interface control that serves the single purpose of laying out collections of views. A UIStackView will dynamically adapt its containing views' layout to the device's current orientation, screen sizes and other changes in its views. You should keep the following stack view properties in mind: The views contained in a stack view can be arranged either Vertically or Horizontally, in the order they were added to the arrangedSubviews array. You can embed stack views within each other, recursively. The containing views are laid out according to the stack view's [distribution](...) and [alignment](...) types. These attributes specify how the view collection is laid out across the span of the stack view (distribution) and how to align all subviews within the stack view's container (alignment). Most properties are animatable and inserting / deleting / hiding / showing views within an animation block will also be animated. Even though you can use a stack view within an UIScrollView, don't try to replicate the behaviour of an UITableView or UICollectionView, as you'll soon regret it. Apple recommends that you use UIStackView for all cases, as it will seriously reduce constraint overhead. Just be sure to judiciously use compression and content hugging priorities to solve possible layout ambiguities. A Flexible Sign In / Recover Form The sample application we'll build features a simple Sign In form, with the option for recovering a forgotten password, all in a single screen. When tapping on the "Forgot your password?" button, the form will change, hiding the password text field and showing the new call-to-action buttons and message labels. By canceling the password recovery action, these new controls will be hidden once again and the form will return to it's initial state. 1. Creating the form This is what the form will look like when we're done. Let's start by creating a new iOS > Single View Application template. Then, we add a new UIStackView to the ViewController and add some constraints for positioning it within its parent view. Since we want a full screen width vertical form, we set its axis to .Vertical, the alignment to .Fill and the distribution to .FillProportionally, so that individual views within the stack view can grow bigger or smaller, according to their content.    class ViewController : UIViewController    {        let formStackView = UIStackView()        ...        override func viewDidLoad() {            super.viewDidLoad()                       // Initialize the top-level form stack view            formStackView.axis = .Vertical            formStackView.alignment = .Fill            formStackView.distribution = .FillProportionally            formStackView.spacing = 8            formStackView.translatesAutoresizingMaskIntoConstraints = false                       view.addSubview(formStackView)                       // Anchor it to the parent view            view.addConstraints(                NSLayoutConstraint.constraintsWithVisualFormat("H:|-20-[formStackView]-20-|", options: [.AlignAllRight,.AlignAllLeft], metrics: nil, views: ["formStackView": formStackView])            )            view.addConstraints(                NSLayoutConstraint.constraintsWithVisualFormat("V:|-20-[formStackView]-8-|", options: [.AlignAllTop,.AlignAllBottom], metrics: nil, views: ["formStackView": formStackView])            )            ...        }        ...    } Next, we'll add all the fields and buttons that make up our form. We'll only present a couple of them here as the rest of the code is boilerplate. In order to refrain UIStackView from growing the height of our inputs and buttons as needed to fill vertical space, we add height constraints to set the maximum value for their vertical size.    class ViewController : UIViewController    {        ...        var passwordField: UITextField!        var signInButton: UIButton!        var signInLabel: UILabel!        var forgotButton: UIButton!        var backToSignIn: UIButton!        var recoverLabel: UILabel!        var recoverButton: UIButton!        ...               override func viewDidLoad() {            ...                       // Add the email field            let emailField = UITextField()            emailField.translatesAutoresizingMaskIntoConstraints = false            emailField.borderStyle = .RoundedRect            emailField.placeholder = "Email Address"            formStackView.addArrangedSubview(emailField)                       // Make sure we have a height constraint, so it doesn't change according to the stackview auto-layout            emailField.addConstraints(                NSLayoutConstraint.constraintsWithVisualFormat("V:[emailField(<=30)]", options: [.AlignAllTop, .AlignAllBottom], metrics: nil, views: ["emailField": emailField])             )                       // Add the password field            passwordField = UITextField()            passwordField.translatesAutoresizingMaskIntoConstraints = false            passwordField.borderStyle = .RoundedRect            passwordField.placeholder = "Password"            formStackView.addArrangedSubview(passwordField)                       // Make sure we have a height constraint, so it doesn't change according to the stackview auto-layout            passwordField.addConstraints(                 NSLayoutConstraint.constraintsWithVisualFormat("V:[passwordField(<=30)]", options: .AlignAllCenterY, metrics: nil, views: ["passwordField": passwordField])            )            ...        }        ...    } 2. Animating by showing / hiding specific views By taking advantage of the previously mentioned properties of UIStackView, we can transition from the Sign In form to the Password Recovery form by showing and hiding specific field and buttons. We do this by setting the hidden property within a UIView.animateWithDuration block.    class ViewController : UIViewController    {        ...        // Callback target for the Forgot my password button, animates old and new controls in / out        func forgotTapped(sender: AnyObject) {            UIView.animateWithDuration(0.2) { [weak self] () -> Void in                self?.signInButton.hidden = true                self?.signInLabel.hidden = true                self?.forgotButton.hidden = true                self?.passwordField.hidden = true                self?.recoverButton.hidden = false                self?.recoverLabel.hidden = false                self?.backToSignIn.hidden = false            }        }               // Callback target for the Back to Sign In button, animates old and new controls in / out        func backToSignInTapped(sender: AnyObject) {            UIView.animateWithDuration(0.2) { [weak self] () -> Void in                self?.signInButton.hidden = false                self?.signInLabel.hidden = false                self?.forgotButton.hidden = false                self?.passwordField.hidden = false                self?.recoverButton.hidden = true                self?.recoverLabel.hidden = true                self?.backToSignIn.hidden = true            }        }        ...    } 3. Handling different Size Classes Because we have many vertical input fields and buttons, space can become an issue when presenting in a compact vertical size, like the iPhone in landscape. To overcome this, we add a stack view to the header section of the form and change its axis orientation between Vertical and Horizontal, according to the current active size class.    override func viewDidLoad() {        ...        // Initialize the header stack view, that will change orientation type according to the current size class        headerStackView.axis = .Vertical        headerStackView.alignment = .Fill        headerStackView.distribution = .Fill        headerStackView.spacing = 8        headerStackView.translatesAutoresizingMaskIntoConstraints = false        ...    }       // If we are presenting in a Compact Vertical Size Class, let's change the header stack view axis orientation    override func willTransitionToTraitCollection(newCollection: UITraitCollection, withTransitionCoordinator coordinator: UIViewControllerTransitionCoordinator) {        if newCollection.verticalSizeClass == .Compact {            headerStackView.axis = .Horizontal        } else {            headerStackView.axis = .Vertical        }    } 4. The flexible form layout So, with a couple of UIStackViews, we've built a flexible form only by defining a few height constraints for our input fields and buttons, with all the remaining constraints magically managed by the stack views. Here is the end result: Conclusion We have included in the sample source code a view controller with this same example but designed with Interface Builder. There, you can clearly see that we have less than 10 constraints, on a layout that could easily have up to 40-50 constraints if we had not used UIStackView. Stack Views are here to stay and you should use them now if you are targeting iOS 9 and above. About the author Milton Moura (@mgcm) is a freelance iOS developer based in Portugal. He has worked professionally in several industries, from aviation to telecommunications and energy and is now fully dedicated to creating amazing applications using Apple technologies. With a passion for design and user interaction, he is also very interested in new approaches to software development. You can find out more at http://defaultbreak.com
Read more
  • 0
  • 0
  • 32824

article-image-remote-sensing-and-histogram
Packt
04 Jan 2016
10 min read
Save for later

Remote Sensing and Histogram

Packt
04 Jan 2016
10 min read
In this article by Joel Lawhead, the author of Learning GeoSpatial Analysis with Python - Second Edition, we will discuss remote sensing. This field grows more exciting every day as more satellites are launched and the distribution of data becomes easier. The high availability of satellite and aerial images as well as the interesting new types of sensors that are being launched each year is changing the role that remote sensing plays in understanding our world. In this field, Python is quite capable. In remote sensing, we step through each pixel in an image and perform a type of query or mathematical process. An image can be thought of as a large numerical array and in remote sensing, these arrays can be as large as tens of megabytes to several gigabytes. While Python is fast, only C-based libraries can provide the speed that is needed to loop through the arrays at a tolerable speed. (For more resources related to this topic, see here.) In this article, whenever possible, we’ll use Python Imaging Library (PIL) for image processing and NumPy, which provides multidimensional array mathematics. While written in C for speed, these libraries are designed for Python and provide Python’s API. In this article, we’ll start with basic image manipulation and build on each exercise all the way to automatic change detection. Here are the topics that we’ll cover: Swapping image bands Creating image histograms Swapping image bands Our eyes can only see colors in the visible spectrum as combinations of red, green, and blue (RGB). Airborne and spaceborne sensors can collect wavelengths of the energy that is outside the visible spectrum. In order to view this data, we move images representing different wavelengths of light reflectance in and out of the RGB channels in order to make color images. These images often end up as bizarre and alien color combinations that can make visual analysis difficult. An example of a typical satellite image is seen in the following Landsat 7 satellite scene near the NASA's Stennis Space Center in Mississippi along the Gulf of Mexico, which is a leading space center for remote sensing and geospatial analysis in general: Most of the vegetation appears red and water appears almost black. This image is one type of false color image, meaning that the color of the image is not based on the RGB light. However, we can change the order of the bands or swap certain bands in order to create another type of false color image that looks more like the world we are used to seeing. In order to do so, you first need to download this image as a ZIP file from the following: http://git.io/vqs41 We need to install the GDAL library with Python bindings. The Geospatial Data Abstraction Library(GDAL)includes a module called gdalnumeric that loads and saves remotely sensed images to and from NumPy arrays for easy manipulation. GDAL itself is a data access library and does not provide much in the name of processing. Therefore, in this article, we will rely heavily on NumPy to actually change images. In this example, we’ll load the image in a NumPy array using gdalnumeric and then, we’ll immediately save it in a new .tiff file. However, upon saving, we’ll use NumPy’s advanced array slicing feature to change the order of the bands. Images in NumPy are multidimensional arrays in the order of band, height, and width. Therefore, an image with three bands will be an array of length three, containing an array for each band, the height, and width of the image. It’s important to note that NumPy references the array locations as y,x (row, column) instead of the usual column and row format that we work with in spreadsheets and other software: from osgeo import gdalnumeric src = "FalseColor.tif" arr = gdalnumeric.LoadFile(src) gdalnumeric.SaveArray(arr[[1, 0, 2], :], "swap.tif",format="GTiff", prototype=src) Also in the SaveArray() method, the last argument is called prototype. This argument let’s you specify another image for GDAL from which we can copy spatial reference information and some other image parameters. Without this argument, we’d end up with an image without georeferencing information, which can not be used in a geographical information system (GIS). In this case, we specified our input image file name as the images are identical, except for the band order. The result of this example produces the swap.tif image, which is a much more visually appealing image with green vegetation and blue water: There’s only one problem with this image. It’s a little dark and difficult to see. Let’s see if we can figure out why. Creating histograms A histogram shows the statistical frequency of data distribution in a dataset. In the case of remote sensing, the dataset is an image, the data distribution is the frequency of the pixels in the range of 0 to 255, which is the range of the 8-byte numbers used to store image information on computers. In an RGB image, color is represented as a three-digit tuple with (0,0,0) being black and (255,255,255) being white. We can graph the histogram of an image with the frequency of each value along the y axis and the range of 255 possible pixel values along the x axis. We can use the Turtle graphics engine that is included with Python to create a simple GIS. We can also use it to easily graph histograms. Histograms are usually a one-off product that makes a quick script, like this example, great. Also, histograms are typically displayed as a bar graph with the width of the bars representing the size of the grouped data bins. However, in an image, each bin is only one value, so we’ll create a line graph. We’ll use the histogram function in this example and create a red, green, and blue line for each band. The graphing portion of this example also defaults to scaling the y axis values to the maximum RGB frequency that is found in the image. Technically, the y axis represents the maximum frequency, that is, the number of pixels in the image, which would be the case if the image was of one color. We’ll use the turtle module again; however, this example could be easily converted to any graphical output module. However, this format makes the distribution harder to see. Let’s take a look at our swap.tif image: from osgeo import gdalnumeric import turtle as t def histogram(a, bins=list(range(0, 256))): fa = a.flat n = gdalnumeric.numpy.searchsorted(gdalnumeric.numpy.sort(fa), bins) n = gdalnumeric.numpy.concatenate([n, [len(fa)]]) hist = n[1:]-n[:-1] return hist defdraw_histogram(hist, scale=True): t.color("black") axes = ((-355, -200), (355, -200), (-355, -200), (-355, 250)) t.up() for p in axes: t.goto(p) t.down() t.up() t.goto(0, -250) t.write("VALUE", font=("Arial, ", 12, "bold")) t.up() t.goto(-400, 280) t.write("FREQUENCY", font=("Arial, ", 12, "bold")) x = -355 y = -200 t.up() for i in range(1, 11): x = x+65 t.goto(x, y) t.down() t.goto(x, y-10) t.up() t.goto(x, y-25) t.write("{}".format((i*25)), align="center") x = -355 y = -200 t.up() pixels = sum(hist[0]) if scale: max = 0 for h in hist: hmax = h.max() if hmax> max: max = hmax pixels = max label = pixels/10 for i in range(1, 11): y = y+45 t.goto(x, y) t.down() t.goto(x-10, y) t.up() t.goto(x-15, y-6) t.write("{}" .format((i*label)), align="right") x_ratio = 709.0 / 256 y_ratio = 450.0 / pixels colors = ["red", "green", "blue"] for j in range(len(hist)): h = hist[j] x = -354 y = -199 t.up() t.goto(x, y) t.down() t.color(colors[j]) for i in range(256): x = i * x_ratio y = h[i] * y_ratio x = x - (709/2) y = y + -199 t.goto((x, y)) im = "swap.tif" histograms = [] arr = gdalnumeric.LoadFile(im) for b in arr: histograms.append(histogram(b)) draw_histogram(histograms) t.pen(shown=False) t.done() Here's what the histogram for swap.tif looks similar to after running the example: As you can see, all the three bands are grouped closely towards the left-hand side of the graph and all have values that are less than 125. As these values approach zero, the image becomes darker, which is not surprising. Just for fun, let’s run the script again and when we call the draw_histogram() function, we’ll add the scale=False option to get an idea of the size of the image and provide an absolute scale. Therefore, we take the following line: draw_histogram(histograms) Change it to the following: draw_histogram(histograms, scale=False) This change will produce the following histogram graph: As you can see, it’s harder to see the details of the value distribution. However, this absolute scale approach is useful if you are comparing multiple histograms from different products that are produced from the same source image. Now that we understand the basics of looking at an image statistically using histograms, how do we make our image brighter? Performing a histogram stretch A histogram stretch operation does exactly what the name suggests. It distributes the pixel values across the whole scale. By doing so, we have more values at the higher-intensity level and the image becomes brighter. Therefore, in this example, we’ll use our histogram function; however, we’ll add another function called stretch() that takes an image array, creates the histogram, and then spreads out the range of values for each band. We’ll run these functions on swap.tif and save the result in an image called stretched.tif: import gdalnumeric import operator from functools import reduce def histogram(a, bins=list(range(0, 256))): fa = a.flat n = gdalnumeric.numpy.searchsorted(gdalnumeric.numpy.sort(fa), bins) n = gdalnumeric.numpy.concatenate([n, [len(fa)]]) hist = n[1:]-n[:-1] return hist def stretch(a): hist = histogram(a) lut = [] for b in range(0, len(hist), 256): step = reduce(operator.add, hist[b:b+256]) / 255 n = 0 for i in range(256): lut.append(n / step) n = n + hist[i+b] gdalnumeric.numpy.take(lut, a, out=a) return a src = "swap.tif" arr = gdalnumeric.LoadFile(src) stretched = stretch(arr) gdalnumeric.SaveArray(arr, "stretched.tif", format="GTiff", prototype=src) The stretch algorithm will produce the following image. Look how much brighter and visually appealing it is: We can run our turtle graphics histogram script on stretched.tif by changing the filename in the im variable to stretched.tif: im = "stretched.tif" This run will give us the following histogram: As you can see, all the three bands are distributed evenly now. Their relative distribution to each other is the same; however, in the image, they are now spread across the spectrum. Summary In this article, we covered the foundations of remote sensing including band swapping and histograms. The authors of GDAL have a set of Python examples, covering some advanced topics that may be of interest, available at https://svn.osgeo.org/gdal/trunk/gdal/swig/python/samples/. Resources for Article: Further resources on this subject: Python Libraries for Geospatial Development[article] Python Libraries[article] Learning R for Geospatial Analysis [article]
Read more
  • 0
  • 0
  • 11394
article-image-types-and-providers
Packt
31 Dec 2015
12 min read
Save for later

Types and providers

Packt
31 Dec 2015
12 min read
In this article by Thomas Uphill, the author of Mastering Puppet Second Edition, we will look at custom types. Puppet separates the implementation of a type into the type definition and any one of the many providers for that type. For instance, the package type in Puppet has multiple providers depending on the platform in use (apt, yum, rpm, gem, and others). Early on in Puppet development there were only a few core types defined. Since then, the core types have expanded to the point where anything that I feel should be a type is already defined by core Puppet. The LVM module creates a type for defining logical volumes, and the concat module creates types for defining file fragments. The firewall module creates a type for defining firewall rules. Each of these types represents something on the system with the following properties: Unique Searchable Atomic Destroyable Creatable (For more resources related to this topic, see here.) When creating a new type, you have to make sure your new type has these properties. The resource defined by the type has to be unique, which is why the file type uses the path to a file as the naming variable (namevar). A system may have files with the same name (not unique), but it cannot have more than one file with an identical path. As an example, the ldap configuration file for openldap is /etc/openldap/ldap.conf, the ldap configuration file for the name services library is /etc/ldap.conf. If you used filename, then they would both be the same resource. Resources must be unique. By atomic, I mean it is indivisible; it cannot be made of smaller components. For instance, the firewall module creates a type for single iptables rules. Creating a type for the tables (INPUT, OUTPUT, FORWARD) within iptables wouldn't be atomic—each table is made up of multiple smaller parts, the rules. Your type has to be searchable so that Puppet can determine the state of the thing you are modifying. A mechanism has to exist to know what the current state is of the thing in question. The last two properties are equally important. Puppet must be able to remove the thing, destroy it, and likewise, Puppet must be able to create the thing anew. Given these criteria, there are several modules that define new types, with some examples including types that manage: Git repositories Apache virtual hosts LDAP entries Network routes Gem modules Perl CPAN modules Databases Drupal multisites Creating a new type As an example, we will create a gem type for managing Ruby gems installed for a user. Ruby gems are packages for Ruby that are installed on the system and can be queried like packages. Installing gems with Puppet can already be done using the gem, pe_gem, or pe_puppetserver_gem providers for the package type. To create a custom type requires some knowledge of Ruby. In this example, we assume the reader is fairly literate in Ruby. We start by defining our type in the lib/puppet/type directory of our module. We'll do this in our example module, modules/example/lib/puppet/type/gem.rb. The file will contain the newtype method and a single property for our type, version as shown in the following code: Puppet::Type.newtype(:gem) do   ensurable   newparam(:name, :namevar => true) do     desc 'The name of the gem'   end   newproperty(:version) do     desc 'version of the gem'     validate do |value|       fail("Invalid gem version #{value}") unless value =~ /^[0-9]         +[0-9A-Za-z.-]+$/     end   end end The ensurable keyword creates the ensure property for our new type, allowing the type to be either present or absent. The only thing we require of the version is that it starts with a number and only contain numbers, letters, periods, or dashes. A more thorough regular expression here could save you time later, such as checking that the version ends with a number or letter. Now we need to start making our provider. The name of the provider is the name of the command used to manipulate the type. For packages, the providers are named things like yum, apt, and dpkg. In our case we'll be using the gem command to manage gems, which makes our path seem a little redundant. Our provider will live at modules/example/lib/puppet/provider/gem/gem.rb. We'll start our provider with a description of the provider and the commands it will use as shown in the following code: Puppet::Type.type(:gem).provide :gem do   desc "Manages gems using gem" Then we'll define a method to list all the gems installed on the system as shown in the following code, which defines the self.instances method: def self.instances   gems = []   command = 'gem list -l'     begin       stdin, stdout, stderr = Open3.popen3(command)       for line in stdout.readlines         (name,version) = line.split(' ')         gem = {}         gem[:provider] = self.name         gem[:name] = name         gem[:ensure] = :present         gem[:version] = version.tr('()','')         gems << new(gem)       end     rescue       raise Puppet::Error, "Failed to list gems using '#         {command}'"     end     gems   end This method runs gem list -l and then parses the output looking for lines such as gemname (version). The output from the gem command is written to the variable stdout. We then use readlines on stdout to create an array which we iterate over with a for loop. Within the for loop we split the lines of output based on a space character into the gem name and version. The version will be wrapped in parenthesis at this point, we use the tr (translate) method to remove the parentheses. We create a local hash of these values and then append the hash to the gems hash. The gems hash is returned and then Puppet knows all about the gems installed on the system. Puppet needs two more methods at this point, a method to determine if a gem exists (is installed), and if it does exist, which version is installed. We already populated the ensure parameter, so as to use that to define our exists method as follows: def exists?   @property_hash[:ensure] == :present end To determine the version of an installed gem, we can use the property_hash variable as follows: def version   @property_hash[:version] || :absent end To test this, add the module to a node and pluginsync the module over to the node as follows: [root@client ~]# puppet plugin download Notice: /File[/opt/puppetlabs/puppet/cache/lib/puppet/provider/gem]/   ensure: created Notice: /File[/opt/puppetlabs/puppet/cache/lib/puppet/provider/gem/   gem.rb]/ensure: defined content as   '{md5}4379c3d0bd6c696fc9f9593a984926d3' Notice: /File[/opt/puppetlabs/puppet/cache/lib/puppet/provider/gem/   gem.rb.orig]/ensure: defined content as   '{md5}c6024c240262f4097c0361ca53c7bab0' Notice: /File[/opt/puppetlabs/puppet/cache/lib/puppet/type/gem.rb]/   ensure: defined content as '{md5}48749efcd33ce06b401d5c008d10166c' Downloaded these plugins: /opt/puppetlabs/puppet/cache/lib/puppet/provider/gem, /opt/puppetlabs/puppet/cache/lib/puppet/provider/gem/gem.rb, /opt/puppetlabs/puppet/cache/lib/puppet/provider/gem/gem.rb.orig, /opt/puppetlabs/puppet/cache/lib/puppet/type/gem.rb This will install our type/gem.rb and provider/gem/gem.rb files into /opt/puppetlabs/puppet/cache/lib/puppet on the node. After that, we are free to run puppet resource on our new type to list the available gems as shown in the following code: [root@client ~]# puppet resource gem gem { 'bigdecimal':   ensure  => 'present',   version => '1.2.0', } gem { 'bropages':   ensure  => 'present',   version => '0.1.0', } gem{ 'commander':   ensure  => 'present',   version => '4.1.5', } gem { 'highline':   ensure  => 'present',   version => '1.6.20', } … Now, if we want to manage gems, we'll need to create and destroy them, and we'll need to provide methods for those operations. If we try at this point, Puppet will fail, as we can see from the following output: [root@client ~]# puppet resource gem bropages pugem { 'bropages':   ensure => 'present',   version => '0.1.0', } [root@client ~]# puppet resource gem bropages ensure=absent gem { 'bropages':   ensure => 'absent', } [root@client ~]# puppet resource gem bropages ensure=absent gem { 'bropages':   ensure => 'absent', } When we run puppet resource, there is no destroy method, so puppet returns that it the gem was removed but doesn't actually do anything. To get Puppet to actually remove the gem, we'll need a method to destroy (remove) gems, gem uninstall should do the trick, as shown in the following code: def destroy   g = @resource[:version] ? [@resource[:name], '--version',     @resource[:version]] : @resource[:name]   command = "gem uninstall #{g} -q -x"     begin       system command     rescue       raise Puppet::Error, "Failed to remove #{@resource[:name]}         '#{command}'"     end     @property_hash.clear   end Using the ternary operator, we either run gem uninstall name -q -x if no version is defined, or gem uninstall name --version version -q -x if a version is defined. We finish by calling @property_hash.clear to remove the gem from the property_hash since the gem is now removed. Now we need to let Puppet know about the state of the bropages gem using our instances method we defined earlier, we'll need to write a new method to prefetch all the available gems. This is done with self.prefetch, as shown in the following code:  def self.prefetch(resources)     gems = instances     resources.keys.each do |name|       if provider = gems.find{ |gem| gem.name == name }         resources[name].provider = provider       end     end   end We can see this in action using puppet resource as shown in the following output: [root@client ~]# puppet resource gem bropages ensure=absent Removing bro Successfully uninstalled bropages-0.1.0 Notice: /Gem[bropages]/ensure: removed gem { 'bropages':   ensure => 'absent', } Almost there, now we want to add bropages back, we'll need a create method, as shown in the following code:  def create     g = @resource[:version] ? [@resource[:name], '--version',       @resource[:version]] : @resource[:name]     command = "gem install #{g} -q"     begin       system command       @property_hash[:ensure] = :present     rescue       raise Puppet::Error, "Failed to install #{@resource[:name]}         '#{command}'"     end   end Now when we run puppet resource to create the gem, we see the installation, as shown in the following output: [root@client ~]# puppet resource gem bropages ensure=present Successfully installed bropages-0.1.0 Parsing documentation for bropages-0.1.0 Installing ri documentation for bropages-0.1.0 1 gem installed Notice: /Gem[bropages]/ensure: created gem { 'bropages':   ensure => 'present', } Nearly done now, we need to handle versions. If we want to install a specific version of the gem, we'll need to define methods to deal with versions.  def version=(value)     command = "gem install #{@resource[:name]} --version       #{@resource[:version]}"     begin       system command       @property_hash[:version] = value     rescue       raise Puppet::Error, "Failed to install gem         #{resource[:name]} using #{command}"     end   end Now, we can tell Puppet to install a specific version of the gem and have the correct results as shown in the following output: [root@client ~]# puppet resource gem bropages version='0.0.9' Fetching: highline-1.7.8.gem (100%) Successfully installed highline-1.7.8 Fetching: bropages-0.0.9.gem (100%) Successfully installed bropages-0.0.9 Parsing documentation for highline-1.7.8 Installing ri documentation for highline-1.7.8 Parsing documentation for bropages-0.0.9 Installing ri documentation for bropages-0.0.9 2 gems installed Notice: /Gem[bropages]/version: version changed '0.1.0' to '0.0.9' gem { 'bropages':   ensure  => 'present',   version => '0.0.9', } This is where our choice of gem as an example breaks down as gem provides for multiple versions of a gem to be installed. Our gem provider, however, works well enough for use at this point. We can specify the gem type in our manifests and have gems installed or removed from the node. This type and provider is only an example; the gem provider for the package type provides the same features in a standard way. When considering creating a new type and provider, search the puppet forge for existing modules first. Summary When the defined types are not enough, you can extend Puppet with custom types and providers written in Ruby. The details of writing providers are best learned by reading the already written providers and referring to the documentation on the Puppet Labs website. Resources for Article: Further resources on this subject: Modules and Templates [article] My First Puppet Module [article] Installing Software and Updates [article]
Read more
  • 0
  • 0
  • 2324

Packt
31 Dec 2015
6 min read
Save for later

Spacecraft – Adding Details

Packt
31 Dec 2015
6 min read
In this article by Christopher Kuhn, the author of the book Blender 3D Incredible Machines, we'll model our Spacecraft. As we do so, we'll cover a few new tools and techniques and apply things in different ways to create a final, complex model: Do it yourself—completing the body Building the landing gear (For more resources related to this topic, see here.) We'll work though the spacecraft one section at a time by adding the details. Do it yourself – completing the body Next, let's take a look at the key areas that we have left to model: The bottom of the ship and the sensor suite (on the nose) are good opportunities to practice on your own. They use identical techniques to the areas of the ship that we've already done. Go ahead and see what you can do! For the record, here's what I ended up doing with the sensor suite: Here's what I did with the bottom. You can see that I copied the circular piece that was at the top of the engine area: One of the nice things about a project as this is that you can start to copy parts from one area to another. It's unlikely that both the top and bottom of the ship would be shown in the same render (or shot), so you can probably get away with borrowing quite a bit. Even if you did see them simultaneously, it's not unreasonable to think that a ship would have more than one of certain components. Of course, this is just a way to make things quicker (and easier). If you'd like everything to be 100% original, you're certainly free to do so. Building the landing gear We'll do the landing struts together, but you can feel free to finish off the actual skids yourself: I kept mine pretty simple compared to the other parts of the ship: Once you've got the skid plate done, make sure to make it a separate object (if it's not already). We're going to use a neat trick to finish this up. Make a copy of the landing gear part and move it to the rear section (or front if you have modeled the rear). Then, under your mesh tab, you can assign both of these objects the same mesh data: Now, whenever you make a change to one of them, the change will carry over to the other as well. Of course, you could just model one and then duplicate it, but sometimes, it's nice to see how the part will look in multiple locations. For instance, the cutouts are slightly different between the front and back of the ship. As you model it, you'll want to make sure that it will fit both areas. The first detail that we'll add is a mounting bracket for our struts to go on: Then, we'll add a small cylinder (at this point, the large one is just a placeholder): We'll rotate it just a bit: From this, it's pretty easy to create a rear mounting piece. Once you've done this, go ahead and add a shock absorber for the front (leave room for the springs, which we'll add next): To create the spring, we'll start with a small (12-sided) circle. We'll make it so small because just like the cable reel on the grabbling gun there will naturally be a lot of geometry, and we want to keep the polygon count as low as possible. Then, in edit mode, move the whole circle away from its original center point: Having done this, you can now add a screw modifier. Right away, you'll see the effect: There are a couple of settings you'll want to make note of here. The Screw value controls the vertical gap or distance of your spring: The Angle and Steps values control the number of turns and smoothness respectively: Go ahead and play with these until you're happy. Then, move and scale your spring into a position. Once it's the way you like it, go ahead and apply the screw modifier (but don't join it to the shock absorber just yet): None of my existing materials seemed right for the spring. So, I went ahead and added one that I called Blue Plastic. At this point, we have a bit of a problem. We want to join the spring to the landing gear but we can't. The landing gear has an edge split modifier with a split angle value of 30, and the spring has a value of 46. If we join them right now, the smooth edges on the spring will become sharp. We don't want this. Instead, we'll go to our shock absorber. Using the Select menu, we'll pick the Sharp Edges option: By default, it will select all edges with an angle of 30 degrees or higher. Once you do this, go ahead and mark these edges as sharp: Because all the thirty degree angles are marked sharp, we no longer need the Edge Angle option on our edge split modifier. You can disable it by unchecking it, and the landing gear remains exactly the same: Now, you can join the spring to it without a problem: Of course, this does mean that when you create new edges in your landing gear, you'll now have to mark them as sharp. Alternatively, you can keep the Edge Angle option selected and just turn it up to 46 degrees—your choice. Next, we'll just pull in the ends of our spring a little, so they don't stick out: Maybe we'll duplicate it. After all, this is a big, heavy vehicle, so maybe, it needs multiple shock absorbers: This is a good place to leave our landing gear for now. Summary In this article, we finished modeling our Spaceship's landing gear. We used a few new tools within Blender, but mostly, we focused on workflow and technique. Resources for Article: Further resources on this subject: Blender 3D 2.49: Quick Start[article] Blender 3D 2.49: Working with Textures[article] Make Spacecraft Fly and Shoot with Special Effects using Blender 3D 2.49 [article]
Read more
  • 0
  • 0
  • 11797

article-image-courses-users-and-roles
Packt
30 Dec 2015
9 min read
Save for later

Courses, Users, and Roles

Packt
30 Dec 2015
9 min read
In this article by Alex Büchner, the author of Moodle 3 Administration, Third Edition, gives an overview of Moodle courses, users, and roles. The three concepts are inherently intertwined and any one of these cannot be used without the other two. We will deal with the basics of the three core elements and show how they work together. Let's see what they are: Moodle courses: Courses are central to Moodle as this is where learning takes place. Teachers upload their learning resources, create activities, assist in learning and grade work, monitor progress, and so on. Students, on the other hand, read, listen to or watch learning resources, participate in activities, submit work, collaborate with others, and so on. Moodle users: These are individuals accessing our Moodle system. Typical users are students and teachers/trainers, but also there are others such as teaching assistants, managers, parents, assessors, examiners, or guests. Oh, and the administrator, of course! Moodle roles: Roles are effectively permissions that specify which features users are allowed to access and, also, where and when (in Moodle) they can access them. Bear in mind that this articleonly covers the basic concepts of these three core elements. (For more resources related to this topic, see here.) A high-level overview To give you an overview of courses, users, and roles, let's have a look at the following diagram. It shows nicely how central the three concepts are and also how other features are related to them. Again, all of their intricacies will be dealt with in due course, so for now, just start getting familiar with some Moodle terminology. Let's start at the bottom-left and cycle through the pyramid clockwise. Users have to go through an Authentication process to get access to Moodle. They then have to go through theEnrolments step to be able to participate in Courses, which themselves are organized into Categories. Groups & Cohorts are different ways to group users at course level or site-wide. Users are granted Roles in particular Contexts. Which role is allowed to do what and which isn't, depends entirely on the Permissions set within that role. The diagram also demonstrates a catch-22 situation. If we start with users, we have no courses to enroll them in to (except the front page); if we start with courses, we have no users who can participate in them. Not to worry though. Moodle lets us go back and forth between any administrative areas and, often, perform multiple tasks at once. Moodle courses Moodle manages activities and stores resources in courses, and this is where learning and collaboration takes place. Courses themselves belong to categories, which are organized hierarchically, similar to folders on our local hard drive. Moodle comes with a default category called Miscellaneous, which is sufficient to show the basics of courses. Moodle is a course-centric system. To begin with, let's create the first course. To do so, go to Courses|Managecourses and categories. Here, select the Miscellaneous category. Then, select the Create newcourse link, and you will be directed to the screen where course details have to be entered. For now, let's focus on the two compulsory fields, namely Coursefullname and Courseshortname. The former is displayed at various places in Moodle, whereas the latter is, by default,used to identify the course and is also shown in the breadcrumb trail. For now, we leave all other fields empty or at their default values and save the course by clicking on the Savechanges button at the bottom. The screen displayed after clicking onSavechanges shows enrolled users, if any. Since we just created the course, there are no users present in the course yet. In fact, except the administrator account we are currently using, there are no users at all on our Moodle system. So, we leave the course without users for now and add some users to our LMS before we come back to this screen (select the Home link in the breadcrumb). Moodle users Moodle users, or rather their user accounts, are dealt within Users|Accounts. Before we start, it is important to understand the difference between authentication and enrolment. Moodle users have to be authenticated in order to log in to the system. Authentication grants users access to the system through login where a username and password have to be given (this also applies to guest accounts where a username is allotted internally). Moodle supports a significant number of authentication mechanisms, which are discussed later in detail. Enrolment happens at course level. However, a user has to be authenticated to the system before enrolment to a course can take place. So, a typical workflow is as follows (there are exceptions as always, but we will deal with them when we get there): Create your users Create your courses (and categories) Associate users to courses and assign roles Again, this sequence demonstrates nicely how intertwined courses, users, and roles are in Moodle. Another way of looking at the difference between authentication and enrolment is how a user will get access to a course. Please bear in mind that this is a very simplistic view and it ignores the supported features such as external authentication, guest access, and self-enrolment. During the authentication phase, a user enters his credentials (username and password) or they are entered automatically via single sign-on. If the account exists locally, that is within Moodle, and the password is valid, he/she is granted access. The next phase is enrolment. If the user is enrolled and the enrolment hasn't expired, he/she is granted access to the course. You will come across a more detailed version of these graphics later on, but for now, it hopefully demonstrates the difference between authentication and enrolment. To add a user account manually, go to Users | Accounts|Addanewuser. As with courses, we will only focus on the mandatory fields, which should be self-explanatory: Username (has to be unique) New password (if a password policy has been set, certain rules might apply) Firstname Surname Email address Make sure you save the account information by selecting Create user at the bottom of the page. If any entered information is invalid, Moodle will display error messages right above the field. I have created a few more accounts; to see who has access to your Moodle system, go to Users|Accounts|Browselistofusers, where you will see all users. Actually, I did this via batch upload. Now that we have a few users on our system, let's go back to the course we created a minute ago and manually enroll new participants to it. To achieve this, go back to Courses|Manage courses and categories, select the Miscellaneous category again, and select the created demo course. Underneath the listed demo course, course details will be displayed alongside a number of options (on large screens, details are shown to the right). Here, select Enrolledusers. As expected, the list of enrolled users is still empty. Click on the Enrolusers button to change this. To grant users access to the course, select the Enrol button beside them and close the window. In the following screenshot, three users, participant01 to participant03 have already been enrolled to the course. Two more users, participant04 and participant05, have been selected for enrolment. You have probably spotted the Assignroles dropdown at the top of the pop-up window. This is where you select what role the selected user has, once he/she is enrolled in the course. For example, to give Tommy Teacher appropriate access to the course, we have to select the Teacher role first, before enrolling him to the course. This leads nicely to the third part of the pyramid, namely, roles. Moodle roles Roles define what users can or cannot see and do in your Moodle system. Moodle comes with a number of predefined roles—we already saw Student and Teacher—but it also allows us to create our own roles, for instance, for parents or external assessors. Each role has a certain scope (called context), which is defined by a set of permissions (expressed as capabilities). For example, a teacher is allowed to grade an assignment, whereas a student isn't. Or, a student is allowed to submit an assignment, whereas a teacher isn't. A role is assigned to a user in a context. Okay, so what is a context? A context is a ring-fenced area in Moodle where roles can be assigned to users. A user can be assigned different roles in different contexts, where the context can be a course, a category, an activity module, a user, a block, the front page, or Moodle itself. For instance, you are assigned the Administrator role for the entire system, but additionally, you might be assigned the Teacher role in any courses you are responsible for; or, a learner will be given the Student role in a course, but might have been granted the Teacher role in a forum to act as a moderator. To give you a feel of how a role is defined, let's go to Users |Permissions, where roles are managed, and select Defineroles. Click on the Teacher role and, after some general settings, you will see a (very) long list of capabilities: For now, we only want to stick with the example we used throughout the article. Now that we know what roles are, we can slightly rephrase what we have done. Instead of saying, "We have enrolled the user participant 01 in the demo course as a student", we would say, "We have assigned the studentrole to the user participant 01 in the context of the demo course." In fact, the term enrolment is a little bit of a legacy and goes back to the times when Moodle didn't have the customizable, finely-grained architecture of roles and permissions that it does now. One can speculate whether there are linguistic connotations between the terms role and enrolment. Summary In this article, we very briefly introduced the concepts of Moodle courses, users, and roles. We also saw how central they are to Moodle and how they are linked together. Any one of these concepts simply cannot exist without the other two, and this is something you should bear in mind throughout. Well, theoretically they can, but it would be rather impractical when you try to model your learning environment. If you haven't fully understood any of the three areas, don't worry. The intention was only to provide you with a high-level overview of the three core components and to touch upon the basics. Resources for Article: Further resources on this subject: Moodle for Online Communities [article] Gamification with Moodle LMS [article] Moodle Plugins [article]
Read more
  • 0
  • 0
  • 6499
article-image-advanced-user-management
Packt
30 Dec 2015
20 min read
Save for later

Advanced User Management

Packt
30 Dec 2015
20 min read
In this article written by Bhaskarjyoti Roy, author of the book Mastering CentOS 7 Linux Server, will introduce some advanced user and group management scenarios along with some examples on how to handle advanced level options such as password aging, managing sudoers, and so on, on a day to day basis. Here, we are assuming that we have already successfully installed CentOS 7 along with a root and user credentials as we do in the traditional format. Also, the command examples, in this chapter, assume you are logged in or switched to the root user. (For more resources related to this topic, see here.)  The following topics will be covered: User and group management from GUI and command line Quotas Password aging Sudoers Managing users and groups from GUI and command line We can add a user to the system using useradd from the command line with a simple command as follows: useradd testuser This creates a user entry in the /etc/passwd file and automatically creates the home directory for the user in /home. The /etc/passwd entry looks like this: testuser:x:1001:1001::/home/testuser:/bin/bash But, as we all know, the user is in a locked state and cannot login to the system unless we add a password for the user using the command: passwd testuser This will, in turn, modify the /etc/shadow file, at the same time unlock the user, and the user will be able to login to the system. By default, the preceding set of commands will create both a user and a group for the testuser on the system. What if we want a certain set of users to be a part of a common group? We will use the -g option along with the useradd command to define the group for the user, but we have to make sure that the group already exists. So, to create users such as testuser1, testuser2, and testuser3 and make them part of a common group called testgroup, we will first create the group and then we create the users using the -g or -G switch. So we will do this: # To create the group : groupadd testgroup # To create the user with the above group and provide password and unlock user at the same time : useradd testuser1 -G testgroup passwd testuser1 useradd testuser2 -g 1002 passwd testuser2 Here, we have used both -g and -G. The difference between them is: with -G, we create the user with its default group and assign the user to the common testgroup as well, but with -g, we created the user as part of the testgroup only. In both cases, we can use either the gid or the group name obtained from the /etc/group file. There are a couple of more options that we can use for an advanced level user creation, for example, for system users with uid less than 500, we have to use the -r option, which will create a user on the system but the uid will be less than 500. We also can use -u to define a specific uid, which must be unique and greater than 499. Common options that we can use with the useradd command are: -c: This option is used for comments, generally to define the user's real name such as -c "John Doe". -d: This option is used to define home-dir; by default, the home directory is created in /home such as -d /var/<user name>. -g: This option is used for the group name or the group number for the user's default group. The group must already have been created earlier. -G: This option is used for additional group names or group numbers, separated by commas, of which the user is a member. Again, these groups must also have been created earlier. -r: This option is used to create a system account with a UID less than 500 and without a home directory. -u: This option is the user ID for the user. It must be unique and greater than 499. There are few quick options that we use with the passwd command as well. These are: -l: This option is to lock the password for the user's account -u: This option is to unlock the password for the user's account -e: This option is to expire the password for the user -x: This option is to define the maximum days for the password lifetime -n: This option is to define the minimum days for the password lifetime Quotas In order to control the disk space used in the Linux filesystem, we must use quota, which enables us to control the disk space and thus helps us resolve low disk space issues to a great extent. For this, we have to enable user and group quota on the Linux system. In CentOS 7, the user and group quota are not enabled by default so we have to enable them first. To check whether quota is enabled, or not, we issue the following command: mount | grep ' / ' The image shows that the root filesystem is enabled without quota as mentioned by the noquota in the output. Now, we have to enable quota on the root (/) filesystem, and to do that, we have to first edit the file /etc/default/grub and add the following to the GRUB_CMDLINE_LINUX: rootflags=usrquota,grpquota The GRUB_CMDLINE_LINUX line should read as follows: GRUB_CMDLINE_LINUX="rd.lvm.lv=centos/swap vconsole.font=latarcyrheb-sun16 rd.lvm.lv=centos/root crashkernel=auto  vconsole.keymap=us rhgb quiet rootflags=usrquota,grpquota" The /etc/default/grub should like the following screenshot: Since we have to reflect the changes we just made, we should backup the grub configuration using the following command: cp /boot/grub2/grub.cfg /boot/grub2/grub.cfg.original Now, we have to rebuild the grub with the changes we just made using the command: grub2-mkconfig -o /boot/grub2/grub.cfg Next, reboot the system. Once it's up, login and verify that the quota is enabled using the command we used before: mount | grep ' / ' It should now show us that the quota is enabled and will show us an output as follows: /dev/mapper/centos-root on / type xfs (rw,relatime,attr2,inode64,usrquota,grpquota) Now, since quota is enabled, we will further install quota using the following to operate quota for different users and groups, and so on: yum -y install quota Once quota is installed, we check the current quota for users using the following command: repquota -as The preceding command will report user quotas in a human readable format. From the preceding screenshot, there are two ways we can limit quota for users and groups, one is setting soft and hard limits for the size of disk space used or another is limiting the user or group by limiting the number of files they can create. In both cases, soft and hard limits are used. A soft limit is something that warns the user when the soft limit is reached and the hard limit is the limit that they cannot bypass. We will use the following command to modify a user quota: edquota -u username Now, we will use the following command to modify the group quota: edquota -g groupname If you have other partitions mounted separately, you have to modify the /etc/fstab to enable quota on the filesystem by adding usrquota and grpquota after the defaults for that specific partition as in the following screenshot, where we have enabled the quota for the /var partition: Once you are finished enabling quota, remount the filesystem and run the following commands: To remount /var : mount -o remount /var To enable quota : quotacheck -avugm quotaon -avug Quota is something all system admins use to handle disk space consumed on a server by users or groups and limit over usage of the space. It thus helps them manage the disk space usage on the system. In this regard, it should be noted that you plan before your installation and create partitions accordingly as well so that the disk space is used properly. Multiple separate partitions such as /var and /home etc are always suggested, as generally, these are the partitions, which consume most space on a Linux system. So, if we keep them on a separate partition, it will not eat up the root ('/') filesystem space and will be more failsafe than using an entire filesystem mounted as only root. Password aging It is a good policy to have password aging so that the users are forced to change their password at a certain interval. This, in turn, helps to keep the security of the system as well. We can use chage to configure the password to expire the first time the user logs in to the system. Note: This process will not work if the user logs in to the system using SSH. This method of using chage will ensure that the user is forced to change the password right away. If we use only chage <username>, it will display the current password aging value for the specified user and will allow them to be changed interactively. The following steps need to be performed to accomplish password aging: Lock the user. If the user doesn't exist, we will use the useradd command to create the user. However, we will not assign any password to the user so that it remains locked. But, if the user already exists on the system, we will use the usermod command to lock the user: Usermod -L <username> Force immediate password change using the following command: chage -d 0 <username> Unlock the account, This can be achieved in two ways. One is to assign an initial password and the other way is to assign a null password. We will take the first approach as the second one, though possible, is not a good practice in terms of security. Therefore, here is what we do to assign an initial password: Use the python command to start the command-line python interpreter: import crypt; print crypt.crypt("Q!W@E#R$","Bing0000/") Here, we have used the Q!W@E#R$ password with a salt combination of the alphanumeric character: Bing0000 followed by a (/) character. The output is the encrypted password, similar to 'BiagqBsi6gl1o'. Press Ctrl + D to exit the Python interpreter. At the shell, enter the following command with the encrypted output of the Python interpreter: usermod -p "<encrypted-password>" <username> So, here, in our case, if the username is testuser, we will use the following command: usermod -p "BiagqBsi6gl1o" testuser Now, upon initial login using the "Q!W@E#R$" password, the user will be prompted for a new password. Setting the password policy This is a set of rules defined in some files, which have to be followed when a system user is setting up. It's an important factor in security because one of the many security breach histories was started with hacking user passwords. This is the reason why most organizations set a password policy for their users. All usernames and passwords must comply with this. A password policy usually is defined by the following: Password aging Password length Password complexity Limit login failures Limit prior password reuse Configuring password aging and password length Password aging and password length are defined in /etc/login.defs. Aging basically means the maximum number of days a password might be used, minimum number of days allowed between password changes, and number of warnings before the password expires. Length refers to the number of characters required for creating the password. To configure password aging and length, we should edit the /etc/login.defs file and set different PASS values according to the policy set by the organization. Note: The password aging controls defined here does not affect existing users; it only affects the newly created users. So, we must set these policies when setting up the system or the server at the beginning. The values we modify are: PASS_MAX_DAYS: The maximum number of days a password can be used PASS_MIN_DAYS: The minimum number of days allowed between password changes PASS_MIN_LEN: The minimum acceptable password length PASS_WARN_AGE: The number of days warning to be given before a password expires Let's take a look at a sample configuration of the login.defs file: Configuring password complexity and limiting reused password usage By editing the /etc/pam.d/system-auth file, we can configure the password complexity and the number of reused passwords to be denied. A password complexity refers to the complexity of the characters used in the password, and the reused password deny refers to denying the desired number of passwords the user used in the past. By setting the complexity, we force the usage of the desired number of capital characters, lowercase characters, numbers, and symbols in a password. The password will be denied by the system until and unless the complexity set by the rules are met. We do this using the following terms: Force capital characters in passwords: ucredit=-X, where X is the number of capital characters required in the password Force lower case characters in passwords: lcredit=-X, where X is the number of lower case characters required in the password Force numbers in passwords: dcredit=-X, where X is the number numbers required in the password Force the use of symbols in passwords: ocredit=-X, where X is the number of symbols required in the password For example: password requisite pam_cracklib.so try_first_pass retry=3 type= ucredit=-2 lcredit=-2 dcredit=-2 ocredit=-2 Deny reused passwords: remember=X, where X is the number of past passwords to be denied For example: password sufficient pam_unix.so sha512 shadow nullok try_first_pass use_authtok remember=5 Let's now take a look at a sample configuration of /etc/pam.d/system-auth: Configuring login failures We set the number of login failures allowed by a user in the /etc/pam.d/password-auth, /etc/pam.d/system-auth, and /etc/pam.d/login files. When a user's failed login attempts are higher than the number defined here, the account is locked and only a system administrator can unlock the account. To configure this, make the following additions to the files. The following deny=X parameter configures this, where X is the number of failed login attempts allowed: Add these two lines to the /etc/pam.d/password-auth and /etc/pam.d/system-auth files and only the first line to the /etc/pam.d/login file: auth        required    pam_tally2.so file=/var/log/tallylog deny=3 no_magic_root unlock_time=300 account     required    pam_tally2.so The following screenshot is a sample /etc/pam.d/system-auth file: The following is a sample /etc/pam.d/login file: To see failures, use the following command: pam_tally2 –user=<User Name> To reset the failure attempts and to enable the user to login again, use the following command: pam_tally2 –user=<User Name> --reset Sudoers Separation of user privilege is one of the main features in Linux operating systems. Normal users operate in limited privileged sessions to limit the scope of their influence on the entire system. One special user exists on Linux that we know about already is root, which has super-user privileges. This account doesn't have any restrictions that are present to normal users. Users can execute commands with super-user or root privileges in a number of different ways. There are mainly three different ways to obtain root privileges on a system: Login to the system as root Login to the system as any user and then use the su - command. This will ask you for the root password and once authenticated, will give you the root shell session. We can disconnect this root shell using Ctrl + D or using the command exit. Once exited, we will come back to our normal user shell. Run commands with root privileges using sudo without spawning a root shell or logging in as root. This sudo command works as follows: sudo <command to execute> Unlike su, sudo will request the password of the user calling the command, not the root password. Sudo doesn't work by default and requires to be set up before it functions correctly. In the following section, we will see how to configure sudo and modify the /etc/sudoers file so that it works the way we want it to. visudo Sudo is modified or implemented using the /etc/sudoers file, and visudo is the command that enables us to edit the file. Note: This file should not be edited using a normal text editor to avoid potential race conditions in updating the file with other processes. Instead, the visudo command should be used. The visudo command opens a text editor normally, but then validates the syntax of the file upon saving. This prevents configuration errors from blocking sudo operations. By default, visudo opens the /etc/sudoers file in Vi Editor, but we can configure it to use the nano text editor instead. For that, we have to make sure nano is already installed or we can install nano using: yum install nano -y Now, we can change it to use nano by editing the ~/.bashrc file: export EDITOR=/usr/bin/nano Then, source the file using: . ~/.bashrc Now, we can use visudo with nano to edit the /etc/sudoers file. So, let's open the /etc/sudoers file using visudo and learn a few things. We can use different kinds of aliases for different set of commands, software, services, users, groups, and so on. For example: Cmnd_Alias NETWORKING = /sbin/route, /sbin/ifconfig, /bin/ping, /sbin/dhclient, /usr/bin/net, /sbin/iptables, /usr/bin/rfcomm, /usr/bin/wvdial, /sbin/iwconfig, /sbin/mii-tool Cmnd_Alias SOFTWARE = /bin/rpm, /usr/bin/up2date, /usr/bin/yum Cmnd_Alias SERVICES = /sbin/service, /sbin/chkconfig and many more ... We can use these aliases to assign a set of command execution rights to a user or a group. For example, if we want to assign the NETWORKING set of commands to the group netadmin we will define: %netadmin ALL = NETWORKING Otherwise, if we want to allow the wheel group users to run all the commands, we use the following command: %wheel  ALL=(ALL)  ALL If we want a specific user, john, to get access to all commands we use the following command: john  ALL=(ALL)  ALL We can create different groups of users, with overlapping membership: User_Alias      GROUPONE = abby, brent, carl User_Alias      GROUPTWO = brent, doris, eric, User_Alias      GROUPTHREE = doris, felicia, grant Group names must start with a capital letter. We can then allow members of GROUPTWO to update the yum database and all the commands assigned to the preceding software by creating a rule like this: GROUPTWO    ALL = SOFTWARE If we do not specify a user/group to run, sudo defaults to the root user. We can allow members of GROUPTHREE to shutdown and reboot the machine by creating a command alias and using that in a rule for GROUPTHREE: Cmnd_Alias      POWER = /sbin/shutdown, /sbin/halt, /sbin/reboot, /sbin/restart GROUPTHREE  ALL = POWER We create a command alias called POWER that contains commands to power off and reboot the machine. We then allow the members of GROUPTHREE to execute these commands. We can also create Run as aliases, which can replace the portion of the rule that specifies to the user to execute the command as: Runas_Alias     WEB = www-data, apache GROUPONE    ALL = (WEB) ALL This will allow anyone who is a member of GROUPONE to execute commands as the www-data user or the apache user. Just keep in mind that later, rules will override previous rules when there is a conflict between the two. There are a number of ways that you can achieve more control over how sudo handles a command. Here are some examples: The updatedb command associated with the mlocate package is relatively harmless. If we want to allow users to execute it with root privileges without having to type a password, we can make a rule like this: GROUPONE    ALL = NOPASSWD: /usr/bin/updatedb NOPASSWD is a tag that means no password will be requested. It has a companion command called PASSWD, which is the default behavior. A tag is relevant for the rest of the rule unless overruled by its twin tag later down the line. For instance, we can have a line like this: GROUPTWO    ALL = NOPASSWD: /usr/bin/updatedb, PASSWD: /bin/kill In this case, a user can run the updatedb command without a password as the root user, but entering the root password will be required for running the kill command. Another helpful tag is NOEXEC, which can be used to prevent some dangerous behavior in certain programs. For example, some programs, such as less, can spawn other commands by typing this from within their interface: !command_to_run This basically executes any command the user gives it with the same permissions that less is running under, which can be quite dangerous. To restrict this, we could use a line like this: username    ALL = NOEXEC: /usr/bin/less We should now have clear understanding of what sudo is and how do we modify and provide access rights using visudo. There are many more things left here. You can check the default /etc/sudoers file, which has a good number of examples, using the visudo command, or you can read the sudoers manual as well. One point to remember is that root privileges are not given to regular users often. It is important for us to understand what these command do when you execute with root privileges. Do not take the responsibility lightly. Learn the best way to use these tools for your use case, and lock down any functionality that is not needed. Reference Now, let's take a look at the major reference used throughout the chapter: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/System_Administrators_Guide/index.html Summary In all we learned about some advanced user management and how to manage users through the command line, along with password aging, quota, exposure to /etc/sudoers, and how to modify them using visudo. User and password management is a regular task that a system administrator performs on servers, and it has a very important role in the overall security of the system. Resources for Article: Further resources on this subject: SELinux - Highly Secured Web Hosting for Python-based Web Applications [article] A Peek Under the Hood – Facts, Types, and Providers [article] Puppet Language and Style [article]
Read more
  • 0
  • 0
  • 8945

article-image-video-surveillance-background-modeling
Packt
30 Dec 2015
7 min read
Save for later

Video Surveillance, Background Modeling

Packt
30 Dec 2015
7 min read
In this article by David Millán Escrivá, Prateek Joshi and Vinícius Godoy the authors of the book OpenCV By Example, willIn order to detect moving objects, we first need to build a model of the background. This is not the same as the direct frame differencing because we are actually modeling the background and using this model to detect moving objects. When we say that we are modeling the background, we are basically building a mathematical formulation that can be used to represent the background. So, this performs in a much better way than the simple frame differencing technique. This technique tries to detect static parts of the scene and then includes builds (updates?) in the background model. This background model is then used to detect background pixels. So, it's an adaptive technique that can adjust according to the scene. (For more resources related to this topic, see here.) Naive background subtraction Let's start the discussion from the beginning. What does a background subtraction process look like? Consider the following image: The preceding image represents the background scene. Now, let's introduce a new object into this scene: As shown in the preceding image, there is a new object in the scene. So, if we compute the difference between this image and our background model, you should be able to identify the location of the TV remote: The overall process looks like this: Does it work well? There's a reason why we call it the naive approach. It works under ideal conditions, and as we know, nothing is ideal in the real world. It does a reasonably good job of computing the shape of the given object, but it does so under some constraints. One of the main requirements of this approach is that the color and intensity of the object should be sufficiently different from that of the background. Some of the factors that affect these kinds of algorithms are image noise, lighting conditions, autofocus in cameras, and so on. Once a new object enters our scene and stays there, it will be difficult to detect new objects that are in front of it. This is because we don't update our background model, and the new object is now part of our background. Consider the following image: Now, let's say a new object enters our scene: We identify this to be a new object, which is fine. Let's say another object comes into the scene: It will be difficult to identify the location of these two different objects because their locations overlap. Here's what we get after subtracting the background and applying the threshold: In this approach, we assume that the background is static. If some parts of our background start moving, then those parts will start getting detected as new objects. So, even if the movements are minor, say a waving flag, it will cause problems in our detection algorithm. This approach is also sensitive to changes in illumination, and it cannot handle any camera movement. Needless to say, it's a delicate approach! We need something that can handle all these things in the real world. Frame differencing We know that we cannot keep a static background image that can be used to detect objects. So, one of the ways to fix this would be to use frame differencing. It is one of the simplest techniques that we can use to see what parts of the video are moving. When we consider a live video stream, the difference between successive frames gives a lot of information. The concept is fairly straightforward. We just take the difference between successive frames and display the difference. If I move my laptop rapidly, we can see something like this: Instead of the laptop, let's move the object and see what happens. If I rapidly shake my head, it will look something like this: As you can see in the preceding images, only the moving parts of the video get highlighted. This gives us a good starting point to see the areas that are moving in the video. Let's take a look at the function to compute the frame difference: Mat frameDiff(Mat prevFrame, Mat curFrame, Mat nextFrame) { Mat diffFrames1, diffFrames2, output; // Compute absolute difference between current frame and the next frame absdiff(nextFrame, curFrame, diffFrames1); // Compute absolute difference between current frame and the previous frame absdiff(curFrame, prevFrame, diffFrames2); // Bitwise "AND" operation between the above two diff images bitwise_and(diffFrames1, diffFrames2, output); return output; } Frame differencing is fairly straightforward. You compute the absolute difference between the current frame and previous frame and between the current frame and next frame. We then take these frame differences and apply bitwise AND operator. This will highlight the moving parts in the image. If you just compute the difference between the current frame and previous frame, it tends to be noisy. Hence, we need to use the bitwise AND operator between successive frame differences to get some stability when we see the moving objects. Let's take a look at the function that can extract and return a frame from the webcam: Mat getFrame(VideoCapture cap, float scalingFactor) { //float scalingFactor = 0.5; Mat frame, output; // Capture the current frame cap >> frame; // Resize the frame resize(frame, frame, Size(), scalingFactor, scalingFactor, INTER_AREA); // Convert to grayscale cvtColor(frame, output, CV_BGR2GRAY); return output; } As we can see, it's pretty straightforward. We just need to resize the frame and convert it to grayscale. Now that we have the helper functions ready, let's take a look at the main function and see how it all comes together: int main(int argc, char* argv[]) { Mat frame, prevFrame, curFrame, nextFrame; char ch; // Create the capture object // 0 -> input arg that specifies it should take the input from the webcam VideoCapture cap(0); // If you cannot open the webcam, stop the execution! if( !cap.isOpened() ) return -1; //create GUI windows namedWindow("Frame"); // Scaling factor to resize the input frames from the webcam float scalingFactor = 0.75; prevFrame = getFrame(cap, scalingFactor); curFrame = getFrame(cap, scalingFactor); nextFrame = getFrame(cap, scalingFactor); // Iterate until the user presses the Esc key while(true) { // Show the object movement imshow("Object Movement", frameDiff(prevFrame, curFrame, nextFrame)); // Update the variables and grab the next frame prevFrame = curFrame; curFrame = nextFrame; nextFrame = getFrame(cap, scalingFactor); // Get the keyboard input and check if it's 'Esc' // 27 -> ASCII value of 'Esc' key ch = waitKey( 30 ); if (ch == 27) { break; } } // Release the video capture object cap.release(); // Close all windows destroyAllWindows(); return 1; } How well does it work? As we can see, frame differencing addresses a couple of important problems that we faced earlier. It can quickly adapt to lighting changes or camera movements. If an object comes in the frame and stays there, it will not be detected in the future frames. One of the main concerns of this approach is about detecting uniformly colored objects. It can only detect the edges of a uniformly colored object. This is because a large portion of this object will result in very low pixel differences, as shown in the following image: Let's say this object moved slightly. If we compare this with the previous frame, it will look like this: Hence, we have very few pixels that are labeled on that object. Another concern is that it is difficult to detect whether an object is moving toward the camera or away from it. Resources for Article: Further resources on this subject: Tracking Objects in Videos [article] Detecting Shapes Employing Hough Transform [article] Hand Gesture Recognition Using a Kinect Depth Sensor [article]
Read more
  • 0
  • 0
  • 2709
Modal Close icon
Modal Close icon