Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Programming

1083 Articles
article-image-unit-testing-0
Packt
18 Feb 2015
18 min read
Save for later

Unit Testing

Packt
18 Feb 2015
18 min read
In this article by Mikael Lundin, author of the book Testing with F#, we will see how unit testing is the art of designing our program in such a way that we can easily test each function as isolated units and such verify its correctness. Unit testing is not only a tool for verification of functionality, but also mostly a tool for designing that functionality in a testable way. What you gain is the means of finding problems early, facilitating change, documentation, and design. In this article, we will dive into how to write good unit tests using F#: Testing in isolation Finding the abstraction level (For more resources related to this topic, see here.) FsUnit The current state of unit testing in F# is good. You can get all the major test frameworks running with little effort, but there is still something that feels a bit off with the way tests and asserts are expressed: open NUnit.Framework Assert.That(result, Is.EqualTo(42)) Using FsUnit, you can achieve much higher expressiveness in writing unit tests by simply reversing the way the assert is written: open FsUnit result |> should equal 42 The FsUnit framework is not a test runner in itself, but uses an underlying test framework to execute. The underlying framework can be of MSTest, NUnit, or xUnit. FsUnit can best be explained as having a different structure and syntax while writing tests. While this is a more dense syntax, the need for structure still exists and AAA is more needed more than ever. Consider the following test example: [<Measure>] type EUR [<Measure>] type SEK type Country = | Sweden | Germany | France   let calculateVat country (amount : float<'u>) =    match country with    | Sweden -> amount * 0.25    | Germany -> amount * 0.19    | France -> amount * 0.2   open NUnit.Framework open FsUnit   [<Test>] let ``Sweden should have 25% VAT`` () =    let amount = 200.<SEK>      calculateVat Sweden amount |> should equal 50<SEK> This code will calculate the VAT in Sweden in Swedish currency. What is interesting is that when we break down the test code and see that it actually follows the AAA structure, even it doesn't explicitly tell us this is so: [<Test>] let ``Germany should have 19% VAT`` () =    // arrange    let amount = 200.<EUR>    // act    calculateVat Germany amount    //assert    |> should equal 38<EUR> The only thing I did here was add the annotations for AAA. It gives us the perspective of what we're doing, what frames we're working inside, and the rules for writing good unit tests. Assertions We have already seen the equal assertion, which verifies that the test result is equal to the expected value: result |> should equal 42 You can negate this assertion by using the not' statement, as follows: result |> should not' (equal 43) With strings, it's quite common to assert that a string starts or ends with some value, as follows: "$12" |> should startWith "$" "$12" |> should endWith "12" And, you can also negate that, as follows: "$12" |> should not' (startWith "€") "$12" |> should not' (endWith "14") You can verify that a result is within a boundary. This will, in turn, verify that the result is somewhere between the values of 35-45: result |> should (equalWithin 5) 40 And, you can also negate that, as follows: result |> should not' ((equalWithin 1) 40) With the collection types list, array, and sequence, you can check that it contains a specific value: [1..10] |> should contain 5 And, you can also negate it to verify that a value is missing, as follows: [1; 1; 2; 3; 5; 8; 13] |> should not' (contain 7) It is common to test the boundaries of a function and then its exception handling. This means you need to be able to assert exceptions, as follows: let getPersonById id = failwith "id cannot be less than 0" (fun () -> getPersonById -1 |> ignore) |> should throw typeof<System.Exception> There is a be function that can be used in a lot of interesting ways. Even in situations where the equal assertion can replace some of these be structures, we can opt for a more semantic way of expressing our assertions, providing better error messages. Let us see examples of this, as follows: // true or false 1 = 1 |> should be True 1 = 2 |> should be False        // strings as result "" |> should be EmptyString null |> should be NullOrEmptyString   // null is nasty in functional programming [] |> should not' (be Null)   // same reference let person1 = new System.Object() let person2 = person1 person1 |> should be (sameAs person2)   // not same reference, because copy by value let a = System.DateTime.Now let b = a a |> should not' (be (sameAs b))   // greater and lesser result |> should be (greaterThan 0) result |> should not' (be lessThan 0)   // of type result |> should be ofExactType<int>   // list assertions [] |> should be Empty [1; 2; 3] |> should not' (be Empty) With this, you should be able to assert most of the things you're looking for. But there still might be a few edge cases out there that default FsUnit asserts won't catch. Custom assertions FsUnit is extensible, which makes it easy to add your own assertions on top of the chosen test runner. This has the possibility of making your tests extremely readable. The first example will be a custom assert which verifies that a given string matches a regular expression. This will be implemented using NUnit as a framework, as follows: open FsUnit open NUnit.Framework.Constraints open System.Text.RegularExpressions   // NUnit: implement a new assert type MatchConstraint(n) =    inherit Constraint() with       override this.WriteDescriptionTo(writer : MessageWriter) : unit =            writer.WritePredicate("matches")            writer.WriteExpectedValue(sprintf "%s" n)        override this.Matches(actual : obj) =            match actual with            | :? string as input -> Regex.IsMatch(input, n)            | _ -> failwith "input must be of string type"            let match' n = MatchConstraint(n)   open NUnit.Framework   [<Test>] let ``NUnit custom assert`` () =    "2014-10-11" |> should match' "d{4}-d{2}-d{2}"    "11/10 2014" |> should not' (match' "d{4}-d{2}-d{2}") In order to create your own assert, you need to create a type that implements the NUnit.Framework.Constraints.IConstraint interface, and this is easily done by inheriting from the Constraint base class. You need to override both the WriteDescriptionTo() and Matches() method, where the first one controls the message that will be output from the test, and the second is the actual test. In this implementation, I verify that input is a string; or the test will fail. Then, I use the Regex.IsMatch() static function to verify the match. Next, we create an alias for the MatchConstraint() function, match', with the extra apostrophe to avoid conflict with the internal F# match expression, and then we can use it as any other assert function in FsUnit. Doing the same for xUnit requires a completely different implementation. First, we need to add a reference to NHamcrest API. We'll find it by searching for the package in the NuGet Package Manager: Instead, we make an implementation that uses the NHamcrest API, which is a .NET port of the Java Hamcrest library for building matchers for test expressions, shown as follows: open System.Text.RegularExpressions open NHamcrest open NHamcrest.Core   // test assertion for regular expression matching let match' pattern =    CustomMatcher<obj>(sprintf "Matches %s" pattern, fun c ->        match c with        | :? string as input -> Regex.IsMatch(input, pattern)        | _ -> false)   open Xunit open FsUnit.Xunit   [<Fact>] let ``Xunit custom assert`` () =    "2014-10-11" |> should match' "d{4}-d{2}-d{2}"    "11/10 2014" |> should not' (match' "d{4}-d{2}-d{2}") The functionality in this implementation is the same as the NUnit version, but the implementation here is much easier. We create a function that receives an argument and return a CustomMatcher<obj> object. This will only take the output message from the test and the function to test the match. Writing an assertion for FsUnit driven by MSTest works exactly the same way as it would in Xunit, by NHamcrest creating a CustomMatcher<obj> object. Unquote There is another F# assertion library that is completely different from FsUnit but with different design philosophies accomplishes the same thing, by making F# unit tests more functional. Just like FsUnit, this library provides the means of writing assertions, but relies on NUnit as a testing framework. Instead of working with a DSL like FsUnit or API such as with the NUnit framework, the Unquote library assertions are based on F# code quotations. Code quotations is a quite unknown feature of F# where you can turn any code into an abstract syntax tree. Namely, when the F# compiler finds a code quotation in your source file, it will not compile it, but rather expand it into a syntax tree that represents an F# expression. The following is an example of a code quotation: <@ 1 + 1 @> If we execute this in F# Interactive, we'll get the following output: val it : Quotations.Expr = Call (None, op_Addition, [Value (1), Value (1)]) This is truly code as data, and we can use it to write code that operates on code as if it was data, which in this case, it is. It brings us closer to what a compiler does, and gives us lots of power in the metadata programming space. We can use this to write assertions with Unquote. Start by including the Unquote NuGet package in your test project, as shown in the following screenshot: And now, we can implement our first test using Unquote, as follows: open NUnit.Framework open Swensen.Unquote   [<Test>] let ``Fibonacci sequence should start with 1, 1, 2, 3, 5`` () =     test <@ fibonacci |> Seq.take 5 |> List.ofSeq = [1; 1; 2; 3; 5] @> This works by Unquote first finding the equals operation, and then reducing each side of the equals sign until they are equal or no longer able to reduce. Writing a test that fails and watching the output more easily explains this. The following test should fail because 9 is not a prime number: [<Test>] let ``prime numbers under 10 are 2, 3, 5, 7, 9`` () =    test <@ primes 10 = [2; 3; 5; 7; 9] @> // fail The test will fail with the following message: Test Name: prime numbers under 10 are 2, 3, 5, 7, 9 Test FullName: chapter04.prime numbers under 10 are 2, 3, 5, 7, 9 Test Outcome: Failed Test Duration: 0:00:00.077   Result Message: primes 10 = [2; 3; 5; 7; 9] [2; 3; 5; 7] = [2; 3; 5; 7; 9] false   Result StackTrace: at Microsoft.FSharp.Core.Operators.Raise[T](Exception exn) at chapter04.prime numbers under 10 are 2, 3, 5, 7, 9() In the resulting message, we can see both sides of the equals sign reduced until only false remains. It's a very elegant way of breaking down a complex assertion. Assertions The assertions in Unquote are not as specific or extensive as the ones in FsUnit. The idea of having lots of specific assertions for different situations is to get very descriptive error messages when the tests fail. Since Unquote actually outputs the whole reduction of the statements when the test fails, the need for explicit assertions is not that high. You'll get a descript error message anyway. The absolute most common is to check for equality, as shown before. You can also verify that two expressions are not equal: test <@ 1 + 2 = 4 - 1 @> test <@ 1 + 2 <> 4 @> We can check whether a value is greater or smaller than the expected value: test <@ 42 < 1337 @> test <@ 1337 > 42 @> You can check for a specific exception, or just any exception: raises<System.NullReferenceException> <@ (null : string).Length @> raises<exn> <@ System.String.Format(null, null) @> Here, the Unquote syntax excels compared to FsUnit, which uses a unit lambda expression to do the same thing in a quirky way. The Unquote library also has its reduce functionality in the public API, making it possible for you to reduce and analyze an expression. Using the reduceFully syntax, we can get the reduction in a list, as shown in the following: > <@ (1+2)/3 @> |> reduceFully |> List.map decompile;; val it : string list = ["(1 + 2) / 3"; "3 / 3"; "1"] If we just want the output to console output, we can run the unquote command directly: > unquote <@ [for i in 1..5 -> i * i] = ([1..5] |> List.map (fun i -> i * i)) @>;; Seq.toList (seq (Seq.delay (fun () -> Seq.map (fun i -> i * i) {1..5}))) = ([1..5] |> List.map (fun i -> i * i)) Seq.toList (seq seq [1; 4; 9; 16; ...]) = ([1; 2; 3; 4; 5] |> List.map (fun i -> i * i)) Seq.toList seq [1; 4; 9; 16; ...] = [1; 4; 9; 16; 25] [1; 4; 9; 16; 25] = [1; 4; 9; 16; 25] true It is important to know what tools are out there, and Unquote is one of those tools that is fantastic to know about when you run into a testing problem in which you want to reduce both sides of an equals sign. Most often, this belongs to difference computations or algorithms like price calculation. We have also seen that Unquote provides a great way of expressing tests for exceptions that is unmatched by FsUnit. Testing in isolation One of the most important aspects of unit testing is to test in isolation. This does not only mean to fake any external dependency, but also that the test code itself should not be tied up to some other test code. If you're not testing in isolation, there is a potential risk that your test fails. This is not because of the system under test, but the state that has lingered from a previous test run, or external dependencies. Writing pure functions without any state is one way of making sure your test runs in isolation. Another way is by making sure that the test creates all the needed state itself. Shared state, like connections, between tests is a bad idea. Using TestFixtureSetUp/TearDown attributes to set up a state for a set of tests is a bad idea. Keeping low performance resources around because they're expensive to set up is a bad idea. The most common shared states are the following: The ASP.NET Model View Controller (MVC) session state Dependency injection setup Database connection, even though it is no longer strictly a unit test Here's how one should think about unit testing in isolation, as shown in the following screenshot: Each test is responsible for setting up the SUT and its database/web service stubs in order to perform the test and assert on the result. It is equally important that the test cleans up after itself, which in the case of unit tests most often can be handed over to the garbage collector, and doesn't need to be explicitly disposed. It is common to think that one should only isolate a test fixture from other test fixtures, but this idea of a test fixture is bad. Instead, one should strive for having each test stand for itself to as large an extent as possible, and not be dependent on outside setups. This does not mean you will have unnecessary long unit tests, provided you write SUT and tests well within that context. The problem we often run into is that the SUT itself maintains some kind of state that is present between tests. The state can simply be a value that is set in the application domain and is present between different test runs, as follows: let getCustomerFullNameByID id =    if cache.ContainsKey(id) then        (cache.[id] :?> Customer).FullName    else        // get from database        // NOTE: stub code        let customer = db.getCustomerByID id        cache.[id] <- customer        customer.FullName The problem we see here is that the cache will be present from one test to another, so when the second test is running, it needs to make sure that its running with a clean cache, or the result might not be as expected. One way to test it properly would be to separate the core logic from the cache and test them each independently. Another would be to treat it as a black box and ignore the cache completely. If the cache makes the test fail, then the functionality fails as a whole. This depends on if we see the cache as an implementation detail of the function or a functionality by itself. Testing implementation details, or private functions, is dirty because our tests might break even if the functionality hasn't changed. And yet, there might be benefits into taking the implementation detail into account. In this case, we could use the cache functionality to easily stub out the database without the need of any mocking framework. Vertical slice testing Most often, we deal with dependencies as something we need to mock away, where as the better option would be to implement a test harness directly into the product. We know what kind of data and what kind of calls we need to make to the database, so right there, we have a public API for the database. This is often called a data access layer in a three-tier architecture (but no one ever does those anymore, right?). As we have a public data access layer, we could easily implement an in-memory representation that can be used not only by our tests, but in development of the product, as shown in the following image: When you're running the application in development mode, you configure it toward the in-memory version of the dependency. This provides you with the following benefits: You'll get a faster development environment Your tests will become simpler You have complete control of your dependency As your development environment is doing everything in memory, it becomes blazing fast. And as you develop your application, you will appreciate adjusting that public API and getting to understand completely what you expect from that dependency. It will lead to a cleaner API, where very few side effects are allowed to seep through. Your tests will become much simpler, as instead of mocking away the dependency, you can call the in-memory dependency and set whatever state you want. Here's an example of what a public data access API might look like: type IDataAccess =    abstract member GetCustomerByID : int -> Customer    abstract member FindCustomerByName : string -> Customer option    abstract member UpdateCustomerName : int -> string -> Customer    abstract member DeleteCustomerByID : int -> bool This is surely a very simple API, but it will demonstrate the point. There is a database with a customer inside it, and we want to do some operations on that. In this case, our in-memory implementation would look like this: type InMemoryDataAccess() =    let data = new System.Collections.Generic.Dictionary<int, Customer>()      // expose the add method    member this.Add customer = data.Add(customer.ID, customer)      interface IDataAccess with       // throw exception if not found        member this.GetCustomerByID id =            data.[id]        member this.FindCustomerByName fullName =            data.Values |> Seq.tryFind (fun customer -> customer.FullName = fullName)          member this.UpdateCustomerName id fullName =            data.[id] <- { data.[id] with FullName = fullName }            data.[id]          member this.DeleteCustomerByID id =            data.Remove(id) This is a simple implementation that provides the same functionality as the database would, but in memory. This makes it possible to run the tests completely in isolation without worrying about mocking away the dependencies. The dependencies are already substituted with in-memory replacements, and as seen with this example, the in-memory replacement doesn't have to be very extensive. The only extra function except from the interface implementation is the Add() function which lets us set the state prior to the test, as this is something the interface itself doesn't provide for us. Now, in order to sew it together with the real implementation, we need to create a configuration in order to select what version to use, as shown in the following code: open System.Configuration open System.Collections.Specialized   // TryGetValue extension method to NameValueCollection type NameValueCollection with    member this.TryGetValue (key : string) =        if this.Get(key) = null then            None        else            Some (this.Get key)   let dataAccess : IDataAccess =    match ConfigurationManager.AppSettings.TryGetValue("DataAccess") with    | Some "InMemory" -> new InMemoryDataAccess() :> IDataAccess    | Some _ | None -> new DefaultDataAccess() :> IDataAccess        // usage let fullName = (dataAccess.GetCustomerByID 1).FullName Again, with only a few lines of code, we manage to select the appropriate IDataAccess instance and execute against it without using dependency injection or taking a penalty in code readability, as we would in C#. The code is straightforward and easy to read, and we can execute any tests we want without touching the external dependency, or in this case, the database. Finding the abstraction level In order to start unit testing, you have to start writing tests; this is what they'll tell you. If you want to get good at it, just start writing tests, any and a lot of them. The rest will solve itself. I've watched experienced developers sit around staring dumbfounded at an empty screen because they couldn't get into their mind how to get started, what to test. The question is not unfounded. In fact, it is still debated in the Test Driven Development (TDD) community what should be tested. The ground rule is that the test should bring at least as much value as the cost of writing it, but that is a bad rule for someone new to testing, as all tests are expensive for them to write. Summary In this article, we've learned how to write unit tests by using the appropriate tools to our disposal: NUnit, FsUnit, and Unquote. We have also learned about different techniques for handling external dependencies, using interfaces and functional signatures, and executing dependency injection into constructors, properties, and methods. Resources for Article: Further resources on this subject: Learning Option Pricing [article] Pentesting Using Python [article] Penetration Testing [article]
Read more
  • 0
  • 0
  • 2340

article-image-financial-management-microsoft-dynamics-ax-2012-r3
Packt
11 Feb 2015
4 min read
Save for later

Financial Management with Microsoft Dynamics AX 2012 R3

Packt
11 Feb 2015
4 min read
In this article by Mohamed Aamer, author of Microsoft Dynamics AX 2012 R3 Financial Management, we will cover that the core foundation of Enterprise Resource Planning (ERP) is financial management; it is vital to comprehend the financial characteristics in Microsoft Dynamics AX 2012 R3 from a practical perspective engaged with the application mechanism. It is important to cover the following topics: Understanding financial management aspects in Microsoft Dynamics AX 2012 R3 Covering the business rational, basic setups, and configuration Real-life business requirements and its solution Hints of implementation tips and tricks in addition to the key consideration points during analysis, design, deployment, and operation (For more resources related to this topic, see here.) Microsoft Dynamics AX 2012 R3 Financial Management book covers the main characteristics general ledger and its integration between other subledgers (Accounts payable, Accounts receivable, fixed assets, cash and bank management, and inventory). It also covers the core features of main accounts, the categorization accounts, and its controls, along with the opening balance process and concept, and the closing procedure. It then discusses subledgers functionality (Accounts payable, Accounts receivable, fixed assets, cash and bank management, cash flow management, and inventory) in more details by walking through the master data, controls, and transactions and its effects on the general ledger. It explores financial reporting that is one of the basic implementation corner stone. The main principles for reporting are reliability of business information and the ability to get the right information at the right time for the right person. Reports that analyze ERP data in an expressive way represent the output of the ERP implementation; it is considered as the cream of the implementation—the next level of value that solution stakeholders should target for. This ultimate outcome results from building all reports based on a single point of information. Planning reporting needs for ERP The Microsoft Dynamics AX implementation teamwork should challenge the management's reporting needs in the analysis phase of the implementation with a particular focus on exploring the data required to build reports. These data requirements should then be cross-checked with the real data entry activities that end users will execute to ensure that business users will get vital information from the reports. The reporting levels are as follows: Operational management Middle management Top management Understanding information technology value chain The model of a management information system is most applicable to the Information Technology (IT) manager or Chief Information Officer (CIO) of a business. Business owners likely don't care as much about the specifics as long as these aspects of the solution deliver the required results. The following are the basic layers of the value chain: Database management Business processes Business Intelligence Frontend Understanding Microsoft Dynamics AX information source blocks This section explores the information sources that eventually determine the strategic value of Business Intelligence (BI) reporting and analytics. These are divided into three blocks. Detailed transactions block Business Intelligence block Executive decisions block Discovering Microsoft Dynamics AX reporting The reporting options offered by Microsoft Dynamics AX are: Inquiry forms SQL Reporting Services (SSRS) reports The original transaction The Original document function Audit trail Reporting currency Companies report their transactions in a specific currency that is known as accounting currency or local currency. It is normal to post transactions in a different currency, and this amount of money is translated to the home currency using the current exchange rate. Autoreports The Autoreport wizard is a user-friendly tool. The end user can easily generate a report starting from every form in Microsoft Dynamics AX. This wizard helps the user to create a report based on the information in the form and save the report. Summary In this article, we covered financial reporting from planning to consideration of reporting levels. We covered important points that affect reporting quality by considering the reporting value chain, which consists of infrastructure, database management, business processes, business intelligence, and the frontend. We also discussed the information source blocks, which consist of the detailed transactions block, business intelligence block, and executive decisions block. Then we learned about the reporting possibilities in Microsoft Dynamics AX such as inquiry forms and SSRS reports, and autoreport capabilities in Microsoft Dynamics AX 2012 R3. Resources for Article: Further resources on this subject: Customization in Microsoft Dynamics CRM [Article] Getting Started with Microsoft Dynamics CRM 2013 Marketing [Article] SOP Module Setup in Microsoft Dynamics GP [Article]
Read more
  • 0
  • 0
  • 4187

article-image-how-to-build-a-koa-web-application-part-2
Christoffer Hallas
08 Feb 2015
5 min read
Save for later

How to Build a Koa Web Application - Part 2

Christoffer Hallas
08 Feb 2015
5 min read
In Part 1 of this series, we got everything in place for our Koa app using Jade and Mongel. In this post, we will cover Jade templates and how to use listing and viewing pages. Please note that this series requires that you use Node.js version 0.11+. Jade templates Rendering HTML is always an important part of any web application. Luckily, when using Node.js there are many great choices, and for this article we’ve chosen Jade. Keep in mind though that we will only touch on a tiny fraction of the Jade functionality. Let’s create our first Jade template. Create a file called create.jade and put in the following: create.jade doctype html html(lang='en') head title Create Page body h1 Create Page form(method='POST', action='/create') input(type='text', name='title', placeholder='Title') input(type='text', name='contents', placeholder='Contents') input(type='submit') For all the Jade questions you have that we won’t answer in this series, I refer you to the excellent official Jade website at http://jade-lang.com . If you add the following statement app.listen(3000); to the end of index.js, then you should be able to run the program from your terminal using the following command and by visiting http://localhost:3000 in your browser. $ node --harmony index.js The --harmony flag just tells the node program that we need support for generators in our program: Listing and viewing pages Now that we can create a page in our MongoDB database, it is time to actually list and view these pages. For this purpose we need to add another middleware to our index.js file after the first middleware: app.use(function* () { if (this.method != 'GET') { this.status = 405; this.body = 'Method Not Allowed'; return } … }); As you can probably already tell, this new middleware is very similar to the first one we added that handled the creation of pages. At first we make sure that the method of the request is GET, and if not, we respond appropriately and return the following: var params = this.path.split('/').slice(1); var id = params[0]; if (id.length == 0) { var pages = yield Page.find(); var html = jade.renderFile('list.jade', { pages: pages }); this.body = html; return } Then, we proceed to inspect the path attribute of the Koa context, looking for an ID that represents the page in the database. Remember how we redirected using the ID in the previous middleware. We inspect the path by splitting it into an array of strings separated by the forward slashes of a URL; this way the path /1234 becomes an array of ‘’ and ‘1234.’ Because the path starts with a forward slash, the first item in the array will always be the empty string, so we just discard that by default. Then we check the length of the ID parameter, and if it’s zero we know that there is in fact no ID in the path, and we should just look for the pages in the database and render our list.jade template with those pages made available to the template as the variable pages. Making data available in templates is also known as providing locals to the template. list.jade doctype html html(lang="en") head title Your Web Application body h1 Your Web Application ul - each page in pages li a(href='/#{page._id}')= page.title But if the length of id was not zero, we assume that it’s an id and we try to load that specific page from the database instead of all the pages, and we proceed to render our view.jade template with the: var page = yield Page.findById(id); var html = jade.renderFile('view.jade', page); this.body = html; view.jade doctype html html(lang="en") head title= title body h1= title p= contents That’s it You should now be able to run the app as previously described and create a page, list all of your pages, and view them. If you want to, you can continue and build a simple CMS system. Koa is very simple to use and doesn’t enforce a lot of functionality on you, allowing you to pick and choose between libraries that you need and want to use. There are many possibilities and that is one of Koa’s biggest strengths. Find even more Node.js content on our Node.js page. Featuring our latest titles and most popular tutorials, it's the perfect place to learn more about Node.js. About the author Christoffer Hallas is a software developer and entrepreneur from Copenhagen, Denmark. He is a computer polyglot and contributes to and maintains a number of open source projects. When not contemplating his next grand idea (which remains an idea), he enjoys music, sports, and design of all kinds. Christoffer can be found on GitHub as hallas and at Twitter as @hamderhallas.
Read more
  • 0
  • 0
  • 4080

article-image-five-kinds-python-functions-python-34-edition
Packt
06 Feb 2015
33 min read
Save for later

The Five Kinds of Python Functions Python 3.4 Edition

Packt
06 Feb 2015
33 min read
This article is written by Steven Lott, author of the book Functional Python Programming. You can find more about him at http://slott-softwarearchitect.blogspot.com. (For more resources related to this topic, see here.) What's This About? We're going to look at various ways that Python 3 lets us define things which behave like functions. The proper term here is Callable – we're looking at objects that can be called like a function. We'll look at the following Python constructs: Function definitions Higher-order functions Function wrappers (around methods) Lambdas Callable objects Generator functions and the yield parameter And yes, we're aware that the list above has six items on it. That's because higher-order functions in Python aren't really all that complex or different. In some languages, functions that take functions are arguments involving special syntax. In Python, it's simple and common and barely worth mentioning as a separate topic. We'll look at when it's appropriate and inappropriate to use one or the other of these various functional forms. Some background Let's take a quick peek at a basic bit of mathematical formalism. We'll look at a function as an abstract formalism. We often annotate it like this: This shows us that f() is a function. It has one argument, x, and will map this to a single value, y. Some mathematical functions are written in front, for example, y=sin x. Some are written in other places around the argument, for example, y=|x|. In Python, the syntax is more consistent, for example, we use a function like this: >>> abs(-5)5 We've applied the abs() function to an argument value of -5. The argument value was mapped to a value of 5. Terminology Consider the following function: In this definition, the argument is a pair of values, (a,b). This is called the domain. We can summarize it as the domain of values for which the function is defined. Outside this domain, the function is not defined. In Python, we get a TypeError exception if we provide one value or three values as the argument. The function maps the domain pair to a pair of values, (q,r). This is the range of the function. We can call this the range of values that could be returned by the function. Mathematical function features As we look at the abstract mathematical definition of functions, we note that functions are generally assumed to have no hysteresis; they have no history or memory of prior use. This is sometimes called the property of being idempotent: the results are always the same for a given argument value. We see this in Python as a common feature. But it's not universally true. We'll look at a number of exceptions to the rule of idempotence. Here's an example of the usual situation: >>> int("10f1", 16)4337 The value returned from the evaluation of int("10f1", 16) never changes. There are, however, some common examples of non-idempotent functions in Python. Examples of hysteresis Here are three common situations where a function has hysteresis. In some cases, results vary based on history. In other cases, results vary based on events in some external environment, such as follows: Random number generators. We don't want them to produce the same value over and over again. The Python random.randrange() function, is not obviously idempotent. OS functions depend on the state of the machine as a whole. The os.listdir() function returns values that depend on the use of functions such as os.unlink(), os.rename(), and open() (among several others).While the rules are generally simple, it requires a stateful object outside the narrow world of the code itself. These are examples of Python functions that don't completely fit the formal mathematical definition; they lack idempotence, and their values depend on history, other functions, or both. Function Definitions Python has two statements that are essential features of function definition. The def statement specifies the domain and the return statement(s) specify the range. A simplified gloss of the syntax is as follows: def name(params):   body   return expression In effect, the function's domain is defined by the parameters provided in the def statement. This list of parameter names is not all the information on the domain, however. Even if we use one of the Python extensions to add type annotations, that's still not all the information. There may be if statements in the body of the function that impose additional explicit restrictions. There may be other functions that impose their own kind of implicit restrictions. If, for example, the body included math.sqrt() then there would be an implicit restriction on some values being non-negative. The return statements provide the function's range. An empty return statement means a range of simply None values. When there are multiple return statements, the range is the union of the ranges on all the return statements. This mapping between Python syntax and mathematical concepts isn't very complete. We need more information about a function. Example definition Here's an example of function definition: def odd(n):   """odd(n) -> boolean, true if n is odd."""   return n % 2 == 1 What do does this definition tell us? Several things such as: Domain: We know that this function accepts n, a single object. Range: Boolean value, True if n is an odd number. This is the most likely interpretation. It's also remotely possible that the class of n has repurposed __mod__() or __rmod__() methods, in which case the semantics can be pretty obscure. Because of the inherent ambiguity in Python, this function has provided a triple-quoted """Docstring""" parameter with a summary of the function. This is a best practice, and should be followed universally except in articles like this where it gets too long-winded to include a docstring parameter everywhere. In this case, the doctoring parameter doesn't state unambiguously that n is intended to be a number. There are two ways to handle this gap, they are as follows: Actually include words like n is a number in the docstring parameter Include the docstring parameter test cases that show the required behavior Either is acceptable. Both are preferable. Using a function To complete this example, here's how we'd use this odd little function named odd(): >>> odd(3)True>>> odd(4)False This kind of example text can be included into the docstring parameter to create two test cases that offer insight into what the function really means. The lack of declarations More verbose type declarations—as used in many popular programming languages—aren't actually enough information to fully specify a function's domain and range. To be rigorously complete, we need type definitions that include optional predicates. Take a look at the following command: isinstance(n,int) and n >= 0 The assert statement is a good place for this kind of additional argument domain checking. This isn't the perfect solution because assert statements can be disabled very easily. It can help during design and testing and it can help people to read your code. The fussy formal declarations of data type used in other languages are not really needed in Python. Python replaces an up-front claim about required types with a runtime search for appropriate class methods. This works because each Python object has all the type information bound into it. Static compile-time type information is redundant, since the runtime type information is complete. A Python function definition is pretty spare. In includes the minimal amount of information about the function. There are no formal declaration of parameter types or return type. This odd little function will work with any object that implements the % operator: Generally, this means any object that implements __mod__() or __rmod__(). This means most subclasses of numbers.Number. It also means instances of any class that happen to provide these methods. That could become very weird, but still possible. We hesitate to think about non-numeric objects that work with the number-like % operator. Some Python features In Python, functions we declare are proper first-class objects. This means that they have attributes that can be assigned to variables and placed into collections. Quite a few clever things can be done with function objects. One of the most elegant things is to use a function as an argument or a return value from another function. The ability to do this means that we can easily create and use higher-order functions in Python. For folks who know languages such as C (and C++), functions aren't proper first-class objects. A pointer to a function, however, is a first class object in C. But the function itself is a block of code that can't easily be manipulated. We'll look at a number of simple ways in which we can write—and use—higher-order functions in Python. Functions are objects Consider the following command example: >>> not_even = odd>>> not_even(3)True We've assigned the odd little function object to a new variable, not_even. This creates an alias for a function. While this isn't always the best idea, there are times when we might want to provide an alternate name for a function as part of maintaining reverse compatibility with a previous release of a library. Using functions Consider the following function definition: def some_test(function, value):   print(function, value)   return function(value) This function's domain includes arguments named function and value. We can see that it prints the arguments, then applies the function argument to the given value. When we use the preceding function, it looks like this: >>> some_test(odd, 3)<function odd at 0x613978> 3True The some_test() function accepted a function as an argument. When we printed the function, we got a summary, <function odd at 0x613978>, that shows us some information about the object. We also show a summary of the argument value, 3. When we applied the function to a value, we got the expected result. We can—of course—extend this concept. In particular, we can apply a single function to many values. Higher-order Functions Higher-order functions become particularly useful when we apply them to collections of objects. The built-in map() function applies a simple function to each value in an argument sequence. Here's an example: >>> list(map(odd, [1,2,3,4]))[True, False, True, False] We've used the map() function to apply the odd() function to each value in the sequence. This is a lot like evaluating: >>> [odd(x) for x in [1,2,3,4]] We've created a list comprehension instead of applying a higher-order map() function. This is equivalent to the following command snippet: [odd(1), odd(2), odd(3), odd(4)] Here, we've manually applied the odd() function to each value in a sequence. Yes, that's a diesel engine alternator and some hoses: We'll use this alternator as a subject for some concrete examples of higher-order functions. Diesel engine background Some basic diesel engine mechanics. The following some basic information: The engine turns the alternator. The alternator generates pulses that drive the tachometer. Amongst other things, like charging the batteries. The alternator provides an indirect measurement of engine RPMs. Direct measurement would involve connecting to a small geared shaft. It's difficult and expensive. We already have a tachometer; it's just incorrect. The new alternator has new wheels. The ratios between engine and alternator have changed. We're not interested in installing a new tachometer. Instead, we'll create a conversion from a number on the tachometer, which is calibrated to the old alternator, to a proper number of engine RPMs. This has to allow the change in ratio between the original tachometer and the new tach. Let's collect some data and see what we can figure out about engine RPMs. New alternator First approximation: all we did was get new wheels. We can presume that the old tachometer was correct. Since the new wheel is smaller, we'll have higher alternator RPMs. That means higher readings on the old tachometer. Here's the key question: How far wrong are the RPMs? The old wheel was approximately 3.5 RPM and the new wheel is approximately 2.5 RPM. We can compute the potential ratio between what the tach says and what the engine is really doing: >>> 3.5/2.51.4>>> 1/_0.7142857142857143 That's nice. Is it right? Can we really just multiply and display RPMs by .7 to get actual engine RPMs? Let's create the conversion card first, then collect some more data. Use case Given RPM on the tachometer, what's the real RPM of the engine? Use the following command to find the RPM: def eng(r):   return r/1.4 Use it like the following: >>> eng(2100)1500.0 This seems useful. Tach says 2100, engine (theoretically) spinning at 1500, more or less. Let's confirm our hypothesis with some real data. Data collection Over a period of time, we recorded tachometer readings and actual RPMs using a visual RPM measuring device. The visual device requires a strip of reflective tape on one of the engine wheels. It uses a laser and counts returns per minute. Simple. Elegant. Accurate. It's really inconvenient. But it got some data we could digest. Skipping some boring statistics, we wind up with the following function that maps displayed RPMs to actual RPMs, such as this: def eng2(r):   return 0.7724*r**1.0134 Here's a sample result: >>> eng2(2100)1797.1291903589386 When tach says 2100, the engine is measured as spinning at about 1800 RPM. That's not quite the same as the theoretical model. But it's so close that it gives us a lot of confidence in this version. Of course, the number displayed is hideous. All that floating-point cruft is crazy. What can we do? Rounding is only part of the solution. We need to think through the use case. After all, we use this standing at the helm of the boat; how much detail is appropriate? Limits and ranges The engine has governors and only runs between 800 and 2500 RPM. There's a very tight limit here. Realistically, we're talking about this small range of values: >>> list(range(800, 2500, 200))[800, 1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400] There's no sensible reason for proving any more detailed engine RPMs. It's a sailboat; top speed is 7.5 knots (Nautical miles per hour). Wind and current have far more impact on the boat speed than the difference between 1600 and 1700 RPMs. The tach can't be read to closer than 100-200 RPM. It's not digital, it's a red pointer near little tick lines. There's no reason to preserve more than a few bits of precision. Example of Tach translation Given the engine RPMs and the conversion function, we can deduce that the tachometer display will be between 1000 to 3200. This will map to engine RPMs in the range of about 800 to 2500. We can confirm this with a mapping like this: >>> list(map(eng2, range(1000,3200,200)))[847.3098694826986, 1019.258964596305, 1191.5942982618956, 1364.2609728487703, 1537.2178605443924, 1710.4329833319157, 1883.8807562755746, 2057.5402392829747, 2231.3939741669838, 2405.4271806626366, 2579.627182659544] We've applied the eng2() mapping from tach to engine RPM. For tach readings between 1000 and 3200 in steps of 200, we've computed the actual engine RPMs. For those who use spreadsheets a lot, the range() function is like filling a column with values. The map(eng2, …) function is like filling an adjacent column with a calculation. We've created the result of applying a function to each value of a given range. As shown, this is little difficult to use. We need to do a little more cleanup. What other function do we need to apply to the results? Round to 100 Here's a function that will round up to the nearest 100: def next100(n):   return int(round(n, -2)) We could call this a kind of composite function built from a partial application of round() and int() functions. If we map this function to the previous results, we get something a little easier to work with. How does this look? >>> tach= range(1000,3200,200)>>> list(map(next100, map(eng2, tach)))[800, 1000, 1200, 1400, 1500, 1700, 1900, 2100, 2200, 2400, 2600] This expression is a bit complex; let's break it down into three discrete steps: First, map the eng2() function to tach numbers between 1000 and 3200. The result is effectively a sequence of values (it's not actually a list, it's a generator, a potential list) Second, map the next100() function to results of previous mapping Finally, collect a single list object from the results We've applied two functions, eng2() and next100(), to a list of values. In principle, we've created a kind of composite function, next100○eng20(rpm). Python doesn't support function composition directly, hence the complex-looking map of map syntax. Interleave sequences of values The final step is to create a table that shows both the tachometer reading and the computed engine RPMs. We need to interleave the input and output values into a single list of pairs. Here are the tach readings we're working with, as a list: >>> tach= range(1000,3200,200) Here are the engine RPMs: >>> engine= list(map(next100,map(eng2,tach))) Here's how we can interleave the two to create something that shows our tachometer reading and engine RPMs: >>> list(zip(tach, engine))[(1000, 800), (1200, 1000), (1400, 1200), (1600, 1400), (1800, 1500), (2000, 1700),(2200, 1900), (2400, 2100), (2600, 2200), (2800, 2400), (3000, 2600)] The rest is pretty-printing. What's important is that we could take functions like eng() or eng2() and apply it to columns of numbers, creating columns of results. The map() function means that we don't have to write explicit for loops to simply apply a function to a sequence of values. Map is lazy We have a few other observations about the Python higher-order functions. First, these functions are lazy, they don't compute any results until required by other statements or expressions. Because they don't actually create intermediate list objects, they may be quite fast. The laziness feature is true for the built-in higher-order functions map() and filter(). It's also true for many of the functions in the itertools library. Many of these functions don't simply create a list object, they yield values as requested. For debugging purposes, we use list() to see what's being produced. If we don't apply list() to the result of a lazy function, we simply see that it's a lazy function. Here's an example: >>> map(lambda x:x*1.4, range(1000,3200,200))<map object at 0x102130610> We don't see a proper result here, because the lazy map() function didn't do anything. The list(), tuple(), or set() functions will force a lazy map() function to actually get up off the couch and compute something. Function Wrappers There are a number of Python functions which are syntactic sugar for method functions. One example is the len() function. This function behaves as if it had the following definition: def len(obj):   return obj.__len__() The function acts like it's simply invoking the object's built-in __len__() method. There are several Python functions that exist only to make the syntax a little more readable. Post-fix syntax purists would prefer to see syntax such as some_list.len(). Those who like their code to look a little more mathematical prefer len(some_list). Some people will go so far as to claim that the presence of prefix functions means that Python isn't strictly object-oriented. This is false; Python is very strictly object-oriented. It doesn't—however—use only postfix method notation. We can write function wrappers to make some method functions a little more palatable. Another good example is the divmod() function. This relies on two method functions, such as the following: a.__divmod__(b) b.__rdivmod__(a) The usual operator rules apply here. If the class for object a implements __divmod__(), then that's used to compute the result. If not, then the same test is made for the class of object b; if there's an implementation, that will be used to compute the results. Otherwise, it's undefined and we'll get an exception. Why wrap a method? Function wrappers for methods are syntactic sugar. They exist to make object methods look like simple functions. In some cases, the functional view is more succinct and expressive. Sometimes the object involved is obvious. For example, the os module functions provide access to OS-level libraries. The OS object is concealed inside the module. Sometimes the object is implied. For example, the random module makes a Random instance for us. We can simply call random.randint() without worrying about the object that was required for this to work properly. Lambdas A lambda is an anonymous function with a degenerate body. It's like a function in some respects and it's unlike a function because of the following two things: A lambda has no name A lambda has no statements A lambda's body is a single expression, nothing more. This expression can have parameters, however, which is why a lambda is a handy form of a callable function. The syntax is essentially as follows: lambda params : expression Here's a concrete example: lambda r: 0.7724*r**1.0134 You may recognize this as the eng2() function defined previously. We don't always need a complete, formal function. Sometimes, we just need an expression that has parameters. Speaking theoretically, a lambda is a one-argument function. When we have multi-argument functions, we can transform it to a series of one-argument lambda forms. This transformation can be helpful for optimization. None of that applies to Python. We'll move on. Using a Lambda with map Here are two equivalent results: map(eng2, tach) map(lambda r: 0.7724*r**1.0134, tach) Here's a previous example, using the lambda instead of the function: >>> tach= range(1000,3200,200)>>> list( map(lambda r: 0.7724*r**1.0134, tach))[847.3098694826986, 1019.258964596305, 1191.5942982618956, 1364.2609728487703, 1537.2178605443924, 1710.4329833319157, 1883.8807562755746, 2057.5402392829747, 2231.3939741669838, 2405.4271806626366, 2579.627182659544] You could scroll back to see that the results are the same. If we're doing a small thing once only, a lambda object might be more clear than a complete function definition. Emphasis here is on small once only. If we start trying to reuse a lambda object, or feel the need to assign a lambda object to a variable, we should really consider a function definition and the associated docstring and doctest features. Another use of Lambdas A common use of lambdas is with three other higher-order functions: sort(), min(), and max(). We might use one of these with a list object: list.sort(key= lambda x: expr) list.min(key= lambda x: expr) list.max(key= lambda x: expr) In each case, we're using a lambda object to embed an expression into the argument values for a function. In some cases, the expression might be very sophisticated; in other cases, it might be something as trivial as lambda x: x[1]. When the expression is trivial, a lambda object is a good idea. If the expression is going to get reused, however, a lambda object might be a bad idea. You can do this… But… The following kind of statement makes sense: some_name = lambda x: 3*x+1 We've created a callable object that takes a single argument value and returns a numeric value such as the following command snippet: def some_name(x): return 3*x+1. There are some differences. Most notably the following: A lambda object is all on one line of code. A possible advantage. There's no docstring. A disadvantage for lambdas of any complexity. Nor is there any doctest in the missing docstring. A significant problem for a lambda object that requires testing. There are ways to test lambdas with doctest outside a docstring, but it seems simpler to switch to a full function definition. We can't easily apply decorators to it. To do it, we lose the @decorator syntax. We can't use any Python statements in it. In particular, no try-except block is possible. For these reasons, we suggest limiting the use of lambdas to truly trivial situations. Callable Objects A callable object fits the model of a function. The unifying feature of all of the things we've looked at is that they're callable. Functions are the primary example of being callable but objects can also be callable. Callable objects can be subclasses of collections.abc.Callable. Because of Python's flexibility, this isn't a requirement, it's merely a good idea. To be callable, a class only needs to provide a __call__() method. Here's a complete callable class definition: from collections.abc import Callableclass Engine(Callable):   def __call__(self, tach):       return 0.7724*tach**1.0134 We've imported the collections.abc.Callable class. This will provide some assurance that any class that extends this abstract superclass will provide a definition for the __call__() method. This is a handy error-checking feature. Our class extends Callable by providing the needed __call__() method. In this case, the __call__() method performs a calculation on the single parameter value, returning a single result. Here's a callable object built from this class: eng= Engine() This creates a function that we can then use. We can evaluate eng(1000) to get the engine RPMs when the tach reads 1000. Callable objects step-by-step There are two parts to making a function a callable object. We'll emphasize these for folks who are new to object-oriented programming: Define a class. Generally, we make this a subclass of collections.abc.Callable. Technically, we only need to implement a __call__() method. It helps to use the proper superclass because it might help catch a few common mistakes. Create an instance of the class. This instance will be a callable object. The object that's created will be very similar to a defined function. And very similar to a lambda object that's been assigned to a variable. While it will be similar to a def statement, it will have one important additional feature: hysteresis. This can be the source of endless bugs. It can also be a way to improve performance. Callables can have hysteresis Here's an example of a callable object that uses hysteresis as a kind of optimization: class Factorial(Callable):   def __init__(self):       self.previous = {}   def __call__(self, n):       if n not in self.previous:           self.previous[n]= self.compute(n)       return self.previous[n]   def compute(self, n):       if n == 0 : return 1       return n*self.__call__(n-1)Here's how we can use this:>>> fact= Factorial()>>> fact(5)120 We create an instance of the class, and then call the instance to compute a value for us. The initializer The initialization method looks like this:    def __init__(self):       self.previous = {} This function creates a cache of previously computed values. This is a technique called memoization. If we've already computed a result once, it's in the self.previous cache; we don't need to compute it again, we already know the answer. The Callable interface The required __call__() method looks like this:    def __call__(self, n):       if n not in self.previous:           self.previous[n]= self.compute(n)       return self.previous[n] We've checked the memoization cache first. If the value is not there, we're forced to compute the answer, and insert it into the cache. The final answer is always a value in the cache. A common what if question is what if we have a function of multiple arguments? There are two minuscule changes to support more complex arguments. Use def __call__(self, *n): and self.compute(*n). Since we're only computing factorial, there's no need to over-generalize. The Compute method The essential computation has been allocated to a method called compute. It looks like this:    def compute(self, n):       if n == 0: return 1           return n*self.__call__(n-1) This does the real work of the callable object: it computes n!. In this case, we've used a pretty standard recursive factorial definition. This recursion relies on the __call__() method to check the cache for previous values. If we don't expect to compute values larger than 1000! (a 2,568 digit number, by the way) the recursion works nicely. If we think we need to compute really large factorials, we'll need to use a different approach. Execute the following code to compute very large factorials: functools.reduce(operator.mul, range(1,n+1)) Either way, we can depend on the internal memoization to leverage previous results. Note the potential issue Hysteresis—memory of what came before—is available to the callable objects. We call functions and lambdas stateless, where callable objects can be stateful. This may be desirable to optimize performance. We can memoize the previous results or we can design an object that's simply confusing. Consider a function like divmod() that returns two values. We could try to define a callable object that first returns the quotient and on the second call with the same arguments returns the remainder: >>> crazy_divmod(355,113)3>>> crazy_divmod(255,113)16 This is technically possible. But it's crazy. Warning: Stay away. We generally expect idempotence: functions do the same thing each time. Implementing memoization didn't alter the basic idempotence of our factorial function. Generator Functions Here's a fun generator, the Collatz function. The function creates a sequence using a simple pair of rules. We'll could call this rule, Half-Or-Three-Plus-One (HOTPO). We'll call it collatz(): def collatz(n):   if n % 2 == 0:        return n//2   else:       return 3*n+1 Each integer argument yields a next integer. These can form a chain. For example, if we start with collatz(13), we get 40. The value of collatz(40) is 20. Here's the sequence of values: 13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1At 1, it loops: 1 → 4 → 2 → 1 … Interestingly, all chains seem to lead—eventually—to 1. To explore this, we need a simple function that will build a chain from a given starting value. Successive values Here's a generator function that will build a list object. This iterates through values in the sequence until it reaches 1, when it terminates: def col_list(n):   seq= [n]   while n != 1:       n= collatz(n)       seq.append(n)   return seq This is not wrong. But it's not really the most useful implementation. This always creates a sequence object. In many cases, we don't really want an object, we only want information about the sequence. We might only want the length, or the largest numbers, or the sum. This is where a generator function might be more useful. A generator function yields elements from the sequence instead of building the sequence as a single object. Generator functions To create a generator function, we write a function that has a loop; inside the loop, there's a yield statement. A function with a yield statement is effectively an Iterable object, it can be used in a for statement to produce values. It doesn't create a big list object, it creates the items that can be accumulated into a list or tuple object. A generator function is lazy: it doesn't compute anything unless forced to by another function needing results. We can iterate through as many (or as few) of the results as we need. For example, list(some_generator()) forces all values to be returned. For another example of a lazy generator, look at how range() objects work. If we evaluate range(10), we only get a generator. If we evaluate list(range(10)), we get a list object. The Collatz generator Here's a generator function that will produce sequences of values using the collatz() method rule shown previously: def col_iter(n):   yield n   while n != 1:       n= collatz(n)        yield n When we use this in a for loop or with the list() function, this will yield the argument number. While the number is not 1, it will apply the collatz() function and yield successive values in the chain. When it has yielded 1, it will will terminate. One common pattern for generator functions is to replace all list-accumulation statements with yield statements. Instead of building a list one time at a time, we yield each item. The collatz() function it lazy. We don't get an answer unless we use list() or tuple() or some variation of a for statement context. Using a generator function Here's how this function looks in practice: >>> for i in col_iter(3):…   print(i)3105168421 We've used the generator function in a for loop so that it will yield all of the values until it terminates. Collatz function sequences Now we can do some exploration of this Collatz sequence. Here are a few evaluations of the col_iter() function: >>> list(col_iter(3))[3, 10, 5, 16, 8, 4, 2, 1]>>> list(col_iter(5))[5, 16, 8, 4, 2, 1]>>> list(col_iter(6))[6, 3, 10, 5, 16, 8, 4, 2, 1]>>> list(syracuse_iter(13))[13, 40, 20, 10, 5, 16, 8, 4, 2, 1] There's an interesting pattern here. It seems that from 16, we know the rest. Generalizing this: from any number we've already seen, we know the rest. Wait. This means that memoization might be a big help in exploring the values created by this sequence. When we start combining function design patterns like this, we're doing functional programming. We're stepping outside the box of purely object-oriented Python. Alternate styles Here is an alternative version of the collatz() function: def collatz2(n):   return n//2 if n%2 == 0 else 3*n+1 This simply collapses the if statements into a single if expression and may not help much. We also have this: collatz3= lambda n: n//2 if n%2 == 0 else 3*n+1 We've collapsed the expression into a lambda object. Helpful? Perhaps not. On the other hand, the function doesn't really need all of the overhead of a full function definition and multiple statements. The lambda object seems to capture everything relevant. Functions as object There's a higher-level function that will produce values until some ending condition is met. We can plug in one of the versions of the collatz() function and a termination test into this general-purpose function: def recurse_until(ending, the_function, n):   yield n   while not ending(n):       n= the_function(n)       yield n This requires two plug-in functions, they are as follows: ending() is a function to test to see whether we're done, for example, lambda n: n==1 the_function() is a form of the Collatz function We've completely uncoupled the general idea of recursively applying a function from a specific function and a specific terminating condition. Using the recurs_until() function We can apply this higher-order recurse_until() function like this: >>> recurse_until(lambda n: n==1, syracuse2, 9)<generator object recurse_until at 0x1021278c0> What's that? That's how a lazy generator looks: it didn't return any values because we didn't demand any values. We need to use it in a loop or some kind of expression that iterates through all available values. The list() function, for example, will collect all of the values. Getting the list of values Here's how we make the lazy generator do some work: >>> list(_)[9, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1] The _ variable is the previously computed value. It relieves us from the burden of having to write an assignment statement. We can write an expression, see the results, and know the results were automatically saved in the _ variable. Project Euler #14 Which starting number, under one million, produces the longest chain? Try it without memoization. It's really simple: >>> collatz_len= [len(list(recurse_until(lambda n: n==1, collatz2, i))) ... for i in range(1,11)]>>> results = zip(collatz_len, range(1,11))>>> max(results)(20, 9)>>> list(col_iter(9))[9, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1] We defined collatz_len as a list. We're writing a list comprehension that shows the values built from a generator expression. The generator expression evaluates len(something) for i in range(1,11). This means we'll be collecting ten values into the list, each of which is the length of something. The something is a list object built from the recurse_until(lambda n: n==1, collatz2, i) function that we discussed. This will compute the sequence of Collatz values starting from i and proceeding until the value in the sequence is 1. We've zipped the lengths with the original values of i. This will create pairs of lengths and starting numbers. The maximum length will now have a starting value associated with it so that we can confirm that the results match our expectations. And yes, this Project Euler problem could—in principle—be solved in a single line of code. Will this scale to 100? 1,000? 1,000,000? How much will memoization help? Summary In this article, we've looked at five (or six) kinds of Python callables. They all fit the y = f(x) model of a function to varying degrees. When is it appropriate to use each of these different ways to express the same essential concept? Functions are created with def and return. It shouldn't come as a surprise that this should cover most cases. This allows a docstring comment and doctest test cases. We could call these def functions, since they're built with the def statement. Higher-order functions—map(), filter(), and the itertools library—are generally written as plain-old def functions. They're higher-order because they accept functions as arguments or return functions as results. Otherwise, they're just functions. Function wrappers—len(), divmod(), pow(), str(), and repr()—are function syntax wrapped around object methods. These are def'd functions with very tiny bodies. We use them because a.pow(2) doesn't seem as clear as pow(2,a). Lambdas are appropriate for one-time use of something so simple that it doesn't deserve being wrapped in a def statement body. In some cases, we have a small nugget of code that seems more clear when written as a lambda expression rather than a more complete function definition. Simple filter rules, and simple computations are often more clearly shown as a lambda object. The Callable objects have a special property that other functions lack: hysteresis. They can retain the results of previous calculations. We've used this hysteresis property to implement memoizing. This can be a huge performance boost. Callable objects can be used badly, however, to create objects that have simply bizarre behavior. Most functions should strive for idempotence—the same arguments should yield the same results. Generator functions are created with a def statement and at least one yield statement. These functions are iterable. They can be used in a for statement to examine each resulting value. They can also be used with functions like list(), tuple(), and set() to create an actual object from the iterable sequence of values. We might combine them with higher-order functions to do complex processing, one item at a time. It's important to work with each of these kinds of callables. If you only have one tool—a hammer—then every problem has to be reshaped into a nail before you can solve it. Once you have multiple tools available, you can pick the tools that provides the most succinct and expressive solution to the problem. Resources for Article: Further resources on this subject: Expert Python Programming [article] Python Network Programming Cookbook [article] Learning Python Design Patterns [article]
Read more
  • 0
  • 0
  • 2725

article-image-working-webstart-and-browser-plugin
Packt
06 Feb 2015
12 min read
Save for later

Working with WebStart and the Browser Plugin

Packt
06 Feb 2015
12 min read
 In this article by Alex Kasko, Stanislav Kobyl yanskiy, and Alexey Mironchenko, authors of the book OpenJDK Cookbook, we will cover the following topics: Building the IcedTea browser plugin on Linux Using the IcedTea Java WebStart implementation on Linux Preparing the IcedTea Java WebStart implementation for Mac OS X Preparing the IcedTea Java WebStart implementation for Windows Introduction For a long time, for end users, the Java applets technology was the face of the whole Java world. For a lot of non-developers, the word Java itself is a synonym for the Java browser plugin that allows running Java applets inside web browsers. The Java WebStart technology is similar to the Java browser plugin but runs remotely on loaded Java applications as separate applications outside of web browsers. The OpenJDK open source project does not contain the implementations for the browser plugin nor for the WebStart technologies. The Oracle Java distribution, otherwise matching closely to OpenJDK codebases, provided its own closed source implementation for these technologies. The IcedTea-Web project contains free and open source implementations of the browser plugin and WebStart technologies. The IcedTea-Web browser plugin supports only GNU/Linux operating systems and the WebStart implementation is cross-platform. While the IcedTea implementation of WebStart is well-tested and production-ready, it has numerous incompatibilities with the Oracle WebStart implementation. These differences can be seen as corner cases; some of them are: Different behavior when parsing not well-formed JNLP descriptor files: The Oracle implementation is generally more lenient for malformed descriptors. Differences in JAR (re)downloading and caching behavior: The Oracle implementation uses caching more aggressively. Differences in sound support: This is due to differences in sound support between Oracle Java and IcedTea on Linux. Linux historically has multiple different sound providers (ALSA, PulseAudio, and so on) and IcedTea has more wide support for different providers, which can lead to sound misconfiguration. The IcedTea-Web browser plugin (as it is built on WebStart) has these incompatibilities too. On top of them, it can have more incompatibilities in relation to browser integration. User interface forms and general browser-related operations such as access from/to JavaScript code should work fine with both implementations. But historically, the browser plugin was widely used for security-critical applications like online bank clients. Such applications usually require security facilities from browsers, such as access to certificate stores or hardware crypto-devices that can differ from browser to browser, depending on the OS (for example, supports only Windows), browser version, Java version, and so on. Because of that, many real-world applications can have problems running the IcedTea-Web browser plugin on Linux. Both WebStart and the browser plugin are built on the idea of downloading (possibly untrusted) code from remote locations, and proper privilege checking and sandboxed execution of that code is a notoriously complex task. Usually reported security issues in the Oracle browser plugin (most widely known are issues during the year 2012) are also fixed separately in IcedTea-Web. Building the IcedTea browser plugin on Linux The IcedTea-Web project is not inherently cross-platform; it is developed on Linux and for Linux, and so it can be built quite easily on popular Linux distributions. The two main parts of it (stored in corresponding directories in the source code repository) are netx and plugin. NetX is a pure Java implementation of the WebStart technology. We will look at it more thoroughly in the following recipes of this article. Plugin is an implementation of the browser plugin using the NPAPI plugin architecture that is supported by multiple browsers. Plugin is written partly in Java and partly in native code (C++), and it officially supports only Linux-based operating systems. There exists an opinion about NPAPI that this architecture is dated, overcomplicated, and insecure, and that modern web browsers have enough built-in capabilities to not require external plugins. And browsers have gradually reduced support for NPAPI. Despite that, at the time of writing this book, the IcedTea-Web browser plugin worked on all major Linux browsers (Firefox and derivatives, Chromium and derivatives, and Konqueror). We will build the IcedTea-Web browser plugin from sources using Ubuntu 12.04 LTS amd64. Getting ready For this recipe, we will need a clean Ubuntu 12.04 running with the Firefox web browser installed. How to do it... The following procedure will help you to build the IcedTea-Web browser plugin: Install prepackaged binaries of OpenJDK 7: sudo apt-get install openjdk-7-jdk Install the GCC toolchain and build dependencies: sudo apt-get build-dep openjdk-7 Install the specific dependency for the browser plugin: sudo apt-get install firefox-dev Download and decompress the IcedTea-Web source code tarball: wget http://icedtea.wildebeest.org/download/source/icedtea-web-1.4.2.tar.gz tar xzvf icedtea-web-1.4.2.tar.gz Run the configure script to set up the build environment: ./configure Run the build process: make Install the newly built plugin into the /usr/local directory: sudo make install Configure the Firefox web browser to use the newly built plugin library: mkdir ~/.mozilla/plugins cd ~/.mozilla/plugins ln -s /usr/local/IcedTeaPlugin.so libjavaplugin.so Check whether the IcedTea-Web plugin has appeared under Tools | Add-ons | Plugins. Open the http://java.com/en/download/installed.jsp web page to verify that the browser plugin works. How it works... The IcedTea browser plugin requires the IcedTea Java implementation to be compiled successfully. The prepackaged OpenJDK 7 binaries in Ubuntu 12.04 are based on IcedTea, so we installed them first. The plugin uses the GNU Autconf build system that is common between free software tools. The xulrunner-dev package is required to access the NPAPI headers. The built plugin may be installed into Firefox for the current user only without requiring administrator privileges. For that, we created a symbolic link to our plugin in the place where Firefox expects to find the libjavaplugin.so plugin library. There's more... The plugin can also be installed into other browsers with NPAPI support, but installation instructions can be different for different browsers and different Linux distributions. As the NPAPI architecture does not depend on the operating system, in theory, a plugin can be built for non-Linux operating systems. But currently, no such ports are planned. Using the IcedTea Java WebStart implementation on Linux On the Java platform, the JVM needs to perform the class load process for each class it wants to use. This process is opaque for the JVM and actual bytecode for loaded classes may come from one of many sources. For example, this method allows the Java Applet classes to be loaded from a remote server to the Java process inside the web browser. Remote class loading also may be used to run remotely loaded Java applications in standalone mode without integration with the web browser. This technique is called Java WebStart and was developed under Java Specification Request (JSR) number 56. To run the Java application remotely, WebStart requires an application descriptor file that should be written using the Java Network Launching Protocol (JNLP) syntax. This file is used to define the remote server to load the application form along with some metainformation. The WebStart application may be launched from the web page by clicking on the JNLP link, or without the web browser using the JNLP file obtained beforehand. In either case, running the application is completely separate from the web browser, but uses a sandboxed security model similar to Java Applets. The OpenJDK project does not contain the WebStart implementation; the Oracle Java distribution provides its own closed-source WebStart implementation. The open source WebStart implementation exists as part of the IcedTea-Web project. It was initially based on the NETwork eXecute (NetX) project. Contrary to the Applet technology, WebStart does not require any web browser integration. This allowed developers to implement the NetX module using pure Java without native code. For integration with Linux-based operating systems, IcedTea-Web implements the javaws command as shell script that launches the netx.jar file with proper arguments. In this recipe, we will build the NetX module from the official IcedTea-Web source tarball. Getting ready For this recipe, we will need a clean Ubuntu 12.04 running with the Firefox web browser installed. How to do it... The following procedure will help you to build a NetX module: Install prepackaged binaries of OpenJDK 7: sudo apt-get install openjdk-7-jdk Install the GCC toolchain and build dependencies: sudo apt-get build-dep openjdk-7 Download and decompress the IcedTea-Web source code tarball: wget http://icedtea.wildebeest.org/download/source/icedtea-web-1.4.2.tar.gz tar xzvf icedtea-web-1.4.2.tar.gz Run the configure script to set up a build environment excluding the browser plugin from the build: ./configure –disable-plugin Run the build process: make Install the newly-built plugin into the /usr/local directory: sudo make install Run the WebStart application example from the Java tutorial: javaws http://docs.oracle.com/javase/tutorialJWS/samples/ deployment/dynamictree_webstartJWSProject/dynamictree_webstart.jnlp How it works... The javaws shell script is installed into the /usr/local/* directory. When launched with a path or a link to the JNLP file, javaws launches the netx.jar file, adding it to the boot classpath (for security reasons) and providing the JNLP link as an argument. Preparing the IcedTea Java WebStart implementation for Mac OS X The NetX WebStart implementation from the IcedTea-Web project is written in pure Java, so it can also be used on Mac OS X. IcedTea-Web provides the javaws launcher implementation only for Linux-based operating systems. In this recipe, we will create a simple implementation of the WebStart launcher script for Mac OS X. Getting ready For this recipe, we will need Mac OS X Lion with Java 7 (the prebuilt OpenJDK or Oracle one) installed. We will also need the netx.jar module from the IcedTea-Web project, which can be built using instructions from the previous recipe. How to do it... The following procedure will help you to run WebStart applications on Mac OS X: Download the JNLP descriptor example from the Java tutorials at http://docs.oracle.com/javase/tutorialJWS/samples/deployment/dynamictree_webstartJWSProject/dynamictree_webstart.jnlp. Test that this application can be run from the terminal using netx.jar: java -Xbootclasspath/a:netx.jar net.sourceforge.jnlp.runtime.Boot dynamictree_webstart.jnlp Create the wslauncher.sh bash script with the following contents: #!/bin/bash if [ "x$JAVA_HOME" = "x" ] ; then JAVA="$( which java 2>/dev/null )" else JAVA="$JAVA_HOME"/bin/java fi if [ "x$JAVA" = "x" ] ; then echo "Java executable not found" exit 1 fi if [ "x$1" = "x" ] ; then echo "Please provide JNLP file as first argument" exit 1 fi $JAVA -Xbootclasspath/a:netx.jar net.sourceforge.jnlp.runtime.Boot $1 Mark the launcher script as executable: chmod 755 wslauncher.sh Run the application using the launcher script: ./wslauncher.sh dynamictree_webstart.jnlp How it works... The next.jar file contains a Java application that can read JNLP files and download and run classes described in JNLP. But for security reasons, next.jar cannot be launched directly as an application (using the java -jar netx.jar syntax). Instead, netx.jar is added to the privileged boot classpath and is run specifying the main class directly. This allows us to download applications in sandbox mode. The wslauncher.sh script tries to find the Java executable file using the PATH and JAVA_HOME environment variables and then launches specified JNLP through netx.jar. There's more... The wslauncher.sh script provides a basic solution to run WebStart applications from the terminal. To integrate netx.jar into your operating system environment properly (to be able to launch WebStart apps using JNLP links from the web browser), a native launcher or custom platform scripting solution may be used. Such solutions lay down the scope of this book. Preparing the IcedTea Java WebStart implementation for Windows The NetX WebStart implementation from the IcedTea-Web project is written in pure Java, so it can also be used on Windows; we also used it on Linux and Mac OS X in previous recipes in this article. In this recipe, we will create a simple implementation of the WebStart launcher script for Windows. Getting ready For this recipe, we will need a version of Windows running with Java 7 (the prebuilt OpenJDK or Oracle one) installed. We will also need the netx.jar module from the IcedTea-Web project, which can be built using instructions from the previous recipe in this article. How to do it... The following procedure will help you to run WebStart applications on Windows: Download the JNLP descriptor example from the Java tutorials at http://docs.oracle.com/javase/tutorialJWS/samples/deployment/dynamictree_webstartJWSProject/dynamictree_webstart.jnlp. Test that this application can be run from the terminal using netx.jar: java -Xbootclasspath/a:netx.jar net.sourceforge.jnlp.runtime.Boot dynamictree_webstart.jnlp Create the wslauncher.sh bash script with the following contents: #!/bin/bash if [ "x$JAVA_HOME" = "x" ] ; then JAVA="$( which java 2>/dev/null )" else JAVA="$JAVA_HOME"/bin/java fi if [ "x$JAVA" = "x" ] ; then echo "Java executable not found" exit 1 fi if [ "x$1" = "x" ] ; then echo "Please provide JNLP file as first argument" exit 1 fi $JAVA -Xbootclasspath/a:netx.jar net.sourceforge.jnlp.runtime.Boot $1 Mark the launcher script as executable: chmod 755 wslauncher.sh Run the application using the launcher script: ./wslauncher.sh dynamictree_webstart.jnlp How it works... The netx.jar module must be added to the boot classpath as it cannot be run directly because of security reasons. The wslauncher.bat script tries to find the Java executable using the JAVA_HOME environment variable and then launches specified JNLP through netx.jar. There's more... The wslauncher.bat script may be registered as a default application to run the JNLP files. This will allow you to run WebStart applications from the web browser. But the current script will show the batch window for a short period of time before launching the application. It also does not support looking for Java executables in the Windows Registry. A more advanced script without those problems may be written using Visual Basic script (or any other native scripting solution) or as a native executable launcher. Such solutions lay down the scope of this book. Summary In this article we covered the configuration and installation of WebStart and browser plugin components, which are the biggest parts of the Iced Tea project.
Read more
  • 0
  • 0
  • 7645

article-image-visualforce-development-apex
Packt
06 Feb 2015
12 min read
Save for later

Visualforce Development with Apex

Packt
06 Feb 2015
12 min read
In this article by Matt Kaufman and Michael Wicherski, authors of the book Learning Apex Programming, we will see how we can use Apex to extend the Salesforce1 Platform. We will also see how to create a customized Force.com page. (For more resources related to this topic, see here.) Apex on its own is a powerful tool to extend the Salesforce1 Platform. It allows you to define your own database logic and fully customize the behavior of the platform. Sometimes, controlling "what happens behind the scenes isn't enough. You might have a complex process that needs to step users through a wizard or need to present data in a format that isn't native to the Salesforce1 Platform, or maybe even make things look like your corporate website. Anytime you need to go beyond custom logic and implement a custom interface, you can turn to Visualforce. Visualforce is the user interface framework for the Salesforce1 Platform. It supports the use of HTML, JavaScript, CSS, and Flash—all of which enable you to build your own custom web pages. These web pages are stored and hosted by the Salesforce1 Platform and can be exposed to just your internal users, your external community users, or publicly to the world. But wait, there's more! Also included with Visualforce is a robust markup language. This markup language (which is also referred to as Visualforce) allows you to bind your web pages to data and actions stored on the platform. It also allows you to leverage Apex for code-based objects and actions. Like the rest of the platform, the markup portion of Visualforce is upgraded three times a year with new tags and features. All of these features mean that Visualforce is very powerful. s-con-what? Before the "introduction of Visualforce, the Salesforce1 Platform had a feature called s-controls. These were simple files where you could write HTML, CSS, and JavaScript. There was no custom markup language included. In order to make things look like the Force.com GUI, a lot of HTML was required. If you wanted to create just a simple input form for a new Account record, so much HTML code was required. The following is just a" small, condensed excerpt of what the HTML would look like if you wanted to recreate such a screen from scratch: <div class="bPageTitle"><div class="ptBody"><div class="content"> <img src="/s.gif" class="pageTitleIcon" title="Account" /> <h1 class="pageType">    Account Edit<span class="titleSeparatingColon">:</span> </h1> <h2 class="pageDescription"> New Account</h2> <div class="blank">&nbsp;</div> </div> <div class="links"></div></div><div   class="ptBreadcrumb"></div></div> <form action="/001/e" method="post" onsubmit="if   (window.ffInAlert) { return false; }if (window.sfdcPage   &amp;&amp; window.sfdcPage.disableSaveButtons) { return   window.sfdcPage.disableSaveButtons(); }"> <div class="bPageBlock brandSecondaryBrd bEditBlock   secondaryPalette"> <div class="pbHeader">    <table border="0" cellpadding="0" cellspacing="0"><tbody>      <tr>      <td class="pbTitle">      <img src="/s.gif" width="12" height="1" class="minWidth"         style="margin-right: 0.25em;margin-right: 0.25em;margin-       right: 0.25em;">      <h2 class="mainTitle">Account Edit</h2>      </td>      <td class="pbButton" id="topButtonRow">      <input value="Save" class="btn" type="submit">      <input value="Cancel" class="btn" type="submit">      </td>      </tr>    </tbody></table> </div> <div class="pbBody">    <div class="pbSubheader brandTertiaryBgr first       tertiaryPalette" >    <span class="pbSubExtra"><span class="requiredLegend       brandTertiaryFgr"><span class="requiredExampleOuter"><span       class="requiredExample">&nbsp;</span></span>      <span class="requiredMark">*</span>      <span class="requiredText"> = Required Information</span>      </span></span>      <h3>Account Information<span         class="titleSeparatingColon">:</span> </h3>    </div>    <div class="pbSubsection">    <table class="detailList" border="0" cellpadding="0"     cellspacing="0"><tbody>      <tr>        <td class="labelCol requiredInput">        <label><span class="requiredMark">*</span>Account         Name</label>      </td>      <td class="dataCol col02">        <div class="requiredInput"><div         class="requiredBlock"></div>        <input id="acc2" name="acc2" size="20" type="text">        </div>      </td>      <td class="labelCol">        <label>Website</label>      </td>      <td class="dataCol">        <span>        <input id="acc12" name="acc12" size="20" type="text">        </span>      </td>      </tr>    </tbody></table>    </div> </div> <div class="pbBottomButtons">    <table border="0" cellpadding="0" cellspacing="0"><tbody>    <tr>      <td class="pbTitle"><img src="/s.gif" width="12" height="1"       class="minWidth" style="margin-right: 0.25em;margin-right:       0.25em;margin-right: 0.25em;">&nbsp;</td>      <td class="pbButtonb" id="bottomButtonRow">      <input value=" Save " class="btn" title="Save"         type="submit">      <input value="Cancel" class="btn" type="submit">      </td>    </tr>    </tbody></table> </div> <div class="pbFooter secondaryPalette"><div class="bg"> </div></div> </div> </form> We did our best to trim down this HTML to as little as possible. Despite all of our efforts, it still "took up more space than we wanted. The really sad part is that all of that code only results in the following screenshot: Not only was it time consuming to write all this HTML, but odds were that we wouldn't get it exactly right the first time. Worse still, every time the business requirements changed, we had to go through the exhausting effort of modifying the HTML code. Something had to change in order to provide us relief. That something was the introduction of Visualforce and its markup language. Your own personal Force.com The markup "tags in Visualforce correspond to various parts of the Force.com GUI. These tags allow you to quickly generate HTML markup without actually writing any HTML. It's really one of the greatest tricks of the Salesforce1 Platform. You can easily create your own custom screens that look just like the built-in ones with less effort than it would take you to create a web page for your corporate website. Take a look at the Visualforce markup that corresponds to the HTML and screenshot we showed you earlier: <apex:page standardController="Account" > <apex:sectionHeader title="Account Edit" subtitle="New Account"     /> <apex:form>    <apex:pageBlock title="Account Edit" mode="edit" >      <apex:pageBlockButtons>        <apex:commandButton value="Save" action="{!save}" />        <apex:commandButton value="Cancel" action="{!cancel}" />      </apex:pageBlockButtons>      <apex:pageBlockSection title="Account Information" >        <apex:inputField value="{!account.Name}" />        <apex:inputField value="{!account.Website}" />      </apex:pageBlockSection>    </apex:pageBlock> </apex:form> </apex:page> Impressive! With "merely these 15 lines of markup, we can render nearly 100 lines of earlier HTML. Don't believe us, you can try it out yourself. Creating a Visualforce page Just like" triggers and classes, Visualforce pages can "be created and edited using the Force.com IDE. The Force.com GUI also includes a web-based editor to work with Visualforce pages. To create a new Visualforce page, perform these simple steps: Right-click on your project and navigate to New | Visualforce Page. The Create New Visualforce Page window appears as shown: Enter" the label and name for your "new page in the Label and Name fields, respectively. For this example, use myTestPage. Select the API version for the page. For this example, keep it at the default value. Click on Finish. A progress bar will appear followed by your new Visualforce page. Remember that you always want to create your code in a Sandbox or Developer Edition org, not directly in Production. It is technically possible to edit Visualforce pages in Production, but you're breaking all sorts of best practices when you do. Similar to other markup languages, every tag in a Visualforce page must be closed. Tags and their corresponding closing tags must also occur in a proper order. The values of tag attributes are enclosed by double quotes; however, single quotes can be used inside the value to denote text values. Every Visualforce page starts with the <apex:page> tag and ends with </apex:page> as shown: <apex:page> <!-- Your content goes here --> </apex:page> Within "the <apex:page> tags, you can paste "your existing HTML as long as it is properly ordered and closed. The result will be a web page hosted by the Salesforce1 Platform. Not much to see here If you are" a web developer, then there's a lot you can "do with Visualforce pages. Using HTML, CSS, and images, you can create really pretty web pages that educate your users. If you have some programming skills, you can also use JavaScript in your pages to allow for interaction. If you have access to web services, you can use JavaScript to call the web services and make a really powerful application. Check out the following Visualforce page for an example of what you can do: <apex:page> <script type="text/javascript"> function doStuff(){    var x = document.getElementById("myId");    console.log(x); } </script> <img src="http://www.thisbook.com/logo.png" /> <h1>This is my title</h1> <h2>This is my subtitle</h2> <p>In a world where books are full of code, there was only one     that taught you everything you needed to know about Apex!</p> <ol>    <li>My first item</li>    <li>Etc.</li> </ol> <span id="myId"></span> <iframe src="http://www.thisbook.com/mypage.html" /> <form action="http://thisbook.com/submit.html" >    <input type="text" name="yoursecret" /> </form> </apex:page> All of this code is standalone and really has nothing to do with the Salesforce1 Platform other than being hosted by it. However, what really makes Visualforce powerful is its ability to interact with your data, which allows your pages to be more dynamic. Even better, you" can write Apex code to control how "your pages behave, so instead of relying on client-side JavaScript, your logic can run server side. Summary In this article we learned how a few features of Apex and how we can use it to extend the SalesForce1 Platform. We also created a custom Force.com page. Well, you've made a lot of progress. Not only can you write code to control how the database behaves, but you can create beautiful-looking pages too. You're an Apex rock star and nothing is going to hold you back. It's time to show your skills to the world. If you want to dig deeper, buy the book and read Learning Apex Programming in a simple step-by-step fashion by using Apex, the language for extension of the Salesforce1 Platform. Resources for Article: Further resources on this subject: Learning to Fly with Force.com [article] Building, Publishing, and Supporting Your Force.com Application [article] Adding a Geolocation Trigger to the Salesforce Account Object [article]
Read more
  • 0
  • 0
  • 2474
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-contexts-and-dependency-injection-netbeans
Packt
06 Feb 2015
18 min read
Save for later

Contexts and Dependency Injection in NetBeans

Packt
06 Feb 2015
18 min read
In this article by David R. Heffelfinger, the author of Java EE 7 Development with NetBeans 8, we will introduce Contexts and Dependency Injection (CDI) and other aspects of it. CDI can be used to simplify integrating the different layers of a Java EE application. For example, CDI allows us to use a session bean as a managed bean, so that we can take advantage of the EJB features, such as transactions, directly in our managed beans. In this article, we will cover the following topics: Introduction to CDI Qualifiers Stereotypes Interceptor binding types Custom scopes (For more resources related to this topic, see here.) Introduction to CDI JavaServer Faces (JSF) web applications employing CDI are very similar to JSF applications without CDI; the main difference is that instead of using JSF managed beans for our model and controllers, we use CDI named beans. What makes CDI applications easier to develop and maintain are the excellent dependency injection capabilities of the CDI API. Just as with other JSF applications, CDI applications use facelets as their view technology. The following example illustrates a typical markup for a JSF page using CDI: <?xml version='1.0' encoding='UTF-8' ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"    "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html      >    <h:head>        <title>Create New Customer</title>    </h:head>    <h:body>        <h:form>            <h3>Create New Customer</h3>            <h:panelGrid columns="3">                <h:outputLabel for="firstName" value="First Name"/>                <h:inputText id="firstName" value="#{customer.firstName}"/>                <h:message for="firstName"/>                  <h:outputLabel for="middleName" value="Middle Name"/>                <h:inputText id="middleName"                  value="#{customer.middleName}"/>                <h:message for="middleName"/>                  <h:outputLabel for="lastName" value="Last Name"/>                <h:inputText id="lastName" value="#{customer.lastName}"/>                <h:message for="lastName"/>                  <h:outputLabel for="email" value="Email Address"/>                <h:inputText id="email" value="#{customer.email}"/>                <h:message for="email"/>                <h:panelGroup/>                <h:commandButton value="Submit"                  action="#{customerController.navigateToConfirmation}"/>            </h:panelGrid>        </h:form>    </h:body> </html> As we can see, the preceding markup doesn't look any different from the markup used for a JSF application that does not use CDI. The page renders as follows (shown after entering some data): In our page markup, we have JSF components that use Unified Expression Language expressions to bind themselves to CDI named bean properties and methods. Let's take a look at the customer bean first: package com.ensode.cdiintro.model;   import java.io.Serializable; import javax.enterprise.context.RequestScoped; import javax.inject.Named;   @Named @RequestScoped public class Customer implements Serializable {      private String firstName;    private String middleName;    private String lastName;    private String email;      public Customer() {    }      public String getFirstName() {        return firstName;    }      public void setFirstName(String firstName) {        this.firstName = firstName;    }      public String getMiddleName() {        return middleName;    }      public void setMiddleName(String middleName) {        this.middleName = middleName;    }      public String getLastName() {        return lastName;    }      public void setLastName(String lastName) {        this.lastName = lastName;    }      public String getEmail() {        return email;    }      public void setEmail(String email) {        this.email = email;    } } The @Named annotation marks this class as a CDI named bean. By default, the bean's name will be the class name with its first character switched to lowercase (in our example, the name of the bean is "customer", since the class name is Customer). We can override this behavior if we wish by passing the desired name to the value attribute of the @Named annotation, as follows: @Named(value="customerBean") A CDI named bean's methods and properties are accessible via facelets, just like regular JSF managed beans. Just like JSF managed beans, CDI named beans can have one of several scopes as listed in the following table. The preceding named bean has a scope of request, as denoted by the @RequestScoped annotation. Scope Annotation Description Request @RequestScoped Request scoped beans are shared through the duration of a single request. A single request could refer to an HTTP request, an invocation to a method in an EJB, a web service invocation, or sending a JMS message to a message-driven bean. Session @SessionScoped Session scoped beans are shared across all requests in an HTTP session. Each user of an application gets their own instance of a session scoped bean. Application @ApplicationScoped Application scoped beans live through the whole application lifetime. Beans in this scope are shared across user sessions. Conversation @ConversationScoped The conversation scope can span multiple requests, and is typically shorter than the session scope. Dependent @Dependent Dependent scoped beans are not shared. Any time a dependent scoped bean is injected, a new instance is created. As we can see, CDI has equivalent scopes to all JSF scopes. Additionally, CDI adds two additional scopes. The first CDI-specific scope is the conversation scope, which allows us to have a scope that spans across multiple requests, but is shorter than the session scope. The second CDI-specific scope is the dependent scope, which is a pseudo scope. A CDI bean in the dependent scope is a dependent object of another object; beans in this scope are instantiated when the object they belong to is instantiated and they are destroyed when the object they belong to is destroyed. Our application has two CDI named beans. We already discussed the customer bean. The other CDI named bean in our application is the controller bean: package com.ensode.cdiintro.controller;   import com.ensode.cdiintro.model.Customer; import javax.enterprise.context.RequestScoped; import javax.inject.Inject; import javax.inject.Named;   @Named @RequestScoped public class CustomerController {      @Inject    private Customer customer;      public Customer getCustomer() {        return customer;    }      public void setCustomer(Customer customer) {        this.customer = customer;    }      public String navigateToConfirmation() {        //In a real application we would        //Save customer data to the database here.          return "confirmation";    } } In the preceding class, an instance of the Customer class is injected at runtime; this is accomplished via the @Inject annotation. This annotation allows us to easily use dependency injection in CDI applications. Since the Customer class is annotated with the @RequestScoped annotation, a new instance of Customer will be injected for every request. The navigateToConfirmation() method in the preceding class is invoked when the user clicks on the Submit button on the page. The navigateToConfirmation() method works just like an equivalent method in a JSF managed bean would, that is, it returns a string and the application navigates to an appropriate page based on the value of that string. Like with JSF, by default, the target page's name with an .xhtml extension is the return value of this method. For example, if no exceptions are thrown in the navigateToConfirmation() method, the user is directed to a page named confirmation.xhtml: <?xml version='1.0' encoding='UTF-8' ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html      >    <h:head>        <title>Success</title>    </h:head>    <h:body>        New Customer created successfully.        <h:panelGrid columns="2" border="1" cellspacing="0">            <h:outputLabel for="firstName" value="First Name"/>            <h:outputText id="firstName" value="#{customer.firstName}"/>              <h:outputLabel for="middleName" value="Middle Name"/>            <h:outputText id="middleName"              value="#{customer.middleName}"/>              <h:outputLabel for="lastName" value="Last Name"/>            <h:outputText id="lastName" value="#{customer.lastName}"/>              <h:outputLabel for="email" value="Email Address"/>            <h:outputText id="email" value="#{customer.email}"/>          </h:panelGrid>    </h:body> </html> Again, there is nothing special we need to do to access the named beans properties from the preceding markup. It works just as if the bean was a JSF managed bean. The preceding page renders as follows: As we can see, CDI applications work just like JSF applications. However, CDI applications have several advantages over JSF, for example (as we mentioned previously) CDI beans have additional scopes not found in JSF. Additionally, using CDI allows us to decouple our Java code from the JSF API. Also, as we mentioned previously, CDI allows us to use session beans as named beans. Qualifiers In some instances, the type of bean we wish to inject into our code may be an interface or a Java superclass, but we may be interested in injecting a subclass or a class implementing the interface. For cases like this, CDI provides qualifiers we can use to indicate the specific type we wish to inject into our code. A CDI qualifier is an annotation that must be decorated with the @Qualifier annotation. This annotation can then be used to decorate the specific subclass or interface. In this section, we will develop a Premium qualifier for our customer bean; premium customers could get perks that are not available to regular customers, for example, discounts. Creating a CDI qualifier with NetBeans is very easy; all we need to do is go to File | New File, select the Contexts and Dependency Injection category, and select the Qualifier Type file type. In the next step in the wizard, we need to enter a name and a package for our qualifier. After these two simple steps, NetBeans generates the code for our qualifier: package com.ensode.cdiintro.qualifier;   import static java.lang.annotation.ElementType.TYPE; import static java.lang.annotation.ElementType.FIELD; import static java.lang.annotation.ElementType.PARAMETER; import static java.lang.annotation.ElementType.METHOD; import static java.lang.annotation.RetentionPolicy.RUNTIME; import java.lang.annotation.Retention; import java.lang.annotation.Target; import javax.inject.Qualifier;   @Qualifier @Retention(RUNTIME) @Target({METHOD, FIELD, PARAMETER, TYPE}) public @interface Premium { } Qualifiers are standard Java annotations. Typically, they have retention of runtime and can target methods, fields, parameters, or types. The only difference between a qualifier and a standard annotation is that qualifiers are decorated with the @Qualifier annotation. Once we have our qualifier in place, we need to use it to decorate the specific subclass or interface implementation, as shown in the following code: package com.ensode.cdiintro.model;   import com.ensode.cdiintro.qualifier.Premium; import javax.enterprise.context.RequestScoped; import javax.inject.Named;   @Named @RequestScoped @Premium public class PremiumCustomer extends Customer {      private Integer discountCode;      public Integer getDiscountCode() {        return discountCode;    }      public void setDiscountCode(Integer discountCode) {        this.discountCode = discountCode;    } } Once we have decorated the specific instance we need to qualify, we can use our qualifiers in the client code to specify the exact type of dependency we need: package com.ensode.cdiintro.controller;   import com.ensode.cdiintro.model.Customer; import com.ensode.cdiintro.model.PremiumCustomer; import com.ensode.cdiintro.qualifier.Premium;   import java.util.logging.Level; import java.util.logging.Logger; import javax.enterprise.context.RequestScoped; import javax.inject.Inject; import javax.inject.Named;   @Named @RequestScoped public class PremiumCustomerController {      private static final Logger logger = Logger.getLogger(            PremiumCustomerController.class.getName());    @Inject    @Premium    private Customer customer;      public String saveCustomer() {          PremiumCustomer premiumCustomer =          (PremiumCustomer) customer;          logger.log(Level.INFO, "Saving the following information n"                + "{0} {1}, discount code = {2}",                new Object[]{premiumCustomer.getFirstName(),                    premiumCustomer.getLastName(),                    premiumCustomer.getDiscountCode()});          //If this was a real application, we would have code to save        //customer data to the database here.          return "premium_customer_confirmation";    } } Since we used our @Premium qualifier to decorate the customer field, an instance of the PremiumCustomer class is injected into that field. This is because this class is also decorated with the @Premium qualifier. As far as our JSF pages go, we simply access our named bean as usual using its name, as shown in the following code; <?xml version='1.0' encoding='UTF-8' ?> <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html      >    <h:head>        <title>Create New Premium Customer</title>    </h:head>    <h:body>        <h:form>            <h3>Create New Premium Customer</h3>            <h:panelGrid columns="3">                <h:outputLabel for="firstName" value="First Name"/>                 <h:inputText id="firstName"                    value="#{premiumCustomer.firstName}"/>                <h:message for="firstName"/>                  <h:outputLabel for="middleName" value="Middle Name"/>                <h:inputText id="middleName"                     value="#{premiumCustomer.middleName}"/>                <h:message for="middleName"/>                  <h:outputLabel for="lastName" value="Last Name"/>                <h:inputText id="lastName"                    value="#{premiumCustomer.lastName}"/>                <h:message for="lastName"/>                  <h:outputLabel for="email" value="Email Address"/>                <h:inputText id="email"                    value="#{premiumCustomer.email}"/>                <h:message for="email"/>                  <h:outputLabel for="discountCode" value="Discount Code"/>                <h:inputText id="discountCode"                    value="#{premiumCustomer.discountCode}"/>                <h:message for="discountCode"/>                   <h:panelGroup/>                <h:commandButton value="Submit"                      action="#{premiumCustomerController.saveCustomer}"/>            </h:panelGrid>        </h:form>    </h:body> </html> In this example, we are using the default name for our bean, which is the class name with the first letter switched to lowercase. Now, we are ready to test our application: After submitting the page, we can see the confirmation page. Stereotypes A CDI stereotype allows us to create new annotations that bundle up several CDI annotations. For example, if we need to create several CDI named beans with a scope of session, we would have to use two annotations in each of these beans, namely @Named and @SessionScoped. Instead of having to add two annotations to each of our beans, we could create a stereotype and annotate our beans with it. To create a CDI stereotype in NetBeans, we simply need to create a new file by selecting the Contexts and Dependency Injection category and the Stereotype file type. Then, we need to enter a name and package for our new stereotype. At this point, NetBeans generates the following code: package com.ensode.cdiintro.stereotype;   import static java.lang.annotation.ElementType.TYPE; import static java.lang.annotation.ElementType.FIELD; import static java.lang.annotation.ElementType.METHOD; import static java.lang.annotation.RetentionPolicy.RUNTIME; import java.lang.annotation.Retention; import java.lang.annotation.Target; import javax.enterprise.inject.Stereotype;   @Stereotype @Retention(RUNTIME) @Target({METHOD, FIELD, TYPE}) public @interface NamedSessionScoped { } Now, we simply need to add the CDI annotations that we want the classes annotated with our stereotype to use. In our case, we want them to be named beans and have a scope of session; therefore, we add the @Named and @SessionScoped annotations as shown in the following code: package com.ensode.cdiintro.stereotype;   import static java.lang.annotation.ElementType.TYPE; import static java.lang.annotation.ElementType.FIELD; import static java.lang.annotation.ElementType.METHOD; import static java.lang.annotation.RetentionPolicy.RUNTIME; import java.lang.annotation.Retention; import java.lang.annotation.Target; import javax.enterprise.context.SessionScoped; import javax.enterprise.inject.Stereotype; import javax.inject.Named;   @Named @SessionScoped @Stereotype @Retention(RUNTIME) @Target({METHOD, FIELD, TYPE}) public @interface NamedSessionScoped { } Now we can use our stereotype in our own code: package com.ensode.cdiintro.beans;   import com.ensode.cdiintro.stereotype.NamedSessionScoped; import java.io.Serializable;   @NamedSessionScoped public class StereotypeClient implements Serializable {      private String property1;    private String property2;      public String getProperty1() {        return property1;    }      public void setProperty1(String property1) {        this.property1 = property1;    }      public String getProperty2() {        return property2;    }      public void setProperty2(String property2) {        this.property2 = property2;    } } We annotated the StereotypeClient class with our NamedSessionScoped stereotype, which is equivalent to using the @Named and @SessionScoped annotations. Interceptor binding types One of the advantages of EJBs is that they allow us to easily perform aspect-oriented programming (AOP) via interceptors. CDI allows us to write interceptor binding types; this lets us bind interceptors to beans and the beans do not have to depend on the interceptor directly. Interceptor binding types are annotations that are themselves annotated with @InterceptorBinding. Creating an interceptor binding type in NetBeans involves creating a new file, selecting the Contexts and Dependency Injection category, and selecting the Interceptor Binding Type file type. Then, we need to enter a class name and select or enter a package for our new interceptor binding type. At this point, NetBeans generates the code for our interceptor binding type: package com.ensode.cdiintro.interceptorbinding;   import static java.lang.annotation.ElementType.TYPE; import static java.lang.annotation.ElementType.METHOD; import static java.lang.annotation.RetentionPolicy.RUNTIME; import java.lang.annotation.Inherited; import java.lang.annotation.Retention; import java.lang.annotation.Target; import javax.interceptor.InterceptorBinding;   @Inherited @InterceptorBinding @Retention(RUNTIME) @Target({METHOD, TYPE}) public @interface LoggingInterceptorBinding { } The generated code is fully functional; we don't need to add anything to it. In order to use our interceptor binding type, we need to write an interceptor and annotate it with our interceptor binding type, as shown in the following code: package com.ensode.cdiintro.interceptor;   import com.ensode.cdiintro.interceptorbinding.LoggingInterceptorBinding; import java.io.Serializable; import java.util.logging.Level; import java.util.logging.Logger; import javax.interceptor.AroundInvoke; import javax.interceptor.Interceptor; import javax.interceptor.InvocationContext;   @LoggingInterceptorBinding @Interceptor public class LoggingInterceptor implements Serializable{      private static final Logger logger = Logger.getLogger(            LoggingInterceptor.class.getName());      @AroundInvoke    public Object logMethodCall(InvocationContext invocationContext)            throws Exception {          logger.log(Level.INFO, new StringBuilder("entering ").append(                invocationContext.getMethod().getName()).append(                " method").toString());          Object retVal = invocationContext.proceed();          logger.log(Level.INFO, new StringBuilder("leaving ").append(                invocationContext.getMethod().getName()).append(                " method").toString());          return retVal;    } } As we can see, other than being annotated with our interceptor binding type, the preceding class is a standard interceptor similar to the ones we use with EJB session beans. In order for our interceptor binding type to work properly, we need to add a CDI configuration file (beans.xml) to our project. Then, we need to register our interceptor in beans.xml as follows: <?xml version="1.0" encoding="UTF-8"?> <beans               xsi_schemaLocation="http://>    <interceptors>          <class>        com.ensode.cdiintro.interceptor.LoggingInterceptor      </class>    </interceptors> </beans> To register our interceptor, we need to set bean-discovery-mode to all in the generated beans.xml and add the <interceptor> tag in beans.xml, with one or more nested <class> tags containing the fully qualified names of our interceptors. The final step before we can use our interceptor binding type is to annotate the class to be intercepted with our interceptor binding type: package com.ensode.cdiintro.controller;   import com.ensode.cdiintro.interceptorbinding.LoggingInterceptorBinding; import com.ensode.cdiintro.model.Customer; import com.ensode.cdiintro.model.PremiumCustomer; import com.ensode.cdiintro.qualifier.Premium; import java.util.logging.Level; import java.util.logging.Logger; import javax.enterprise.context.RequestScoped; import javax.inject.Inject; import javax.inject.Named;   @LoggingInterceptorBinding @Named @RequestScoped public class PremiumCustomerController {      private static final Logger logger = Logger.getLogger(            PremiumCustomerController.class.getName());    @Inject    @Premium    private Customer customer;      public String saveCustomer() {          PremiumCustomer premiumCustomer = (PremiumCustomer) customer;          logger.log(Level.INFO, "Saving the following information n"                + "{0} {1}, discount code = {2}",                new Object[]{premiumCustomer.getFirstName(),                    premiumCustomer.getLastName(),                    premiumCustomer.getDiscountCode()});          //If this was a real application, we would have code to save        //customer data to the database here.          return "premium_customer_confirmation";    } } Now, we are ready to use our interceptor. After executing the preceding code and examining the GlassFish log, we can see our interceptor binding type in action. The lines entering saveCustomer method and leaving saveCustomer method were added to the log by our interceptor, which was indirectly invoked by our interceptor binding type. Custom scopes In addition to providing several prebuilt scopes, CDI allows us to define our own custom scopes. This functionality is primarily meant for developers building frameworks on top of CDI, not for application developers. Nevertheless, NetBeans provides a wizard for us to create our own CDI custom scopes. To create a new CDI custom scope, we need to go to File | New File, select the Contexts and Dependency Injection category, and select the Scope Type file type. Then, we need to enter a package and a name for our custom scope. After clicking on Finish, our new custom scope is created, as shown in the following code: package com.ensode.cdiintro.scopes;   import static java.lang.annotation.ElementType.TYPE; import static java.lang.annotation.ElementType.FIELD; import static java.lang.annotation.ElementType.METHOD; import static java.lang.annotation.RetentionPolicy.RUNTIME; import java.lang.annotation.Inherited; import java.lang.annotation.Retention; import java.lang.annotation.Target; import javax.inject.Scope;   @Inherited @Scope // or @javax.enterprise.context.NormalScope @Retention(RUNTIME) @Target({METHOD, FIELD, TYPE}) public @interface CustomScope { } To actually use our scope in our CDI applications, we would need to create a custom context which, as mentioned previously, is primarily a concern for framework developers and not for Java EE application developers. Therefore, it is beyond the scope of this article. Interested readers can refer to JBoss Weld CDI for Java Platform, Ken Finnigan, Packt Publishing. (JBoss Weld is a popular CDI implementation and it is included with GlassFish.) Summary In this article, we covered NetBeans support for CDI, a new Java EE API introduced in Java EE 6. We provided an introduction to CDI and explained additional functionality that the CDI API provides over standard JSF. We also covered how to disambiguate CDI injected beans via CDI Qualifiers. Additionally, we covered how to group together CDI annotations via CDI stereotypes. We also, we saw how CDI can help us with AOP via interceptor binding types. Finally, we covered how NetBeans can help us create custom CDI scopes. Resources for Article: Further resources on this subject: Java EE 7 Performance Tuning and Optimization [article] Java EE 7 Developer Handbook [article] Java EE 7 with GlassFish 4 Application Server [article]
Read more
  • 0
  • 0
  • 9639

article-image-multiplying-performance-parallel-computing
Packt
06 Feb 2015
22 min read
Save for later

Multiplying Performance with Parallel Computing

Packt
06 Feb 2015
22 min read
In this article, by Aloysius Lim and William Tjhi, authors of the book R High Performance Programming, we will learn how to write and execute a parallel R code, where different parts of the code run simultaneously. So far, we have learned various ways to optimize the performance of R programs running serially, that is in a single process. This does not take full advantage of the computing power of modern CPUs with multiple cores. Parallel computing allows us to tap into all the computational resources available and to speed up the execution of R programs by many times. We will examine the different types of parallelism and how to implement them in R, and we will take a closer look at a few performance considerations when designing the parallel architecture of R programs. (For more resources related to this topic, see here.) Data parallelism versus task parallelism Many modern software applications are designed to run computations in parallel in order to take advantage of the multiple CPU cores available on almost any computer today. Many R programs can similarly be written in order to run in parallel. However, the extent of possible parallelism depends on the computing task involved. On one side of the scale are embarrassingly parallel tasks, where there are no dependencies between the parallel subtasks; such tasks can be made to run in parallel very easily. An example of this is, building an ensemble of decision trees in a random forest algorithm—randomized decision trees can be built independently from one another and in parallel across tens or hundreds of CPUs, and can be combined to form the random forest. On the other end of the scale are tasks that cannot be parallelized, as each step of the task depends on the results of the previous step. One such example is a depth-first search of a tree, where the subtree to search at each step depends on the path taken in previous steps. Most algorithms fall somewhere in between with some steps that must run serially and some that can run in parallel. With this in mind, careful thought must be given when designing a parallel code that works correctly and efficiently. Often an R program has some parts that have to be run serially and other parts that can run in parallel. Before making the effort to parallelize any of the R code, it is useful to have an estimate of the potential performance gains that can be achieved. Amdahl's law provides a way to estimate the best attainable performance gain when you convert a code from serial to parallel execution. It divides a computing task into its serial and potentially-parallel parts and states that the time needed to execute the task in parallel will be no less than this formula: T(n) = T(1)(P + (1-P)/n), where: T(n) is the time taken to execute the task using n parallel processes P is the proportion of the whole task that is strictly serial The theoretical best possible speed up of the parallel algorithm is thus: S(n) = T(1) / T(n) = 1 / (P + (1-P)/n) For example, given a task that takes 10 seconds to execute on one processor, where half of the task can be run in parallel, then the best possible time to run it on four processors is T(4) = 10(0.5 + (1-0.5)/4) = 6.25 seconds. The theoretical best possible speed up of the parallel algorithm with four processors is 1 / (0.5 + (1-0.5)/4) = 1.6x . The following figure shows you how the theoretical best possible execution time decreases as more CPU cores are added. Notice that the execution time reaches a limit that is just above five seconds. This corresponds to the half of the task that must be run serially, where parallelism does not help. Best possible execution time versus number of CPU cores In general, Amdahl's law means that the fastest execution time for any parallelized algorithm is limited by the time needed for the serial portions of the algorithm. Bear in mind that Amdahl's law provides only a theoretical estimate. It does not account for the overheads of parallel computing (such as starting and coordinating tasks) and assumes that the parallel portions of the algorithm are infinitely scalable. In practice, these factors might significantly limit the performance gains of parallelism, so use Amdahl's law only to get a rough estimate of the maximum speedup possible. There are two main classes of parallelism: data parallelism and task parallelism. Understanding these concepts helps to determine what types of tasks can be modified to run in parallel. In data parallelism, a dataset is divided into multiple partitions. Different partitions are distributed to multiple processors, and the same task is executed on each partition of data. Take for example, the task of finding the maximum value in a vector dataset, say one that has one billion numeric data points. A serial algorithm to do this would look like the following code, which iterates over every element of the data in sequence to search for the largest value. (This code is intentionally verbose to illustrate how the algorithm works; in practice, the max() function in R, though also serial in nature, is much faster.) serialmax <- function(data) {max = -Inffor (i in data) {if (i > max)max = i}return max} One way to parallelize this algorithm is to split the data into partitions. If we have a computer with eight CPU cores, we can split the data into eight partitions of 125 million numbers each. Here is the pseudocode for how to perform the same task in parallel: # Run this in parallel across 8 CPU corespart.results <- run.in.parallel(serialmax(data.part))# Compute global maxglobal.max <- serialmax(part.results) This pseudocode runs eight instances of serialmax()in parallel—one for each data partition—to find the local maximum value in each partition. Once all the partitions have been processed, the algorithm finds the global maximum value by finding the largest value among the local maxima. This parallel algorithm works because the global maximum of a dataset must be the largest of the local maxima from all the partitions. The following figure depicts data parallelism pictorially. The key behind data parallel algorithms is that each partition of data can be processed independently of the other partitions, and the results from all the partitions can be combined to compute the final results. This is similar to the mechanism of the MapReduce framework from Hadoop. Data parallelism allows algorithms to scale up easily as data volume increases—as more data is added to the dataset, more computing nodes can be added to a cluster to process new partitions of data. Data parallelism Other examples of computations and algorithms that can be run in a data parallel way include: Element-wise matrix operations such as addition and subtraction: The matrices can be partitioned and the operations are applied to each pair of partitions. Means: The sums and number of elements in each partition can be added to find the global sum and number of elements from which the mean can be computed. K-means clustering: After data partitioning, the K centroids are distributed to all the partitions. Finding the closest centroid is performed in parallel and independently across the partitions. The centroids are updated by first, calculating the sums and the counts of their respective members in parallel, and then consolidating them in a single process to get the global means. Frequent itemset mining using the Partition algorithm: In the first pass, the frequent itemsets are mined from each partition of data to generate a global set of candidate itemsets; in the second pass, the supports of the candidate itemsets are summed from each partition to filter out the globally infrequent ones. The other main class of parallelism is task parallelism, where tasks are distributed to and executed on different processors in parallel. The tasks on each processor might be the same or different, and the data that they act on might also be the same or different. The key difference between task parallelism and data parallelism is that the data is not divided into partitions. An example of a task parallel algorithm performing the same task on the same data is the training of a random forest model. A random forest is a collection of decision trees built independently on the same data. During the training process for a particular tree, a random subset of the data is chosen as the training set, and the variables to consider at each branch of the tree are also selected randomly. Hence, even though the same data is used, the trees are different from one another. In order to train a random forest of say 100 decision trees, the workload could be distributed to a computing cluster with 100 processors, with each processor building one tree. All the processors perform the same task on the same data (or exact copies of the data), but the data is not partitioned. The parallel tasks can also be different. For example, computing a set of summary statistics on the same set of data can be done in a task parallel way. Each process can be assigned to compute a different statistic—the mean, standard deviation, percentiles, and so on. Pseudocode of a task parallel algorithm might look like this: # Run 4 tasks in parallel across 4 coresfor (task in tasks)run.in.parallel(task)# Collect the results of the 4 tasksresults <- collect.parallel.output()# Continue processing after all 4 tasks are complete Implementing data parallel algorithms Several R packages allow code to be executed in parallel. The parallel package that comes with R provides the foundation for most parallel computing capabilities in other packages. Let's see how it works with an example. This example involves finding documents that match a regular expression. Regular expression matching is a fairly computational expensive task, depending on the complexity of the regular expression. The corpus, or set of documents, for this example is a sample of the Reuters-21578 dataset for the topic corporate acquisitions (acq) from the tm package. Because this dataset contains only 50 documents, they are replicated 100,000 times to form a corpus of 5 million documents so that parallelizing the code will lead to meaningful savings in execution times. library(tm)data("acq")textdata <- rep(sapply(content(acq), content), 1e5) The task is to find documents that match the regular expression d+(,d+)? mln dlrs, which represents monetary amounts in millions of dollars. In this regular expression, d+ matches a string of one or more digits, and (,d+)? optionally matches a comma followed by one more digits. For example, the strings 12 mln dlrs, 1,234 mln dlrs and 123,456,789 mln dlrs will match the regular expression. First, we will measure the execution time to find these documents serially with grepl(): pattern <- "\d+(,\d+)? mln dlrs"system.time(res1 <- grepl(pattern, textdata))##   user  system elapsed ## 65.601   0.114  65.721 Next, we will modify the code to run in parallel and measure the execution time on a computer with four CPU cores: library(parallel)detectCores()## [1] 4cl <- makeCluster(detectCores())part <- clusterSplit(cl, seq_along(textdata))text.partitioned <- lapply(part, function(p) textdata[p])system.time(res2 <- unlist(    parSapply(cl, text.partitioned, grepl, pattern = pattern))) ##  user  system elapsed ## 3.708   8.007  50.806 stopCluster(cl) In this code, the detectCores() function reveals how many CPU cores are available on the machine, where this code is executed. Before running any parallel code, makeCluster() is called to create a local cluster of processing nodes with all four CPU cores. The corpus is then split into four partitions using the clusterSplit() function to determine the ideal split of the corpus such that each partition has roughly the same number of documents. The actual parallel execution of grepl() on each partition of the corpus is carried out by the parSapply() function. Each processing node in the cluster is given a copy of the partition of data that it is supposed to process along with the code to be executed and other variables that are needed to run the code (in this case, the pattern argument). When all four processing nodes have completed their tasks, the results are combined in a similar fashion to sapply(). Finally, the cluster is destroyed by calling stopCluster(). It is good practice to ensure that stopCluster() is always called in production code, even if an error occurs during execution. This can be done as follows: doSomethingInParallel <- function(...) {    cl <- makeCluster(...)    on.exit(stopCluster(cl))    # do something} In this example, running the task in parallel on four processors resulted in a 23 percent reduction in the execution time. This is not in proportion to the amount of compute resources used to perform the task; with four times as many CPU cores working on it, a perfectly parallelizable task might experience as much as a 75 percent runtime reduction. However, remember Amdahl's law—the speed of parallel code is limited by the serial parts, which includes the overheads of parallelization. In this case, calling makeCluster() with the default arguments creates a socket-based cluster. When such a cluster is created, additional copies of R are run as workers. The workers communicate with the master R process using network sockets, hence the name. The worker R processes are initialized with the relevant packages loaded, and data partitions are serialized and sent to each worker process. These overheads can be significant, especially in data parallel algorithms where large volumes of data needs to be transferred to the worker processes. Besides parSapply(), parallel also provides the parApply() and parLapply() functions; these functions are analogous to the standard sapply(), apply(), and lapply() functions, respectively. In addition, the parLapplyLB() and parSapplyLB() functions provide load balancing, which is useful when the execution of each parallel task takes variable amounts of time. Finally, parRapply() and parCapply() are parallel row and column apply() functions for matrices. On non-Windows systems, parallel supports another type of cluster that often incurs less overheads — forked clusters. In these clusters, new worker processes are forked from the parent R process with a copy of the data. However, the data is not actually copied in the memory unless it is modified by a child process. This means that, compared to socket-based clusters, initializing child processes is quicker and the memory usage is often lower. Another advantage of using forked clusters is that parallel provides a convenient and concise way to run tasks on them via the mclapply(), mcmapply(), and mcMap() functions. (These functions start with mc because they were originally a part of the multicore package) There is no need to explicitly create and destroy the cluster, as these functions do this automatically. We can simply call mclapply() and state the number of worker processes to fork via the mc.cores argument: system.time(res3 <- unlist(    mclapply(text.partitioned, grepl, pattern = pattern,             mc.cores = detectCores())))##    user  system elapsed ## 127.012   0.350  33.264 This shows a 49 percent reduction in execution time compared to the serial version, and 35 percent reduction compared to parallelizing using a socket-based cluster. For this example, forked clusters provide the best performance. Due to differences in system configuration, you might see very different results when you try the examples in your own environment. When you develop parallel code, it is important to test the code in an environment that is similar to the one that it will eventually run in. Implementing task parallel algorithms Let's now see how to implement a task parallel algorithm using both socket-based and forked clusters. We will look at how to run the same task and different tasks on workers in a cluster. Running the same task on workers in a cluster To demonstrate how to run the same task on a cluster, the task for this example is to generate 500 million Poisson random numbers. We will do this by using L'Ecuyer's combined multiple-recursive generator, which is the only random number generator in base R that supports multiple streams to generate random numbers in parallel. The random number generator is selected by calling the RNGkind() function. We cannot just use any random number generator in parallel because the randomness of the data depends on the algorithm used to generate random data and the seed value given to each parallel task. Most other algorithms were not designed to produce random numbers in multiple parallel streams, and might produce multiple highly correlated streams of numbers, or worse, multiple identical streams! First, we will measure the execution time of the serial algorithm: RNGkind("L'Ecuyer-CMRG")nsamples <- 5e8lambda <- 10system.time(random1 <- rpois(nsamples, lambda))##   user  system elapsed## 51.905   0.636  52.544 To generate the random numbers on a cluster, we will first distribute the task evenly among the workers. In the following code, the integer vector samples.per.process contains the number of random numbers that each worker needs to generate on a four-core CPU. The seq() function produces ncores+1 numbers evenly distributed between 0 and nsamples, with the first number being 0 and the next ncores numbers indicating the approximate cumulative number of samples across the worker processes. The round() function rounds off these numbers into integers and diff() computes the difference between them to give the number of random numbers that each worker process should generate. cores <- detectCores()cl <- makeCluster(ncores)samples.per.process <-    diff(round(seq(0, nsamples, length.out = ncores+1))) Before we can generate the random numbers on a cluster, each worker needs a different seed from which it can generate a stream of random numbers. The seeds need to be set on all the workers before running the task, to ensure that all the workers generate different random numbers. For a socket-based cluster, we can call clusterSetRNGStream() to set the seeds for the workers, then run the random number generation task on the cluster. When the task is completed, we call stopCluster() to shut down the cluster: clusterSetRNGStream(cl)system.time(random2 <- unlist(    parLapply(cl, samples.per.process, rpois,               lambda = lambda)))##  user  system elapsed ## 5.006   3.000  27.436stopCluster(cl) Using four parallel processes in a socket-based cluster reduces the execution time by 48 percent. The performance of this type of cluster for this example is better than that of the data parallel example because there is less data to copy to the worker processes—only an integer that indicates how many random numbers to generate. Next, we run the same task on a forked cluster (again, this is not supported on Windows). The mclapply() function can set the random number seeds for each worker for us, when the mc.set.seed argument is set to TRUE; we do not need to call clusterSetRNGStream(). Otherwise, the code is similar to that of the socket-based cluster: system.time(random3 <- unlist(    mclapply(samples.per.process, rpois,             lambda = lambda,             mc.set.seed = TRUE, mc.cores = ncores))) ##   user  system elapsed ## 76.283   7.272  25.052 On our test machine, the execution time of the forked cluster is slightly faster, but close to that of the socket-based cluster, indicating that the overheads for this task are similar for both types of clusters. Running different tasks on workers in a cluster So far, we have executed the same tasks on each parallel process. The parallel package also allows different tasks to be executed on different workers. For this example, the task is to generate not only Poisson random numbers, but also uniform, normal, and exponential random numbers. As before, we start by measuring the time to perform this task serially: RNGkind("L'Ecuyer-CMRG")nsamples <- 5e7pois.lambda <- 10system.time(random1 <- list(pois = rpois(nsamples,                                          pois.lambda),                            unif = runif(nsamples),                            norm = rnorm(nsamples),                            exp = rexp(nsamples)))##   user  system elapsed ## 14.180   0.384  14.570 In order to run different tasks on different workers on socket-based clusters, a list of function calls and their associated arguments must be passed to parLapply(). This is a bit cumbersome, but parallel unfortunately does not provide an easier interface to run different tasks on a socket-based cluster. In the following code, the function calls are represented as a list of lists, where the first element of each sublist is the name of the function that runs on a worker, and the second element contains the function arguments. The function do.call() is used to call the given function with the given arguments. cores <- detectCores()cl <- makeCluster(cores)calls <- list(pois = list("rpois", list(n = nsamples,                                        lambda = pois.lambda)),              unif = list("runif", list(n = nsamples)),              norm = list("rnorm", list(n = nsamples)),              exp = list("rexp", list(n = nsamples)))clusterSetRNGStream(cl)system.time(    random2 <- parLapply(cl, calls,                         function(call) {                             do.call(call[[1]], call[[2]])                         }))##  user  system elapsed ## 2.185   1.629  10.403stopCluster(cl) On forked clusters on non-Windows machines, the mcparallel() and mccollect() functions offer a more intuitive way to run different tasks on different workers. For each task, mcparallel() sends the given task to an available worker. Once all the workers have been assigned their tasks, mccollect() waits for the workers to complete their tasks and collects the results from all the workers. mc.reset.stream()system.time({    jobs <- list()    jobs[[1]] <- mcparallel(rpois(nsamples, pois.lambda),                            "pois", mc.set.seed = TRUE)    jobs[[2]] <- mcparallel(runif(nsamples),                            "unif", mc.set.seed = TRUE)    jobs[[3]] <- mcparallel(rnorm(nsamples),                            "norm", mc.set.seed = TRUE)    jobs[[4]] <- mcparallel(rexp(nsamples),                            "exp", mc.set.seed = TRUE)    random3 <- mccollect(jobs)})##   user  system elapsed ## 14.535   3.569   7.97 Notice that we also had to call mc.reset.stream() to set the seeds for random number generation in each worker. This was not necessary when we used mclapply(), which calls mc.reset.stream() for us. However, mcparallel() does not, so we need to call it ourselves. Summary In this article, we learned about two classes of parallelism: data parallelism and task parallelism. Data parallelism is good for tasks that can be performed in parallel on partitions of a dataset. The dataset to be processed is split into partitions and each partition is processed on a different worker processes. Task parallelism, on the other hand, divides a set of similar or different tasks to amongst the worker processes. In either case, Amdahl's law states that the maximum improvement in speed that can be achieved by parallelizing code is limited by the proportion of that code that can be parallelized. Resources for Article: Further resources on this subject: Using R for Statistics, Research, and Graphics [Article] Learning Data Analytics with R and Hadoop [Article] Aspects of Data Manipulation in R [Article]
Read more
  • 0
  • 0
  • 3888

article-image-chain-responsibility-pattern
Packt
05 Feb 2015
12 min read
Save for later

The Chain of Responsibility Pattern

Packt
05 Feb 2015
12 min read
In this article by Sakis Kasampalis, author of the book Mastering Python Design Patterns, we will see a detailed description of the Chain of Responsibility design pattern with the help of a real-life example as well as a software example. Also, its use cases and implementation are discussed. (For more resources related to this topic, see here.) When developing an application, most of the time we know which method should satisfy a particular request in advance. However, this is not always the case. For example, we can think of any broadcast computer network, such as the original Ethernet implementation [j.mp/wikishared]. In broadcast computer networks, all requests are sent to all nodes (broadcast domains are excluded for simplicity), but only the nodes that are interested in a sent request process it. All computers that participate in a broadcast network are connected to each other using a common medium such as the cable that connects the three nodes in the following figure: If a node is not interested or does not know how to handle a request, it can perform the following actions: Ignore the request and do nothing Forward the request to the next node The way in which the node reacts to a request is an implementation detail. However, we can use the analogy of a broadcast computer network to understand what the chain of responsibility pattern is all about. The Chain of Responsibility pattern is used when we want to give a chance to multiple objects to satisfy a single request, or when we don't know which object (from a chain of objects) should process a specific request in advance. The principle is the same as the following: There is a chain (linked list, tree, or any other convenient data structure) of objects. We start by sending a request to the first object in the chain. The object decides whether it should satisfy the request or not. The object forwards the request to the next object. This procedure is repeated until we reach the end of the chain. At the application level, instead of talking about cables and network nodes, we can focus on objects and the flow of a request. The following figure, courtesy of a title="Scala for Machine Learning" www.sourcemaking.com [j.mp/smchain], shows how the client code sends a request to all processing elements (also known as nodes or handlers) of an application: Note that the client code only knows about the first processing element, instead of having references to all of them, and each processing element only knows about its immediate next neighbor (called the successor), not about every other processing element. This is usually a one-way relationship, which in programming terms means a singly linked list in contrast to a doubly linked list; a singly linked list does not allow navigation in both ways, while a doubly linked list allows that. This chain organization is used for a good reason. It achieves decoupling between the sender (client) and the receivers (processing elements) [GOF95, page 254]. A real-life example ATMs and, in general, any kind of machine that accepts/returns banknotes or coins (for example, a snack vending machine) use the chain of responsibility pattern. There is always a single slot for all banknotes, as shown in the following figure, courtesy of www.sourcemaking.com: When a banknote is dropped, it is routed to the appropriate receptacle. When it is returned, it is taken from the appropriate receptacle [j.mp/smchain], [j.mp/c2chain]. We can think of the single slot as the shared communication medium and the different receptacles as the processing elements. The result contains cash from one or more receptacles. For example, in the preceding figure, we see what happens when we request $175 from the ATM. A software example I tried to find some good examples of Python applications that use the Chain of Responsibility pattern but I couldn't, most likely because Python programmers don't use this name. So, my apologies, but I will use other programming languages as a reference. The servlet filters of Java are pieces of code that are executed before an HTTP request arrives at a target. When using servlet filters, there is a chain of filters. Each filter performs a different action (user authentication, logging, data compression, and so forth), and either forwards the request to the next filter until the chain is exhausted, or it breaks the flow if there is an error (for example, the authentication failed three consecutive times) [j.mp/soservl]. Apple's Cocoa and Cocoa Touch frameworks use Chain of Responsibility to handle events. When a view receives an event that it doesn't know how to handle, it forwards the event to its superview. This goes on until a view is capable of handling the event or the chain of views is exhausted [j.mp/chaincocoa]. Use cases By using the Chain of Responsibility pattern, we give a chance to a number of different objects to satisfy a specific request. This is useful when we don't know which object should satisfy a request in advance. An example is a purchase system. In purchase systems, there are many approval authorities. One approval authority might be able to approve orders up to a certain value, let's say $100. If the order is more than $100, the order is sent to the next approval authority in the chain that can approve orders up to $200, and so forth. Another case where Chain of Responsibility is useful is when we know that more than one object might need to process a single request. This is what happens in an event-based programming. A single event such as a left mouse click can be caught by more than one listener. It is important to note that the Chain of Responsibility pattern is not very useful if all the requests can be taken care of by a single processing element, unless we really don't know which element that is. The value of this pattern is the decoupling that it offers. Instead of having a many-to-many relationship between a client and all processing elements (and the same is true regarding the relationship between a processing element and all other processing elements), a client only needs to know how to communicate with the start (head) of the chain. The following figure demonstrates the difference between tight and loose coupling. The idea behind loosely coupled systems is to simplify maintenance and make it easier for us to understand how they function [j.mp/loosecoup]: Implementation There are many ways to implement Chain of Responsibility in Python, but my favorite implementation is the one by Vespe Savikko [j.mp/savviko]. Vespe's implementation uses dynamic dispatching in a Pythonic style to handle requests [j.mp/ddispatch]. Let's implement a simple event-based system using Vespe's implementation as a guide. The following is the UML class diagram of the system: The Event class describes an event. We'll keep it simple, so in our case an event has only name: class Event: def __init__(self, name): self.name = name def __str__(self): return self.name The Widget class is the core class of the application. The parent aggregation shown in the UML diagram indicates that each widget can have a reference to a parent object, which by convention, we assume is a Widget instance. Note, however, that according to the rules of inheritance, an instance of any of the subclasses of Widget (for example, an instance of MsgText) is also an instance of Widget. The default value of parent is None: class Widget: def __init__(self, parent=None): self.parent = parent The handle() method uses dynamic dispatching through hasattr() and getattr() to decide who is the handler of a specific request (event). If the widget that is asked to handle an event does not support it, there are two fallback mechanisms. If the widget has parent, then the handle() method of parent is executed. If the widget has no parent but a handle_default() method, handle_default() is executed: def handle(self, event): handler = 'handle_{}'.format(event) if hasattr(self, handler): method = getattr(self, handler) method(event) elif self.parent: self.parent.handle(event) elif hasattr(self, 'handle_default'): self.handle_default(event) At this point, you might have realized why the Widget and Event classes are only associated (no aggregation or composition relationships) in the UML class diagram. The association is used to show that the Widget class "knows" about the Event class but does not have any strict references to it, since an event needs to be passed only as a parameter to handle(). MainWIndow, MsgText, and SendDialog are all widgets with different behaviors. Not all these three widgets are expected to be able to handle the same events, and even if they can handle the same event, they might behave differently. MainWIndow can handle only the close and default events: class MainWindow(Widget): def handle_close(self, event): print('MainWindow: {}'.format(event)) def handle_default(self, event): print('MainWindow Default: {}'.format(event)) SendDialog can handle only the paint event: class SendDialog(Widget): def handle_paint(self, event): print('SendDialog: {}'.format(event)) Finally, MsgText can handle only the down event: class MsgText(Widget): def handle_down(self, event): print('MsgText: {}'.format(event)) The main() function shows how we can create a few widgets and events, and how the widgets react to those events. All events are sent to all the widgets. Note the parent relationship of each widget. The sd object (an instance of SendDialog) has as its parent the mw object (an instance of MainWindow). However, not all objects need to have a parent that is an instance of MainWindow. For example, the msg object (an instance of MsgText) has the sd object as a parent: def main(): mw = MainWindow() sd = SendDialog(mw) msg = MsgText(sd) for e in ('down', 'paint', 'unhandled', 'close'): evt = Event(e) print('nSending event -{}- to MainWindow'.format(evt)) mw.handle(evt) print('Sending event -{}- to SendDialog'.format(evt)) sd.handle(evt) print('Sending event -{}- to MsgText'.format(evt)) msg.handle(evt) The following is the full code of the example (chain.py): class Event: def __init__(self, name): self.name = name def __str__(self): return self.name class Widget: def __init__(self, parent=None): self.parent = parent def handle(self, event): handler = 'handle_{}'.format(event) if hasattr(self, handler): method = getattr(self, handler) method(event) elif self.parent: self.parent.handle(event) elif hasattr(self, 'handle_default'): self.handle_default(event) class MainWindow(Widget): def handle_close(self, event): print('MainWindow: {}'.format(event)) def handle_default(self, event): print('MainWindow Default: {}'.format(event)) class SendDialog(Widget): def handle_paint(self, event): print('SendDialog: {}'.format(event)) class MsgText(Widget): def handle_down(self, event): print('MsgText: {}'.format(event)) def main(): mw = MainWindow() sd = SendDialog(mw) msg = MsgText(sd) for e in ('down', 'paint', 'unhandled', 'close'): evt = Event(e) print('nSending event -{}- to MainWindow'.format(evt)) mw.handle(evt) print('Sending event -{}- to SendDialog'.format(evt)) sd.handle(evt) print('Sending event -{}- to MsgText'.format(evt)) msg.handle(evt) if __name__ == '__main__': main() Executing chain.py gives us the following results: >>> python3 chain.py Sending event -down- to MainWindow MainWindow Default: down Sending event -down- to SendDialog MainWindow Default: down Sending event -down- to MsgText MsgText: down Sending event -paint- to MainWindow MainWindow Default: paint Sending event -paint- to SendDialog SendDialog: paint Sending event -paint- to MsgText SendDialog: paint Sending event -unhandled- to MainWindow MainWindow Default: unhandled Sending event -unhandled- to SendDialog MainWindow Default: unhandled Sending event -unhandled- to MsgText MainWindow Default: unhandled Sending event -close- to MainWindow MainWindow: close Sending event -close- to SendDialog MainWindow: close Sending event -close- to MsgText MainWindow: close There are some interesting things that we can see in the output. For instance, sending a down event to MainWindow ends up being handled by the default MainWindow handler. Another nice case is that although a close event cannot be handled directly by SendDialog and MsgText, all the close events end up being handled properly by MainWindow. That's the beauty of using the parent relationship as a fallback mechanism. If you want to spend some more creative time on the event example, you can replace the dumb print statements and add some actual behavior to the listed events. Of course, you are not limited to the listed events. Just add your favorite event and make it do something useful! Another exercise is to add a MsgText instance during runtime that has MainWindow as the parent. Is this hard? Do the same for an event (add a new event to an existing widget). Which is harder? Summary In this article, we covered the Chain of Responsibility design pattern. This pattern is useful to model requests / handle events when the number and type of handlers isn't known in advance. Examples of systems that fit well with Chain of Responsibility are event-based systems, purchase systems, and shipping systems. In the Chain Of Responsibility pattern, the sender has direct access to the first node of a chain. If the request cannot be satisfied by the first node, it forwards to the next node. This continues until either the request is satisfied by a node or the whole chain is traversed. This design is used to achieve loose coupling between the sender and the receiver(s). ATMs are an example of Chain Of Responsibility. The single slot that is used for all banknotes can be considered the head of the chain. From here, depending on the transaction, one or more receptacles is used to process the transaction. The receptacles can be considered the processing elements of the chain. Java's servlet filters use the Chain of Responsibility pattern to perform different actions (for example, compression and authentication) on an HTTP request. Apple's Cocoa frameworks use the same pattern to handle events such as button presses and finger gestures. Resources for Article: Further resources on this subject: Exploring Model View Controller [Article] Analyzing a Complex Dataset [Article] Automating Your System Administration and Deployment Tasks Over SSH [Article]
Read more
  • 0
  • 0
  • 10075

article-image-openlayers-key-components
Packt
04 Feb 2015
13 min read
Save for later

OpenLayers' Key Components

Packt
04 Feb 2015
13 min read
In this article by, Thomas Gratier, Paul Spencer, and Erik Hazzard, authors of the book OpenLayers 3 Beginner's Guide, we will see the various components of OpenLayers and a short description about them. (For more resources related to this topic, see here.) The OpenLayers library provides web developers with components useful for building web mapping applications. Following the principles of object-oriented design, these components are called classes. The relationship between all the classes in the OpenLayers library is part of the deliberate design, or architecture, of the library. There are two types of relationships that we, as developers using the library, need to know about: relationships between classes and inheritance between classes. Relationships between classes describe how classes, or more specifically, instances of classes, are related to each other. There are several different conceptual ways that classes can be related, but basically a relationship between two classes implies that one of the class uses the other in some way, and often vice-versa. Inheritance between classes shows how behavior of classes, and their relationships are shared with other classes. Inheritance is really just a way of sharing common behavior between several different classes. We'll start our discussion of the key components of OpenLayers by focusing on the first of these – the relationship between classes. We'll start by looking at the Map class – ol.Map. Its all about the map Instances of the Map class are at the center of every OpenLayers application. These objects are instances of the ol.Map class and they use instances of other classes to do their job, which is to put an interactive map onto a web page. Almost every other class in the OpenLayers is related to the Map class in some direct or indirect relationship. The following diagram illustrates the direct relationships that we are most interested in: The preceding diagram shows the most important relationships between the Map class and other classes it uses to do its job. It tells us several important things: A map has 0 or 1 view instances and it uses the name view to refer to it. A view may be associated with multiple maps, however. A map may have 0 or more instances of layers managed by a Collection class and a layer may be associated with 0 or one Map class. The Map class has a member variable named layers that it uses to refer to this collection. A map may have 0 or more instances of overlays managed by a Collection class and an overlay may be associated with 0 or one Map class. The Map class has a member variable named overlays that it uses to refer to this collection. A map may have 0 or more instances of controls managed by a class called ol.Collection and controls may be associated with 0 or one Map class. The Map class has a member variable named controls that it uses to refer to this collection. A map may have 0 or more instances of interactions managed by a Collection class and an interaction may be associated with 0 or one Map class. The Map class has a member variable named interactions that it uses to refer to this collection. Although these are not the only relationships between the Map class and other classes, these are the ones we'll be working with the most. The View class (ol.View) manages information about the current position of the Map class. If you are familiar with the programming concept of MVC (Model-View-Controller), be aware that the view class is not a View in the MVC sense. It does not provide the presentation layer for the map, rather it acts more like a controller (although there is not an exact parallel because OpenLayers was not designed with MVC in mind). The Layer class (ol.layer.Base) is the base class for classes that provide data to the map to be rendered. The Overlay class (ol.Overlay) is an interactive visual element like a control, but it is tied to a specific geographic position. The Control class (ol.control.Control) is the base class for a group of classes that collectively provide the ability to a user to interact with the Map. Controls have a visible user interface element (such as a button or a form input element) with which the user interacts. The Interaction class (ol.interaction.Interaction) is the base class for a group of classes that also allow the user to interact with the map, but differ from controls in which they have no visible user interface element. For example, the DragPan interaction allows the user to click on and drag the map to pan around. Controlling the Map's view The OpenLayers view class, ol.View, represents a simple two-dimensional view of the world. It is responsible for determining where, and to some degree how, the user is looking at the world. It is responsible for managing the following information: The geographic center of the map The resolution of the map, which is to say how much of the map we can see around the center The rotation of the map Although you can create a map without a view, it won't display anything until a view is assigned to it. Every map must have a view in order to display any map data at all. However, a view may be shared between multiple instances of the Map class. This effectively synchronizes the center, resolution, and rotation of each of the maps. In this way, you can create two or more maps in different HTML containers on a web page, even showing different information, and have them look at the same world position. Changing the position of any of the maps (for instance, by dragging one) automatically updates the other maps at the same time! Displaying map content So, if the view is responsible for managing where the user is looking in the world, which component is responsible for determining what the user sees there? That's the job of layers and overlays. A layer provides access to a source of geospatial data. There are two basic kinds of layers, that is, raster and vector layers: In computer graphics, the term raster (raster graphics) refers to a digital image. In OpenLayers, a raster layer is one that displays images in your map at specific geographic locations. In computer graphics, the term vector (vector graphics) refers to images that are defined in terms of geometric shapes, such as points, lines, and polygons—or mathematic formulae such as Bézier curves. In OpenLayers, a vector layer reads geospatial data from vector data (such as a KML file) and the data can then be drawn onto the map. Layers are not the only way to display spatial information on the map. The other way is to use an overlay. We can create instances of ol.Overlay and add them to the map at specific locations. The overlay then positions its content (an HTML element) on the map at the specified location. The HTML element can then be used like any other HTML element. The most common use of overlays is to display spatially relevant information in a pop-up dialog in response to the mouse moving over, or clicking on a geographic feature. Interacting with the map As mentioned earlier, the two components that allow users to interact with the map are Interactions and Controls. Let's look at them in a bit more detail. Using interactions Interactions are components that allow the user to interact with the map via some direct input, usually by using the mouse (or a finger with a touch screen). Interactions have no visible user interface. The default set of interactions are: ol.interaction.DoubleClickZoom: If you double-click the left mouse button, the map will zoom in by a factor of 2 ol.interaction.DragPan: If you drag the map, it will pan as you move the mouse ol.interaction.PinchRotate: On touch-enabled devices, placing two fingers on the device and rotating them in a circular motion will rotate the map ol.interaction.PinchZoom: On touch-enabled devices, placing two fingers on the device and pinching them together or spreading them apart will zoom the map out and in respectively ol.interaction.KeyboardPan: You can use the arrow keys to pan the map in the direction of the arrows ol.interaction.KeyboardZoom: You can use the + and – keys to zoom in and out ol.interaction.MouseWheelZoom: You can use the scroll wheel on a mouse to zoom the map in and out ol.interaction.DragZoom: If you hold the Shift key while dragging on map, a rectangular region will be drawn and when you release the mouse button, you will zoom into that area Controls Controls are components that allow the user to modify the map state via some visible user interface element, such as a button. In the examples we've seen so far, we've seen zoom buttons in the top-left corner of the map and an attribution control in the bottom-right corner of the map. In fact, the default controls are: ol.control.Zoom: This displays the zoom buttons in the top-left corner. ol.control.Rotate: This is a button to reset rotation to 0; by default, this is only displayed when the map's rotation is not 0. Ol.control.Attribution: This displays attribution text for the layers currently visible in the map. By default, the attributions are collapsed to a single icon in the bottom-right corner and clicking the icon will show the attributions. This concludes our brief overview of the central components of an OpenLayers application. We saw that the Map class is at the center of everything and there are some key components—the view, layers, overlays, interactions, and controls—that it uses to accomplish its job of putting an interactive map onto a web page. At the beginning of this article, we talked about both relationships and inheritance. So far, we've only covered the relationships. In the next section, we'll show the inheritance architecture of the key components and introduce three classes that have been working behind the scenes to make everything work. OpenLayers' super classes In this section, we will look at three classes in the OpenLayers library that we won't often work directly with, but which provide an enormous amount of functionality to most of the other classes in the library. The first two classes, Observable and Object, are at the base of the inheritance tree for OpenLayers—the so-called super classes that most classes inherit from. The third class, Collection, isn't actually a super class but is used as the basis for many relationships between classes in OpenLayers—we've already seen that the Map class relationships with layers, overlays, interactions, and controls are managed by instances of the Collection class. Before we jump into the details, take a look at the inheritance diagram for the components we've already discussed: As you can see, the Observable class, ol.Observable, is the base class for every component of OpenLayers that we've seen so far. In fact, there are very few classes in the OpenLayers library that do not inherit from the Observable class or one of its subclasses. Similarly, the Object class, ol.Object, is the base class for many classes in the library and itself is a subclass of Observable. The Observable and Object classes aren't very glamorous. You can't see them in action and they don't do anything very exciting from a user's perspective. What they do though is provide two common sets of behavior that you can expect to be able to use on almost every object you create or access through the OpenLayers library—Event management and Key-Value Observing (KVO). Event management with the Observable class An event is basically what it sounds like—something happening. Events are a fundamental part of how various components of OpenLayers—the map, layers, controls, and pretty much everything else—communicate with each other. It is often important to know when something has happened and to react to it. One type of event that is very useful is a user-generated event, such as a mouse click or touches on a mobile device's screen. Knowing when the user has clicked and dragged on the Map class allows some code to react to this and move the map to simulate panning it. Other types of events are internal, such as the map being moved or data finishing loading. To continue the previous example, once the map has moved to simulate panning, another event is issued by OpenLayers to say that the map has finished moving so that other parts of OpenLayers can react by updating the user interface with the center coordinates or by loading more data. Key-Value Observing with the Object class OpenLayers' Object class inherits from Observable and implements a software pattern called Key-Value Observing (KVO). With KVO, an object representing some data maintains a list of other objects that wish to observe it. When the data value changes, the observers are notified automatically. Working with Collections The last section for this article is about the OpenLayers' Collection class, ol.Collection. As mentioned, the Collection class is not a super class like Observable and Object, but it is an integral part of the relationship model. Many classes in OpenLayers make use of the Collection class to manage one-to-many relationships. At its core, the Collection class is a JavaScript array with additional convenience methods. It also inherits directly from the Object class and inherits the functionality of both Observable and Object. This makes the Collection class extremely powerful. Collection properties A Collection class, inherited from the Object class, has one observable property, length. When a collection changes (elements are added or removed), it's length property is updated. This means it also emits an event, change:length, when the length property is changed. Collection events A Collection class also inherits the functionality of the Observable class (via Object class) and emits two other events—add and remove. Registered event handler functions of both events will receive a single argument, a CollectionEvent, that has an element property with the element that was added or removed. Summary This wraps up our overview of the key concepts in the OpenLayers library. We took a quick look at the key components of the library from two different aspects—relationships and inheritance. With the Map class as the central object of any OpenLayers application, we looked at its main relationships to other classes including views, layers, overlays, interactions, and controls. We briefly introduced each of these classes to give an overview of primary purpose. We then investigated inheritance related to these objects and reviewed the super classes that provide functionality to most classes in the OpenLayers library—the Observable and Object classes. The Observable class provides a basic event mechanism and the Object class adds observable properties with a powerful binding feature. Lastly, we looked at the Collection class. Although this isn't part of the inheritance structure, it is crucial to know how one-to-many relationships work throughout the library (including the Map class relationships with layers, overlays, interactions, and controls). Resources for Article: Further resources on this subject: OGC for ESRI Professionals [Article] Improving proximity filtering with KNN [Article] OpenLayers: Overview of Vector Layer [Article]
Read more
  • 0
  • 0
  • 3611
article-image-adding-authentication
Packt
23 Jan 2015
15 min read
Save for later

Adding Authentication

Packt
23 Jan 2015
15 min read
This article written by Mat Ryer, the author of Go Programming Blueprints, is focused on high-performance transmission of messages from the clients to the server and back again, but our users have no way of knowing who they are talking to. One solution to this problem is building of some kind of signup and login functionality and letting our users create accounts and authenticate themselves before they can open the chat page. (For more resources related to this topic, see here.) Whenever we are about to build something from scratch, we must ask ourselves how others have solved this problem before (it is extremely rare to encounter genuinely original problems), and whether any open solutions or standards already exist that we can make use of. Authorization and authentication are hardly new problems, especially in the world of the Web, with many different protocols out there to choose from. So how do we decide the best option to pursue? As always, we must look at this question from the point of view of the user. A lot of websites these days allow you to sign in using your accounts existing elsewhere on a variety of social media or community websites. This saves users the tedious job of entering all their account information over and over again as they decide to try out different products and services. It also has a positive effect on the conversion rates for new sites. In this article, we will enhance our chat codebase to add authentication, which will allow our users to sign in using Google, Facebook, or GitHub and you'll see how easy it is to add other sign-in portals too. In order to join the chat, users must first sign in. Following this, we will use the authorized data to augment our user experience so everyone knows who is in the room, and who said what. In this article, you will learn to: Use the decorator pattern to wrap http.Handler types to add additional functionality to handlers Serve HTTP endpoints with dynamic paths Use the Gomniauth open source project to access authentication services Get and set cookies using the http package Encode objects as Base64 and back to normal again Send and receive JSON data over a web socket Give different types of data to templates Work with channels of your own types Handlers all the way down For our chat application, we implemented our own http.Handler type in order to easily compile, execute, and deliver HTML content to browsers. Since this is a very simple but powerful interface, we are going to continue to use it wherever possible when adding functionality to our HTTP processing. In order to determine whether a user is authenticated, we will create an authentication wrapper handler that performs the check, and passes execution on to the inner handler only if the user is authenticated. Our wrapper handler will satisfy the same http.Handler interface as the object inside it, allowing us to wrap any valid handler. In fact, even the authentication handler we are about to write could be later encapsulated inside a similar wrapper if needed. Diagram of a chaining pattern when applied to HTTP handlers The preceding figure shows how this pattern could be applied in a more complicated HTTP handler scenario. Each object implements the http.Handler interface, which means that object could be passed into the http.Handle method to directly handle a request, or it can be given to another object, which adds some kind of extra functionality. The Logging handler might write to a logfile before and after the ServeHTTP method is called on the inner handler. Because the inner handler is just another http.Handler, any other handler can be wrapped in (or decorated with) the Logging handler. It is also common for an object to contain logic that decides which inner handler should be executed. For example, our authentication handler will either pass the execution to the wrapped handler, or handle the request itself by issuing a redirect to the browser. That's plenty of theory for now; let's write some code. Create a new file called auth.go in the chat folder: package main import ( "net/http" ) type authHandler struct { next http.Handler } func (h *authHandler) ServeHTTP(w http.ResponseWriter, r *http.Request) { if _, err := r.Cookie("auth"); err == http.ErrNoCookie { // not authenticated w.Header().Set("Location", "/login") w.WriteHeader(http.StatusTemporaryRedirect) } else if err != nil { // some other error panic(err.Error()) } else { // success - call the next handler h.next.ServeHTTP(w, r) } } func MustAuth(handler http.Handler) http.Handler { return &authHandler{next: handler} } The authHandler type not only implements the ServeHTTP method (which satisfies the http.Handler interface) but also stores (wraps) http.Handler in the next field. Our MustAuth helper function simply creates authHandler that wraps any other http.Handler. This is the pattern in general programming practice that allows us to easily add authentication to our code in main.go. Let us tweak the following root mapping line: http.Handle("/", &templateHandler{filename: "chat.html"}) Let us change the first argument to make it explicit about the page meant for chatting. Next, let's use the MustAuth function to wrap templateHandler for the second argument: http.Handle("/chat", MustAuth(&templateHandler{filename: "chat.html"})) Wrapping templateHandler with the MustAuth function will cause execution to run first through our authHandler, and only to templateHandler if the request is authenticated. The ServeHTTP method in our authHandler will look for a special cookie called auth, and use the Header and WriteHeader methods on http.ResponseWriter to redirect the user to a login page if the cookie is missing. Build and run the chat application and try to hit http://localhost:8080/chat: go build -o chat ./chat -host=":8080" You need to delete your cookies to clear out previous auth tokens, or any other cookies that might be left over from other development projects served through localhost. If you look in the address bar of your browser, you will notice that you are immediately redirected to the /login page. Since we cannot handle that path yet, you'll just get a 404 page not found error. Making a pretty social sign-in page There is no excuse for building ugly apps, and so we will build a social sign-in page that is as pretty as it is functional. Bootstrap is a frontend framework used to develop responsive projects on the Web. It provides CSS and JavaScript code that solve many user-interface problems in a consistent and good-looking way. While sites built using Bootstrap all tend to look the same (although there are plenty of ways in which the UI can be customized), it is a great choice for early versions of apps, or for developers who don't have access to designers. If you build your application using the semantic standards set forth by Bootstrap, it becomes easy for you to make a Bootstrap theme for your site or application and you know it will slot right into your code. We will use the version of Bootstrap hosted on a CDN so we don't have to worry about downloading and serving our own version through our chat application. This means that in order to render our pages properly, we will need an active Internet connection, even during development. If you prefer to download and host your own copy of Bootstrap, you can do so. Keep the files in an assets folder and add the following call to your main function (it uses http.Handle to serve the assets via your application): http.Handle("/assets/", http.StripPrefix("/assets", http.FileServer(http.Dir("/path/to/assets/")))) Notice how the http.StripPrefix and http.FileServer functions return objects that satisfy the http.Handler interface as per the decorator pattern that we implement with our MustAuth helper function. In main.go, let's add an endpoint for the login page: http.Handle("/chat", MustAuth(&templateHandler{filename: "chat.html"})) http.Handle("/login", &templateHandler{filename: "login.html"}) http.Handle("/room", r) Obviously, we do not want to use the MustAuth method for our login page because it will cause an infinite redirection loop. Create a new file called login.html inside our templates folder, and insert the following HTML code: <html> <head> <title>Login</title> <link rel="stylesheet" href="//netdna.bootstrapcdn.com/bootstrap/3.1.1/css/bootstrap.min.css"> </head> <body> <div class="container"> <div class="page-header"> <h1>Sign in</h1> </div> <div class="panel panel-danger"> <div class="panel-heading"> <h3 class="panel-title">In order to chat, you must be signed in</h3> </div> <div class="panel-body"> <p>Select the service you would like to sign in with:</p> <ul> <li> <a href="/auth/login/facebook">Facebook</a> </li> <li> <a href="/auth/login/github">GitHub</a> </li> <li> <a href="/auth/login/google">Google</a> </li> </ul> </div> </div> </div> </body> </html> Restart the web server and navigate to http://localhost:8080/login. You will notice that it now displays our sign-in page: Endpoints with dynamic paths Pattern matching for the http package in the Go standard library isn't the most comprehensive and fully featured implementation out there. For example, Ruby on Rails makes it much easier to have dynamic segments inside the path: "auth/:action/:provider_name" This then provides a data map (or dictionary) containing the values that it automatically extracted from the matched path. So if you visit auth/login/google, then params[:provider_name] would equal google, and params[:action] would equal login. The most the http package lets us specify by default is a path prefix, which we can do by leaving a trailing slash at the end of the pattern: "auth/" We would then have to manually parse the remaining segments to extract the appropriate data. This is acceptable for relatively simple cases, which suits our needs for the time being since we only need to handle a few different paths such as: /auth/login/google /auth/login/facebook /auth/callback/google /auth/callback/facebook If you need to handle more advanced routing situations, you might want to consider using dedicated packages such as Goweb, Pat, Routes, or mux. For extremely simple cases such as ours, the built-in capabilities will do. We are going to create a new handler that powers our login process. In auth.go, add the following loginHandler code: // loginHandler handles the third-party login process. // format: /auth/{action}/{provider} func loginHandler(w http.ResponseWriter, r *http.Request) { segs := strings.Split(r.URL.Path, "/") action := segs[2] provider := segs[3] switch action { case "login": log.Println("TODO handle login for", provider) default: w.WriteHeader(http.StatusNotFound) fmt.Fprintf(w, "Auth action %s not supported", action) } } In the preceding code, we break the path into segments using strings.Split before pulling out the values for action and provider. If the action value is known, we will run the specific code, otherwise we will write out an error message and return an http.StatusNotFound status code (which in the language of HTTP status code, is a 404 code). We will not bullet-proof our code right now but it's worth noticing that if someone hits loginHandler with too few segments, our code will panic because it expects segs[2] and segs[3] to exist. For extra credit, see whether you can protect against this and return a nice error message instead of a panic if someone hits /auth/nonsense. Our loginHandler is only a function and not an object that implements the http.Handler interface. This is because, unlike other handlers, we don't need it to store any state. The Go standard library supports this, so we can use the http.HandleFunc function to map it in a way similar to how we used http.Handle earlier. In main.go, update the handlers: http.Handle("/chat", MustAuth(&templateHandler{filename: "chat.html"})) http.Handle("/login", &templateHandler{filename: "login.html"}) http.HandleFunc("/auth/", loginHandler) http.Handle("/room", r) Rebuild and run the chat application: go build –o chat ./chat –host=":8080" Hit the following URLs and notice the output logged in the terminal: http://localhost:8080/auth/login/google outputs TODO handle login for google http://localhost:8080/auth/login/facebook outputs TODO handle login for facebook We have successfully implemented a dynamic path-matching mechanism that so far just prints out TODO messages; we need to integrate with authentication services in order to make our login process work. OAuth2 OAuth2 is an open authentication and authorization standard designed to allow resource owners to give clients delegated access to private data (such as wall posts or tweets) via an access token exchange handshake. Even if you do not wish to access the private data, OAuth2 is a great option that allows people to sign in using their existing credentials, without exposing those credentials to a third-party site. In this case, we are the third party and we want to allow our users to sign in using services that support OAuth2. From a user's point of view, the OAuth2 flow is: A user selects provider with whom they wish to sign in to the client app. The user is redirected to the provider's website (with a URL that includes the client app ID) where they are asked to give permission to the client app. The user signs in from the OAuth2 service provider and accepts the permissions requested by the third-party application. The user is redirected back to the client app with a request code. In the background, the client app sends the grant code to the provider, who sends back an auth token. The client app uses the access token to make authorized requests to the provider, such as to get user information or wall posts. To avoid reinventing the wheel, we will look at a few open source projects that have already solved this problem for us. Open source OAuth2 packages Andrew Gerrand has been working on the core Go team since February 2010, that is two years before Go 1.0 was officially released. His goauth2 package (see https://code.google.com/p/goauth2/) is an elegant implementation of the OAuth2 protocol written entirely in Go. Andrew's project inspired Gomniauth (see https://github.com/stretchr/gomniauth). An open source Go alternative to Ruby's omniauth project, Gomniauth provides a unified solution to access different OAuth2 services. In the future, when OAuth3 (or whatever next-generation authentication protocol it is) comes out, in theory, Gomniauth could take on the pain of implementing the details, leaving the user code untouched. For our application, we will use Gomniauth to access OAuth services provided by Google, Facebook, and GitHub, so make sure you have it installed by running the following command: go get github.com/stretchr/gomniauth Some of the project dependencies of Gomniauth are kept in Bazaar repositories, so you'll need to head over to http://wiki.bazaar.canonical.com to download them. Tell the authentication providers about your app Before we ask an authentication provider to help our users sign in, we must tell them about our application. Most providers have some kind of web tool or console where you can create applications to kick this process. Here's one from Google: In order to identify the client application, we need to create a client ID and secret. Despite the fact that OAuth2 is an open standard, each provider has their own language and mechanism to set things up, so you will most likely have to play around with the user interface or the documentation to figure it out in each case. At the time of writing this, in Google Developer Console , you navigate to APIs & auth | Credentials and click on the Create new Client ID button. In most cases, for added security, you have to be explicit about the host URLs from where requests will come. For now, since we're hosting our app locally on localhost:8080, you should use that. You will also be asked for a redirect URI that is the endpoint in our chat application and to which the user will be redirected after successfully signing in. The callback will be another action on our loginHandler, so the redirection URL for the Google client will be http://localhost:8080/auth/callback/google. Once you finish the authentication process for the providers you want to support, you will be given a client ID and secret for each provider. Make a note of these, because we will need them when we set up the providers in our chat application. If we host our application on a real domain, we have to create new client IDs and secrets, or update the appropriate URL fields on our authentication providers to ensure that they point to the right place. Either way, it's not bad practice to have a different set of development and production keys for security. Summary This article shows how to add OAuth to our chat application so that we can keep track of who is saying what, but let them log in using Google, Facebook, or GitHub. We also learned how to use handlers for efficient coding. This article also thought us how to make a pretty social sign-in page. Resources for Article: Further resources on this subject: WebSockets in Wildfly [article] Using Socket.IO and Express together [article] The Importance of Securing Web Services [article]
Read more
  • 0
  • 0
  • 3363

article-image-arcgis-spatial-analyst
Packt
20 Jan 2015
16 min read
Save for later

ArcGIS Spatial Analyst

Packt
20 Jan 2015
16 min read
In this article by Daniela Cristiana Docan, author of ArcGIS for Desktop Cookbook, we will learn that the ArcGIS Spatial Analyst extension offers a lot of great tools for geoprocessing raster data. Most of the Spatial Analyst tools generate a new raster output. Before starting a raster analysis session, it's best practice to set the main analysis environment parameters settings (for example, scratch the workspace, extent, and cell size of the output raster). In this article, you will store all raster datasets in file geodatabase as file geodatabase raster datasets. (For more resources related to this topic, see here.) Analyzing surfaces In this recipe, you will represent 3D surface data in a two-dimensional environment. To represent 3D surface data in the ArcMap 2D environment, you will use hillshades and contours. You can use the hillshade raster as a background for other raster or vector data in ArcMap. Using the surface analysis tools, you can derive new surface data, such as slope and aspect or locations visibility. Getting ready In the surface analysis context: The term slope refers to the steepness of raster cells Aspect defines the orientation or compass direction of a cell Visibility identifies which raster cells are visible from a surface location In this recipe, you will prepare your data for analysis by creating an elevation surface named Elevation from vector data. The two feature classes involved are the PointElevation point feature class and the ContourLine polyline feature class. All other output raster datasets will derive from the Elevation raster. How to do it... Follow these steps to prepare your data for spatial analysis: Start ArcMap and open the existing map document, SurfaceAnalysis.mxd, from <drive>:PacktPublishingDataSpatialAnalyst. Go to Customize | Extensions and check the Spatial Analyst extension. Open ArcToolbox, right-click on the ArcToolbox toolbox, and select Environments. Set the geoprocessing environment as follows: Workspace | Current Workspace: DataSpatialAnalystTOPO5000.gdb and Scratch Workspace: DataSpatialAnalystScratchTOPO5000.gdb. Output Coordinates: Same as Input. Raster Analysis | Cell Size: As Specified below: type 0.5 with unit as m. Mask: SpatialAnalystTOPO5000.gdbTrapezoid5k. Raster Storage | Pyramid: check Build pyramids and Pyramid levels: type 3. Click on OK. In ArcToolbox, expand Spatial Analyst Tools | Interpolation, and double-click on the Topo to Raster tool to open the dialog box. Click on Show Help to see the meaning of every parameter. Set the following parameters: Input feature data: PointElevation Field: Elevation and Type: PointElevation ContourLine Field: Elevation and Type: Contour WatercourseA Type: Lake Output surface raster: ...ScratchTOPO5000.gdbElevation Output extent (optional): ContourLine Drainage enforcement (optional): NO_ENFORCE Accept the default values for all other parameters. Click on OK. The Elevation raster is a continuous thematic raster. The raster cells are arranged in 4,967 rows and 4,656 columns. Open Layer Properties | Source of the raster and explore the following properties: Data Type (File Geodatabase Raster Dataset), Cell Size (0.5 meters) or Spatial Reference (EPSG: 3844). In the Layer Properties window, click on the Symbology tab. Select the Stretched display method for the continuous raster cell values as follows: Show: Stretched and Color Ramp: Surface. Click on OK. Explore the cell values using the following two options: Go to Layer Properties | Display and check Show MapTips Add the Spatial Analyst toolbar, and from Customize | Commands, add the Pixel Inspector tool Let's create a hillshade raster using the Elevation layer: Expand Spatial Analyst Tools | Interpolation and double-click on the Hillshade tool to open the dialog box. Set the following parameters: Input raster: ScratchTOPO5000.gdbElevation Output raster: ScratchTOPO5000.gdbHillshade Azimuth (optional): 315 and Altitude (optional): 45 Accept the default value for Z factor and leave the Model shadows option unchecked. Click on OK. From time to time, please ensure to save the map document as MySurfaceAnalysis.mxd at ...DataSpatialAnalyst. The Hillshade raster is a discrete thematic raster that has an associated attribute table known as Value Attribute Table (VAT). Right-click on the Hillshade raster layer and select Open Attribute Table. The Value field stores the illumination values of the raster cells based on the position of the light source. The 0 value (black) means that 25406 cells are not illuminated by the sun, and 254 value (white) means that 992 cells are entirely illuminated. Close the table. In the Table Of Contents section, drag the Hillshade layer below the Elevation layer, and use the Effects | Transparency tool to add a transparency effect for the Elevation raster layer, as shown in the following screenshot: In the next step, you will derive a raster of slope and aspect from the Elevation layer. Expand Spatial Analyst Tools | Interpolation and double-click on the Slope tool to open the dialog box. Set the following parameters: Input raster: Elevation Output raster: ScratchTOPO5000.gdbSlopePercent Output measurement (optional): PERCENT_RISE Click on OK. Symbolize the layer using the Classified method, as follows: Show: Classified. In the Classification section, click on Classify and select the Manual classification method. You will add seven classes. To add break values, right-click on the empty space of the Insert Break graph. To delete one, select the break value from the graph, and right-click to select Delete Break. Do not erase the last break value, which represents the maximum value. Secondly, in the Break Values section, edit the following six values: 5; 7; 15; 20; 60; 90, and leave unchanged the seventh value (496,6). Select Slope (green to red) for Color Ramp. Click on OK. The green areas represent flatter slopes, while the red areas represent steep slopes, as shown in the following screenshot: Expand Spatial Analyst Tools | Interpolation and double click on the Aspect tool to open the dialog box. Set the following parameters: Input raster: Elevation Output raster: ScratchTOPO5000.gdbAspect Click on OK. Symbolize the Aspect layer. For Classify, click on the Manual classification method. You will add five classes. To add or delete break values, right-click on the empty space of the graph, and select Insert / Delete Break. Secondly, edit the following four values: 0; 90; 180; 270, leaving unchanged the fifth value in the Break Values section. Click on OK. In the Symbology window, edit the labels of the five classes as shown in the following picture. Click on OK. In the Table Of Contents section, select the <VALUE> label, and type Slope Direction. The following screenshot is the result of this action: In the next step, you will create a raster of visibility between two geodetic points in order to plan some topographic measurements using an electronic theodolite. You will use the TriangulationPoint and Elevation layers: In the Table Of Contents section, turn on the TriangulationPoint layer, and open its attribute table to examine the fields. There are two geodetic points with the following supplementary fields: OffsetA and OffsetB. OffsetA is the proposed height of the instrument mounted on its tripod above stations 8 and 72. OffsetB is the proposed height of the reflector (or target) above the same points. Close the table. Expand Spatial Analyst Tools | Interpolation and double-click on the Visibility tool to open the dialog box. Click on Show Help to see the meaning of every parameter. Set the following parameters: Input raster: Elevation Input point or polyline observer features: TOPO5000.gdbGeodeticPointsTriangulationPoint Output raster: ScratchTOPO5000.gdbVisibility Analysis type (optional): OBSERVERS Observer parameters | Surface offset (optional): OffsetB Observer offset (optional): OffsetA Outer radius (optional): For this, type 1600 Notice that OffsetA and OffsetB were automatically assigned. The Outer radius parameter limits the search distance, and it is the rounded distance between the two geodetic points. All other cells beyond the 1,600-meter radius will be excluded from the visibility analysis. Click on OK. Open the attribute table of the Visibility layer to inspect the fields and values. The Value field stores the value of cells. Value 0 means that cells are not visible from the two points. Value 1 means that 6,608,948 cells are visible only from point 8 (first observer OBS1). Value 2 means that 1,813,578 cells are visible only from point 72 (second observer OBS2). Value 3 means that 4,351,861 cells are visible from both points. In conclusion, there is visibility between the two points if the height of the instrument and reflector is 1.5 meters. Close the table. Symbolize the Visibility layer, as follows: Show: Unique Values and Value Field: Value. Click on Add All Values and choose Color Scheme: Yellow-Green Bright. Select <Heading> and change the Label value to Height 1.5 meters. Double-click on the symbol for Value as 0, and select No Color. Click on OK. The Visibility layer is symbolized as shown in the following screenshot: Turn off all layers except the Visibility, TriangulationPoint, and Hillshade layers. Save your map as MySurfaceAnalysis.mxd and close ArcMap. You can find the final results at <drive>:PacktPublishingDataSpatialAnalystSurfaceAnalysis. How it works... You have started the exercise by setting the geoprocessing environment. You will override those settings in the next recipes. At the application level, you chose to build pyramids. By creating pyramids, your raster will be displayed faster when you zoom out. The pyramid levels contain the copy of the original raster at a low resolution. The original raster will have a cell size of 0.5 meters. The pixel size will double at each level of the pyramid, so the first level will have a cell size of 1 meter; the second level will have a cell size of 2 meters; and the third level will have a cell size of 4 meters. Even if the values of cells refer to heights measured above the local mean sea level (zero-level surface), you should consider the planimetric accuracy of the dataset. Please remember that TOPO5000.gdb refers to a product at the scale 1:5,000. This is the reason why you have chosen 0.5 meters for the raster cell size. At step 4, you used the PointElevation layer as supplementary data when you created the Elevation raster. If one of your ArcToolbox tools fails to execute or you have obtained an empty raster output, you have some options here: Open the Results dialog from the Geoprocessing menu to explore the error report. This will help you to identify the parameter errors. Right-click on the previous execution of the tool and choose Open (step 1). Change the parameters and click on OK to run the tool. Choose Re Run if you want to run the tool with the parameters unchanged (step 2) as shown in the following screenshot: Run the ArcToolbox tool from the ArcCatalog application. Before running the tool, check the geoprocessing environment in ArcCatalog by navigating to Geoprocessing | Environments. There's more... What if you have a model with all previous steps? Open ArcCatalog, and go to ...DataSpatialAnalystModelBuilder. In the ModelBuilder folder, you have a toolbox named MyToolbox, which contains the Surface Analysis model. Right-click on the model and select Properties. Take your time to study the information from the General, Parameters, and Environments tabs. The output (derived data) will be saved in Scratch Workspace: ModelBuilder ScratchTopo5000.gdb. Click on OK to close the Surface Analysis Properties window. Running the entire model will take you around 25 minutes. You have two options: Tool dialog option: Right-click on the Surface Analysis model and select Open. Notice the model parameters that you can modify and read the Help information. Click on OK to run the model. Edit mode: Right-click on the Surface Analysis model and select Edit. The colored model elements are in the second state—they are ready to run the Surface Analysis model by using one of those two options: To run the entire model at the same time, select Run Entire Model from the Model menu. To run the tools (yellow rounded rectangle) one by one, select the Topo to Raster tool with the Select tool, and click on the Run tool from the Standard toolbar. Please remember that a shadow behind a tool means that the model element has already been run. You used the Visibility tool to check the visibility between two points with 1.5 meters for the Observer offset and Surface offset parameters. Try yourself to see what happens if the offset value is less than 1.5 meters. To again run the Visibility tool in the Edit mode, right-click on the tool, and select Open. For Surface offset and Observer offset, type 0.5 meters and click on OK to run the tool. Repeat these steps for a 1 meter offset. Interpolating data Spatial interpolation is the process of estimating an unknown value between two known values taking into account Tobler's First Law: "Everything is related to everything else, but near things are more related than distant things." This recipe does not undertake to teach you the advanced concept of interpolation because it is too complex for this book. Instead, this recipe will guide you to create a terrain surface using the following: A feature class with sample elevation points Two interpolation methods: Inverse Distance Weighted (IDW) and Spline For further research, please refer to: Geographic Information Analysis, David O'Sullivan and David Unwin, John Wiley & Sons, Inc., 2003, specifically the 8.3 Spatial interpolation recipe of Chapter 8, Describing and Analyzing Fields, pp.220-234. Getting ready In this recipe, you will create a terrain surface stored as a raster using the PointElevation sample points. Your sample data has the following characteristics: The average distance between points is 150 meters The density of sample points is not the same on the entire area of interest There are not enough points to define the cliffs and the depressions There are not extreme differences in elevation values How to do it... Follow these steps to create a terrain surface using the IDW tool: Start ArcMap and open an existing map document Interpolation.mxd from <drive>:PacktPublishingDataSpatialAnalyst. Set the geoprocessing environment, as follows: Workspace | Current Workspace: DataSpatialAnalystTOPO5000.gdb and Scratch Workspace: DataSpatialAnalystScratchTOPO5000.gdb Output Coordinates: Same as the PointElevation layer Raster Analysis | Cell Size: As Specified Below: 1 Mask: DataSpatialAnalystTOPO5000.gdbTrapezoid5k In the next two steps, you will use the IDW tool. Running IDW with barrier polyline features will take you around 15 minutes: In ArcToolbox, go to Spatial Analyst Tools | Interpolation, and double-click on the IDW tool. Click on Show Help to see the meaning of every parameter. Set the following parameters: Input point features: PointElevation Z value field: Elevation Output raster: ScratchTOPO5000.gdbIDW_1 Power (optional): 0.5 Search radius (optional): Variable Search Radius Settings | Number of points: 6 Maximum distance: 500 Input barrier polyline features (optional): TOPO5000.gdbHydrographyWatercourseL Accept the default value of Output cell size (optional). Click on OK. Repeat step 3 by setting the following parameters: Input point features: PointElevation Z value field: Elevation Output raster: ScratchTOPO5000.gdbIDW_2 Power (optional): 2 The rest of the parameters are the same as in step 3. Click on OK. Symbolize the IDW_1 and IDW_2 layers as follows: Show: Classified; Classification: Equal Interval: 10 classes; Color Scheme: Surface. Click on OK. You should obtain the following results: In the following steps, you will use the Spline tool to generate the terrain surface: In ArcToolbox, go to Spatial Analyst Tools | Interpolation, and double-click on the Spline tool. Set the following parameters: Input point features: PointElevation Z value field: Elevation Output raster: ScratchTOPO5000.gdbSpline_Regular Spline type (optional): REGULARIZED Weight (optional): 0.1 and Number of points (optional): 6 Accept the default value of Output cell size (optional). Click on OK. Run again the Spline tool with the following parameters: Input point features: PointElevation Z value field: Elevation Output raster: ScratchTOPO5000.gdbSpline_Tension Spline type (optional): TENSION Weight (optional): 0.1 and Number of points (optional): 6 Accept the default value of Output cell size (optional). Click on OK. Symbolize the Spline_Regular and Spline_Tension raster layers using the Equal Interval method classification with 10 classes and the Surface color ramp: In the next steps, you will use the Spline with Barriers tool to generate a terrain surface using an increased number of sample points. You will transform the ContourLine layer in a point feature class. You will combine those new points with features from the PointElevation layer: In ArcToolbox, go to Data Management Tools | Features, and double-click on the Feature vertices to Points tool. Set the following parameters: Input features: ContourLine Output Feature Class: TOPO5000.gdbRelief ContourLine_FeatureVertices Point type (optional): ALL Click on OK. Inspect the attribute table of the newly created layer. In the Catalog window, go to ...TOPO5000.gdbRelief and create a copy of the PointElevation feature class. Rename the new feature class as ContourAndPoint. Right-click on ContourAndPoint and select Load | Load Data. Set the following parameters from the second and fourth panels: Input data: ContourLine_FeatureVertices Target Field: Elevation Matching Source Field: Elevation Accept the default values for the rest of the parameters and click on Finish. In ArcToolbox, go to Spatial Analyst Tools | Interpolation, and double-click on the Spline with Barriers tool. Set the following parameters: Input point features: ContourAndPoint Z value field: Elevation Input barrier features (optional): TOPO5000.gdbHydrographyWatercourseA Output raster: ScratchTOPO5000.gdbSpline_WaterA Smoothing Factor (optional): 0 Accept the default value of Output cell size (optional). Click on OK. You should obtain a similar terrain surface to what's shown here: Explore the results by comparing the similarities or differences of the terrain surface between interpolated raster layers and the ContourLine vector layer. The IDW method works well with a proper density of sample points. Try to create a new surface using the IDW tool and the ContourAndPoint layer as sample points. Save your map as MyInterpolation.mxd and close ArcMap. You can find the final results at <drive>:PacktPublishingDataSpatialAnalyst Interpolation. How it works... The IDW method generated an average surface that will not cross through the known point elevation values and will not estimate the values below the minimum or above the maximum given point values. The IDW tool allows you to define polyline barriers or limits in searching sample points for interpolation. Even if the WatercourseL polyline feature classes do not have elevation values, river features can be used to interrupt the continuity of interpolated surfaces. To obtain fewer averaged estimated values (reduce the IDW smoother effect) you have to: Reduce the sample size to 6 points Choose a variable search radius Increase the power to 2 The Power option defines the influence of sample point values. This value increases with the distance. There is a disadvantage because around a few sample points, there are small areas raised above the surrounding surface or small hollows below the surrounding surface. The Spline method has generated a surface that crosses through all the known point elevation values and estimates the values below the minimum or above the maximum sample point values. Because the density of points is quite low, we reduced the sample size to 6 points and defined a variable search radius of 500 meters in order to reduce the smoothening effect. The Regularized option estimates the hills or depressions that are not cached by the sample point values. The Tension option will force the interpolated values to stay closer to the sample point values. Starting from step 12, we increased the number of sample points in order to better estimate the surface. At step 14, notice that the Spline with Barriers tool allows you to use the polygon feature class as breaks or barriers in searching sample points for interpolation. Summary In this article, we learned about the ArcGIS Spatial Analyst extension and its tools. Resources for Article:   Further resources on this subject: Posting Reviews, Ratings, and Photos [article] Enterprise Geodatabase [article] Adding Graphics to the Map [article]
Read more
  • 0
  • 0
  • 3445

article-image-customization-microsoft-dynamics-crm
Packt
30 Dec 2014
24 min read
Save for later

Customization in Microsoft Dynamics CRM

Packt
30 Dec 2014
24 min read
 In this article by Nicolae Tarla, author of the book Dynamics CRM Application Structure, we looked at the basic structure of Dynamics CRM, the modules comprising the application, and what each of these modules contain. Now, we'll delve deeper into the application and take a look at how we can customize it. In this chapter, we will take a look at the following topics: Solutions and publishers Entity elements Entity types Extending entities Entity forms, quick view, and quick create forms Entity views and charts Entity relationships Messages Business rules We'll be taking a look at how to work with each of the elements comprising the sales, service, and marketing modules. We will go through the customization options and see how we can extend the system to fit new business requirements. (For more resources related to this topic, see here.) Solutions When we are talking about customizations for Microsoft Dynamics CRM, one of the most important concepts is the solution. The solution is a container of all the configurations and customizations. This packaging method allows customizers to track customizations, export and reimport them into other environments, as well as group specific sets of customizations by functionality or project cycle. Managing solutions is an aspect that should not be taken lightly, as down the road, a properly designed solution packaging model can help a lot, or an incorrect one can create difficulties. Using solutions is a best practice. While you can implement customizations without using solutions, these customizations will be merged into the base solutions and "you will not be able to export the customizations separately from the core elements of the platform. For a comprehensive description of solutions, you can refer to the MSDN documentation available at http://msdn.microsoft.com/en-gb/library/gg334576.aspx#BKMK_UnmanagedandManagedSolutions. Types of solutions Within the context of Dynamics CRM, there are two types of solutions that you will commonly use while implementing customizations: Unmanaged solutions Managed solutions Each one of these solution types has its own strengths and properties and are recommended to be used in various circumstances. In order to create and manage solutions as well as perform system customizations, the user account must be configured as a system customizer or system administrator. Unmanaged solutions An unmanaged solution is the default state of a newly created solution. A solution is unmanaged for the period of time while customization work is being performed in the context of the solution. You cannot customize a managed solution. An unmanaged solution can be converted to a managed solution by exporting it as managed. When the work is completed and the unmanaged solution is ready to be distributed, it is recommended that you package it as a managed solution for distribution. A managed solution, if configured as such, prevents further customizations to the solution elements. For this reason, solution vendors package their solutions as managed. In an unmanaged solution, the system customizer can perform various tasks, "which include: Adding and removing components Deleting components that allow deletion Exporting and importing the solution as an unmanaged solution Exporting the solution as a managed solution Changes made to the components in an unmanaged solution are also applied to all the unmanaged solutions that include these components. This means that all changes from all unmanaged solutions are also applied to the default solution. Deleting an unmanaged solution results in the removal of the container alone, "while the unmanaged components of the solution remain in the system. Deleting a component in an unmanaged solution results in the deletion of this component from the system. In order to remove a component from an unmanaged solution, the component should be removed from the solution, not deleted. Managed solutions Once work is completed in an unmanaged solution and the solution is ready to be distributed, it can be exported as a managed solution. Packaging a solution as a managed solution presents the following advantages: Solution components cannot be added or removed from a managed solution A managed solution cannot be exported from the environment it was deployed in Deleting a managed solution results in the uninstallation of all the component customizations included with the solution. It also results "in the loss of data associated with the components being deleted. A managed solution cannot be installed in the same organization that contains the unmanaged solution which was used to create it. Within a managed solution, certain components can be configured to allow further customization. Through this mechanism, the managed solution provider can enable future customizations that modify aspects of the solution provided. The guidance provided by Microsoft when working with various solution types states that a solution should be used in an unmanaged state between development and test environments, and it should be exported as a managed solution when it is ready to be deployed to a production environment. Solution properties Besides the solution type, each solution contains a solution publisher. This is a set of properties that allow the solution creators to communicate different information to the solution's users, including ways to contact the publisher for additional support. The solution publisher record will be created in all the organizations where the solution is being deployed. The solution publisher record is also important when releasing an update to an existing solution. Based on this common record and the solution properties, an update solution can be released and deployed on top of an existing solution. Using a published solution also allows us to define a custom prefix for all new custom fields created in the context of the solution. The default format for new custom field names is a new field name. Using a custom publisher, we can change "the "new" prefix to a custom prefix specific to our solution. Solution layering When multiple solutions are deployed in an organization, there are two methods by which the system defines the order in which changes take precedence. These methods are merge and top wins. The user interface elements are merged by default. As such, elements such as the default forms, ribbons, command bars, and site map are merged, and all base elements and new custom elements are rendered. For all other solution components, the top wins approach is taken, where the last solution that makes a customization takes precedence. The top wins approach is also taken into consideration when a subset of customizations is being applied on top of a previously applied customization. The system checks the integrity of all solution exports, imports, and other operations. As such, when exporting a solution, if dependent entities are not included, a warning is presented. The customizer has the option to ignore this warning. When importing a solution, if the dependent entities are missing, the import is halted and it fails. Also, deleting a component from a solution is prevented if dependent entities require it to be present. The default solution Dynamics CRM allows you to customize the system without taking advantage of solutions. By default, the system comes with a solution. This is an unmanaged solution, and all system customizations are applied to it by default. The default solution includes all the default components and customizations defined within Microsoft Dynamics CRM. This solution defines the default application behavior. Most of the components in this solution can be further customized. "This solution includes all the out-of-the-box customizations. Also, customizations applied through unmanaged solutions are being merged into the default solution. Entity elements Within a solution, we work with various entities. In Dynamics CRM, there are three main entity types: System entities Business entities Custom entities Each entity is composed of various attributes, while each attribute is defined as a value with a specific data type. We can consider an entity to be a data table. Each "row represents and entity record, while each column represents an entity attribute. As with any table, each attribute has specific properties that define its data type. The system entities in Dynamics CRM are used internally by the application and "are not customizable. Also, they cannot be deleted. As a system customizer or developer, we will work mainly with business management entities and custom entities. Business management entities are the default entities that come with the application. Some are customizable and can "be extended as required. Custom entities are all net new entities that are created "as part of our system customizations. The aspects related to customizing an entity include renaming the entity; modifying, adding, or removing entity attributes; or changing various settings and properties. Let's take a look at all these in detail. Renaming an entity One of the ways to customize an entity is by renaming it. In the general properties "of the entity, the field's display name allows us to change the name of an entity. "The plural name can also be updated accordingly. When renaming an entity, make sure that all the references and messages are updated to reflect the new entity name. Views, charts, messages, business rules, hierarchy settings, and even certain fields can reference the original name, and they should be updated to reflect the new name assigned to the entity. The display name of an entity can be modified for the default value. This is a very common customization. In many instances, we need to modify the default entity name to match the business for which we are customizing the system. For instance, many customers use the term organization instead of account. This is a very easy customization achieved by updating the Display Name and Plural Name fields. While implementing this change, make sure that you also update the entity messages, as a lot of them use the original name of the entity by default.   You can change a message value by double-clicking on the message and entering the new message into the Custom Display String field. Changing entity settings and properties When creating and managing entities in Dynamics CRM, there are generic "entity settings that we have to pay attention to. We can easily get to these settings and properties by navigating to Components | Entities within a solution and selecting an entity from the list. We will get an account entity screen similar to "the following screenshot:   The settings are structured in two main tabs, with various categories on each tab. We will take a look at each set of settings and properties individually in the next sections. Entity definition This area of the General tab groups together general properties and settings related to entity naming properties, ownership, and descriptions. Once an entity is created, the Name value remains fixed and cannot be modified. If the internal Name field needs to be changed, a new entity with the new Name field must be created. Areas that display this entity This section sets the visibility of this entity. An entity can be made available in only one module or more standard modules of the application. The account is a good example as it is present in all the three areas of the application. Options for entity The Options for Entity section contains a subset of sections with various settings "and properties to configure the main properties of the entity, such as whether the entity can be customized by adding business process flows, notes and activities, "and auditing as well as other settings. Pay close attention to the settings marked with a plus, as once these settings are enabled, they cannot be disabled. If you are not sure whether you need these features, disable them. The Process section allows you to enable the entity for Business Process Flows. When enabling an entity for Business Process Flows, specific fields to support this functionality are created. For this reason, once an entity is enabled for Business Process Flows, it cannot be disabled at a later time. In the communication and collaboration area, we can enable the use of notes, related activities, and connections as well as enable sending of e-mails and queues on the entity. Enabling these configurations creates the required fields and relationships in the system, and you cannot disable them later. In addition, you can enable the entity for mail merge for use with access teams and also for document management. Enabling an entity for document management allows you to store documents related to the records of this type in SharePoint if the organization is configured to integrate with SharePoint. The data services section allows you to enable the quick create forms for this entity's records as well as to enable or disable duplicate detection and auditing. When you are enabling auditing, auditing must also be enabled at the organization level. Auditing is a two-step process. The next subsections deal with Outlook and mobile access. Here, we can define whether the entity can be accessed from various mobile devices as well as Outlook and whether the access is read-only or read/write on tablets. The last section allows us to define a custom help section for a specific entity. "Custom help must be enabled at the organization level first. Primary field settings The Primary Field settings tab contains the configuration properties for the entity's primary field. Each entity in the Dynamics CRM platform is defined by a primary field. This field can only be a text field, and the size can be customized as needed. The display name can be adjusted as needed. Also, the requirement level can be selected from one of the three values: optional, business-recommended, or business-required. When it is marked as business-required, the system will require users to enter a value if they are creating or making changes to an entity record form. The primary fields are also presented for customization in the entity field's listing. Business versus custom entities As mentioned previously, there are two types of customizable entities in Dynamics CRM. They are business entities and custom entities. Business entities are customizable entities that are created by Microsoft and come as part of the default solution package. They are part of the three modules: sales, service, and marketing. Custom entities are all the new entities that are being created as part of the customization and platform extending process. Business entities Business entities are part of the default customization provided with the application by Microsoft. They are either grouped into one of the three modules of functionality or are spread across all three. For example, the account and contact entities are present in all the modules, while the case entity belongs to the service module. "Some other business entities are opportunity, lead, marketing list, and so on. Most of the properties of business entities are customizable in Dynamics CRM. However, there are certain items that are not customizable across these entities. These are, in general, the same type of customizations that are not changeable "when creating a custom entity. For example, the entity internal name (the schema name) cannot be changed once an entity has been created. In addition, the primary field properties cannot be modified once an entity is created. Custom entities All new entities created as part of a customization and implemented in Dynamics CRM are custom entities. When creating a new custom entity, we have the freedom to configure all the settings and properties as needed from the beginning. We can use a naming convention that makes sense to the user and generate all the messages from the beginning, taking advantage of this name. A custom entity can be assigned by default to be displayed in one or more of the three main modules or in the settings and help section. If a new module is created and custom entities need to be part of this new module, we can achieve this by customizing the application navigation, commonly referred to as the application sitemap. While customizing the application navigation might not be such a straightforward process, the tools released to the community are available, which makes this job a lot easier and more visual. The default method to customize the navigation is described in detail in the SDK, and it involves exporting a solution with the navigation sitemap configuration, modifying the XML data, and reimporting the updated solution. Extending entities Irrespective of whether we want to extend a customizable business entity or a custom entity, the process is similar. We extend entities by creating new entity forms, views, charts, relationships, and business rules.   Starting with Dynamics CRM 2015, entities configured for hierarchical relationships now support the creation and visualization of hierarchies through hierarchy settings. We will be taking a look at each of these options in detail in the next sections. Entity forms Entities in Dynamics CRM can be accessed from various parts of the system, and their information can be presented in various formats. This feature contributes to "the 360-degree view of customer data. In order to enable this functionality, the entities in Dynamics CRM present a variety of standard views that are available for customization. These include standard entity forms, quick create forms, and quick view forms. In addition, for mobile devices, "we can customize mobile forms. Form types With the current version of Dynamics CRM 2015, most of the updated entities now have four different form types, as follows: The main form The mobile form The quick create form The quick view form Various other forms can be created on an entity, either from scratch or by opening an existing form and saving it with a new name. When complex forms need to be created, in many circumstances, it is much easier to start from an existing entity "form rather than recreating everything. We have role-based forms, which change based on the user's security role, and we can also have more than one form available for users to select from. We can customize which view is presented to the user based on specific form rules or "other business requirements. It is a good practice to define a fallback form for each entity and to give all the users view permissions to this form. Once more than one main forms are created for an entity, you can define the order in which the forms are presented based on permissions. If the user does not have access to any of the higher precedence forms, they will be able to access the fallback form. Working with contingency forms is quite similar; here, a form is defined to be available to users who cannot access any other forms on an entity. The approach for configuring this is a little different though. You create a form with minimal information being displayed on it. Only assign a system administrator role to this form, and select enable for a fallback. With this, you specify a form that will not be visible to anybody other that the system administrator. In addition, configuring the form in this manner also makes it available to users whose security roles do not have a form specified. With such a configuration, if a user is added to a restrictive group that does not allow them to see most forms, they will have this one form available. The main form The main form is the default form associated with an entity. This form will be available by default when you open a record. There can be more than one main form, and these forms can be configured to be available to various security roles. A role must have at least one form available for the role. If more than one form is available for a specific role, then the users will be given the option to select the form they want to use to visualize a record available for it to be selected by the user. Forms that are available for various roles are called role-based forms. As an example, the human resource role can have a specific view in an account, showing more information than a form available for a sales role. At the time of editing, the main form of an entity will look similar to the "following screenshot: A mobile form A mobile form is a stripped-down form that is available for mobile devices with small screens. When customizing mobile forms, you should not only pay attention to the fact that a small screen can only render so much before extensive scrolling becomes exhaustive but also the fact that most mobile devices transfer data wirelessly and, as such, the amount of data should be limited. At the time of editing, the Mobile Entity form looks similar to the Account Mobile form shown in the following screenshot. This is basically just a listing of the fields that are available and the order in which they are presented to the user.   The quick create form The quick create form, while serving a different purpose than quick view forms, "are confined to the same minimalistic approach. Of course, a system customizer "is not necessarily limited to a certain amount of data to be added to these forms, "but it should be mindful of where these forms are being used and how much real estate is dedicated to them. In a quick create form, the minimal amount of data to be added is the required "fields. In order to save a new record, all business-required fields must be filled in. "As such, they should be added to the quick create form. The quick create form are created in the same way as any other type of form. In the solution package, navigate to entities, select the entity in which you want to customize an existing quick create form or add a new one, and expand the forms section; you will see all the existing forms for the specific entity. Here, you can select the form you want to modify or click on New to create a new one. Once the form is open for editing, the process of customizing the form is exactly the same for all forms. You can add or remove fields, customize labels, rearrange fields in the form, and so on. In order to remind the customizer that this is a quick create form, a minimalistic three-column grid is provided by default for this type of form in edit mode, "as shown in the following screenshot: Pay close attention to the fact that you can add only a limited type of controls to a quick create form. Items such as iframes and sub-grids are not available. That's not to say that the layout cannot be changed. You can be as creative as needed when customizing the quick create view. Once you have created the form, save and publish it. Since we have created a relationship between the account and the project earlier, we can add a grid view "to the account displaying all the related child projects. Now, navigating to an account, we can quickly add a new child project by going "to the project's grid view and clicking on the plus symbol to add a project. This will launch the quick create view of the project we just customized. This is how the project window will look:   As you can see in the previous screenshot, the quick create view is displayed as an overlay over the main form. For this reason, the amount of data should be kept to a minimum. This type of form is not meant to replace a full-fledged form but to allow a user to create a new record type with minimal inputs and with no navigation to other records. Another way to access the quick create view for an entity is by clicking on the "Create button situated at the top-right corner of most Dynamics CRM pages, "right before the field that displays your username. This presents the user with "the option to create common out-of-the-box record types available in the system, "as seen in the following screenshot:   Selecting any one of the Records options presents the quick create view. If you opt to create activities in this way, you will not be presented with a quick create form; rather, you will be taken to the full activity form. Once a quick create form record is created in the system, the quick create form closes and a notification is displayed to the user with an option to navigate to the newly created record. This is how the final window should look:   The quick view form The quick view form is a feature added with Dynamics CRM 2013 that allows system customizers to create a minimalistic view to be presented in a related record form. This form presents a summary of a record in a condensed format that allows you to insert it into a related record's form. The process to use a quick view form comprises the following two steps: Create the quick view form for an entity Add the quick view form to the related record The process of creating a quick view form is similar to the process of creating "any other form. The only requirement here is to keep the amount of information minimal, in order to avoid taking up too much real estate on the related record "form. The following screenshot describes the standard Account quick create form:   A very good example is the quick view form for the account entity. This view is created by default in the system. It only includes the account name, e-mail and "phone information, as well as a grid of recent cases and recent activities. We can use this view in a custom project entity. In the project's main form, add a lookup field to define the account related to the project. In the project's form customization, add a Quick View Form tab from the ribbon, as shown in the following screenshot:   Once you add a Quick View Form tab, you are presented with a Quick View Control Properties window. Here, define the name and label for the control and whether you want the label to be displayed in the form. In addition, on this form, you get to define the rules on what is to be displayed "on the form. In the Data Source section, select Account in the Lookup Field and Related Entity dropdown list and in the Quick View Form dropdown list, select "the account card form. This is the name of the account's quick view form defined "in the system. The following screenshot shows the Data Source configuration and the Selected quick view forms field:   Once complete, save and publish the form. Now, if we navigate to a project record, we can select the related account and the quick view will automatically be displayed on the project form, as shown in the "next screenshot:   The default quick view form created for the account entity is displayed now on the project form with all the specified account-related details. This way any updates to the account are immediately reflected in the project form. Taking this approach, it is now much easier to display all the needed information on the same screen so that the user does not have to navigate away and click through a maze to get to all the data needed. Summary Throughout this chapter, we looked at the main component of the three system modules: an entity. We defined what an entity is and we looked at what an entity is composed of. Then, we looked at each of the components in detail and we discussed ways in which we can customize the entities and extend the system. We investigated ways to visually represent the data related to entities and how to relate entities for data integrity. We also looked at how to enhance entity behavior with business rules and the limitations that the business rules have versus more advanced customizations, using scripts or other developer-specific methods. The next chapter will take you into the business aspect of the Dynamics CRM platform, with an in-depth look at all the available business processes. We will revisit business rules, and we will take a look at other ways to enforce business-specific rules and processes using the wizard-driven customizations available with the platform. Resources for Article: Further resources on this subject: Form customizations [article] Introduction to Reporting in Microsoft Dynamics CRM [article] Overview of Microsoft Dynamics CRM 2011 [article]
Read more
  • 0
  • 0
  • 3983
article-image-middleware
Packt
30 Dec 2014
13 min read
Save for later

Middleware

Packt
30 Dec 2014
13 min read
In this article by Mario Casciaro, the author of the book, "Node.js Design Patterns", has described the importance of using a middleware pattern. One of the most distinctive patterns in Node.js is definitely middleware. Unfortunately it's also one of the most confusing for the inexperienced, especially for developers coming from the enterprise programming world. The reason for the disorientation is probably connected with the meaning of the term middleware, which in the enterprise architecture's jargon represents the various software suites that help to abstract lower level mechanisms such as OS APIs, network communications, memory management, and so on, allowing the developer to focus only on the business case of the application. In this context, the term middleware recalls topics such as CORBA, Enterprise Service Bus, Spring, JBoss, but in its more generic meaning it can also define any kind of software layer that acts like a glue between lower level services and the application (literally the software in the middle). (For more resources related to this topic, see here.) Middleware in Express Express (http://expressjs.com) popularized the term middleware in theNode.js world, binding it to a very specific design pattern. In express, in fact, a middleware represents a set of services, typically functions, that are organized in a pipeline and are responsible for processing incoming HTTP requests and relative responses. An express middleware has the following signature: function(req, res, next) { ... } Where req is the incoming HTTP request, res is the response, and next is the callback to be invoked when the current middleware has completed its tasks and that in turn triggers the next middleware in the pipeline. Examples of the tasks carried out by an express middleware are as the following: Parsing the body of the request Compressing/decompressing requests and responses Producing access logs Managing sessions Providing Cross-site Request Forgery (CSRF) protection If we think about it, these are all tasks that are not strictly related to the main functionality of an application, rather, they are accessories, components providing support to the rest of the application and allowing the actual request handlers to focus only on their main business logic. Essentially, those tasks are software in the middle. Middleware as a pattern The technique used to implement middleware in express is not new; in fact, it can be considered the Node.js incarnation of the Intercepting Filter pattern and the Chain of Responsibility pattern. In more generic terms, it also represents a processing pipeline,which reminds us about streams. Today, in Node.js, the word middleware is used well beyond the boundaries of the express framework, and indicates a particular pattern whereby a set of processing units, filters, and handlers, under the form of functions are connected to form an asynchronous sequence in order to perform preprocessing and postprocessing of any kind of data. The main advantage of this pattern is flexibility; in fact, this pattern allows us to obtain a plugin infrastructure with incredibly little effort, providing an unobtrusive way for extending a system with new filters and handlers. If you want to know more about the Intercepting Filter pattern, the following article is a good starting point: http://www.oracle.com/technetwork/java/interceptingfilter-142169.html. A nice overview of the Chain of Responsibility pattern is available at this URL: http://java.dzone.com/articles/design-patterns-uncovered-chain-of-responsibility. The following diagram shows the components of the middleware pattern: The essential component of the pattern is the Middleware Manager, which is responsible for organizing and executing the middleware functions. The most important implementation details of the pattern are as follows: New middleware can be registered by invoking the use() function (the name of this function is a common convention in many implementations of this pattern, but we can choose any name). Usually, new middleware can only be appended at the end of the pipeline, but this is not a strict rule. When new data to process is received, the registered middleware is invoked in an asynchronous sequential execution flow. Each unit in the pipeline receives in input the result of the execution of the previous unit. Each middleware can decide to stop further processing of the data by simply not invoking its callback or by passing an error to the callback. An error situation usually triggers the execution of another sequence of middleware that is specifically dedicated to handling errors. There is no strict rule on how the data is processed and propagated in the pipeline. The strategies include: Augmenting the data with additional properties or functions Replacing the data with the result of some kind of processing Maintaining the immutability of the data and always returning fresh copies as result of the processing The right approach that we need to take depends on the way the Middleware Manager is implemented and on the type of processing carried out by the middleware itself. Creating a middleware framework for ØMQ Let's now demonstrate the pattern by building a middleware framework around the ØMQ (http://zeromq.org) messaging library. ØMQ (also known as ZMQ, or ZeroMQ) provides a simple interface for exchanging atomic messages across the network using a variety of protocols; it shines for its performances, and its basic set of abstractions are specifically built to facilitate the implementation of custom messaging architectures. For this reason, ØMQ is often chosen to build complex distributed systems. The interface of ØMQ is pretty low-level, it only allows us to use strings and binary buffers for messages, so any encoding or custom formatting of data has to be implemented by the users of the library. In the next example, we are going to build a middleware infrastructure to abstract the preprocessing and postprocessing of the data passing through a ØMQ socket, so that we can transparently work with JSON objects but also seamlessly compress the messages traveling over the wire. Before continuing with the example, please make sure to install the ØMQ native libraries following the instructions at this URL: http://zeromq.org/intro:get-the-software. Any version in the 4.0 branch should be enough for working on this example. The Middleware Manager The first step to build a middleware infrastructure around ØMQ is to create a component that is responsible for executing the middleware pipeline when a new message is received or sent. For the purpose, let's create a new module called zmqMiddlewareManager.js and let's start defining it: function ZmqMiddlewareManager(socket) { this.socket = socket; this.inboundMiddleware = []; //[1] this.outboundMiddleware = []; var self = this; socket.on('message', function(message) { //[2] self.executeMiddleware(self.inboundMiddleware, { data: message }); }); } module.exports = ZmqMiddlewareManager; This first code fragment defines a new constructor for our new component. It accepts a ØMQ socket as an argument and: Creates two empty lists that will contain our middleware functions, one for the inbound messages and another one for the outbound messages. Immediately, it starts listening for the new messages coming from the socket by attaching a new listener to the message event. In the listener, we process the inbound message by executing the inboundMiddleware pipeline. The next method of the ZmqMiddlewareManager prototype is responsible for executing the middleware when a new message is sent through the socket: ZmqMiddlewareManager.prototype.send = function(data) { var self = this; var message = { data: data}; self.executeMiddleware(self.outboundMiddleware, message,    function() {    self.socket.send(message.data);    } ); } This time the message is processed using the filters in the outboundMiddleware list and then passed to socket.send() for the actual network transmission. Now, we need a small method to append new middleware functions to our pipelines; we already mentioned that such a method is conventionally called use(): ZmqMiddlewareManager.prototype.use = function(middleware) { if(middleware.inbound) {    this.inboundMiddleware.push(middleware.inbound); }if(middleware.outbound) {    this.outboundMiddleware.unshift(middleware.outbound); } } Each middleware comes in pairs; in our implementation it's an object that contains two properties, inbound and outbound, that contain the middleware functions to be added to the respective list. It's important to observe here that the inbound middleware is pushed to the end of the inboundMiddleware list, while the outbound middleware is inserted at the beginning of the outboundMiddleware list. This is because complementary inbound/outbound middleware functions usually need to be executed in an inverted order. For example, if we want to decompress and then deserialize an inbound message using JSON, it means that for the outbound, we should instead first serialize and then compress. It's important to understand that this convention for organizing the middleware in pairs is not strictly part of the general pattern, but only an implementation detail of our specific example. Now, it's time to define the core of our component, the function that is responsible for executing the middleware: ZmqMiddlewareManager.prototype.executeMiddleware = function(middleware, arg, finish) {var self = this;(    function iterator(index) {      if(index === middleware.length) {        return finish && finish();      }      middleware[index].call(self, arg, function(err) { if(err) {        console.log('There was an error: ' + err.message);      }      iterator(++index);    }); })(0); } The preceding code should look very familiar; in fact, it is a simple implementation of the asynchronous sequential iteration pattern. Each function in the middleware array received in input is executed one after the other, and the same arg object is provided as an argument to each middleware function; this is the trickthat makes it possible to propagate the data from one middleware to the next. At the end of the iteration, the finish() callback is invoked. Please note that for brevity we are not supporting an error middleware pipeline. Normally, when a middleware function propagates an error, another set of middleware specifically dedicated to handling errors is executed. This can be easily implemented using the same technique that we are demonstrating here. A middleware to support JSON messages Now that we have implemented our Middleware Manager, we can create a pair of middleware functions to demonstrate how to process inbound and outbound messages. As we said, one of the goals of our middleware infrastructure is having a filter that serializes and deserializes JSON messages, so let's create a new middleware to take care of this. In a new module called middleware.js; let's include the following code: module.exports.json = function() { return {    inbound: function(message, next) {      message.data = JSON.parse(message.data.toString());      next();    },    outbound: function(message, next) {      message.data = new Buffer(JSON.stringify(message.data));      next();    } } } The json middleware that we just created is very simple: The inbound middleware deserializes the message received as an input and assigns the result back to the data property of message, so that it can be further processed along the pipeline The outbound middleware serializes any data found into message.data Design Patterns Please note how the middleware supported by our framework is quite different from the one used in express; this is totally normal and a perfect demonstration of how we can adapt this pattern to fit our specific need. Using the ØMQ middleware framework We are now ready to use the middleware infrastructure that we just created. To do that, we are going to build a very simple application, with a client sending a ping to a server at regular intervals and the server echoing back the message received. From an implementation perspective, we are going to rely on a request/reply messaging pattern using the req/rep socket pair provided by ØMQ (http://zguide. zeromq.org/page:all#Ask-and-Ye-Shall-Receive). We will then wrap the socketswith our zmqMiddlewareManager to get all the advantages from the middleware infrastructure that we built, including the middleware for serializing/deserializing JSON messages. The server Let's start by creating the server side (server.js). In the first part of the module we initialize our components: var zmq = require('zmq'); var ZmqMiddlewareManager = require('./zmqMiddlewareManager'); var middleware = require('./middleware'); var reply = zmq.socket('rep'); reply.bind('tcp://127.0.0.1:5000'); In the preceding code, we loaded the required dependencies and bind a ØMQ 'rep' (reply) socket to a local port. Next, we initialize our middleware: var zmqm = new ZmqMiddlewareManager(reply); zmqm.use(middleware.zlib()); zmqm.use(middleware.json()); We created a new ZmqMiddlewareManager object and then added two middlewares, one for compressing/decompressing the messages and another one for parsing/ serializing JSON messages. For brevity, we did not show the implementation of the zlib middleware. Now we are ready to handle a request coming from the client, we will do this by simply adding another middleware, this time using it as a request handler: zmqm.use({ inbound: function(message, next) { console.log('Received: ',    message.data); if(message.data.action === 'ping') {     this.send({action: 'pong', echo: message.data.echo});  }    next(); } }); Since this last middleware is defined after the zlib and json middlewares, we can transparently use the decompressed and deserialized message that is available in the message.data variable. On the other hand, any data passed to send() will be processed by the outbound middleware, which in our case will serialize then compress the data. The client On the client side of our little application, client.js, we will first have to initiate a new ØMQ req (request) socket connected to the port 5000, the one used by our server: var zmq = require('zmq'); var ZmqMiddlewareManager = require('./zmqMiddlewareManager'); var middleware = require('./middleware'); var request = zmq.socket('req'); request.connect('tcp://127.0.0.1:5000'); Then, we need to set up our middleware framework in the same way that we did for the server: var zmqm = new ZmqMiddlewareManager(request); zmqm.use(middleware.zlib()); zmqm.use(middleware.json()); Next, we create an inbound middleware to handle the responses coming from the server: zmqm.use({ inbound: function(message, next) {    console.log('Echoed back: ', message.data);    next(); } }); In the preceding code, we simply intercept any inbound response and print it to the console. Finally, we set up a timer to send some ping requests at regular intervals, always using the zmqMiddlewareManager to get all the advantages of our middleware: setInterval(function() { zmqm.send({action: 'ping', echo: Date.now()}); }, 1000); We can now try our application by first starting the server: node server We can then start the client with the following command: node client At this point, we should see the client sending messages and the server echoing them back. Our middleware framework did its job; it allowed us to decompress/compress and deserialize/serialize our messages transparently, leaving the handlers free to focus on their business logic! Summary In this article, we learned about the middleware pattern and the various facets of the pattern, and we also saw how to create a middleware framework and how to use. Resources for Article:  Further resources on this subject: Selecting and initializing the database [article] Exploring streams [article] So, what is Node.js? [article]
Read more
  • 0
  • 0
  • 7953

article-image-how-vector-features-are-displayed
Packt
30 Dec 2014
23 min read
Save for later

How Vector Features are Displayed

Packt
30 Dec 2014
23 min read
In this article by Erik Westra, author of the book Building Mapping Applications with QGIS, we will learn how QGIS symbols and renderers are used to control how vector features are displayed on a map. In addition to this, we will also learn saw how symbol layers work. The features within a vector map layer are displayed using a combination of renderer and symbol objects. The renderer chooses which symbol is to be used for a given feature, and the symbol does the actual drawing. There are three basic types of symbols defined by QGIS: Marker symbol: This displays a point as a filled circle Line symbol: This draws a line using a given line width and color Fill symbol: This draws the interior of a polygon with a given color These three types of symbols are implemented as subclasses of the qgis.core.QgsSymbolV2 class: qgis.core.QgsMarkerSymbolV2 qgis.core.QgsLineSymbolV2 qgis.core.QgsFillSymbolV2 You might be wondering why all these classes have "V2" in their name. This is a historical quirk of QGIS. Earlier versions of QGIS supported both an "old" and a "new" system of rendering, and the "V2" naming refers to the new rendering system. The old rendering system no longer exists, but the "V2" naming continues to maintain backward compatibility with existing code. Internally, symbols are rather complex, using "symbol layers" to draw multiple elements on top of each other. In most cases, however, you can make use of the "simple" version of the symbol. This makes it easier to create a new symbol without having to deal with the internal complexity of symbol layers. For example: symbol = QgsMarkerSymbolV2.createSimple({'width' : 1.0,                                        'color' : "255,0,0"}) While symbols draw the features onto the map, a renderer is used to choose which symbol to use to draw a particular feature. In the simplest case, the same symbol is used for every feature within a layer. This is called a single symbol renderer, and is represented by the qgis.core.QgsSingleSymbolRenderV2class. Other possibilities include: Categorized symbol renderer (qgis.core.QgsCategorizedSymbolRendererV2): This renderer chooses a symbol based on the value of an attribute. The categorized symbol renderer has a mapping from attribute values to symbols. Graduated symbol renderer (qgis.core.QgsGraduatedSymbolRendererV2): This type of renderer has a series of ranges of attribute values, and maps each range to an appropriate symbol. Using a single symbol renderer is very straightforward: symbol = ... renderer = QgsSingleSymbolRendererV2(symbol) layer.setRendererV2(renderer) To use a categorized symbol renderer, you first define a list of qgis.core.QgsRendererCategoryV2 objects, and then use that to create the renderer. For example: symbol_male = ... symbol_female = ...   categories = [] categories.append(QgsRendererCategoryV2("M", symbol_male, "Male")) categories.append(QgsRendererCategoryV2("F", symbol_female,                                        "Female"))   renderer = QgsCategorizedSymbolRendererV2("", categories) renderer.setClassAttribute("GENDER") layer.setRendererV2(renderer) Notice that the QgsRendererCategoryV2 constructor takes three parameters: the desired value, the symbol to use, and the label used to describe that category. Finally, to use a graduated symbol renderer, you define a list of qgis.core.QgsRendererRangeV2 objects and then use that to create your renderer. For example: symbol1 = ... symbol2 = ...   ranges = [] ranges.append(QgsRendererRangeV2(0, 10, symbol1, "Range 1")) ranges.append(QgsRendererRange(11, 20, symbol2, "Range 2"))   renderer = QgsGraduatedSymbolRendererV2("", ranges) renderer.setClassAttribute("FIELD") layer.setRendererV2(renderer) Working with symbol layers Internally, symbols consist of one or more symbol layers that are displayed one on top of the other to draw the vector feature: The symbol layers are drawn in the order in which they are added to the symbol. So, in this example, Symbol Layer 1 will be drawn before Symbol Layer 2. This has the effect of drawing the second symbol layer on top of the first. Make sure you get the order of your symbol layers correct, or you may find a symbol layer completely obscured by another layer. While the symbols we have been working with so far have had only one layer, there are some clever tricks you can perform with multilayer symbols. When you create a symbol, it will automatically be initialized with a default symbol layer. For example, a line symbol (an instance of QgsLineSymbolV2) will be created with a single layer of type QgsSimpleLineSymbolLayerV2. This layer is used to draw the line feature onto the map. To work with symbol layers, you need to remove this default layer and replace it with your own symbol layer or layers. For example: symbol = QgsSymbolV2.defaultSymbol(layer.geometryType()) symbol.deleteSymbolLayer(0) # Remove default symbol layer.   symbol_layer_1 = QgsSimpleFillSymbolLayerV2() symbol_layer_1.setFillColor(QColor("yellow"))   symbol_layer_2 = QgsLinePatternFillSymbolLayer() symbol_layer_2.setLineAngle(30) symbol_layer_2.setDistance(2.0) symbol_layer_2.setLineWidth(0.5) symbol_layer_2.setColor(QColor("green"))   symbol.appendSymbolLayer(symbol_layer_1) symbol.appendSymbolLayer(symbol_layer_2) The following methods can be used to manipulate the layers within a symbol: symbol.symbolLayerCount(): This returns the number of symbol layers within this symbol symbol.symbolLayer(index): This returns the given symbol layer within the symbol. Note that the first symbol layer has an index of zero. symbol.changeSymbolLayer(index, symbol_layer): This replaces a given symbol layer within the symbol symbol.appendSymbolLayer(symbol_layer): This appends a new symbol layer to the symbol symbol.insertSymbolLayer(index, symbol_layer): This inserts a symbol layer at a given index symbol.deleteSymbolLayer(index): This removes the symbol layer at the given index Remember that to use the symbol once you've created it, you create an appropriate renderer and then assign that renderer to your map layer. For example: renderer = QgsSingleSymbolRendererV2(symbol) layer.setRendererV2(renderer) The following symbol layer classes are available for you to use: PyQGIS class Description Example QgsSimpleMarkerSymbolLayerV2 This displays a point geometry as a small colored circle.   QgsEllipseSymbolLayerV2 This displays a point geometry as an ellipse.   QgsFontMarkerSymbolLayerV2 This displays a point geometry as a single character. You can choose the font and character to be displayed.   QgsSvgMarkerSymbolLayerV2 This displays a point geometry using a single SVG format image.   QgsVectorFieldSymbolLayer This displays a point geometry by drawing a displacement line. One end of the line is the coordinate of the point, while the other end is calculated using attributes of the feature.   QgsSimpleLineSymbolLayerV2 This displays a line geometry or the outline of a polygon geometry using a line of a given color, width, and style.   QgsMarkerLineSymbolLayerV2 This displays a line geometry or the outline of a polygon geometry by repeatedly drawing a marker symbol along the length of the line.   QgsSimpleFillSymbolLayerV2 This displays a polygon geometry by filling the interior with a given solid color and then drawing a line around the perimeter.   QgsGradientFillSymbolLayerV2 This fills the interior of a polygon geometry using a color or grayscale gradient.   QgsCentroidFillSymbolLayerV2 This draws a simple dot at the centroid of a polygon geometry.   QgsLinePatternFillSymbolLayer This draws the interior of a polygon geometry using a repeated line. You can choose the angle, width, and color to use for the line.   QgsPointPatternFillSymbolLayer This draws the interior of a polygon geometry using a repeated point.   QgsSVGFillSymbolLayer This draws the interior of a polygon geometry using a repeated SVG format image.   These predefined symbol layers, either individually or in various combinations, give you enormous flexibility in how features are to be displayed. However, if these aren't enough for you, you can also implement your own symbol layers using Python. We will look at how this can be done later in this article. Combining symbol layers By combining symbol layers, you can achieve a range of complex visual effects. For example, you could combine an instance of QgsSimpleMarkerSymbolLayerV2 with a QgsVectorFieldSymbolLayer to display a point geometry using two symbols at once: One of the main uses of symbol layers is to draw different LineString or PolyLine symbols to represent different types of roads. For example, you can draw a complex road symbol by combining multiple symbol layers, like this: This effect is achieved using three separate symbol layers: Here is the Python code used to generate the above map symbol: symbol = QgsLineSymbolV2.createSimple({}) symbol.deleteSymbolLayer(0) # Remove default symbol layer.   symbol_layer = QgsSimpleLineSymbolLayerV2() symbol_layer.setWidth(4) symbol_layer.setColor(QColor("light gray")) symbol_layer.setPenCapStyle(Qt.FlatCap) symbol.appendSymbolLayer(symbol_layer)   symbol_layer = QgsSimpleLineSymbolLayerV2() symbol_layer.setColor(QColor("black")) symbol_layer.setWidth(2) symbol_layer.setPenCapStyle(Qt.FlatCap) symbol.appendSymbolLayer(symbol_layer)   symbol_layer = QgsSimpleLineSymbolLayerV2() symbol_layer.setWidth(1) symbol_layer.setColor(QColor("white")) symbol_layer.setPenStyle(Qt.DotLine) symbol.appendSymbolLayer(symbol_layer) As you can see, you can set the line width, color, and style to create whatever effect you want. As always, you have to define the layers in the correct order, with the back-most symbol layer defined first. By combining line symbol layers in this way, you can create almost any type of road symbol that you want. You can also use symbol layers when displaying polygon geometries. For example, you can draw QgsPointPatternFillSymbolLayer on top of QgsSimpleFillSymbolLayerV2 to have repeated points on top of a simple filled polygon, like this: Finally, you can make use of transparency to allow the various symbol layers (or entire symbols) to blend into each other. For example, you can create a pinstripe effect by combining two symbol layers, like this: symbol = QgsFillSymbolV2.createSimple({}) symbol.deleteSymbolLayer(0) # Remove default symbol layer.   symbol_layer = QgsGradientFillSymbolLayerV2() symbol_layer.setColor2(QColor("dark gray")) symbol_layer.setColor(QColor("white")) symbol.appendSymbolLayer(symbol_layer)   symbol_layer = QgsLinePatternFillSymbolLayer() symbol_layer.setColor(QColor(0, 0, 0, 20)) symbol_layer.setLineWidth(2) symbol_layer.setDistance(4) symbol_layer.setLineAngle(70) symbol.appendSymbolLayer(symbol_layer) The result is quite subtle and visually pleasing: In addition to changing the transparency for a symbol layer, you can also change the transparency for the symbol as a whole. This is done by using the setAlpha() method, like this: symbol.setAlpha(0.3) The result looks like this: Note that setAlpha() takes a floating point number between 0.0 and 1.0, while the transparency of a QColor object, like the ones we used earlier, is specified using an alpha value between 0 and 255. Implementing symbol layers in Python If the built-in symbol layers aren't flexible enough for your needs, you can implement your own symbol layers using Python. To do this, you create a subclass of the appropriate type of symbol layer (QgsMarkerSymbolLayerV2, QgsLineSymbolV2, or QgsFillSymbolV2) and implement the various drawing methods yourself. For example, here is a simple marker symbol layer that draws a cross for a Point geometry: class CrossSymbolLayer(QgsMarkerSymbolLayerV2):    def __init__(self, length=10.0, width=2.0):        QgsMarkerSymbolLayerV2.__init__(self)        self.length = length        self.width = width   def layerType(self):        return "Cross"   def properties(self):        return {'length' : self.length,               'width' : self.width}      def clone(self): return CrossSymbolLayer(self.length, self.width)      def startRender(self, context):        self.pen = QPen()        self.pen.setColor(self.color()) self.pen.setWidth(self.width)      def stopRender(self, context): self.pen = None   def renderPoint(self, point, context):        left = point.x() - self.length        right = point.x() + self.length        bottom = point.y() - self.length        top = point.y() + self.length          painter = context.renderContext().painter()        painter.setPen(self.pen)        painter.drawLine(left, bottom, right, top)        painter.drawLine(right, bottom, left, top) Using this custom symbol layer in your code is straightforward: symbol = QgsMarkerSymbolV2.createSimple({}) symbol.deleteSymbolLayer(0)   symbol_layer = CrossSymbolLayer() symbol_layer.setColor(QColor("gray"))   symbol.appendSymbolLayer(symbol_layer) Running this code will draw a cross at the location of each point geometry, as follows: Of course, this is a simple example, but it shows you how to use custom symbol layers implemented in Python. Let's now take a closer look at the implementation of the CrossSymbolLayer class, and see what each method does: __init__(): Notice how the __init__ method accepts parameters that customize the way the symbol layer works. These parameters, which should always have default values assigned to them, are the properties associated with the symbol layer. If you want to make your custom symbol available within the QGIS Layer Properties window, you will need to register your custom symbol layer and tell QGIS how to edit the symbol layer's properties. We will look at this shortly. layerType(): This method returns a unique name for your symbol layer. properties(): This should return a dictionary that contains the various properties used by this symbol layer. The properties returned by this method will be stored in the QGIS project file, and used later to restore the symbol layer. clone(): This method should return a copy of the symbol layer. Since we have defined our properties as parameters to the __init__ method, implementing this method simply involves creating a new instance of the class and copying the properties from the current symbol layer to the new instance. startRender(): This method is called before the first feature in the map layer is rendered. This can be used to define any objects that will be required to draw the feature. Rather than creating these objects each time, it is more efficient (and therefore faster) to create them only once to render all the features. In this example, we create the QPen object that we will use to draw the Point geometries. stopRender(): This method is called after the last feature has been rendered. This can be used to release the objects created by the startRender() method. renderPoint(): This is where all the work is done for drawing point geometries. As you can see, this method takes two parameters: the point at which to draw the symbol, and the rendering context (an instance of QgsSymbolV2RenderContext) to use for drawing the symbol. The rendering context provides various methods for accessing the feature being displayed, as well as information about the rendering operation, the current scale factor, etc. Most importantly, it allows you to access the PyQt QPainter object needed to actually draw the symbol onto the screen. The renderPoint() method is only used for symbol layers that draw point geometries. For line geometries, you should implement the renderPolyline() method, which has the following signature: def renderPolyline(self, points, context): The points parameter will be a QPolygonF object containing the various points that make up the LineString, and context will be the rendering context to use for drawing the geometry. If your symbol layer is intended to work with polygons, you should implement the renderPolygon() method, which looks like this: def renderPolygon(self, outline, rings, context): Here, outline is a QPolygonF object that contains the points that make up the exterior of the polygon, and rings is a list of QPolygonF objects that define the interior rings or "holes" within the polygon. As always, context is the rendering context to use when drawing the geometry. A custom symbol layer created in this way will work fine if you just want to use it within your own external PyQGIS application. However, if you want to use a custom symbol layer within a running copy of QGIS, and in particular, if you want to allow end users to work with the symbol layer using the Layer Properties window, there are some extra steps you will have to take, which are as follows: If you want the symbol to be visually highlighted when the user clicks on it, you will need to change your symbol layer's renderXXX() method to see if the feature being drawn has been selected by the user, and if so, change the way it is drawn. The easiest way to do this is to change the geometry's color. For example: if context.selected():    color = context.selectionColor() else:    color = self.color To allow the user to edit the symbol layer's properties, you should create a subclass of QgsSymbolLayerV2Widget, which defines the user interface to edit the properties. For example, a simple widget for the purpose of editing the length and width of a CrossSymbolLayer can be defined as follows: class CrossSymbolLayerWidget(QgsSymbolLayerV2Widget):    def __init__(self, parent=None):        QgsSymbolLayerV2Widget.__init__(self, parent)        self.layer = None          self.lengthField = QSpinBox(self)        self.lengthField.setMinimum(1)        self.lengthField.setMaximum(100)        self.connect(self.lengthField,                      SIGNAL("valueChanged(int)"),                      self.lengthChanged)          self.widthField = QSpinBox(self)        self.widthField.setMinimum(1)        self.widthField.setMaximum(100)        self.connect(self.widthField,                      SIGNAL("valueChanged(int)"),                      self.widthChanged)          self.form = QFormLayout()        self.form.addRow('Length', self.lengthField)        self.form.addRow('Width', self.widthField)          self.setLayout(self.form)      def setSymbolLayer(self, layer):        if layer.layerType() == "Cross":            self.layer = layer            self.lengthField.setValue(layer.length)            self.widthField.setValue(layer.width)      def symbolLayer(self):        return self.layer      def lengthChanged(self, n):        self.layer.length = n        self.emit(SIGNAL("changed()"))      def widthChanged(self, n):        self.layer.width = n        self.emit(SIGNAL("changed()")) We define the contents of our widget using the standard __init__() initializer. As you can see, we define two fields, lengthField and widthField, which let the user change the length and width properties respectively, for our symbol layer. The setSymbolLayer() method tells the widget which QgsSymbolLayerV2 object to use, while the symbolLayer() method returns the QgsSymbolLayerV2 object this widget is editing. Finally, the two XXXChanged() methods are called when the user changes the value of the fields, allowing us to update the symbol layer's properties to match the value set by the user. Finally, you will need to register your symbol layer. To do this, you create a subclass of QgsSymbolLayerV2AbstractMetadata and pass it to the QgsSymbolLayerV2Registry object's addSymbolLayerType() method. Here is an example implementation of the metadata for our CrossSymbolLayer class, along with the code to register it within QGIS: class CrossSymbolLayerMetadata(QgsSymbolLayerV2AbstractMetadata):    def __init__(self):        QgsSymbolLayerV2AbstractMetadata.__init__(self, "Cross", "Cross marker", QgsSymbolV2.Marker)      def createSymbolLayer(self, properties):        if "length" in properties:            length = int(properties['length'])        else:            length = 10        if "width" in properties:            width = int(properties['width'])        else:            width = 2        return CrossSymbolLayer(length, width)      def createSymbolLayerWidget(self, layer):        return CrossSymbolLayerWidget()   registry = QgsSymbolLayerV2Registry.instance() registry.addSymbolLayerType(CrossSymbolLayerMetadata()) Note that the parameters of QgsSymbolLayerV2AbstractMetadata.__init__() are as follows: The unique name for the symbol layer, which must match the name returned by the symbol layer's layerType() method. A display name for this symbol layer, as shown to the user within the Layer Properties window. The type of symbol that this symbol layer will be used for. The createSymbolLayer() method is used to restore the symbol layer based on the properties stored in the QGIS project file when the project was saved. The createSymbolLayerWidget() method is called to create the user interface widget that lets the user view and edit the symbol layer's properties. Implementing renderers in Python If you need to choose symbols based on more complicated criteria than what the built-in renderers will provide, you can write your own custom QgsFeatureRendererV2 subclass using Python. For example, the following Python code implements a simple renderer that alternates between odd and even symbols as point features are displayed: class OddEvenRenderer(QgsFeatureRendererV2):    def __init__(self): QgsFeatureRendererV2.__init__(self, "OddEvenRenderer")        self.evenSymbol = QgsMarkerSymbolV2.createSimple({})        self.evenSymbol.setColor(QColor("light gray"))        self.oddSymbol = QgsMarkerSymbolV2.createSimple({})        self.oddSymbol.setColor(QColor("black"))        self.n = 0      def clone(self):        return OddEvenRenderer()      def symbolForFeature(self, feature):        self.n = self.n + 1        if self.n % 2 == 0:            return self.evenSymbol        else:            return self.oddSymbol      def startRender(self, context, layer):        self.n = 0        self.oddSymbol.startRender(context)        self.evenSymbol.startRender(context)      def stopRender(self, context):        self.oddSymbol.stopRender(context)        self.evenSymbol.stopRender(context)      def usedAttributes(self):        return [] Using this renderer will cause the various point geometries to be displayed in alternating colors, for example: Let's take a closer look at how this class was implemented, and what the various methods do: __init__(): This is your standard Python initializer. Notice how we have to provide a unique name for the renderer when calling the QgsFeatureRendererV2.__init__() method; this is used to keep track of the various renderers within QGIS itself. clone(): This creates a copy of this renderer. If your renderer uses properties to control how it works, this method should copy those properties into the new renderer object. symbolForFeature(): This returns the symbol to use for drawing the given feature. startRender(): This prepares to start rendering the features within the map layer. As the renderer can make use of multiple symbols, you need to implement this so that your symbols are also given a chance to prepare for rendering. stopRender(): This finishes rendering the features. Once again, you need to implement this so that your symbols can have a chance to clean up once the rendering process has finished. usedAttributes():This method should be implemented to return the list of attributes that the renderer requires if your renderer makes use of feature attributes to choose between the various symbols,. If you wish, you can also implement your own widget that lets the user change the way the renderer works. This is done by subclassing QgsRendererV2Widget and setting up the widget to edit the renderer's various properties in the same way that we implemented a subclass of QgsSymbolLayerV2Widget to edit the properties for a symbol layer. You will also need to provide metadata about your new renderer (by subclassing QgsRendererV2AbstractMetadata) and use the QgsRendererV2Registry object to register your new renderer. If you do this, the user will be able to select your custom renderer for new map layers, and change the way your renderer works by editing the renderer's properties. Summary In this article, we learned how QGIS symbols and renderers are used to control how vector features are displayed on a map. We saw that there are three standard types of symbols: marker symbols for drawing points, line symbols for drawing lines, and fill symbols for drawing the interior of a polygon. We then learned how to instantiate a "simple" version of each of these symbols for use in your programs. We next looked at the built-in renderers, and how these can be used to choose the same symbol for every feature (using the QgsSingleSymbolRenderV2 class), to select a symbol based on the exact value of an attribute (using QgsCategorizedSymbolRendererV2), and to choose a symbol based on a range of attribute values (using the QgsGraduatedSymbolRendererV2 class). We then saw how symbol layers work, and how to manipulate the layers within a symbol. We looked at all the different types of symbol layers built into QGIS, and learned how they can be combined to produce sophisticated visual effects. Finally, we saw how to implement our own symbol layers using Python, and how to write your own renderer from scratch if one of the existing renderer classes doesn't meet your needs. Using these various PyQGIS classes, you have an extremely powerful set of tools at your disposal for displaying vector data within a map. While simple visual effects can be achieved with a minimum of fuss, you can produce practically any visual effect you want using an appropriate combination of built-in or custom-written QGIS symbols and renderers. Resources for Article: Further resources on this subject: Combining Vector and Raster Datasets [article] QGIS Feature Selection Tools [article] Creating a Map [article]
Read more
  • 0
  • 0
  • 4082
Modal Close icon
Modal Close icon