How-To Tutorials

06 Feb 2015

13 min read

Tour of Xcode

06 Feb 2015

In this article, written by Jayant Varma, the author of Xcode 6 Essentials, we shall look at Xcode closely as this is going to be the tool you would use quite a lot for all aspects of your app development for Apple devices. It is a good idea to know and be familiar with the interface, the sections, shortcut keys, and so on. (For more resources related to this topic, see here.) Starting Xcode Xcode, like many other Mac applications, is found in the Applications folder or the Launchpad. On starting Xcode, you will be greeted with the launch screen that offers some entry points for working with Xcode. Mostly, you will select Create a new Xcode project or Check out an existing project , if you have an existing project to continue work on. Xcode remembers what it was doing last, so if you had a project or file open, it will open up those windows again. Creating a new project After selecting the Create a new project option, we are guided via a wizard that helps us get started. Selecting the project type The first step is to select what type of project you want to create. At the moment, there are two distinct types of projects, mobile (iOS) or desktop (OS X) that you can create. Within each of those types, you can select the type of project you want. The screenshot displays a standard configuration for iOS application projects. The templates used when the selected type of project is created are self sufficient, that is, when the Run button is pressed, the app compiles and runs. It might do nothing, as this is a minimalistic template. On selecting the type of project, we can select the next step: Setting the project options This step allows selecting the options, namely setting the application name, the organization name, identifier, language, and devices to support. In the past, the language was always set to Objective-C, however with Xcode 6, there are two options: objective-C and Swift Setting the project properties On creation, the main screen is displayed. Here it offers the option to change other details related to the application such as the version number and build. It also allows you to configure the team ID and certificates used for signing the application to test on a mobile device or for distribution to the App Store. It also allows you to set the compatibility for earlier versions. The orientation and app icons, splash screens, and so on are also set from this screen. If you want to set these up later on in the project, it is fine, this can be accessed at any time and does not stop you from development. It needs to be set prior to deploying it on a device or creating an App Store ready application. Xcode overview Let us have a look at the Xcode interface to familiarize ourselves with the same as it would help improve productivity when building your application. The top section immediately following the traffic light (window chrome) displays a Play and Stop button. This allows the project to run and stop. The breadcrumb toolbar displays the project-specific settings with respect to the product and the target. With an iOS project, it could be a particular simulator for iPhone, iPad, and so on, or a physical device (number 5 in the following screenshot). Just under this are vertical areas that are the main content area with all the files, editors, UI, and so on. These can be displayed or hidden as required and can be stacked vertically or horizontally. The distinct areas in Xcode are as follows: Project navigation (number1) Editor and assistant editor (number 2 ) and (number 3 ) Utility/inspector (number 4 ) The toolbar (number 5 ) and (number 6 ) These sections can be switched on and off (shown or hidden) as required to make space for other sections or more screen space to work with: Sections in Xcode The project section The project navigation section has three sub sections, the topmost being the project toolbar that has eight icons. These can be seen as in the following screenshot. The next sub section contains the project files and all the assets required for this project. The bottom most section consists of recently edited files and filters: You can use the keyboard shortcuts to access these areas quickly with the CMD + 1...8 keys. The eight areas available under project navigation are key and for the beginner to Xcode, this could be a bit daunting. When you run the project, the current section might change and display another where you might wonder how to get back to the project (file) navigator. Getting familiar with these is always helpful and the easiest way to navigate between these is the CMD + 1..8 keys. Project navigator ( CMD + 1 ): This displays all of the files, folders, assets, frameworks, and so on that are part of this project. This is displayed as a hierarchical view and is the way that a majority of developers access their files, folders, and so on. Symbol navigator ( CMD + 2 ): This displays all of the classes, members, and methods that are available in them. This is the easiest way to navigate quickly to a method/function, attribute/property. Search navigator ( CMD + 3 ): This allows you to search the project for a particular match. This is quite useful to find and replace text. Issues navigator ( CMD + 4 ): This displays the warning and errors that occur while typing your code or on building and running it. This also displays the results of the static analyzer. Tests navigator ( CMD + 5 ); This displays the tests that you have present in your code either added by yourself or the default ones created with the project. Debug navigator ( CMD + 6 ): This displays the information about the application when you choose to run it. It has some amazing detailed information on CPU usage, memory usage, disk usage, threads, and so on. Breakpoint navigator ( CMD + 7 ): This displays all the breakpoints in your project from all files. This also allows you to create exception and symbolic breakpoints. Log navigator ( CMD + 8 ): This displays a log of all actions carried out, namely compiling, building, and running. This is more useful when used to determine the results of automated builds The editor and assistant editor sections The second area contains the editor and assistant editor sections. These display the code, the XIB (as appropriate), storyboard files, device previews, and so on. Each of the sub sections have a jump bar on the top that relates to files and allow for navigating back and forth in the files and display the location of the file in the workspace. To the right from this is a mini issues navigator that displays all warnings and errors. In the case of the assistant editors, it also displays two buttons: one to add a new assistant editor area and another to close it. Source code editors While we are looking at the interface, it is worth noting that the Xcode code editor is a very advanced editor with a lot of features, which is now seen as standard with a lot of text editors. Some of the features that make working with Xcode easier are as follows: Code folding : This feature helps to hide code at points such as the function declaration, loops, matching brace brackets, and so on. When a function or portion of code is folded, it hides it from view, thereby allowing you to view other areas of the code that would not be visible unless you scrolled. Syntax highlighting : This is one of the most useful features as it helps you, the developer, to visually, at a glance, differentiate your source code from variables, constants, and strings. Xcode has syntax highlighting for a couple of languages as mentioned earlier. Context help : This is one of the best features whereby when you hover over a word in the source code with OPT pressed, it shows a dotted underline and the cursor changes to a question mark. When you click on a word with the dotted underline and the question mark cursor, it displays a popup with details about that word. It also highlights all instances of that word in the file. The popup details as much information as available. If it is a variable or a function that you have added to the code, then it will display the name of the file where it was declared. If it is a word that is contained in the Apple libraries, then it displays the description and other additional details. Context jump : This is another cool feature that allows jumping to the point of declaration of that word. This is achieved by clicking on a word while keeping the CMD button pressed. In many cases, this is mainly helpful to know how the function is declared and what parameters it expects. It can also be useful to get information on other enumerators and constants used with that function. The jump could be in the same file as where you are editing the code or it could be to the header files where they are declared. Edit all in scope : This is a cool feature where you can edit all of the instances of the word together rather than using search and replace. A case scenario is if you want to change the name of a variable and ensure that all instances you are using in the file are changed but not the ones that are text, then you can use this option to quickly change it. Catching mistakes with fix-it : This is another cool feature in Xcode that will save you a lot of time and hassle. As you type text, Xcode keeps analyzing the code and looking for errors. If you have declared a variable and not used it in your code, Xcode immediately draws attention to it suggesting that the variable is an unused variable. However, if it was supposed to be a pointer and you have declared it without *; Xcode immediately flags it as an error that the interface type cannot be statically allocated. It offers a fix-it solution of inserting * and the code has a greyed * character showing where it will be added. This helps the developer fix commonly overlooked issues such as missing semicolons, missing declarations, or misspelled variable names. Code completion : This is the bit that makes writing code so much easier, type in a few letters of the function name and Xcode pops up a list of functions, constants, methods, and so on that start with those letters and displays all of the required parameters (as applicable) including the return type. When selected, it adds the token placeholders that can be replaced with the actual parameter values. The results might vary from person to person depending on the settings and the speed of the system you run Xcode on. The assistant editor The assistant editor is mainly used to display the counterparts and related files to the file open in the primary editor (generally used when working with Objective-C where the .h or.m files are the related files). The assistant editors track the contents of the editor. Xcode is quite intelligent and knows the corresponding sections and counterparts. When you click on a file, it opens up in the editor. However, pressing the OPT + Shift while clicking on the file, you would be provided with an interactive dialog to select where to open the file. The options include the primary editor or the assistant editor. You can also add assistant editors as required. Another way to open a file quickly is to use the Open Quickly option, which has a shortcut key of CMD + Shift + O . This displays a textbox that allows accessing a file from the project. The utility/inspector section The last section contains the inspector and library. This section changes based on the type of file selected in the current editor. The inspector has 6 tabs/sections and they are as follows: The file inspector ( CMD + OPT + 1 ): This displays the physical file information for the file selected. For code files, it is the text encoding, the targets that it belongs to, and the physical file path. While for the storyboard, it is the physical file path and allows setting attributes such as auto layout and size classes (new in Xcode 6). The quick help inspector ( CMD + OPT + 2 ): This displays information about the class or object selected. The identity inspector ( CMD + OPT + 3 ): This displays the class name, ID, and others that identify the object selected. The attributes inspector ( CMD + OPT + 4 ): This displays the attributes for the object selected as if it is the initial root view controller, does it extend under the top bars or not, if it has a navigation bar or not, and others. This also displays the user-defined attributes (a new feature with Xcode 6). The size inspector ( CMD + OPT + 5 ): This displays the size of the control selected and the associated constraints that help position it on the container. The connections inspector ( CMD + OPT + 6 ): This displays the connections created in the Interface Builder between the UI and the code. The lower half of this inspector contains four options that help you work efficiently, they are as follows: The file template library : This contains the options to create a new class, protocol. The options that are available when selecting the File | New option from the menu. The code snippets library : This is a wonderful but not widely used option. This can hold code snippets that can help you avoid writing repetitive blocks of code in your app. You can drag and drop the snippet to your code in the editor. This also offers features such as shortcuts, scopes, platforms, and languages. So you can have a shortcut such as appDidLoad (for example) that inserts the code to create and populate a button. This is achieved simply by setting the platform as appropriate to iOS or OS X. After creating a code snippet, as soon as you type the first few characters, the code snippet shows up in the list of autocomplete options; The object library : This is the toolbox that contains all of the controls that you need for creating your UI, be it a button, a label, a Table View, view, View Controller, or anything else. Adding a code snippet is as easy as dragging the selected code from the editor onto the snippet area. It is a little tricky because the moment you start dragging, it could break your selection highlight. You need to select the text, click (hold) and then drag it. The media library : This contains the list of all images and other media types that are available to this project/workspace. Summary In this article, you have seen a quick tour of Xcode, keeping the shortcuts and tips handy as they really do help get things done faster. The code snippets are a wonderful feature that allow for quickly setting up commonly used code with shortcut keywords. Resources for Article: Further resources on this subject: Introducing Xcode Tools for iPhone Development [article] Xcode 4 ios: Displaying Notification Messages [article] Linking OpenCV to an iOS project [article]

0
0
9665

article-image-basic-and-interactive-plots

Packt

06 Feb 2015

19 min read

Basic and Interactive Plots

Packt

06 Feb 2015

19 min read

0
0
2492

article-image-lync-2013-hybrid-and-lync-online

Packt

06 Feb 2015

27 min read

Lync 2013 Hybrid and Lync Online

Packt

06 Feb 2015

27 min read

0
0
12847

Packt

06 Feb 2015

12 min read

Qlik Sense's Vision

Packt

06 Feb 2015

12 min read

In this article by Christopher Ilacqua, Henric Cronström, and James Richardson, authors of the book Learning Qlik® Sense, we will look at the evolving requirements that compel organizations to readdress how they deliver business intelligence and support data-driven decision-making. This is important as it supplies some of the reasons as to why Qlik® Sense is relevant and important to their success. The purpose of covering these factors is so that you can consider and plan for them in your organization. Among other things, in this article, we will cover the following topics: The ongoing data explosion The rise of in-memory processing Barrierless BI through Human-Computer Interaction The consumerization of BI and the rise of self-service The use of information as an asset The changing role of IT (For more resources related to this topic, see here.) Evolving market factors Technologies are developed and evolved in response to the needs of the environment they are created and used within. The most successful new technologies anticipate upcoming changes in order to help people take advantage of altered circumstances or reimagine how things are done. Any market is defined by both the suppliers—in this case, Qlik®—and the buyers, that is, the people who want to get more use and value from their information. Buyers' wants and needs are driven by a variety of macro and micro factors, and these are always in flux in some markets more than others. This is obviously and apparently the case in the world of data, BI, and analytics, which has been changing at a great pace due to a number of factors discussed further in the rest of this article. Qlik Sense has been designed to be the means through which organizations and the people that are a part of them thrive in a changed environment. Big, big, and even bigger data A key factor is that there's simply much more data in many forms to analyze than before. We're in the middle of an ongoing, accelerating data boom. According to Science Daily, 90 percent of the world's data was generated over the past two years. The fact is that with technologies such as Hadoop and NoSQL databases, we now have unprecedented access to cost-effective data storage. With vast amounts of data now storable and available for analysis, people need a way to sort the signal from the noise. People from a wider variety of roles—not all of them BI users or business analysts—are demanding better, greater access to data, regardless of where it comes from. Qlik Sense's fundamental design centers on bringing varied data together for exploration in an easy and powerful way. The slow spinning down of the disk At the same time, we are seeing a shift in how computation occurs and potentially, how information is managed. Fundamentals of the computing architectures that we've used for decades, the spinning disk and moving read head, are becoming outmoded. This means storing and accessing data has been around since Edison invented the cylinder phonograph in 1877. It's about time this changed. This technology has served us very well; it was elegant and reliable, but it has limitations. Speed limitations primarily. Fundamentals that we take for granted today in BI, such as relational and multidimensional storage models, were built around these limitations. So were our IT skills, whether we realized it at the time. With the use of in-memory processing and 64-bit addressable memory spaces, these limitations are gone! This means a complete change in how we think about analysis. Processing data in memory means we can do analysis that was impractical or impossible before with the old approach. With in-memory computing, analysis that would've taken days before, now takes just seconds (or much less). However, why does it matter? Because it allows us to use the time more effectively; after all, time is the most finite resource of all. In-memory computing enables us to ask more questions, test more scenarios, do more experiments, debunk more hypotheses, explore more data, and run more simulations in the short window available to us. For IT, it means no longer trying to second-guess what users will do months or years in advance and trying to premodel it in order to achieve acceptable response times. People hate watching the hourglass spin. Qlik Sense's predecessor QlikView® was built on the exploitation of in-memory processing; Qlik Sense has it at its core too. Ubiquitous computing and the Internet of Things You may know that more than a billion people use Facebook, but did you know that the majority of those people do so from a mobile device? The growth in the number of devices connected to the Internet is absolutely astonishing. According to Cisco's Zettabyte Era report, Internet traffic from wireless devices will exceed traffic from wired devices in 2014. If we were writing this article even as recently as a year ago, we'd probably be talking about mobile BI as a separate thing from desktop or laptop delivered analytics. The fact of the matter is that we've quickly gone beyond that. For many people now, the most common way to use technology is on a mobile device, and they expect the kind of experience they've become used to on their iOS or Android device to be mirrored in complex software, such as the technology they use for visual discovery and analytics. From its inception, Qlik Sense has had mobile usage in the center of its design ethos. It's the first data discovery software to be built for mobiles, and that's evident in how it uses HTML5 to automatically render output for the device being used, whatever it is. Plug in a laptop running Qlik Sense to a 70-inch OLED TV and the visual output is resized and re-expressed to optimize the new form factor. So mobile is the new normal. This may be astonishing but it's just the beginning. Mobile technology isn't just a medium to deliver information to people, but an acceleration of data production for analysis too. By 2020, pretty much everyone and an increasing number of things will be connected to the Internet. There are 7 billion people on the planet today. Intel predicts that by 2020, more than 31 billion devices will be connected to the Internet. So, that's not just devices used by people directly to consume or share information. More and more things will be put online and communicate their state: cars, fridges, lampposts, shoes, rubbish bins, pets, plants, heating systems—you name it. These devices will generate a huge amount of data from sensors that monitor all kinds of measurable attributes: temperature, velocity, direction, orientation, and time. This means an increasing opportunity to understand a huge gamut of data, but without the right technology and approaches it will be complex to analyze what is going on. Old methods of analysis won't work, as they don't move quickly enough. The variety and volume of information that can be analyzed will explode at an exponential rate. The rise of this type of big data makes us redefine how we build, deliver, and even promote analytics. It is an opportunity for those organizations that can exploit it through analysis; this can sort the signals from the noise and make sense of the patterns in the data. Qlik Sense is designed as just such a signal booster; it takes how users can zoom and pan through information too large for them to easily understand the product. Unbound Human-Computer Interaction We touched on the boundary between the computing power and the humans using it in the previous section. Increasingly, we're removing barriers between humans and technology. Take the rise of touch devices. Users don't want to just view data presented to them in a static form. Instead, they want to "feel" the data and interact with it. The same is increasingly true of BI. The adoption of BI tools has been too low because the technology has been hard to use. Adoption has been low because in the past BI tools often required people to conform to the tool's way of working, rather than reflecting the user's way of thinking. The aspiration for Qlik Sense (when part of the QlikView.Next project) was that the software should be both "gorgeous and genius". The genius part obviously refers to the built-in intelligence, the smarts, the software will have. The gorgeous part is misunderstood or at least oversimplified. Yes, it means cosmetically attractive (which is important) but much more importantly, it means enjoyable to use and experience. In other words, Qlik Sense should never be jarring to users but seamless, perhaps almost transparent to them, inducing a state of mental flow that encourages thinking about the question being considered rather than the tool used to answer it. The aim was to be of most value to people. Qlik Sense will empower users to explore their data and uncover hidden insights, naturally. Evolving customer requirements It is not only the external market drivers that impact how we use information. Our organizations and the people that work within them are also changing in their attitude towards technology, how they express ideas through data, and how increasingly they make use of data as a competitive weapon. Consumerization of BI and the rise of self-service The consumerization of any technology space is all about how enterprises are affected by, and can take advantage of, new technologies and models that originate and develop in the consumer marker, rather than in the enterprise IT sector. The reality is that individuals react quicker than enterprises to changes in technology. As such, consumerization cannot be stopped, nor is it something to be adopted. It can be embraced. While it's not viable to build a BI strategy around consumerization alone, its impact must be considered. Consumerization makes itself felt in three areas: Technology: Most investment in innovation occurs in the consumer space first, with enterprise vendors incorporating consumer-derived features after the fact. (Think about how vendors added the browser as a UI for business software applications.) Economics: Consumer offerings are often less expensive or free (to try) with a low barrier of entry. This drives prices down, including enterprise sectors, and alters selection behavior. People: Demographics, which is the flow of Millennial Generation into the workplace, and the blurring of home/work boundaries and roles, which may be seen from a traditional IT perspective as rogue users, with demands to BYOPC or device. In line with consumerization, BI users want to be able to pick up and just use the technology to create and share engaging solutions; they don't want to read the manual. This places a high degree of importance on the Human-Computer Interaction (HCI) aspects of a BI product (refer to the preceding list) and governed access to information and deployment design. Add mobility to this and you get a brand new sourcing and adoption dynamic in BI, one that Qlik engendered, and Qlik Sense is designed to take advantage of. Think about how Qlik Sense Desktop was made available as a freemium offer. Information as an asset and differentiator As times change, so do differentiators. For example, car manufacturers in the 1980s differentiated themselves based on reliability, making sure their cars started every single time. Today, we expect that our cars will start; reliability is now a commodity. The same is true for ERP systems. Originally, companies implemented ERPs to improve reliability, but in today's post-ERP world, companies are shifting to differentiating their businesses based on information. This means our focus changes from apps to analytics. And analytics apps, like those delivered by Qlik Sense, help companies access the data they need to set themselves apart from the competition. However, to get maximum return from information, the analysis must be delivered fast enough, and in sync with the operational tempo people need. Things are speeding up all the time. For example, take the fashion industry. Large mainstream fashion retailers used to work two seasons per year. Those that stuck to that were destroyed by fast fashion retailers. The same is true for old style, system-of-record BI tools; they just can't cope with today's demands for speed and agility. The rise of information activism A new, tech-savvy generation is entering the workforce, and their expectations are different than those of past generations. The Beloit College Mindset List for the entering class of 2017 gives the perspective of students entering college this year, how they see the world, and the reality they've known all their lives. For this year's freshman class, Java has never been just a cup of coffee and a tablet is no longer something you take in the morning. This new generation of workers grew up with the Internet and is less likely to be passive with data. They bring their own devices everywhere they go, and expect it to be easy to mash-up data, communicate, and collaborate with their peers. The evolution and elevation of the role of IT We've all read about how the role of IT is changing, and the question CIOs today must ask themselves is: "How do we drive innovation?". IT must transform from being gatekeepers (doers) to storekeepers (enablers), providing business users with self-service tools they need to be successful. However, to achieve this transformation, they need to stock helpful tools and provide consumable information products or apps. Qlik Sense is a key part of the armory that IT needs to provide to be successful in this transformation. Summary In this article, we looked at the factors that provide the wider context for the use of Qlik Sense. The factors covered arise out of both increasing technical capability and demands to compete in a globalized, information-centric world, where out-analyzing your competitors is a key success factor. Resources for Article: Further resources on this subject: Securing QlikView Documents [article] Conozca QlikView [article] Introducing QlikView elements [article]

0
0
2208

article-image-five-kinds-python-functions-python-34-edition

Packt

06 Feb 2015

33 min read

The Five Kinds of Python Functions Python 3.4 Edition

Packt

06 Feb 2015

33 min read

This article is written by Steven Lott, author of the book Functional Python Programming. You can find more about him at http://slott-softwarearchitect.blogspot.com. (For more resources related to this topic, see here.) What's This About? We're going to look at various ways that Python 3 lets us define things which behave like functions. The proper term here is Callable – we're looking at objects that can be called like a function. We'll look at the following Python constructs: Function definitions Higher-order functions Function wrappers (around methods) Lambdas Callable objects Generator functions and the yield parameter And yes, we're aware that the list above has six items on it. That's because higher-order functions in Python aren't really all that complex or different. In some languages, functions that take functions are arguments involving special syntax. In Python, it's simple and common and barely worth mentioning as a separate topic. We'll look at when it's appropriate and inappropriate to use one or the other of these various functional forms. Some background Let's take a quick peek at a basic bit of mathematical formalism. We'll look at a function as an abstract formalism. We often annotate it like this: This shows us that f() is a function. It has one argument, x, and will map this to a single value, y. Some mathematical functions are written in front, for example, y=sin x. Some are written in other places around the argument, for example, y=|x|. In Python, the syntax is more consistent, for example, we use a function like this: >>> abs(-5)5 We've applied the abs() function to an argument value of -5. The argument value was mapped to a value of 5. Terminology Consider the following function: In this definition, the argument is a pair of values, (a,b). This is called the domain. We can summarize it as the domain of values for which the function is defined. Outside this domain, the function is not defined. In Python, we get a TypeError exception if we provide one value or three values as the argument. The function maps the domain pair to a pair of values, (q,r). This is the range of the function. We can call this the range of values that could be returned by the function. Mathematical function features As we look at the abstract mathematical definition of functions, we note that functions are generally assumed to have no hysteresis; they have no history or memory of prior use. This is sometimes called the property of being idempotent: the results are always the same for a given argument value. We see this in Python as a common feature. But it's not universally true. We'll look at a number of exceptions to the rule of idempotence. Here's an example of the usual situation: >>> int("10f1", 16)4337 The value returned from the evaluation of int("10f1", 16) never changes. There are, however, some common examples of non-idempotent functions in Python. Examples of hysteresis Here are three common situations where a function has hysteresis. In some cases, results vary based on history. In other cases, results vary based on events in some external environment, such as follows: Random number generators. We don't want them to produce the same value over and over again. The Python random.randrange() function, is not obviously idempotent. OS functions depend on the state of the machine as a whole. The os.listdir() function returns values that depend on the use of functions such as os.unlink(), os.rename(), and open() (among several others).While the rules are generally simple, it requires a stateful object outside the narrow world of the code itself. These are examples of Python functions that don't completely fit the formal mathematical definition; they lack idempotence, and their values depend on history, other functions, or both. Function Definitions Python has two statements that are essential features of function definition. The def statement specifies the domain and the return statement(s) specify the range. A simplified gloss of the syntax is as follows: def name(params): body return expression In effect, the function's domain is defined by the parameters provided in the def statement. This list of parameter names is not all the information on the domain, however. Even if we use one of the Python extensions to add type annotations, that's still not all the information. There may be if statements in the body of the function that impose additional explicit restrictions. There may be other functions that impose their own kind of implicit restrictions. If, for example, the body included math.sqrt() then there would be an implicit restriction on some values being non-negative. The return statements provide the function's range. An empty return statement means a range of simply None values. When there are multiple return statements, the range is the union of the ranges on all the return statements. This mapping between Python syntax and mathematical concepts isn't very complete. We need more information about a function. Example definition Here's an example of function definition: def odd(n): """odd(n) -> boolean, true if n is odd.""" return n % 2 == 1 What do does this definition tell us? Several things such as: Domain: We know that this function accepts n, a single object. Range: Boolean value, True if n is an odd number. This is the most likely interpretation. It's also remotely possible that the class of n has repurposed __mod__() or __rmod__() methods, in which case the semantics can be pretty obscure. Because of the inherent ambiguity in Python, this function has provided a triple-quoted """Docstring""" parameter with a summary of the function. This is a best practice, and should be followed universally except in articles like this where it gets too long-winded to include a docstring parameter everywhere. In this case, the doctoring parameter doesn't state unambiguously that n is intended to be a number. There are two ways to handle this gap, they are as follows: Actually include words like n is a number in the docstring parameter Include the docstring parameter test cases that show the required behavior Either is acceptable. Both are preferable. Using a function To complete this example, here's how we'd use this odd little function named odd(): >>> odd(3)True>>> odd(4)False This kind of example text can be included into the docstring parameter to create two test cases that offer insight into what the function really means. The lack of declarations More verbose type declarations—as used in many popular programming languages—aren't actually enough information to fully specify a function's domain and range. To be rigorously complete, we need type definitions that include optional predicates. Take a look at the following command: isinstance(n,int) and n >= 0 The assert statement is a good place for this kind of additional argument domain checking. This isn't the perfect solution because assert statements can be disabled very easily. It can help during design and testing and it can help people to read your code. The fussy formal declarations of data type used in other languages are not really needed in Python. Python replaces an up-front claim about required types with a runtime search for appropriate class methods. This works because each Python object has all the type information bound into it. Static compile-time type information is redundant, since the runtime type information is complete. A Python function definition is pretty spare. In includes the minimal amount of information about the function. There are no formal declaration of parameter types or return type. This odd little function will work with any object that implements the % operator: Generally, this means any object that implements __mod__() or __rmod__(). This means most subclasses of numbers.Number. It also means instances of any class that happen to provide these methods. That could become very weird, but still possible. We hesitate to think about non-numeric objects that work with the number-like % operator. Some Python features In Python, functions we declare are proper first-class objects. This means that they have attributes that can be assigned to variables and placed into collections. Quite a few clever things can be done with function objects. One of the most elegant things is to use a function as an argument or a return value from another function. The ability to do this means that we can easily create and use higher-order functions in Python. For folks who know languages such as C (and C++), functions aren't proper first-class objects. A pointer to a function, however, is a first class object in C. But the function itself is a block of code that can't easily be manipulated. We'll look at a number of simple ways in which we can write—and use—higher-order functions in Python. Functions are objects Consider the following command example: >>> not_even = odd>>> not_even(3)True We've assigned the odd little function object to a new variable, not_even. This creates an alias for a function. While this isn't always the best idea, there are times when we might want to provide an alternate name for a function as part of maintaining reverse compatibility with a previous release of a library. Using functions Consider the following function definition: def some_test(function, value): print(function, value) return function(value) This function's domain includes arguments named function and value. We can see that it prints the arguments, then applies the function argument to the given value. When we use the preceding function, it looks like this: >>> some_test(odd, 3)<function odd at 0x613978> 3True The some_test() function accepted a function as an argument. When we printed the function, we got a summary, <function odd at 0x613978>, that shows us some information about the object. We also show a summary of the argument value, 3. When we applied the function to a value, we got the expected result. We can—of course—extend this concept. In particular, we can apply a single function to many values. Higher-order Functions Higher-order functions become particularly useful when we apply them to collections of objects. The built-in map() function applies a simple function to each value in an argument sequence. Here's an example: >>> list(map(odd, [1,2,3,4]))[True, False, True, False] We've used the map() function to apply the odd() function to each value in the sequence. This is a lot like evaluating: >>> [odd(x) for x in [1,2,3,4]] We've created a list comprehension instead of applying a higher-order map() function. This is equivalent to the following command snippet: [odd(1), odd(2), odd(3), odd(4)] Here, we've manually applied the odd() function to each value in a sequence. Yes, that's a diesel engine alternator and some hoses: We'll use this alternator as a subject for some concrete examples of higher-order functions. Diesel engine background Some basic diesel engine mechanics. The following some basic information: The engine turns the alternator. The alternator generates pulses that drive the tachometer. Amongst other things, like charging the batteries. The alternator provides an indirect measurement of engine RPMs. Direct measurement would involve connecting to a small geared shaft. It's difficult and expensive. We already have a tachometer; it's just incorrect. The new alternator has new wheels. The ratios between engine and alternator have changed. We're not interested in installing a new tachometer. Instead, we'll create a conversion from a number on the tachometer, which is calibrated to the old alternator, to a proper number of engine RPMs. This has to allow the change in ratio between the original tachometer and the new tach. Let's collect some data and see what we can figure out about engine RPMs. New alternator First approximation: all we did was get new wheels. We can presume that the old tachometer was correct. Since the new wheel is smaller, we'll have higher alternator RPMs. That means higher readings on the old tachometer. Here's the key question: How far wrong are the RPMs? The old wheel was approximately 3.5 RPM and the new wheel is approximately 2.5 RPM. We can compute the potential ratio between what the tach says and what the engine is really doing: >>> 3.5/2.51.4>>> 1/_0.7142857142857143 That's nice. Is it right? Can we really just multiply and display RPMs by .7 to get actual engine RPMs? Let's create the conversion card first, then collect some more data. Use case Given RPM on the tachometer, what's the real RPM of the engine? Use the following command to find the RPM: def eng(r): return r/1.4 Use it like the following: >>> eng(2100)1500.0 This seems useful. Tach says 2100, engine (theoretically) spinning at 1500, more or less. Let's confirm our hypothesis with some real data. Data collection Over a period of time, we recorded tachometer readings and actual RPMs using a visual RPM measuring device. The visual device requires a strip of reflective tape on one of the engine wheels. It uses a laser and counts returns per minute. Simple. Elegant. Accurate. It's really inconvenient. But it got some data we could digest. Skipping some boring statistics, we wind up with the following function that maps displayed RPMs to actual RPMs, such as this: def eng2(r): return 0.7724*r**1.0134 Here's a sample result: >>> eng2(2100)1797.1291903589386 When tach says 2100, the engine is measured as spinning at about 1800 RPM. That's not quite the same as the theoretical model. But it's so close that it gives us a lot of confidence in this version. Of course, the number displayed is hideous. All that floating-point cruft is crazy. What can we do? Rounding is only part of the solution. We need to think through the use case. After all, we use this standing at the helm of the boat; how much detail is appropriate? Limits and ranges The engine has governors and only runs between 800 and 2500 RPM. There's a very tight limit here. Realistically, we're talking about this small range of values: >>> list(range(800, 2500, 200))[800, 1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400] There's no sensible reason for proving any more detailed engine RPMs. It's a sailboat; top speed is 7.5 knots (Nautical miles per hour). Wind and current have far more impact on the boat speed than the difference between 1600 and 1700 RPMs. The tach can't be read to closer than 100-200 RPM. It's not digital, it's a red pointer near little tick lines. There's no reason to preserve more than a few bits of precision. Example of Tach translation Given the engine RPMs and the conversion function, we can deduce that the tachometer display will be between 1000 to 3200. This will map to engine RPMs in the range of about 800 to 2500. We can confirm this with a mapping like this: >>> list(map(eng2, range(1000,3200,200)))[847.3098694826986, 1019.258964596305, 1191.5942982618956, 1364.2609728487703, 1537.2178605443924, 1710.4329833319157, 1883.8807562755746, 2057.5402392829747, 2231.3939741669838, 2405.4271806626366, 2579.627182659544] We've applied the eng2() mapping from tach to engine RPM. For tach readings between 1000 and 3200 in steps of 200, we've computed the actual engine RPMs. For those who use spreadsheets a lot, the range() function is like filling a column with values. The map(eng2, …) function is like filling an adjacent column with a calculation. We've created the result of applying a function to each value of a given range. As shown, this is little difficult to use. We need to do a little more cleanup. What other function do we need to apply to the results? Round to 100 Here's a function that will round up to the nearest 100: def next100(n): return int(round(n, -2)) We could call this a kind of composite function built from a partial application of round() and int() functions. If we map this function to the previous results, we get something a little easier to work with. How does this look? >>> tach= range(1000,3200,200)>>> list(map(next100, map(eng2, tach)))[800, 1000, 1200, 1400, 1500, 1700, 1900, 2100, 2200, 2400, 2600] This expression is a bit complex; let's break it down into three discrete steps: First, map the eng2() function to tach numbers between 1000 and 3200. The result is effectively a sequence of values (it's not actually a list, it's a generator, a potential list) Second, map the next100() function to results of previous mapping Finally, collect a single list object from the results We've applied two functions, eng2() and next100(), to a list of values. In principle, we've created a kind of composite function, next100○eng20(rpm). Python doesn't support function composition directly, hence the complex-looking map of map syntax. Interleave sequences of values The final step is to create a table that shows both the tachometer reading and the computed engine RPMs. We need to interleave the input and output values into a single list of pairs. Here are the tach readings we're working with, as a list: >>> tach= range(1000,3200,200) Here are the engine RPMs: >>> engine= list(map(next100,map(eng2,tach))) Here's how we can interleave the two to create something that shows our tachometer reading and engine RPMs: >>> list(zip(tach, engine))[(1000, 800), (1200, 1000), (1400, 1200), (1600, 1400), (1800, 1500), (2000, 1700),(2200, 1900), (2400, 2100), (2600, 2200), (2800, 2400), (3000, 2600)] The rest is pretty-printing. What's important is that we could take functions like eng() or eng2() and apply it to columns of numbers, creating columns of results. The map() function means that we don't have to write explicit for loops to simply apply a function to a sequence of values. Map is lazy We have a few other observations about the Python higher-order functions. First, these functions are lazy, they don't compute any results until required by other statements or expressions. Because they don't actually create intermediate list objects, they may be quite fast. The laziness feature is true for the built-in higher-order functions map() and filter(). It's also true for many of the functions in the itertools library. Many of these functions don't simply create a list object, they yield values as requested. For debugging purposes, we use list() to see what's being produced. If we don't apply list() to the result of a lazy function, we simply see that it's a lazy function. Here's an example: >>> map(lambda x:x*1.4, range(1000,3200,200))<map object at 0x102130610> We don't see a proper result here, because the lazy map() function didn't do anything. The list(), tuple(), or set() functions will force a lazy map() function to actually get up off the couch and compute something. Function Wrappers There are a number of Python functions which are syntactic sugar for method functions. One example is the len() function. This function behaves as if it had the following definition: def len(obj): return obj.__len__() The function acts like it's simply invoking the object's built-in __len__() method. There are several Python functions that exist only to make the syntax a little more readable. Post-fix syntax purists would prefer to see syntax such as some_list.len(). Those who like their code to look a little more mathematical prefer len(some_list). Some people will go so far as to claim that the presence of prefix functions means that Python isn't strictly object-oriented. This is false; Python is very strictly object-oriented. It doesn't—however—use only postfix method notation. We can write function wrappers to make some method functions a little more palatable. Another good example is the divmod() function. This relies on two method functions, such as the following: a.__divmod__(b) b.__rdivmod__(a) The usual operator rules apply here. If the class for object a implements __divmod__(), then that's used to compute the result. If not, then the same test is made for the class of object b; if there's an implementation, that will be used to compute the results. Otherwise, it's undefined and we'll get an exception. Why wrap a method? Function wrappers for methods are syntactic sugar. They exist to make object methods look like simple functions. In some cases, the functional view is more succinct and expressive. Sometimes the object involved is obvious. For example, the os module functions provide access to OS-level libraries. The OS object is concealed inside the module. Sometimes the object is implied. For example, the random module makes a Random instance for us. We can simply call random.randint() without worrying about the object that was required for this to work properly. Lambdas A lambda is an anonymous function with a degenerate body. It's like a function in some respects and it's unlike a function because of the following two things: A lambda has no name A lambda has no statements A lambda's body is a single expression, nothing more. This expression can have parameters, however, which is why a lambda is a handy form of a callable function. The syntax is essentially as follows: lambda params : expression Here's a concrete example: lambda r: 0.7724*r**1.0134 You may recognize this as the eng2() function defined previously. We don't always need a complete, formal function. Sometimes, we just need an expression that has parameters. Speaking theoretically, a lambda is a one-argument function. When we have multi-argument functions, we can transform it to a series of one-argument lambda forms. This transformation can be helpful for optimization. None of that applies to Python. We'll move on. Using a Lambda with map Here are two equivalent results: map(eng2, tach) map(lambda r: 0.7724*r**1.0134, tach) Here's a previous example, using the lambda instead of the function: >>> tach= range(1000,3200,200)>>> list( map(lambda r: 0.7724*r**1.0134, tach))[847.3098694826986, 1019.258964596305, 1191.5942982618956, 1364.2609728487703, 1537.2178605443924, 1710.4329833319157, 1883.8807562755746, 2057.5402392829747, 2231.3939741669838, 2405.4271806626366, 2579.627182659544] You could scroll back to see that the results are the same. If we're doing a small thing once only, a lambda object might be more clear than a complete function definition. Emphasis here is on small once only. If we start trying to reuse a lambda object, or feel the need to assign a lambda object to a variable, we should really consider a function definition and the associated docstring and doctest features. Another use of Lambdas A common use of lambdas is with three other higher-order functions: sort(), min(), and max(). We might use one of these with a list object: list.sort(key= lambda x: expr) list.min(key= lambda x: expr) list.max(key= lambda x: expr) In each case, we're using a lambda object to embed an expression into the argument values for a function. In some cases, the expression might be very sophisticated; in other cases, it might be something as trivial as lambda x: x[1]. When the expression is trivial, a lambda object is a good idea. If the expression is going to get reused, however, a lambda object might be a bad idea. You can do this… But… The following kind of statement makes sense: some_name = lambda x: 3*x+1 We've created a callable object that takes a single argument value and returns a numeric value such as the following command snippet: def some_name(x): return 3*x+1. There are some differences. Most notably the following: A lambda object is all on one line of code. A possible advantage. There's no docstring. A disadvantage for lambdas of any complexity. Nor is there any doctest in the missing docstring. A significant problem for a lambda object that requires testing. There are ways to test lambdas with doctest outside a docstring, but it seems simpler to switch to a full function definition. We can't easily apply decorators to it. To do it, we lose the @decorator syntax. We can't use any Python statements in it. In particular, no try-except block is possible. For these reasons, we suggest limiting the use of lambdas to truly trivial situations. Callable Objects A callable object fits the model of a function. The unifying feature of all of the things we've looked at is that they're callable. Functions are the primary example of being callable but objects can also be callable. Callable objects can be subclasses of collections.abc.Callable. Because of Python's flexibility, this isn't a requirement, it's merely a good idea. To be callable, a class only needs to provide a __call__() method. Here's a complete callable class definition: from collections.abc import Callableclass Engine(Callable): def __call__(self, tach): return 0.7724*tach**1.0134 We've imported the collections.abc.Callable class. This will provide some assurance that any class that extends this abstract superclass will provide a definition for the __call__() method. This is a handy error-checking feature. Our class extends Callable by providing the needed __call__() method. In this case, the __call__() method performs a calculation on the single parameter value, returning a single result. Here's a callable object built from this class: eng= Engine() This creates a function that we can then use. We can evaluate eng(1000) to get the engine RPMs when the tach reads 1000. Callable objects step-by-step There are two parts to making a function a callable object. We'll emphasize these for folks who are new to object-oriented programming: Define a class. Generally, we make this a subclass of collections.abc.Callable. Technically, we only need to implement a __call__() method. It helps to use the proper superclass because it might help catch a few common mistakes. Create an instance of the class. This instance will be a callable object. The object that's created will be very similar to a defined function. And very similar to a lambda object that's been assigned to a variable. While it will be similar to a def statement, it will have one important additional feature: hysteresis. This can be the source of endless bugs. It can also be a way to improve performance. Callables can have hysteresis Here's an example of a callable object that uses hysteresis as a kind of optimization: class Factorial(Callable): def __init__(self): self.previous = {} def __call__(self, n): if n not in self.previous: self.previous[n]= self.compute(n) return self.previous[n] def compute(self, n): if n == 0 : return 1 return n*self.__call__(n-1)Here's how we can use this:>>> fact= Factorial()>>> fact(5)120 We create an instance of the class, and then call the instance to compute a value for us. The initializer The initialization method looks like this: def __init__(self): self.previous = {} This function creates a cache of previously computed values. This is a technique called memoization. If we've already computed a result once, it's in the self.previous cache; we don't need to compute it again, we already know the answer. The Callable interface The required __call__() method looks like this: def __call__(self, n): if n not in self.previous: self.previous[n]= self.compute(n) return self.previous[n] We've checked the memoization cache first. If the value is not there, we're forced to compute the answer, and insert it into the cache. The final answer is always a value in the cache. A common what if question is what if we have a function of multiple arguments? There are two minuscule changes to support more complex arguments. Use def __call__(self, *n): and self.compute(*n). Since we're only computing factorial, there's no need to over-generalize. The Compute method The essential computation has been allocated to a method called compute. It looks like this: def compute(self, n): if n == 0: return 1 return n*self.__call__(n-1) This does the real work of the callable object: it computes n!. In this case, we've used a pretty standard recursive factorial definition. This recursion relies on the __call__() method to check the cache for previous values. If we don't expect to compute values larger than 1000! (a 2,568 digit number, by the way) the recursion works nicely. If we think we need to compute really large factorials, we'll need to use a different approach. Execute the following code to compute very large factorials: functools.reduce(operator.mul, range(1,n+1)) Either way, we can depend on the internal memoization to leverage previous results. Note the potential issue Hysteresis—memory of what came before—is available to the callable objects. We call functions and lambdas stateless, where callable objects can be stateful. This may be desirable to optimize performance. We can memoize the previous results or we can design an object that's simply confusing. Consider a function like divmod() that returns two values. We could try to define a callable object that first returns the quotient and on the second call with the same arguments returns the remainder: >>> crazy_divmod(355,113)3>>> crazy_divmod(255,113)16 This is technically possible. But it's crazy. Warning: Stay away. We generally expect idempotence: functions do the same thing each time. Implementing memoization didn't alter the basic idempotence of our factorial function. Generator Functions Here's a fun generator, the Collatz function. The function creates a sequence using a simple pair of rules. We'll could call this rule, Half-Or-Three-Plus-One (HOTPO). We'll call it collatz(): def collatz(n): if n % 2 == 0: return n//2 else: return 3*n+1 Each integer argument yields a next integer. These can form a chain. For example, if we start with collatz(13), we get 40. The value of collatz(40) is 20. Here's the sequence of values: 13 → 40 → 20 → 10 → 5 → 16 → 8 → 4 → 2 → 1At 1, it loops: 1 → 4 → 2 → 1 … Interestingly, all chains seem to lead—eventually—to 1. To explore this, we need a simple function that will build a chain from a given starting value. Successive values Here's a generator function that will build a list object. This iterates through values in the sequence until it reaches 1, when it terminates: def col_list(n): seq= [n] while n != 1: n= collatz(n) seq.append(n) return seq This is not wrong. But it's not really the most useful implementation. This always creates a sequence object. In many cases, we don't really want an object, we only want information about the sequence. We might only want the length, or the largest numbers, or the sum. This is where a generator function might be more useful. A generator function yields elements from the sequence instead of building the sequence as a single object. Generator functions To create a generator function, we write a function that has a loop; inside the loop, there's a yield statement. A function with a yield statement is effectively an Iterable object, it can be used in a for statement to produce values. It doesn't create a big list object, it creates the items that can be accumulated into a list or tuple object. A generator function is lazy: it doesn't compute anything unless forced to by another function needing results. We can iterate through as many (or as few) of the results as we need. For example, list(some_generator()) forces all values to be returned. For another example of a lazy generator, look at how range() objects work. If we evaluate range(10), we only get a generator. If we evaluate list(range(10)), we get a list object. The Collatz generator Here's a generator function that will produce sequences of values using the collatz() method rule shown previously: def col_iter(n): yield n while n != 1: n= collatz(n) yield n When we use this in a for loop or with the list() function, this will yield the argument number. While the number is not 1, it will apply the collatz() function and yield successive values in the chain. When it has yielded 1, it will will terminate. One common pattern for generator functions is to replace all list-accumulation statements with yield statements. Instead of building a list one time at a time, we yield each item. The collatz() function it lazy. We don't get an answer unless we use list() or tuple() or some variation of a for statement context. Using a generator function Here's how this function looks in practice: >>> for i in col_iter(3):… print(i)3105168421 We've used the generator function in a for loop so that it will yield all of the values until it terminates. Collatz function sequences Now we can do some exploration of this Collatz sequence. Here are a few evaluations of the col_iter() function: >>> list(col_iter(3))[3, 10, 5, 16, 8, 4, 2, 1]>>> list(col_iter(5))[5, 16, 8, 4, 2, 1]>>> list(col_iter(6))[6, 3, 10, 5, 16, 8, 4, 2, 1]>>> list(syracuse_iter(13))[13, 40, 20, 10, 5, 16, 8, 4, 2, 1] There's an interesting pattern here. It seems that from 16, we know the rest. Generalizing this: from any number we've already seen, we know the rest. Wait. This means that memoization might be a big help in exploring the values created by this sequence. When we start combining function design patterns like this, we're doing functional programming. We're stepping outside the box of purely object-oriented Python. Alternate styles Here is an alternative version of the collatz() function: def collatz2(n): return n//2 if n%2 == 0 else 3*n+1 This simply collapses the if statements into a single if expression and may not help much. We also have this: collatz3= lambda n: n//2 if n%2 == 0 else 3*n+1 We've collapsed the expression into a lambda object. Helpful? Perhaps not. On the other hand, the function doesn't really need all of the overhead of a full function definition and multiple statements. The lambda object seems to capture everything relevant. Functions as object There's a higher-level function that will produce values until some ending condition is met. We can plug in one of the versions of the collatz() function and a termination test into this general-purpose function: def recurse_until(ending, the_function, n): yield n while not ending(n): n= the_function(n) yield n This requires two plug-in functions, they are as follows: ending() is a function to test to see whether we're done, for example, lambda n: n==1 the_function() is a form of the Collatz function We've completely uncoupled the general idea of recursively applying a function from a specific function and a specific terminating condition. Using the recurs_until() function We can apply this higher-order recurse_until() function like this: >>> recurse_until(lambda n: n==1, syracuse2, 9)<generator object recurse_until at 0x1021278c0> What's that? That's how a lazy generator looks: it didn't return any values because we didn't demand any values. We need to use it in a loop or some kind of expression that iterates through all available values. The list() function, for example, will collect all of the values. Getting the list of values Here's how we make the lazy generator do some work: >>> list(_)[9, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1] The _ variable is the previously computed value. It relieves us from the burden of having to write an assignment statement. We can write an expression, see the results, and know the results were automatically saved in the _ variable. Project Euler #14 Which starting number, under one million, produces the longest chain? Try it without memoization. It's really simple: >>> collatz_len= [len(list(recurse_until(lambda n: n==1, collatz2, i))) ... for i in range(1,11)]>>> results = zip(collatz_len, range(1,11))>>> max(results)(20, 9)>>> list(col_iter(9))[9, 28, 14, 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1] We defined collatz_len as a list. We're writing a list comprehension that shows the values built from a generator expression. The generator expression evaluates len(something) for i in range(1,11). This means we'll be collecting ten values into the list, each of which is the length of something. The something is a list object built from the recurse_until(lambda n: n==1, collatz2, i) function that we discussed. This will compute the sequence of Collatz values starting from i and proceeding until the value in the sequence is 1. We've zipped the lengths with the original values of i. This will create pairs of lengths and starting numbers. The maximum length will now have a starting value associated with it so that we can confirm that the results match our expectations. And yes, this Project Euler problem could—in principle—be solved in a single line of code. Will this scale to 100? 1,000? 1,000,000? How much will memoization help? Summary In this article, we've looked at five (or six) kinds of Python callables. They all fit the y = f(x) model of a function to varying degrees. When is it appropriate to use each of these different ways to express the same essential concept? Functions are created with def and return. It shouldn't come as a surprise that this should cover most cases. This allows a docstring comment and doctest test cases. We could call these def functions, since they're built with the def statement. Higher-order functions—map(), filter(), and the itertools library—are generally written as plain-old def functions. They're higher-order because they accept functions as arguments or return functions as results. Otherwise, they're just functions. Function wrappers—len(), divmod(), pow(), str(), and repr()—are function syntax wrapped around object methods. These are def'd functions with very tiny bodies. We use them because a.pow(2) doesn't seem as clear as pow(2,a). Lambdas are appropriate for one-time use of something so simple that it doesn't deserve being wrapped in a def statement body. In some cases, we have a small nugget of code that seems more clear when written as a lambda expression rather than a more complete function definition. Simple filter rules, and simple computations are often more clearly shown as a lambda object. The Callable objects have a special property that other functions lack: hysteresis. They can retain the results of previous calculations. We've used this hysteresis property to implement memoizing. This can be a huge performance boost. Callable objects can be used badly, however, to create objects that have simply bizarre behavior. Most functions should strive for idempotence—the same arguments should yield the same results. Generator functions are created with a def statement and at least one yield statement. These functions are iterable. They can be used in a for statement to examine each resulting value. They can also be used with functions like list(), tuple(), and set() to create an actual object from the iterable sequence of values. We might combine them with higher-order functions to do complex processing, one item at a time. It's important to work with each of these kinds of callables. If you only have one tool—a hammer—then every problem has to be reshaped into a nail before you can solve it. Once you have multiple tools available, you can pick the tools that provides the most succinct and expressive solution to the problem. Resources for Article: Further resources on this subject: Expert Python Programming [article] Python Network Programming Cookbook [article] Learning Python Design Patterns [article]

0
0
2725

article-image-getting-your-own-video-and-feeds

Packt

06 Feb 2015

18 min read

Getting Your Own Video and Feeds

Packt

06 Feb 2015

18 min read

0
0
4608

How-To Tutorials

Packt

06 Feb 2015

4 min read

Upgrading the interface

Packt

06 Feb 2015

4 min read

In this article by Marco Schwartz and Oliver Manickum authors of the book Programming Arduino with LabVIEW, we will see how to design an interfave using LabVIEW. (For more resources related to this topic, see here.) At this stage, we know that we have our two sensors working and that they were interfaced correctly with the LabVIEW interface. However, we can do better; for now, we simply have a text display of the measurements, which is not elegant to read. Also, the light-level measurement goes from 0 to 5, which doesn't mean anything for somebody who will look at the interface for the first time. Therefore, we will modify the interface slightly. We will add a temperature gauge to display the data coming from the temperature sensor, and we will modify the output of the reading from the photocell to display the measurement from 0 (no light) to 100 percent (maximum brightness). We first need to place the different display elements. To do this, perform the following steps: Start with Front Panel. You can use a temperature gauge for the temperature and a simple slider indicator for Light Level. You will find both in the Indicators submenu of LabVIEW. After that, simply place them on the right-hand side of the interface and delete the other indicators we used earlier. Also, name the new indicators accordingly so that we can know to which element we have to connect them later. Then, it is time to go back to Block Diagram to connect the new elements we just added in Front Panel. For the temperature element, it is easy: you can simply connect the temperature gauge to the TMP36 output pin. For the light level, we will make slightly more complicated changes. We will divide the measured value beside the Analog Read element by 5, thus obtaining an output value between 0 and 1. Then, we will multiply this value by 100, to end up with a value going from 0 to 100 percent of the ambient light level. To do so perform the following steps: The first step is to place two elements corresponding to the two mathematical operations we want to do: a divide operator and a multiply operator. You can find both of them in the Functions panel of LabVIEW. Simply place them close to the Analog Read element in your program. After that, right-click on one of the inputs of each operator element, and go to Create | Constant to create a constant input for each block. Add a value of 5 for the division block, and add a value of 100 for the multiply block. Finally, connect the output of the Analog Read element to the input of the division block, the output of this block to the input of the multiply block, and the output of the multiply block to the input of the Light Level indicator. You can now go back to Front Panel to see the new interface in action. You can run the program again by clicking on the little arrow on the toolbar. You should immediately see that Temperature is now indicated by the gauge on the right and Light Level is immediately changing on the slider, depending on how you cover the sensor with your hand. Summary In this article, we connected a temperature sensor and a light-level sensor to Arduino and built a simple LabVIEW program to read data from these sensors. Then, we built a nice graphical interface to visualize the data coming from these sensors. There are many ways you can build other projects based on what you learned in this article. You can, for example, connect higher temperatures and/or more light-level sensors to the Arduino board and display these measurements in the interface. You can also connect other kinds of sensors that are supported by LabVIEW, for example, other analog sensors. For example, you can add a barometric pressure sensor or a humidity sensor to the project to build an even more complete weather-measurement station. One other interesting extension of this article will be to use the storage and plotting capabilities of LabVIEW to dynamically plot the history of the measured data inside the LabVIEW interface. Resources for Article: Further resources on this subject: The Arduino Mobile Robot [article] Using the Leap Motion Controller with Arduino [article] Avoiding Obstacles Using Sensors [article]

0
0
7274

How-To Tutorials

article-image-visualforce-development-apex

Packt

06 Feb 2015

12 min read

Visualforce Development with Apex

Packt

06 Feb 2015

12 min read

0
0
2474

Packt

06 Feb 2015

32 min read

Remote Access

Packt

06 Feb 2015

32 min read

0
0
1564

Packt

06 Feb 2015

22 min read

Event-driven Programming

Packt

06 Feb 2015

22 min read

0
0
6437

article-image-extending-elasticsearch-scripting

Packt

06 Feb 2015

21 min read

Extending ElasticSearch with Scripting

Packt

06 Feb 2015

21 min read

0
0
8475

article-image-introduction-apache-zookeeper

Packt

05 Feb 2015

26 min read

Introduction to Apache ZooKeeper

Packt

05 Feb 2015

26 min read

In this article by Saurav Haloi, author of the book a Apache Zookeeper Essentials, we will learn about Apache ZooKeeper is a software project of the Apache Software Foundation; it provides an open source solution to the various coordination problems in large distributed systems. ZooKeeper as a centralized coordination service is distributed and highly reliable, running on a cluster of servers called a ZooKeeper Ensemble. Distributed consensus, group management, presence protocols, and leader election are implemented by the service so that the applications do not need to reinvent the wheel by implementing them on its own. On top of these, the primitives exposed by ZooKeeper can be used by applications to build much more powerful abstractions for solving a wide variety of problems. (For more resources related to this topic, see here.) Apache ZooKeeper is implemented in Java. It ships with C, Java, Perl, and Python client bindings. Community contributed client libraries are available for a plethora of languages like Go, Scala, Erlang, and so on. Apache ZooKeeper is widely used by large number of organizations, such as Yahoo Inc., Twitter, Netflix and Facebook, in their distributed application platforms as a coordination service. In this article we will look into installation and configuration of Apache ZooKeeper, some of the concepts associated with it followed by programming using Python client library of ZooKeeper. We will also read how we can implement some of the important constructs of distributed programming using ZooKeeper. Download and installation ZooKeeper is supported by a wide variety of platforms. GNU/Linux and Oracle Solaris are supported as development and production platforms for both server and client. Windows and Mac OS X are recommended only as development platforms for both server and client. ZooKeeper is implemented in Java and requires Java 6 or later versions to run. Let's download the stable version from one of the mirrors, say Georgia Tech's Apache download mirror (http://b.gatech.edu/1xElxRb) in the following example: $ wgethttp://www.gtlib.gatech.edu/pub/apache/zookeeper/stable/zookeeper-3.4.6.tar.gz$ ls -alh zookeeper-3.4.6.tar.gz-rw-rw-r-- 1 saurav saurav 17M Feb 20 2014 zookeeper-3.4.6.tar.gz Once we have downloaded the ZooKeeper tarball, installing and setting up a standalone ZooKeeper node is pretty simple and straightforward. Let's extract the compressed tar archive into /usr/share: $ tar -C /usr/share -zxf zookeeper-3.4.6.tar.gz$ cd /usr/share/zookeeper-3.4.6/$ lsbin CHANGES.txt contrib docs ivy.xml LICENSE.txtREADME_packaging.txt recipes zookeeper-3.4.6.jar zookeeper-3.4.6.jar.md5build.xml conf dist-maven ivysettings.xml libNOTICE.txt README.txt src zookeeper-3.4.6.jar.asczookeeper-3.4.6.jar.sha1 The location where the ZooKeeper archive is extracted in our case, /usr/share/zookeeper-3.4.6, can be exported as ZK_HOME as follows: $ export ZK_HOME=/usr/share/zookeeper-3.4.6 Configuration Once we have extracted the tarball, the next thing is to configure ZooKeeper. The conf folder holds the configuration files for ZooKeeper. ZooKeeper needs a configuration file called zoo.cfg in the conf folder inside the extracted ZooKeeper folder. There is a sample configuration file that contains some of the configuration parameters for reference. Let's create our configuration file with the following minimal parameters and save it in the conf directory: $ cat conf/zoo.cfgtickTime=2000dataDir=/var/lib/zookeeperclientPort=2181 The configuration parameters' meanings are explained here: tickTime: This is measured in milliseconds; it is used for session registration and to do regular heartbeats by clients with the ZooKeeper service. The minimum session timeout will be twice the tickTime parameter. dataDir: This is the location to store the in-memory state of ZooKeeper; it includes database snapshots and the transaction log of updates to the database. Extracting the ZooKeeper archive won't create this directory, so if this directory doesn't exist in the system, you will need to create it and set writable permission to it. clientPort: This is the port that listens for client connections, so it is where the ZooKeeper clients will initiate a connection. The client port can be set to any number, and different servers can be configured to listen on different ports. The default is 2181. ZooKeeper needs the JAVA_HOME environment variable to be set correctly. To see if this is set in your system, run the following command: $ echo $JAVA_HOME Starting the ZooKeeper server Now, considering that Java is installed and working properly, let's go ahead and start the ZooKeeper server. All ZooKeeper administration scripts to start/stop the server and invoke the ZooKeeper command shell are shipped along with the archive in the bin folder with the following code: $ pwd /usr/share/zookeeper-3.4.6/bin $ ls README.txt zkCleanup.sh zkCli.cmd zkCli.sh zkEnv.cmd zkEnv.sh zkServer.cmd zkServer.sh The scripts with the .sh extension are for Unix platforms (GNU/Linux, Mac OS X, and so on), and the scripts with the .cmd extension are for Microsoft Windows operating systems. To start the ZooKeeper server in a GNU/Linux system, you need to execute the zkServer.sh script as follows. This script gives options to start, stop, restart, and see the status of the ZooKeeper server: $ ./zkServer.sh JMX enabled by default Using config: /usr/share/zookeeper-3.4.6/bin/../conf/zoo.cfg Usage: ./zkServer.sh {start|start-foreground|stop|restart|status|upgrade|print-cmd} To avoid going to the ZooKeeper install directory to run these scripts, you can include it in your PATH variable as follows: export PATH=$PATH:/usr/share/zookeeper-3.4.6/bin Executing zkServer.sh with the start argument will start the ZooKeeper server. A successful start of the server will show the following output: $ zkServer.sh start JMX enabled by default Using config: /usr/share/zookeeper-3.4.6/bin/../conf/zoo.cfg Starting zookeeper ... STARTED To verify that the ZooKeeper server has started, you can use the following ps command: $ ps –ef | grep zookeeper | grep –v grep | awk '{print $2}' 5511 The ZooKeeper server's status can be checked with the zkServer.sh script as follows: $ zkServer.sh status JMX enabled by default Using config: /usr/share/zookeeper-3.4.6/bin/../conf/zoo.cfg Mode: standalone Connecting to ZooKeeper with a Java-based shell To start the Java-based ZooKeeper command-line shell, we simply need to run zkCli.sh of the ZK_HOME/bin folder with the server IP and port as follows: ${ZK_HOME}/bin/zkCli.sh –server zk_server:port In our case, we are running our ZooKeeper server on the same machine, so the ZooKeeper server will be localhost, or the loop-back address will be 127.0.0.1. The default port we configured was 2181: $ zkCli.sh -server localhost:2181 As we connect to the running ZooKeeper instance, we will see the output similar to the following one in the terminal (some output is omitted): Connecting to localhost:2181 ............... ............... Welcome to ZooKeeper! JLine support is enabled ............. WATCHER:: WatchedEvent state:SyncConnected type:None path:null [zk: localhost:2181(CONNECTED) 0] To see a listing of the commands supported by the ZooKeeper Java shell, you can run the help command in the shell prompt: [zk: localhost:2181(CONNECTED) 0] help ZooKeeper -server host:port cmd args connect host:port get path [watch] ls path [watch] set path data [version] rmr path delquota [-n|-b] path quit printwatches on|off create [-s] [-e] path data acl stat path [watch] close ls2 path [watch] history listquota path setAcl path acl getAcl path sync path redo cmdno addauth scheme auth delete path [version] setquota -n|-b val path We can execute a few simple commands to get a feel of the command-line interface. Let's start by running the ls command, which, as in Unix, is used for listing: [zk: localhost:2181(CONNECTED) 1] ls / [zookeeper] Now, the ls command returned a string called zookeeper, which is a znode in the ZooKeeper terminology. We can create a znode through the ZooKeeper shell as follows: To begin with, let's create a HelloWorld znode with empty data. [zk: localhost:2181(CONNECTED) 2] create /HelloWorld "" Created /HelloWorld [zk: localhost:2181(CONNECTED) 3] ls / [zookeeper, HelloWorld] We can delete the znode created by issuing the delete command as follows: [zk: localhost:2181(CONNECTED) 4] delete /HelloWorld [zk: localhost:2181(CONNECTED) 5] ls / [zookeeper] The ZooKeeper data model ZooKeeper allows distributed processes to coordinate with each other through a shared hierarchical namespace of data registers. The namespace looks quite similar to a Unix filesystem. The data registers are known as znodes in the ZooKeeper nomenclature. ZooKeeper has two types of znodes: persistent and ephemeral. There is a third type that you might have heard of, called a sequential znode, which is a kind of a qualifier for the other two types. Both persistent and ephemeral znodes can be sequential znodes as well. The persistent znode As the name suggests, persistent znodes have a lifetime in the ZooKeeper’s namespace until they’re explicitly deleted. A znode can be deleted by calling the delete API call. The ephemeral znode An ephemeral znode is deleted by the ZooKeeper service when the creating client’s session ends. An end to a client’s session can happen because of disconnection due to a client crash or explicit termination of the connection. The sequential znode A sequential znode is assigned a sequence number by ZooKeeper as a part of its name during its creation. The value of a monotonously increasing counter (maintained by the parent znode) is appended to the name of the znode. The ZooKeeper Watches ZooKeeper is designed to be a scalable and robust centralized service for very large distributed applications. A common design anti-pattern associated while accessing such services by clients is through polling or a pull kind of a model. A pull model often suffers from scalability problems when implemented in large and complex distributed systems. To solve this problem, ZooKeeper designers implemented a mechanism where clients can get notifications from the ZooKeeper service instead of polling for events. This resembles a push model, where notifications are pushed to the registered clients of the ZooKeeper service. Clients can register with the ZooKeeper service for any changes associated with a znode. This registration is known as setting a watch on a znode in ZooKeeper terminology. Watches allow clients to get notifications when a znode changes in any way. A watch is a one-time operation, which means that it triggers only one notification. To continue receiving notifications over time, the client must reregister the watch upon receiving each event notification. ZooKeeper watches are a one-time trigger. What this means is that if a client receives a watch event and wants to get notified of future changes, it must set another watch. Whenever a watch is triggered, a notification is dispatched to the client that had set the watch. Watches are maintained in the ZooKeeper server to which a client is connected, and this makes it a fast and lean means of event notification. The watches are triggered for the following three changes to a znode: Any changes to the data of a znode, such as when new data is written to the znode’s data field using the setData operation. Any changes to the children of a znode. For instance, children of a znode are deleted with the delete operation. A znode being created or deleted, which could happen in the event that a new znode is added to a path or an existing one is deleted. Again, ZooKeeper asserts the following guarantees with respect to watches and notifications: ZooKeeper ensures that watches are always ordered in the FIFO manner and that notifications are always dispatched in order Watch notifications are delivered to a client before any other change is made to the same znode The order of the watch events are ordered with respect to the updates seen by the ZooKeeper service ZooKeeper operations ZooKeeper’s data model and its API support the following nine basic operations: Operation Description Operation Event-generating Actions exists A znode is created or deleted, or its data is updated getChildren A child of a znode is created or deleted, or the znode itself is deleted getData A znode is deleted or its data is updated Watches and ZooKeeper operations The read operations in znodes, such as exists, getChildren, and getData, allow watches to be set on them. On the other hand, the watches triggered by znode's write operations, such as create, delete, and setData. ACL operations do not participate in watches. The following are the types of watch events that might occur during a znode state change: NodeChildrenChanged: A znode’s child is created or deleted NodeCreated: A znode is created in a ZooKeeper path NodeDataChanged: The data associated with a znode is updated NodeDeleted: A znode is deleted in a ZooKeeper path Programming with Apache ZooKeeper with Python ZooKeeper is easily programmable and has client binding for a plethora of languages. Its shipped with official Java, C, Perl and Python client libraries. Here we will look at programming ZooKeeper with Python: Apache ZooKeeper is shipped with an official client binding for Python, which is developed on top of the C bindings. It can be found in the contrib/zkpython directory of the ZooKeeper distribution. To build and install the Python binding, refer to the instructions in the README file there. In this section, we will study about another popular Python client library for ZooKeeper, called Kazoo (https://kazoo.readthedocs.org/). Kazoo is a pure Python library for ZooKeeper, which means that unlike the official Python bindings, Kazoo is implemented fully in Python and has no dependency on the C bindings of ZooKeeper. Along with providing both synchronous and asynchronous APIs, the Kazoo library also provides APIs for some distributed data structure primitives such as distributed locks, leader election, distributed queues, and so on. Installation of Kazoo is very simple, which can be done either with pip or easy_install installers: Using pip, Kazoo can be installed with the following command: $ pip install kazoo Using easy_install, Kazoo is installed as follows: $ easy_install kazoo To verify whether Kazoo is installed properly, let's try to connect to the ZooKeeper instance and print the list of znodes in the root path of the tree, as shown in the following screenshot: In the preceding example, we imported the KazooClient, which is the main ZooKeeper client class. Then, we created an object of the class (an instance of KazooClient) by connecting to the ZooKeeper instance that is running on the localhost. Once we called the start() method, it initiates a connection to the ZooKeeper server. Once successfully connected, the instance contains the handle to the ZooKeeper session. Now, when we called the get_children() method on the root path of the ZooKeeper namespace, it returned a list of the children. Finally, we closed the connection by calling the stop() method. A watcher implementation Kazoo provides a higher-level child and data watching API's as a recipe through a module called kazoo.recipe.watchers. This module provides the implementation of DataWatch and ChildrenWatch along with another class called PatientChildrenWatch. The PatientChildrenWatch> class returns values after the children of a node don't change for a period of time, unlike the other two, which return each time an event is generated. Let's look at the implementation of a simple children watcher client, which will generate an event each time a znode is added or deleted from the ZooKeeper path: import signal from kazoo.client import KazooClient from kazoo.recipe.watchers import ChildrenWatch zoo_path = '/MyPath' zk = KazooClient(hosts='localhost:2181') zk.start() zk.ensure_path(zoo_path) @zk.ChildrenWatch(zoo_path) def child_watch_func(children): print "List of Children %s" % children while True: signal.pause() In this simple implementation of a children watcher, we connect to the ZooKeeper server that is running in the localhost, using the following code, and create a path /MyPath: zk.ensure_path(zoo_path) @zk.ChildrenWatch(zoo_path) We then set a children watcher on this path and register a callback method child_watch_func, which prints the current list of children on the event generated in /MyPath. When we run this client watcher in a terminal, it starts listening to events: On another terminal, we will create some znodes in/MyPath with the ZooKeeper shell: We observe that the children watcher client receives these znode creation events, and it prints the list of the current children in the terminal window: Similarly, if we delete the znodes that we just created, the watcher will receive the events and subsequently will print the children listing in the console: The messages shown in the following screenshot are printed in the terminal where the children watcher is running: ZooKeeper recipes In this section, you will learn to develop high-level distributed system constructs and data structures using ZooKeeper. As mentioned earlier, most of these constructs and functions are of utmost importance in building scalable distributed architectures, but they are fairly complicated to implement from scratch. Developers can often get bogged down while implementing these and integrating them with their application logic. In this section, you will learn how to develop algorithms to build some of these high-level functions using ZooKeeper primitives and data model and see how ZooKeeper makes it simple, scalable, and error free, with much lesser code. Barrier Barrier is a type of synchronization method used in distributed systems to block the processing of a set of nodes until a condition is satisfied. It defines a point where all nodes must stop their processing and cannot proceed until all the other nodes reach this barrier. The algorithm to implement a barrier using ZooKeeper is as follows: To start with, a znode is designated to be a barrier znode, say /zk_barrier. The barrier is said to be active in the system if this barrier znode exists . Each client calls the ZooKeeper API's exists() function on /zk_barrier by registering for watch events on the barrier znode (the watch event is set to true). If the exists() method returns false, the barrier no longer exists, and the client proceeds with its computation. Else, if the exists() method returns true, the clients just waits for watch events. Whenever the barrier exit condition is met, the client in charge of the barrier will delete /zk_barrier. The deletion triggers a watch event, and on getting this notification, the client calls the exists() function on /zk_barrier again. Step 7 returns true, and the clients can proceed further. The barrier exists until the barrier znode ceases to exist! In this way, we can implement a barrier using ZooKeeper without much of an effort. The example cited so far is for a simple barrier to stop a group of distributed processes from waiting on some condition and then proceed together when the condition is met. There is another type of barrier that aids in synchronizing the beginning and end of a computation; this is known as double barrier. The logic of a double barrier states that a computation is started when the required number of processes join the barrier. The processes leave after completing the computation, and when the number of processes participating in the barrier become zero, the computation is stated to end. The algorithm for a double barrier is implemented by having a barrier znode that serves the purpose of being a parent for individual process znodes participating in the computation. It's algorithm is outlined as follows: Phase 1: Joining the barrier znode can be done as follows: Suppose the barrier znode is represented by znode/barrier. Every client process registers with the barrier znode by creating an ephemeral znode with /barrier as the parent. In real scenarios, clients might register using their hostnames. The client process sets a watch event for the existence of another znode called ready under the /barrier znode and waits for the node to appear. A number N is predefined in the system; this governs the minimum number of clients to join the barrier before the computation can start. While joining the barrier, each client process finds the number of child znodes of /barrier: M = getChildren(/barrier, watch=false) 5. If M is less than N, the client waits for the watch event registered in step 3. Else, if M is equal to N, then the client process creates the ready znode under /barrier. The creation of the ready znode in step 5 triggers the watch event, and each client starts the computation that they were waiting so far to do. Phase 2: Leaving the barrier can be done as follows: Client processing on finishing the computation deletes the znode it created under /barrier (in step 2 of Phase 1: Joining the barrier). The client process then finds the number of children under /barrier: M = getChildren(/barrier, watch=True) If M is not equal to 0, this client waits for notifications (observe that we have set the watch event to True in the preceding call). If M is equal to 0, then the client exits the barrier znode The preceding procedure suffers from a potential herd effect where all client processes wake up to check the number of children left in the barrier when a notification is triggered. To get away with this, we can use a sequential ephemeral znode to be created in step 2 of Phase 1: Joining the barrier. Every client process watches it's next lowest sequential ephemeral znode to go away as an exit criterion. This way, only a single event is generated for any client completing the computation, and hence, not all clients need to wake up together to check on its exit condition. For a large number of client processes participating in a barrier, the herd effect can negatively impact the scalability of the ZooKeeper service, and developers should be aware of such scenarios. A Java language implementation of a double barrier can be found in the ZooKeeper documentation at http://zookeeper.apache.org/doc/r3.4.6/zookeeperTutorial.html. Queue A distributed queue is a very common data structure used in distributed systems. A special implementation of a queue, called a producer-consumer queue, is where a collection of processes called producers generate or create new items and put them in the queue, while consumer processes remove the items from the queue and process them. The addition and removal of items in the queue follow a strict ordering of first in first out (FIFO). A producer-consumer queue can be implemented using ZooKeeper. A znode will be designated to hold a queue instance, say queue-znode. All queue items are stored as znodes under this znode. Producers add an item to the queue by creating a znode under the queue-znode, and consumers retrieve the items by getting and then deleting a child from the queue-znode. The FIFO order of the items is maintained using sequential property of znode provided by ZooKeeper. When a producer process creates a znode for a queue item, it sets the sequential flag. This lets ZooKeeper append the znode name with a monotonically increasing sequence number as the suffix. ZooKeeper guarantees that the sequence numbers are applied in order and are not reused. The consumer process processes the items in the correct order by looking at the sequence number of the znode. The pseudocode for the algorithm to implement a producer-consumer queue using ZooKeeper is shown here: Let /_QUEUE_ represent the top-level znode for our queue implementation, which is also called the queue-node. Clients acting as producer processes put something into the queue by calling the create() method with the znode name as "queue-" and set the sequence and ephemeral flags if the create() method call is set true: create( “queue-“, SEQUENCE_EPHEMERAL) The sequence flag lets the new znode get a name like queue-N, where N is a monotonically increasing number Clients acting as consumer processes process a getChildren() method call on the queue-node with a watch event set to true: M = getChildren(/_QUEUE_, true) It sorts the children list M, takes out the lowest numbered child znode from the list, starts processing on it by taking out the data from the znode, and then deletes it. The client picks up items from the list and continues processing on them. On reaching the end of the list, the client should check again whether any new items are added to the queue by issuing another get_children() method call. > The algorithm continues when get_children() returns an empty list; this means that no more znodes or items are left under /_QUEUE_. It's quite possible that in step 3, the deletion of a znode by a client will fail because some other client has gained access to the znode while this client was retrieving the item. In such scenarios, the client should retry the delete call. Using this algorithm for implementation of a generic queue, we can also build a priority queue out of it, where each item can have a priority tagged to it. The algorithm and implementation is left as an exercise to the readers. C and Java implementations of the distributed queue recipe are shipped along with the ZooKeeper distribution under the recipes folder. Developers can use this recipe to implement distributed lock in their applications. Kazoo, the Python client library for ZooKeeper, has distributed queue implementations inside the kazoo.recipe.queue module. This queue implementation has priority assignment to the queue items support as well as queue locking support that are built into it. Lock A lock in a distributed system is an important primitive that provides the applications with a means to synchronize their access to shared resources. Distributed locks need to be globally synchronous to ensure that no two clients can hold the same lock at any instance of time. Typical scenarios where locks are inevitable are when the system as a whole needs to ensure that only one node of the cluster is allowed to carry out an operation at a given time, such as: Write to a shared database or file Act as a decision subsystem Process all I/O requests from other nodes ZooKeeper can be used to implement mutually exclusive locks for processes that run on different servers across different networks and even geographically apart. To build a distributed lock with ZooKeeper, a persistent znode is designated to be the main lock-znode. Client processes that want to acquire the lock will create an ephemeral znode with a sequential flag set under the lock-znode. The crux of the algorithm is that the lock is owned by the client process whose child znode has the lowest sequence number. ZooKeeper guarantees the order of the sequence number, as sequence znodes are numbered in a monotonically increasing order. Suppose there are three znodes under the lock-znode: l1, l2, and l3. The client process that created l1 will be the owner of the lock. If the client wants to release the lock, it simply deletes l1, and then, the owner of l2 will be the lock owner and so on. The pseudocode for the algorithm to implement a distributed lock service with ZooKeeper is shown here: Let the parent lock node be represented by a persistent znode, /_locknode_, in the Zookeeper tree. Phase 1: Acquire a lock with the following steps: Call the create("/_locknode_/lock-",CreateMode=EPHEMERAL_SEQUENTIAL) method. Call the getChildren("/_locknode_/lock-", false) method on the lock node. Here, the watch flag is set to false, as otherwise, it can lead to a herd effect. If the znode created by the client in step 1 has the lowest sequence number suffix, then the client is owner of the lock, and it exits the algorithm. Call the exists("/_locknode_/, True) method. If the exists() method returns false, go to step 2. If the exists() method returns true, wait for notifications for the watch event set in step 4. Phase 2: Release a lock as follows: The client holding the lock deletes the node, thereby triggering the next client in line to acquire the lock. The client that created the next higher sequence node will be notified and hold the lock. The watch for this event was set in step 4 of Phase 1,:Acquire a lock. While it's not recommended that you use a distributed system with a large number of clients due to the herd effect, if the other clients also need to know about the change of lock ownership, they could set a watch on the /_locknode_ lock node for events of the NodeChildrenChanged type and can determine the current owner. If there was a partial failure in the creation of znode due to connection loss, it's possible that the client won't be able to correctly determine whether it successfully created the child znode. To resolve such a situation, the client can store its session ID in the znode data field or even as a part of the znode name itself. As a client retains the same session ID after a reconnect, it can easily determine whether the child znode was created by it by looking at the session ID. The idea of creating an ephemeral znode prevents a potential dead-lock situation that might arise when a client dies while holding a lock. However, as the property of the ephemeral znode dictates that it gets deleted when the session times out or expires, ZooKeeper will delete the znode created by the dead client, and the algorithm runs as usual. However, if the client hangs for some reason but the ZooKeeper session is still active, then we might get into a deadlock. This can be solved by having a monitor client that triggers an alarm when the lock holding time for a client crosses a predefined time out. The ZooKeeper distribution is shipped with the C and Java language implementation of a distributed lock in the recipes folder. The recipe implements the algorithm you have learned so far and also takes into account the problems associated with partial failure and herd effect. The previous recipe of a mutually exclusive lock can be modified to implement a shared lock as well. Readers can find the algorithm and pseudocode for a shared lock using Zookeeper in the documentation at http://zookeeper.apache.org/doc/r3.4.6/recipes.html#Shared+Locks. More ZooKeeper recipes are available at: http://zookeeper.apache.org/doc/trunk/recipes.html Summary In this article, we read about the fundamentals of Apache ZooKeeper, programming it and how to implement common distributed data structures with ZooKeeper. For more details on Apache ZooKeeper, please visit its project page. Resources for Article: Further resources on this subject: Creating an Apache JMeter™ test workbench [article] Apache Maven and m2eclipse [article] Coverage with Apache Karaf Pax Exam tests [article]

0
0
5305

article-image-transformations-using-mapreduce

Packt

05 Feb 2015

19 min read

Transformations Using Map/Reduce

Packt

05 Feb 2015

19 min read

In this article written by Adam Boduch, author of the book Lo-Dash Essentials, we'll be looking at all the interesting things we can do with Lo-Dash and the map/reduce programming model. We'll start off with the basics, getting our feet wet with some basic mappings and basic reductions. As we progress through the article, we'll start introducing more advanced techniques to think in terms of map/reduce with Lo-Dash. The goal, once you've reached the end of this article, is to have a solid understanding of the Lo-Dash functions available that aid in mapping and reducing collections. Additionally, you'll start to notice how disparate Lo-Dash functions work together in the map/reduce domain. Ready? (For more resources related to this topic, see here.) Plucking values Consider that as your informal introduction to mapping because that's essentially what it's doing. It's taking an input collection and mapping it to a new collection, plucking only the properties we're interested in. This is shown in the following example: var collection = [ { name: 'Virginia', age: 45 }, { name: 'Debra', age: 34 }, { name: 'Jerry', age: 55 }, { name: 'Earl', age: 29 } ]; _.pluck(collection, 'age'); // → [ 45, 34, 55, 29 ] This is about as simple a mapping operation as you'll find. In fact, you can do the same thing with map(): var collection = [ { name: 'Michele', age: 58 }, { name: 'Lynda', age: 23 }, { name: 'William', age: 35 }, { name: 'Thomas', age: 41 } ]; _.map(collection, 'name'); // → // [ // "Michele", // "Lynda", // "William", // "Thomas" // ] As you'd expect, the output here is exactly the same as it would be with pluck(). In fact, pluck() is actually using the map() function under the hood. The callback passed to map() is constructed using property(), which just returns the specified property value. The map() function falls back to this plucking behavior when a string instead of a function is passed to it. With that brief introduction to the nature of mapping, let's dig a little deeper and see what's possible in mapping collections. Mapping collections In this section, we'll explore mapping collections. Mapping one collection to another ranges from composing really simple—as we saw in the preceding section—to sophisticated callbacks. These callbacks that map each item in the collection can include or exclude properties and can calculate new values. Besides, we can apply functions to these items. We'll also address the issue of filtering collections and how this can be done in conjunction with mapping. Including and excluding properties When applied to an object, the pick() function generates a new object containing only the specified properties. The opposite of this function, omit(), generates an object with every property except those specified. Since these functions work fine for individual object instances, why not use them in a collection? You can use both of these functions to shed properties from collections by mapping them to new ones, as shown in the following code: var collection = [ { first: 'Ryan', last: 'Coleman', age: 23 }, { first: 'Ann', last: 'Sutton', age: 31 }, { first: 'Van', last: 'Holloway', age: 44 }, { first: 'Francis', last: 'Higgins', age: 38 } ]; _.map(collection, function(item) { return _.pick(item, [ 'first', 'last' ]); }); // → // [ // { first: "Ryan", last: "Coleman" }, // { first: "Ann", last: "Sutton" }, // { first: "Van", last: "Holloway" }, // { first: "Francis", last: "Higgins" } // ] Here, we're creating a new collection using the map() function. The callback function supplied to map() is applied to each item in the collection. The item argument is the original item from the collection. The callback is expected to return the mapped version of that item and this version could be anything, including the original item itself. Be careful when manipulating the original item in map() callbacks. If the item is an object and it's referenced elsewhere in your application, it could have unintended consequences. We're returning a new object as the mapped item in the preceding code. This is done using the pick() function. We only care about the first and the last properties. Our newly mapped collection looks identical to the original, except that no item has an age property. This newly mapped collection is seen in the following code: var collection = [ { first: 'Clinton', last: 'Park', age: 19 }, { first: 'Dana', last: 'Hines', age: 36 }, { first: 'Pete', last: 'Ross', age: 31 }, { first: 'Annie', last: 'Cross', age: 48 } ]; _.map(collection, function(item) { return _.omit(item, 'first'); }); // → // [ // { last: "Park", age: 19 }, // { last: "Hines", age: 36 }, // { last: "Ross", age: 31 }, // { last: "Cross", age: 48 } // ] The preceding code follows the same approach as the pick() code. The only difference is that we're excluding the first property from the newly created collection. You'll also notice that we're passing a string containing a single property name instead of an array of property names. In addition to passing strings or arrays as the argument to pick() or omit(), we can pass in a function callback. This is suitable when it's not very clear which objects in a collection should have which properties. Using a callback like this inside a map() callback lets us perform detailed comparisons and transformations on collections while using very little code: function invalidAge(value, key) { return key === 'age' && value < 40; } var collection = [ { first: 'Kim', last: 'Lawson', age: 40 }, { first: 'Marcia', last: 'Butler', age: 31 }, { first: 'Shawna', last: 'Hamilton', age: 39 }, { first: 'Leon', last: 'Johnston', age: 67 } ]; _.map(collection, function(item) { return _.omit(item, invalidAge); }); // → // [ // { first: "Kim", last: "Lawson", age: 40 }, // { first: "Marcia", last: "Butler" }, // { first: "Shawna", last: "Hamilton" }, // { first: "Leon", last: "Johnston", age: 67 } // ] The new collection generated by this code excludes the age property for items where the age value is less than 40. The callback supplied to omit() is applied to each key-value pair in the object. This code is a good illustration of the conciseness achievable with Lo-Dash. There's a lot of iterative code running here and there is no for or while statement in sight. Performing calculations It's time now to turn our attention to performing calculations in our map() callbacks. This entails looking at the item and, based on its current state, computing a new value that will be ultimately mapped to the new collection. This could mean extending the original item's properties or replacing one with a newly computed value. Whichever the case, it's a lot easier to map these computations than to write your own logic that applies these functions to every item in your collection. This is explained using the following example: var collection = [ { name: 'Valerie', jqueryYears: 4, cssYears: 3 }, { name: 'Alonzo', jqueryYears: 1, cssYears: 5 }, { name: 'Claire', jqueryYears: 3, cssYears: 1 }, { name: 'Duane', jqueryYears: 2, cssYears: 0 } ]; _.map(collection, function(item) { return _.extend({ experience: item.jqueryYears + item.cssYears, specialty: item.jqueryYears >= item.cssYears ? 'jQuery' : 'CSS' }, item); }); // → // [ // { // experience": 7, // specialty": "jQuery", // name": "Valerie", // jqueryYears": 4, // cssYears: 3 // }, // { // experience: 6, // specialty: "CSS", // name: "Alonzo", // jqueryYears: 1, // cssYears: 5 // }, // { // experience: 4, // specialty: "jQuery", // name: "Claire", // jqueryYears: 3, // cssYears: 1 // }, // { // experience: 2, // specialty: "jQuery", // name: "Duane", // jqueryYears: 2, // cssYears: 0 // } // ] Here, we're mapping each item in the original collection to an extended version of it. Particularly, we're computing two new values for each item—experience and speciality. The experience property is simply the sum of the jqueryYears and cssYears properties. The speciality property is computed based on the larger value of the jqueryYears and cssYears properties. Earlier, I mentioned the need to be careful when modifying items in map() callbacks. In general, it's a bad idea. It's helpful to try and remember that map() is used to generate new collections, not to modify existing collections. Here's an illustration of the horrific consequences of not being careful: var app = {}, collection = [ { name: 'Cameron', supervisor: false }, { name: 'Lindsey', supervisor: true }, { name: 'Kenneth', supervisor: false }, { name: 'Caroline', supervisor: true } ]; app.supervisor = _.find(collection, { supervisor: true }); _.map(collection, function(item) { return _.extend(item, { supervisor: false }); }); console.log(app.supervisor); // → { name: "Lindsey", supervisor: false } The destructive nature of this callback is not obvious at all and next to impossible for programmers to track down and diagnose. Its nature is essentially resetting the supervisor attribute for each item. If these items are used anywhere else in the application, the supervisor property value will be clobbered whenever this map job is executed. If you need to reset values like this, ensure that the change is mapped to the new value and not made to the original. Mapping also works with primitive values as the item. Often, we'll have an array of primitive values that we'd like transformed into an alternative representation. For example, let's say you have an array of sizes, expressed in bytes. You can map those arrays to a new collection with those sizes expressed as human-readable values, using the following code: function bytes(b) { var units = [ 'B', 'K', 'M', 'G', 'T', 'P' ], target = 0; while (b >= 1024) { b = b / 1024; target++; } return (b % 1 === 0 ? b : b.toFixed(1)) + units[target] + (target === 0 ? '' : 'B'); } var collection = [ 1024, 1048576, 345198, 120120120 ]; _.map(collection, bytes); // → [ "1KB", "1MB", "337.1KB", "114.6MB" ] The bytes() function takes a numerical argument, which is the number of bytes to be formatted. This is the starting unit. We just keep incrementing the target unit until we have something that is less than 1024. For example, the last item in our collection maps to '114.6MB'. The bytes() function can be passed directly to map() since it's expecting values in our collection as they are. Calling functions We don't always have to write our own callback functions for map(). Wherever it makes sense, we're free to leverage Lo-Dash functions to map our collection items. For example, let's say we have a collection and we'd like to know the size of each item. There's a size() Lo-Dash function we can use as our map() callback, as follows: var collection = [ [ 1, 2 ], [ 1, 2, 3 ], { first: 1, second: 2 }, { first: 1, second: 2, third: 3 } ]; _.map(collection, _.size); // → [ 2, 3, 2, 3 ] This code has the added benefit that the size() function returns consistent results, no matter what kind of argument is passed to it. In fact, any function that takes a single argument and returns a new value based on that argument is a valid candidate for a map() callback. For instance, we could also map the minimum and maximum value of each item: var source = _.range(1000), collection = [ _.sample(source, 50), _.sample(source, 100), _.sample(source, 150) ]; _.map(collection, _.min); // → [ 20, 21, 1 ] _.map(collection, _.max); // → [ 931, 985, 991 ] What if we want to map each item of our collection to a sorted version? Since we do not sort the collection itself, we don't care about the item positions within the collection, but the items themselves, if they're arrays, for instance. Let's see what happens with the following code: var collection = [ [ 'Evan', 'Veronica', 'Dana' ], [ 'Lila', 'Ronald', 'Dwayne' ], [ 'Ivan', 'Alfred', 'Doug' ], [ 'Penny', 'Lynne', 'Andy' ] ]; _.map(collection, _.compose(_.first, function(item) { return _.sortBy(item); })); // → [ "Dana", "Dwayne", "Alfred", "Andy" ] This code uses the compose() function to construct a map() callback. The first function returns the sorted version of the item by passing it to sortBy(). The first() item of this sorted list is then returned as the mapped item. The end result is a new collection containing the alphabetically first item from each array in our collection, with three lines of code. This is not bad. Filtering and mapping Filtering and mapping are two closely related collection operations. Filtering extracts only those collection items that are of particular interest in a given context. Mapping transforms collections to produce new collections. But what if you only want to map a certain subset of your collection? Then it would make sense to chain together the filtering and mapping operations, right? Here's an example of what that might look like: var collection = [ { name: 'Karl', enabled: true }, { name: 'Sophie', enabled: true }, { name: 'Jerald', enabled: false }, { name: 'Angie', enabled: false } ]; _.compose( _.partialRight(_.map, 'name'), _.partialRight(_.filter, 'enabled') )(collection); // → [ "Karl", "Sophie" ] This map is executed using compose() to build a function that is called right away, with our collection as the argument. The function is composed of two partials. We're using partialRight() on both arguments because we want the collection supplied as the leftmost argument in both cases. The first partial function is filter(). We're partially applying the enabled argument. So this function will filter our collection before it's passed to map(). This brings us to our next partial in the function composition. The result of filtering the collection is passed to map(), which has the name argument partially applied. The end result is a collection with enabled name strings. The important thing to note about the preceding code is that the filtering operation takes place before the map() function is run. We could have stored the filtered collection in an intermediate variable instead of streamlining with compose(). Regardless of flavor, it's important that the items in your mapped collection correspond to the items in the source collection. It's conceivable to filter out the items in the map() callback by not returning anything, but this is ill-advised as it doesn't map well, both figuratively and literally. Mapping objects The previous section focused on collections and how to map them. But wait, objects are collections too, right? That is indeed correct, but it's worth differentiating between the more traditional collections, arrays, and plain objects. The main reason is that there are implications with ordering and keys when performing map/reduce. At the end of the day, arrays and objects serve different use cases with map/reduce, and this article tries to acknowledge these differences. Now we'll start looking at some techniques Lo-Dash programmers employ when working with objects and mapping them to collections. There are a number of factors to consider such as the keys within an object and calling methods on objects. We'll take a look at the relationship between key-value pairs and how they can be used in a mapping context. Working with keys We can use the keys of a given object in interesting ways to map the object to a new collection. For example, we can use the keys() function to extract the keys of an object and map them to values other than the property value, as shown in the following example: var object = { first: 'Ronald', last: 'Walters', employer: 'Packt' }; _.map(_.sortBy(_.keys(object)), function(item) { return object[item]; }); // → [ "Packt", "Ronald", "Walters" ] The preceding code builds an array of property values from object. It does so using map(), which is actually mapping the keys() array of object. These keys are sorted using sortBy(). So Packt is the first element of the resulting array because employer is alphabetically first in the object keys. Sometimes, it's desirable to perform lookups in other objects and map those values to a target object. For example, not all APIs return everything you need for a given page, packaged in a neat little object. You have to do joins and build the data you need. This is shown in the following code: var users = {}, preferences = {}; _.each(_.range(100), function() { var id = _.uniqueId('user-'); users[id] = { type: 'user' }; preferences[id] = { emailme: !!(_.random()) }; }); _.map(users, function(value, key) { return _.extend({ id: key }, preferences[key]); }); // → // [ // { id: "user-1", emailme: true }, // { id: "user-2", emailme: false }, // ... // ] This example builds two objects, users and preferences. In the case of each object, the keys are user identifiers that we're generating with uniqueId(). The user objects just have some dummy attribute in them, while the preferences objects have an emailme attribute, set to a random Boolean value. Now let's say we need quick access to this preference for all users in the users object. As you can see, it's straightforward to implement using map() on the users object. The callback function returns a new object with the user ID. We extend this object with the preference for that particular user by looking at them by key. Calling methods Objects aren't limited to storing primitive strings and numbers. Properties can store functions as their values, or methods, as they're commonly referred. However, depending on the context where you're using your object, methods aren't always callable, especially if you have little or no control over the context where your objects are used. One technique that's helpful in situations such as these is mapping the result of calling these methods and using this result in the context in question. Let's see how this can be done with the following code: var object = { first: 'Roxanne', last: 'Elliot', name: function() { return this.first + ' ' + this.last; }, age: 38, retirement: 65, working: function() { return this.retirement - this.age; } }; _.map(object, function(value, key) { var item = {}; item[key] = _.isFunction(value) ? object[key]() : value return item; }); // → // [ // { first: "Roxanne" }, // { last: "Elliot" }, // { name: "Roxanne Elliot" }, // { age: 38 }, // { retirement: 65 }, // { working: 27 } // ] _.map(object, function(value, key) { var item = {}; item[key] = _.result(object, key); return item; }); // → // [ // { first: "Roxanne" }, // { last: "Elliot" }, // { name: "Roxanne Elliot" }, // { age: 38 }, // { retirement: 65 }, // { working: 27 } // ] Here, we have an object with both primitive property values and methods that use these properties. Now we'd like to map the results of calling those methods and we will experiment with two different approaches. The first approach uses the isFunction() function to determine whether the property value is callable or not. If it is, we call it and return that value. The second approach is a little easier to implement and achieves the same outcome. The result() function is applied to the object using the current key. This tests whether we're working with a function or not, so our code doesn't have to. In the first approach to mapping method invocations, you might have noticed that we're calling the method using object[key]() instead of value(). The former retains the context as the object variable, but the latter loses the context, since it is invoked as a plain function without any object. So when you're writing mapping callbacks that call methods and not getting the expected results, make sure the method's context is intact. Perhaps, you have an object but you're not sure which properties are methods. You can use functions() to figure this out and then map the results of calling each method to an array, as shown in the following code: var object = { firstName: 'Fredrick', lastName: 'Townsend', first: function() { return this.firstName; }, last: function() { return this.lastName; } }; var methods = _.map(_.functions(object), function(item) { return [ _.bindKey(object, item) ]; }); _.invoke(methods, 0); // → [ "Fredrick", "Townsend" ] The object variable has two methods, first() and last(). Assuming we didn't know about these methods, we can find them using functions(). Here, we're building a methods array using map(). The input is an array containing the names of all the methods of the given object. The value we're returning is interesting. It's a single-value array; you'll see why in a moment. The value of this array is a function built by passing the object and the name of the method to bindKey(). This function, when invoked, will always use object as its context. Lastly, we use invoke() to invoke each method in our methods array, building a new result array. Recall that our map() callback returned an array. This was a simple hack to make invoke() work, since it's a convenient way to call methods. It generally expects a key as the second argument, but a numerical index works just as well, since they're both looked up as same. Mapping key-value pairs Just because you're working with an object doesn't mean it's ideal, or even necessary. That's what map() is for—mapping what you're given to what you need. For instance, the property values are sometimes all that matter for what you're doing, and you can dispense with the keys entirely. For that, we have the values() function and we feed the values to map(): var object = { first: 'Lindsay', last: 'Castillo', age: 51 }; _.map(_.filter(_.values(object), _.isString), function(item) { return '<strong>' + item + '</strong>'; }); // → [ "<strong>Lindsay</strong>", "<strong>Castillo</strong>" ] All we want from the object variable here is a list of property values, which are strings, so that we can format them. In other words, the fact that the keys are first, last, and age is irrelevant. So first, we call values() to build an array of values. Next, we pass that array to filter(), removing anything that's not a string. We then pass the output of this to map, where we're able to map the string using <strong/> tags. The opposite might also be true—the value is completely meaningless without its key. If that's the case, it may be fitting to map key-value pairs to a new collection, as shown in the following example: function capitalize(s) { return s.charAt(0).toUpperCase() + s.slice(1); } function format(label, value) { return '<label>' + capitalize(label) + ':</label>' + '<strong>' + value + '</strong>'; } var object = { first: 'Julian', last: 'Ramos', age: 43 }; _.map(_.pairs(object), function(pair) { return format.apply(undefined, pair); }); // → // [ // "<label>First:</label><strong>Julian</strong>", // "<label>Last:</label><strong>Ramos</strong>", // "<label>Age:</label><strong>43</strong>" // ] We're passing the result of running our object through the pairs() function to map(). The argument passed to our map callback function is an array, the first element being the key and the second being the value. It so happens that the format() function expects a key and a value to format the given string, so we're able to use format.apply() to call the function, passing it the pair array. This approach is just a matter of taste. There's no need to call pairs() before map(). We could just as easily have called format directly. But sometimes, this approach is preferred, and the reasons, not least of which is the style of the programmer, are wide and varied. Summary This article introduced you to the map/reduce programming model and how Lo-Dash tools help realize it in your application. First, we examined mapping collections, including how to choose which properties get included and how to perform calculations. We then moved on to mapping objects. Keys can have an important role in how objects get mapped to new objects and collections. There are also methods and functions to consider when mapping. Resources for Article: Further resources on this subject: The First Step [article] Recursive directives [article] AngularJS Project [article]

0
0
6209

How-To Tutorials

article-image-building-next-generation-web-meteor

Packt

05 Feb 2015

9 min read

Building the next generation Web with Meteor

Packt

05 Feb 2015

9 min read

This article by Fabian Vogelsteller, the author of Building Single-page Web Apps with Meteor, explores the full-stack framework of Meteor. Meteor is not just a JavaScript library such as jQuery or AngularJS. It's a full-stack solution that contains frontend libraries, a Node.js-based server, and a command-line tool. All this together lets us write large-scale web applications in JavaScript, on both the server and client, using a consistent API. (For more resources related to this topic, see here.) Even with Meteor being quite young, already a few companies such as https://lookback.io, https://respond.ly and https://madeye.io use Meteor already in their production environment. If you want to see for yourself what's made with Meteor, take a look at http://madewith.meteor.com. Meteor makes it easy for us to build web applications quickly and takes care of the boring processes such as file linking, minifying, and concatenating of files. Here are a few highlights of what is possible with Meteor: We can build complex web applications amazingly fast using templates that automatically update themselves when data changes We can push new code to all clients on the fly while they are using our app Meteor core packages come with a complete account solution, allowing a seamless integration with Facebook, Twitter, and more Data will automatically be synced across clients, keeping every client in the same state in almost real time Latency compensation will make our interface appear super fast while the server response happens in the background With Meteor, we never have to link files with the <script> tags in HTML. Meteor's command-line tool automatically collects JavaScript or CSS files in our application's folder and links them in the index.html file, which is served to clients on initial page load. This makes structuring our code in separate files as easy as creating them. Meteor's command-line tool also watches all files inside our application's folder for changes and rebuilds them on the fly when they change. Additionally, it starts a Meteor server that serves the app's files to the clients. When a file changes, Meteor reloads the site of every client while preserving its state. This is called a hot code reload. In production, the build process also concatenates and minifies our CSS and JavaScript files. By simply adding the less and coffee core packages, we can even write all styles in LESS and code in CoffeeScript with no extra effort. The command-line tool is also the tool for deploying and bundling our app so that we can run it on a remote server. Sounds awesome? Let's take a look at what's needed to use Meteor Adding basic packages Packages in Meteor are libraries that can be added to our projects. The nice thing about Meteor packages is that they are self-contained units, which run out of the box. They mostly add either some templating functionality or provide extra objects in the global namespace of our project. Packages can also add features to Meteor's build process like the stylus package, which lets us write our app's style files with the stylus pre-processor syntax. Writing templates in Meteor Normally when we build websites, we build the complete HTML on the server side. This was quite straightforward; every page is built on the server, then it is sent to the client, and at last JavaScript added some additional animation or dynamic behavior to it. This is not so in single-page apps, where each page needs to be already in the client's browser so that it can be shown at will. Meteor solves that problem by providing templates that exists in JavaScript and can be placed in the DOM at some point. These templates can have nested templates, allowing for and easy way to reuse and structure an app's HTML layout. Since Meteor is so flexible in terms of folder and file structure, any *.html page can contain a template and will be parsed during Meteor's build process. This allows us to put all templates in the my-meteor-blog/client/templates folder. This folder structure is chosen as it helps us organizing templates while our app grows. Meteor template engine is called Spacebars, which is a derivative of the handlebars template engine. Spacebars is built on top of Blaze, which is Meteor's reactive DOM update engine. Meteor and databases Meteor currently uses MongoDB by default to store data on the server, although there are drivers planned for relational databases, too. If you are adventurous, you can try one of the community-built SQL drivers, such as the numtel:mysql package from https://atmospherejs.com/numtel/mysql. MongoDB is a NoSQL database. This means it is based on a flat document structure instead of a relational table structure. Its document approach makes it ideal for JavaScript as documents are written in BJSON, which is very similar to the JSON format. Meteor has a database everywhere approach, which means we have the same API to query the database on the client as well as on the server. Yet, when we query the database on the client, we are only able to access data that we published to a client. MongoDB uses a datastructure called a collection, which is the equivalent of a table in an SQL database. Collections contain documents, where each document has its own unique ID. These documents are JSON-like structures and can contain properties with values, even with multiple dimensions: { "_id": "W7sBzpBbov48rR7jW", "myName": "My Document Name", "someProperty": 123456, "aNestedProperty": { "anotherOne": "With another string" } } These collections are used to store data in the servers MongoDB as well as the client-sides minimongo collections, which is an in-memory database mimicking the behavior of the real MongoDB. The MongoDB API let us use a simple JSON-based query language to get documents from a collection. We can pass additional options to only ask for specific fields or sort the returned documents. These are very powerful features, especially on the client side, to display data in various ways. Data everywhere In Meteor, we can use the browser console to update data, which means we update the database from the client. This works because Meteor automatically syncs these changes to the server and updates the database accordingly. This is happening because we have the autopublish and insecure core packages added to our project by default. The autopublish package publishes automatically all documents to every client, whereas the insecure package allows every client to update database records by its _id field. Obviously, this works well for prototyping but is infeasible for production, as every client could manipulate our database. If we remove the insecure package, we would need to add the "allow and deny" rules to determine what a client is allowed to update and what not; otherwise all updates will get denied. Differences between client and server collections Meteor has a database everywhere approach. This means it provides the same API on the client as on the server. The data flow is controlled using a publication subscription model. On the server sits the real MongoDB database, which stores data persistently. On the client Meteor has a package called minimongo, which is a pure in-memory database mimicking most of MongoDB's query and update functions. Every time a client connects to its Meteor server, Meteor downloads the documents the client subscribed to and stores them in its local minimongo database. From here, they can be displayed in a template or processed by functions. When the client updates a document, Meteor syncs it back to the server, where it is passed through any allow/deny functions before being persistently stored in the database. This works also in the other way, when a document in the server-side database changes, it will get automatically sync to every client that is subscribed to it, keeping every connected client up to date. Syncing data – the current Web versus the new Web In the current Web, most pages are either static files hosted on a server or dynamically generated by a server on a request. This is true for most server-side-rendered websites, for example, those written with PHP, Rails, or Django. Both of these techniques required no effort besides being displayed by the clients; therefore, they are called thin clients. In modern web applications, the idea of the browser has moved from thin clients to fat clients. This means most of the website's logic resides on the client and the client asks for the data it needs. Currently, this is mostly done via calls to an API server. This API server then returns data, commonly in JSON form, giving the client an easy way to handle it and use it appropriately. Most modern websites are a mixture of thin and fat clients. Normal pages are server-side-rendered, where only some functionality, such as a chat box or news feed, is updated using API calls. Meteor, however, is built on the idea that it's better to use the calculation power of all clients instead of one single server. A pure fat client or a single-page app contains the entire logic of a website's frontend, which is send down on the initial page load. The server then merely acts as a data source, sending only the data to the clients. This can happen by connecting to an API and utilizing AJAX calls, or as with Meteor, using a model called publication/subscription. In this model, the server offers a range of publications and each client decides which dataset it wants to subscribe to. Compared with AJAX calls, the developer doesn't have to take care of any downloading or uploading logic. The Meteor client syncs all of the data automatically in the background as soon as it subscribes to a specific dataset. When data on the server changes, the server sends the updated documents to the clients and vice versa, as shown in the following diagram: Summary Meteor comes with more great ways of building pure JavaScript applications such as simple routing and simple ways to make components, which can be packaged for others to use. Meteor's reactivity model, which allows you to rerun any function and template helpers at will, allows for great consistent interfaces and simple dependency tracking, which is a key for large-scale JavaScript applications. If you want to dig deeper, buy the book and read How to build your own blog as single-page web application in a simple step-by-step fashion by using Meteor, the next generation web! Resources for Article: Further resources on this subject: Quick start - creating your first application [article] Meteor.js JavaScript Framework: Why Meteor Rocks! [article] Marionette View Types and Their Use [article]

0
0
1897

How-To Tutorials

Packt

05 Feb 2015

11 min read

Google App Engine

Packt

05 Feb 2015

11 min read

In this article by Massimiliano Pippi, author of the book Python for Google App Engine, in this article, you will learn how to write a web application and seeing the platform in action. Web applications commonly provide a set of features such as user authentication and data storage. App Engine provides the services and tools needed to implement such features. (For more resources related to this topic, see here.) In this article, we will see: Details of the webapp2 framework How to authenticate users Storing data on Google Cloud Datastore Building HTML pages using templates Experimenting on the Notes application To better explore App Engine and Cloud Platform capabilities, we need a real-world application to experiment on; something that's not trivial to write, with a reasonable list of requirements. A good candidate is a note-taking application; we will name it Notes. Notes enable the users to add, remove, and modify a list of notes; a note has a title and a body of text. Users can only see their personal notes, so they must authenticate before using the application. The main page of the application will show the list of notes for logged-in users and a form to add new ones. The code from the helloworld example is a good starting point. We can simply change the name of the root folder and the application field in the app.yaml file to match the new name we chose for the application, or we can start a new project from scratch named notes. Authenticating users The first requirement for our Notes application is showing the home page only to users who are logged in and redirect others to the login form; the users service provided by App Engine is exactly what we need and adding it to our MainHandler class is quite simple: import webapp2 from google.appengine.api import users class MainHandler(webapp2.RequestHandler): def get(self): user = users.get_current_user() if user is not None: self.response.write('Hello Notes!') else: login_url = users.create_login_url(self.request.uri) self.redirect(login_url) app = webapp2.WSGIApplication([ ('/', MainHandler) ], debug=True) The user package we import on the second line of the previous code provides access to users' service functionalities. Inside the get() method of the MainHandler class, we first check whether the user visiting the page has logged in or not. If they have, the get_current_user() method returns an instance of the user class provided by App Engine and representing an authenticated user; otherwise, it returns None as output. If the user is valid, we provide the response as we did before; otherwise, we redirect them to the Google login form. The URL of the login form is returned using the create_login_url() method, and we call it, passing as a parameter the URL we want to redirect users to after a successful authentication. In this case, we want to redirect users to the same URL they are visiting, provided by webapp2 in the self.request.uri property. The webapp2 framework also provides handlers with a redirect() method we can use to conveniently set the right status and location properties of the response object so that the client browsers will be redirected to the login page. HTML templates with Jinja2 Web applications provide rich and complex HTML user interfaces, and Notes is no exception but, so far, response objects in our applications contained just small pieces of text. We could include HTML tags as strings in our Python modules and write them in the response body but we can imagine how easily it could become messy and hard to maintain the code. We need to completely separate the Python code from HTML pages and that's exactly what a template engine does. A template is a piece of HTML code living in its own file and possibly containing additional, special tags; with the help of a template engine, from the Python script, we can load this file, properly parse special tags, if any, and return valid HTML code in the response body. App Engine includes in the Python runtime a well-known template engine: the Jinja2 library. To make the Jinja2 library available to our application, we need to add this code to the app.yaml file under the libraries section: libraries: - name: webapp2 version: "2.5.2" - name: jinja2 version: latest We can put the HTML code for the main page in a file called main.html inside the application root. We start with a very simple page: <!DOCTYPE html> <html> <head lang="en"> <meta charset="UTF-8"> <title>Notes</title> </head> <body> <div class="container"> <h1>Welcome to Notes!</h1> <p> Hello, <b>{{user}}</b> - <a href="{{logout_url}}">Logout</a> </p> </div> </body> </html> Most of the content is static, which means that it will be rendered as standard HTML as we see it but there is a part that is dynamic and whose content depend on which data will be passed at runtime to the rendering process. This data is commonly referred to as template context. What has to be dynamic is the username of the current user and the link used to log out from the application. The HTML code contains two special elements written in the Jinja2 template syntax, {{user}} and {{logout_url}}, that will be substituted before the final output occurs. Back to the Python script; we need to add the code to initialize the template engine before the MainHandler class definition: import os import jinja2 jinja_env = jinja2.Environment( loader=jinja2.FileSystemLoader(os.path.dirname(__file__))) The environment instance stores engine configuration and global objects, and it's used to load templates instances; in our case, instances are loaded from HTML files on the filesystem in the same directory as the Python script. To load and render our template, we add the following code to the MainHandler.get() method: class MainHandler(webapp2.RequestHandler): def get(self): user = users.get_current_user() if user is not None: logout_url = users.create_logout_url(self.request.uri) template_context = { 'user': user.nickname(), 'logout_url': logout_url, } template = jinja_env.get_template('main.html') self.response.out.write( template.render(template_context)) else: login_url = users.create_login_url(self.request.uri) self.redirect(login_url) Similar to how we get the login URL, the create_logout_url() method provided by the user service returns the absolute URI to the logout procedure that we assign to the logout_url variable. We then create the template_context dictionary that contains the context values we want to pass to the template engine for the rendering process. We assign the nickname of the current user to the user key in the dictionary and the logout URL string to the logout_url key. The get_template() method from the jinja_env instance takes the name of the file that contains the HTML code and returns a Jinja2 template object. To obtain the final output, we call the render() method on the template object passing in the template_context dictionary whose values will be accessed, specifying their respective keys in the HTML file with the template syntax elements {{user}} and {{logout_url}}. Handling forms The main page of the application is supposed to list all the notes that belong to the current user but there isn't any way to create such notes at the moment. We need to display a web form on the main page so that users can submit details and create a note. To display a form to collect data and create notes, we put the following HTML code right below the username and the logout link in the main.html template file: {% if note_title %} <p>Title: {{note_title}}</p> <p>Content: {{note_content}}</p> {% endif %} <h4>Add a new note</h4> <form action="" method="post"> <div class="form-group"> <label for="title">Title:</label> <input type="text" id="title" name="title" /> </div> <div class="form-group"> <label for="content">Content:</label> <textarea id="content" name="content"></textarea> </div> <div class="form-group"> <button type="submit">Save note</button> </div> </form> Before showing the form, a message is displayed only when the template context contains a variable named note_title. To do this, we use an if statement, executed between the {% if note_title %} and {% endif %} delimiters; similar delimiters are used to perform for loops or assign values inside a template. The action property of the form tag is empty; this means that upon form submission, the browser will perform a POST request to the same URL, which in this case is the home page URL. As our WSGI application maps the home page to the MainHandler class, we need to add a method to this class so that it can handle POST requests: class MainHandler(webapp2.RequestHandler): def get(self): user = users.get_current_user() if user is not None: logout_url = users.create_logout_url(self.request.uri) template_context = { 'user': user.nickname(), 'logout_url': logout_url, } template = jinja_env.get_template('main.html') self.response.out.write( template.render(template_context)) else: login_url = users.create_login_url(self.request.uri) self.redirect(login_url) def post(self): user = users.get_current_user() if user is None: self.error(401) logout_url = users.create_logout_url(self.request.uri) template_context = { 'user': user.nickname(), 'logout_url': logout_url, 'note_title': self.request.get('title'), 'note_content': self.request.get('content'), } template = jinja_env.get_template('main.html') self.response.out.write( template.render(template_context)) When the form is submitted, the handler is invoked and the post() method is called. We first check whether a valid user is logged in; if not, we raise an HTTP 401: Unauthorized error without serving any content in the response body. Since the HTML template is the same served by the get() method, we still need to add the logout URL and the user name to the context. In this case, we also store the data coming from the HTML form in the context. To access the form data, we call the get() method on the self.request object. The last three lines are boilerplate code to load and render the home page template. We can move this code in a separate method to avoid duplication: def _render_template(self, template_name, context=None): if context is None: context = {} template = jinja_env.get_template(template_name) return template.render(context) In the handler class, we will then use something like this to output the template rendering result: self.response.out.write( self._render_template('main.html', template_context)) We can try to submit the form and check whether the note title and content are actually displayed above the form. Summary Thanks to App Engine, we have already implemented a rich set of features with a relatively small effort so far. We have discovered some more details about the webapp2 framework and its capabilities, implementing a nontrivial request handler. We have learned how to use the App Engine users service to provide users authentication. We have delved into some fundamental details of Datastore and now we know how to structure data in grouped entities and how to effectively retrieve data with ancestor queries. In addition, we have created an HTML user interface with the help of the Jinja2 template library, learning how to serve static content such as CSS files. Resources for Article: Further resources on this subject: Machine Learning in IPython with scikit-learn [Article] Introspecting Maya, Python, and PyMEL [Article] Driving Visual Analyses with Automobile Data (Python) [Article]

0
0
3192

How-To Tutorials

Tour of Xcode

Basic and Interactive Plots

Lync 2013 Hybrid and Lync Online

Qlik Sense's Vision

The Five Kinds of Python Functions Python 3.4 Edition

Getting Your Own Video and Feeds

Upgrading the interface

Visualforce Development with Apex

Remote Access

Event-driven Programming

Trending Topics

Extending ElasticSearch with Scripting

Introduction to Apache ZooKeeper

Transformations Using Map/Reduce

Building the next generation Web with Meteor

Google App Engine

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access