How-To Tutorials

article-image-essentials-working-python-collections

09 Jul 2015

14 min read

The Essentials of Working with Python Collections

09 Jul 2015

In this article by Steven F. Lott, the author of the book Python Essentials, we'll look at the break and continue statements; these modify a for or while loop to allow skipping items or exiting before the loop has processed all items. This is a fundamental change in the semantics of a collection-processing statement. (For more resources related to this topic, see here.) Processing collections with the for statement The for statement is an extremely versatile way to process every item in a collection. We do this by defining a target variable, a source of items, and a suite of statements. The for statement will iterate through the source of items, assigning each item to the target variable, and also execute the suite of statements. All of the collections in Python provide the necessary methods, which means that we can use anything as the source of items in a for statement. Here's some sample data that we'll work with. This is part of Mike Keith's poem, Near a Raven. We'll remove the punctuation to make the text easier to work with: >>> text = '''Poe, E. ... Near a Raven ... ... Midnights so dreary, tired and weary.''' >>> text = text.replace(",","").replace(".","").lower() This will put the original text, with uppercase and lowercase and punctuation into the text variable. When we use text.split(), we get a sequence of individual words. The for loop can iterate through this sequence of words so that we can process each one. The syntax looks like this: >>> cadaeic= {} >>> for word in text.split(): ... cadaeic[word]= len(word) We've created an empty dictionary, and assigned it to the cadaeic variable. The expression in the for loop, text.split(), will create a sequence of substrings. Each of these substrings will be assigned to the word variable. The for loop body—a single assignment statement—will be executed once for each value assigned to word. The resulting dictionary might look like this (irrespective of ordering): {'raven': 5, 'midnights': 9, 'dreary': 6, 'e': 1, 'weary': 5, 'near': 4, 'a': 1, 'poe': 3, 'and': 3, 'so': 2, 'tired': 5} There's no guaranteed order for mappings or sets. Your results may differ slightly. In addition to iterating over a sequence, we can also iterate over the keys in a dictionary. >>> for word in sorted(cadaeic): ... print(word, cadaeic[word]) When we use sorted() on a tuple or a list, an interim list is created with sorted items. When we apply sorted() to a mapping, the sorting applies to the keys of the mapping, creating a sequence of sorted keys. This loop will print a list in alphabetical order of the various pilish words used in this poem. Pilish is a subset of English where the word lengths are important: they're used as mnemonic aids. A for statement corresponds to the "for all" logical quantifier, . At the end of a simple for loop we can assert that all items in the source collection have been processed. In order to build the "there exists" quantifier, , we can either use the while statement, or the break statement inside the body of a for statement. Using literal lists in a for statement We can apply the for statement to a sequence of literal values. One of the most common ways to present literals is as a tuple. It might look like this: for scheme in 'http', 'https', 'ftp': do_something(scheme) This will assign three different values to the scheme variable. For each of those values, it will evaluate the do_something() function. From this, we can see that, strictly-speaking, the () are not required to delimit a tuple object. If the sequence of values grows, however, and we need to span more than one physical line, we'll want to add (), making the tuple literal more explicit. Using the range() and enumerate() functions The range() object will provide a sequence of numbers, often used in a for loop. The range() object is iterable, it's not itself a sequence object. It's a generator, which will produce items when required. If we use range() outside a for statement, we need to use a function like list(range(x)) or tuple(range(a,b)) to consume all of the generated values and create a new sequence object. The range() object has three commonly-used forms: range(n) produces ascending numbers including 0 but not including n itself. This is a half-open interval. We could say that range(n) produces numbers, x, such that . The expression list(range(5)) returns [0, 1, 2, 3, 4]. This produces n values including 0 and n - 1. range(a,b) produces ascending numbers starting from a but not including b. The expression tuple(range(-1,3)) will return (-1, 0, 1, 2). This produces b - a values including a and b - 1. range(x,y,z) produces ascending numbers in the sequence . This produces (y-x)//z values. We can use the range() object like this: for n in range(1, 21): status= str(n) if n % 5 == 0: status += " fizz" if n % 7 == 0: status += " buzz" print(status) In this example, we've used a range() object to produce values, n, such that . We use the range() object to generate the index values for all items in a list: for n in range(len(some_list)): print(n, some_list[n]) We've used the range() function to generate values between 0 and the length of the sequence object named some_list. The for statement allows multiple target variables. The rules for multiple target variables are the same as for a multiple variable assignment statement: a sequence object will be decomposed and items assigned to each variable. Because of that, we can leverage the enumerate() function to iterate through a sequence and assign the index values at the same time. It looks like this: for n, v in enumerate(some_list): print(n, v) The enumerate() function is a generator function which iterates through the items in source sequence and yields a sequence of two-tuple pairs with the index and the item. Since we've provided two variables, the two-tuple is decomposed and assigned to each variable. There are numerous use cases for this multiple-assignment for loop. We often have list-of-tuples data structures that can be handled very neatly with this multiple-assignment feature. Iterating with the while statement The while statement is a more general iteration than the for statement. We'll use a while loop in two situations. We'll use this in cases where we don't have a finite collection to impose an upper bound on the loop's iteration; we may suggest an upper bound in the while clause itself. We'll also use this when writing a "search" or "there exists" kind of loop; we aren't processing all items in a collection. A desktop application that accepts input from a user, for example, will often have a while loop. The application runs until the user decides to quit; there's no upper bound on the number of user interactions. For this, we generally use a while True: loop. Infinite iteration is recommended. If we want to write a character-mode user interface, we could do it like this: quit_received= False while not quit_received: command= input("prompt> ") quit_received= process(command) This will iterate until the quit_received variable is set to True. This will process indefinitely; there's no upper boundary on the number of iterations. This process() function might use some kind of command processing. This should include a statement like this: if command.lower().startswith("quit"): return True When the user enters "quit", the process() function will return True. This will be assigned to the quit_received variable. The while expression, not quit_received, will become False, and the loop ends. A "there exists" loop will iterate through a collection, stopping at the first item that meets certain criteria. This can look complex because we're forced to make two details of loop processing explicit. Here's an example of searching for the first value that meets a condition. This example assumes that we have a function, condition(), which will eventually be True for some number. Here's how we can use a while statement to locate the minimum for which this function is True: >>> n = 1 >>> while n != 101 and not condition(n): ... n += 1 >>> assert n == 101 or condition(n) The while statement will terminate when n == 101 or the condition(n) is True. If this expression is False, we can advance the n variable to the next value in the sequence of values. Since we're iterating through the values in order from the smallest to the largest, we know that n will be the smallest value for which the condition() function is true. At the end of the while statement we have included a formal assertion that either n is 101 or the condition() function is True for the given value of n. Writing an assertion like this can help in design as well as debugging because it will often summarize the loop invariant condition. We can also write this kind of loop using the break statement in a for loop, something we'll look at in the next section. The continue and break statements The continue statement is helpful for skipping items without writing deeply-nested if statements. The effect of executing a continue statement is to skip the rest of the loop's suite. In a for loop, this means that the next item will be taken from the source iterable. In a while loop, this must be used carefully to avoid an otherwise infinite iteration. We might see file processing that looks like this: for line in some_file: clean = line.strip() if len(clean) == 0: continue data, _, _ = clean.partition("#") data = data.rstrip() if len(data) == 0: continue process(data) In this loop, we're relying on the way files act like sequences of individual lines. For each line in the file, we've stripped whitespace from the input line, and assigned the resulting string to the clean variable. If the length of this string is zero, the line was entirely whitespace, and we'll continue the loop with the next line. The continue statement skips the remaining statements in the body of the loop. We'll partition the line into three pieces: a portion in front of any "#", the "#" (if present), and the portion after any "#". We've assigned the "#" character and any text after the "#" character to the same easily-ignored variable, _, because we don't have any use for these two results of the partition() method. We can then strip any trailing whitespace from the string assigned to the data variable. If the resulting string has a length of zero, then the line is entirely filled with "#" and any trailing comment text. Since there's no useful data, we can continue the loop, ignoring this line of input. If the line passes the two if conditions, we can process the resulting data. By using the continue statement, we have avoided complex-looking, deeply-nested if statements. It's important to note that a continue statement must always be part of the suite inside an if statement, inside a for or while loop. The condition on that if statement becomes a filter condition that applies to the collection of data being processed. continue always applies to the innermost loop. Breaking early from a loop The break statement is a profound change in the semantics of the loop. An ordinary for statement can be summarized by "for all." We can comfortably say that "for all items in a collection, the suite of statements was processed." When we use a break statement, a loop is no longer summarized by "for all." We need to change our perspective to "there exists". A break statement asserts that at least one item in the collection matches the condition that leads to the execution of the break statement. Here's a simple example of a break statement: for n in range(1, 100): factors = [] for x in range(1,n): if n % x == 0: factors.append(x) if sum(factors) == n: break We've written a loop that is bound by . This loop includes a break statement, so it will not process all values of n. Instead, it will determine the smallest value of n, for which n is equal to the sum of its factors. Since the loop doesn't examine all values, it shows that at least one such number exists within the given range. We've used a nested loop to determine the factors of the number n. This nested loop creates a sequence, factors, for all values of x in the range , such that x, is a factor of the number n. This inner loop doesn't have a break statement, so we are sure it examines all values in the given range. The least value for which this is true is the number six. It's important to note that a break statement must always be part of the suite inside an if statement inside a for or while loop. If the break isn't in an if suite, the loop will always terminate while processing the first item. The condition on that if statement becomes the "where exists" condition that summarizes the loop as a whole. Clearly, multiple if statements with multiple break statements mean that the overall loop can have a potentially confusing and difficult-to-summarize post-condition. Using the else clause on a loop Python's else clause can be used on a for or while statement as well as on an if statement. The else clause executes after the loop body if there was no break statement executed. To see this, here's a contrived example: >>> for item in 1,2,3: ... print(item) ... if item == 2: ... print("Found",item) ... break ... else: ... print("Found Nothing") The for statement here will iterate over a short list of literal values. When a specific target value has been found, a message is printed. Then, the break statement will end the loop, avoiding the else clause. When we run this, we'll see three lines of output, like this: 1 2 Found 2 The value of three isn't shown, nor is the "Found Nothing" message in the else clause. If we change the target value in the if statement from two to a value that won't be seen (for example, zero or four), then the output will change. If the break statement is not executed, then the else clause will be executed. The idea here is to allow us to write contrasting break and non-break suites of statements. An if statement suite that includes a break statement can do some processing in the suite before the break statement ends the loop. An else clause allows some processing at the end of the loop when none of the break-related suites statements were executed. Summary In this article, we've looked at the for statement, which is the primary way we'll process the individual items in a collection. A simple for statement assures us that our processing has been done for all items in the collection. We've also looked at the general purpose while loop. Resources for Article: Further resources on this subject: Introspecting Maya, Python, and PyMEL [article] Analyzing a Complex Dataset [article] Geo-Spatial Data in Python: Working with Geometry [article]

0
0
2320

Packt

09 Jul 2015

29 min read

Process of designing XenDesktop® deployments

Packt

09 Jul 2015

29 min read

In this article by, Govardhan Gunnala and Daniele Tosatto, authors of the book Mastering Citrix® XenDesktop®, we will discuss the process of designing XenDesktop® deployments. The uniqueness of the XenDesktop architecture is its modular five layer model. It covers all the key decisions in designing the XenDesktop deployment. User layer: Defines the users and their requirements Access layer: Defines how the users will access the resources Desktop/resource layer: Defines what resources will be delivered Control layer: Defines managing and maintaining the solution Hardware layer: Defines what resources it needs for implementing the chosen solution (For more resources related to this topic, see here.) While FMA is simple at a high level, its implementation can become complex depending on the technologies/options that are chosen for each component across the layers of FMA. Along with great flexibility, comes the responsibility of diligently choosing the technologies/options for fulfilling your business requirements. Importantly, the decisions made in the first three layers impact the last two layers of the deployment. It means that fixing a wrong decision anywhere in the first three layers during/after implementation stage would have less or no scope, and may even lead to implement the solution from the scratch again. Your design decisions speak for your solution's effectiveness in helping with the given business requirements. The layered architecture of the XenDesktop FMA, featuring the components at each layer is given in the following diagram. Each component of XenDesktop will fall under one of the layers shown in the succeeding diagram. We'll see what decisions are to be made for each of these components at each layer in the next sub section. Decisions to be made at each layer I will have to write a separate book for discussing all the possible technologies/options that are available at each layer. Following is a highly summarized list of the decisions to be made at each layer. This will help you in realizing the breadth of designing XenDesktop. This high level coverage of the various options will help you in locating and considering all the possible options that are available for making the right decisions and avoiding any slippages and missing any considerations. User layer The user layer refers to the specification of the users who will utilize the XenDesktop deployment. A business requirement statement may mention that the service users can either be the internal business users or the external customers accessing the service from Internet. Furthermore, both of these users may also need mobile access to the XenDesktop services. The Citrix receiver is the only component that belongs to the user layer, and XenDesktop is dependent on it for successfully delivering a XenDesktop session. By correlating this technical aspect with the preceding business requirement statement, one needs to consider all the possible aspects of receiver software on the client devices. This involves making the following decisions: Endpoint/user devices related: What are the devices that the users are supposed to access the services from? Who owns and administrates those devices throughout their lifecycle? Endpoints supported: Corporate computers, laptops, or mobiles running on Windows or thin clients. User smart devices, such as Android tablets, Apple iPads, and so on. In case of service providers, the endpoints can usually be any device and they need to be supported. Endpoint ownership: Device management includes security, availability, and compliance. Maintaining the responsibility of the devices on network. Endpoint lifecycle: Devices become either outdated or limited very quickly. Define minimum device hardware requirements to run your business workloads. Endpoint form factor: Choose the devices that may either be fully featured or have limited thin clients, or be a mix of both to support features, such as HDX graphics, multi-monitors, and so on. Thin client selection: Choose if the thin clients, such as Dell Wyse Zero clients, running on the limited functionality operating systems would suffice your user requirements. Understand its licensing cost. Receiver selection: Once you determine your endpoint device and its capabilities, you need to decide on the receiver selection that can be run on the devices. The greatest thing is that receiver is available for almost any device. Receiver type: Choose the receiver that is required for your device. Since the Receiver software for each platform (OS) differs, it is important to use the appropriate Receiver software for your devices while considering the platform that it runs on. You can download the appropriate Receiver software for your device from http://www.Citrix.com/go/receiver.html page. Initial deployment: Receiver is like any other software that will fit into your overall application portfolio. Determine how you would deploy this application on your devices. For corporate desktops and mobiles, you may use the enterprise application deployment and the mobile device management software. Otherwise, the users will be prompted to install it when they access the StoreFront URL, or they can even download it from Citrix for facilitating the installation process. For user-managed mobile devices, you can get it from the respective Windows or Google Apple stores/marketplaces. Initial configuration: Similar to other applications, Receiver requires certain initial configuration. It can be configured either manually or by using a provisioning file, group policy, and e-mail based discovery. Keeping the Receiver software up-to-date: Once you have installed Receiver on user devices, you will also require a mechanism for deploying the updates to Receiver. This can also be the way of initial deployments. Access layer An access layer refers to the specification of how the users gain access to the resources. A business requirement statement may usually state that the users should be validated for gaining access, and the access should be secured when the user is connected over the Internet. The technical components that fall under this layer include firewall(s), NetScaler, and StoreFront. These components play a broader role in the overall networking infrastructure of the company, which also includes the XenDesktop, as well as complete Citrix solutions in the environment. Their key activities include firewalling, external to internal IP address NATing, NetScaler Gateway to secure the connection between the virtual desktop and the user device, global load balancing, user validation/authentication, and GUI presentation of the enumerated resources to the end users. It involves making the following decisions: Authentication: Authentication point: A user can be authenticated at the NetScaler Gateway or StoreFront. Authentication policy: Various business use cases and compliance makes certain modes of authentication mandatory. You can choose from the different authentication methods supported at: StoreFront: Basic authentication by using a username and a password; Domain Pass-through, the NS Gateway pass-through, smart card, and even unauthenticated access. NetScaler Gateway: LDAP, RADIUS (token), client certificates. StoreFront: The decisions that are to be made around the scope of StoreFront are as follows: Unauthenticated access: Provides access to the users who don't require a username and a password, but they are still able to access the administrator allowed resources. Usually, this fits well with public or Kiosk systems. High availability: Making the StoreFront servers available at all times. Hardware load balancing, DNS Round Robin, Windows network load balancing, and so on. Delivery controller high availability and StoreFront: Building high availability for the delivery controller is recommended since they are needed for forming successful connections. Defining more than one delivery controller for the stores makes StoreFront auto failover to the next server in the list. Security - Inbound traffic: Consider securing the user connection to virtual desktops from the internal StoreFront and the external NetScaler Gateway. Security – Backend traffic: Consider securing the communication between the StoreFront and the XML services running on the controller servers. As these will be within the internal network, they can be secured by using the internal private certificate. Routing Receiver with Beacons: Receiver supports websites called Beacons to identify whether the user connection is internal or external. StoreFront provides Receiver with the http(s) addresses of the Beacon points during the initial connection. Resource Presentation: StoreFront presents a webpage, which provides self-service of the resources by the user. Scalability: The StoreFront server load and capacity for the user workload. Multi-site App synchronization: StoreFront can connect to the controllers at multiple site deployments. StoreFront can replicate the user subscribed applications across the servers. NetScaler Gateway: In the NetScaler Gateway, the decision regarding the secured external user access from public Internet involves the following: Topology: NetScaler supports two topologies: 1-Arm (normal security) and 2-Arm (high security). High availability: The NetScaler Gateways can be configured in pairs to provide high availability. Platform: NetScaler is available in different platforms, such as VPX, MDX, and SDX. They have different SSL throughput and SSL Transaction Per Second (TPS) metrics. Pre-authentication policy: Specifies about the Endpoint Analysis (EPA) scans for evaluating whether the endpoints meet the pre-set security criteria. This is available when NetScaler is chosen as the authentication point. Session management: The session policies define the overall user experience by classifying the endpoints into mobile and non-mobile devices. Session profile defines the details needed for gaining access to the environment. These are in two forms: SSLVPN and HDX proxy. Preferred data center: In multi-active data center deployments, StoreFront can determine the user resources primary data center and NetScaler can direct the user connections to that. Static and dynamic methods are used for specifying the preferred data center. Desktop/resource layer The desktop or resource layer refers to the specification of which resources (applications and desktops) users will receive. This layer comes with various options, which are tailored for business user roles and their requirements. This layer makes XenDesktop a better fit for achieving the varying user needs across each of their departments. It includes specification of the FlexCast model (type of desktop), user personalization, and delivering the application to the users in the desktop session. An example business requirement statement may specify that all the permanent employees would require a desktop with all the basic applications pre-installed based on their team and role, with their user settings and data to be retained. For all the contract employees, provide a basic desktop with controlled access to the applications on-demand and do not retain their user data. It includes various components, such as profile management solutions (including Windows profiles, the Citrix profile management, AppSense), Citrix print server, Windows operating systems, application delivery, and so on. It involves making decisions, such as: Images: It involves choosing the FlexCast model that is tailored to the user requirements, thereby delivering the expected desktop behavior to the end users, as follows: Operating system related: It requires choosing the desktop or the server operating systems for your master image, which depends on the FlexCast model that you are choosing from. Hosted Shared Hosted VDI: Pooled-static, pooled-random, pooled with PvD, dedicated, existing, physical/remote PC, streamed and streamed with PvD Streamed VHD Local VM On-demand apps Local apps In case of the desktop OS, it's also important to choose the right OS architecture according to the 32-bit or 64-bit processor architecture of the desktop. Computer policies: Define the controls over the user connection, security and bandwidth settings, devices or connection types, and so on. Specify all the policy features similar to that of the user policies. Machine catalogs: Define your catalog settings, including the FlexCast model, AD computer accounts, provisioning method, OS of the base image, and so on. Delivery groups: Assign desktops or applications to the user groups. Application folders: This is a tidy interface feature in Studio for organizing the applications into folders for easy management. StoreFront integration: This is an option for specifying the StoreFront URL for the Receiver in the master image so that the users will be auto connected to the storefront in the session. Resource allocation: This defines the hardware resources for the desktop VMs. It primarily involves hosts and storage. Depending on your estimated workloads, you can define the resources, such as number of virtual processors (vCPU), amount of virtual memory (vRAM), storage requirements for the needed disk space, and also the following resources Graphics (GPU): For the advanced use cases, you may choose to allocate the pass-through GPU, hardware vGPU, or the software vGPUs. IOPS: Depending on the operating system, the FlexCast model, and estimated workloads, you can analyze the overall IOPS load from the system and plan the corresponding hardware to support that load. Optimizations: Depending on the operating system, you can apply various optimizations to Windows that run on the master image. This greatly reduces the overall load later. Bandwidth requirements: Bandwidth can be a limiting factor in case of WAN and remote user connections of slow networks. Bandwidth consumption and user experience depend on various factors, such as the operating system being used, the application design, and screen resolution. To retain high user experience, it's important to consider the bandwidth requirements and optimization technologies, as follows: Bandwidth minimizing technologies: These include Quality of Service (QoS), HDX RealTime, and WAN Optimization, with Citrix's own CloudBridge solution. HDX Encoding Method: HDX encoding method also affects the bandwidth usage. For XenDesktop 7.x, there are three encoding methods that are available. These will appropriately be employed by the HDX protocol. These are Desktop Composition Redirection, H.264 Enhanced SuperCodec, and Legacy Mode (XenDesktop 5.X Adaptive Display). Session Bandwidth: Bandwidth needed in a session depends on the user interaction with desktop and applications. Latency: HDX can typically perform well up to 300 ms latency and the experience begins to degrade as latency increases. Personalization: This is an essential element of the desktop environment. It involves the decisions that are critical for the end user experience/acceptance and for the overall success of the solution during implementation. Following are the decisions that are involved in personalization. User profiles: This involves the decisions that are related to the user login, roaming of their settings, and seamless profile experience across overall Windows network: Profile type: Choose which profile type works for your user requirements. Possible options include local, roaming, mandatory, and hybrid profile with Citrix Profile Management. Citrix Profile Management provides various additional features, such as profile streaming, active write back, and configuring profiles using an .ini file, and so on. Folder redirection: This option saves the user's application settings in the profile. Represents special folders, such as AppData, Desktop, and so on. Folder exclusion: This option is for setting the exclusion of folders that are to be saved in the user profile. Usually, it refers to the local and IE Temp folders of a user profile. Profile caching: Caching profiles on a local system improves the user login experience and it occurs by default. You need to consider this depending on the type of virtual desktop FlexCast mode. Profile permissions: Specify whether the administrator needs access to the user profiles based on information sensitivity. Profile path: The decision to place the user profiles on a network location for high availability. It affects the logon performance depending on how close the profile is to the virtual desktop from which the user is logging on. It can be managed either from Active Directory or through Citrix Profile Management. User profile replication between data centers: This involves making the user profiles highly available and supporting the profile roaming among multiple data centers. User policies: Involves the decision regarding deploying the user settings and controlling those using management policies providing consistent settings for users, such as: Preferred policy engine: This requires choosing the policy processing for the Windows systems. The Citrix policies can be defined and managed from either Citrix Studio or the Active Directory group policy. Policy filtering: The Citrix policies can be applied to the users and their desktop with the various filter options that are available in the Citrix policy engine. If the group policies have been used, then you'll use the group policy filtering options. Policy precedence: The Citrix policies are processed in the order of LCSDOU (Local, Citrix, Site, Domain, OU policies). Baseline policy: This defines the policy with default and common settings for all the desktop images. Citrix provides the policy templates that suit specific business use cases. A baseline should cover security requirements, common network conditions, and managing the user device or the user profile requirements. Such a baseline can be configured using the security policies, connection-based policies, device-based policies, and profile-based policies. Printing: This is one of the most common desktop user requirements. XenDesktop supports printing, which can work for various scenarios. The printing technology involves deploying and using appropriate drivers. Provisioning printers: These can either be a static or dynamic set of printers. The options for dynamic printers do and do not auto-create all the client printers and auto-create the non-network client printers only. You can also set the options for session printers through the Citrix policy, which can include either static or dynamic printers. Furthermore, you can also set proximity printers. Managing print drivers: This option can be configured so that printer drivers are auto-installed during the session creation. It can be installed by using either the generic Citrix universal printer driver, or the manual option. You can also have all the known drivers preinstalled on the master image. Citrix even provides the Citrix universal print server, which extends XenDesktop universal printing support to network printing. Print job routing: It can be routed among client device or through the network server. The ICA protocol is used for compressing and sending data. Personal vDisk: Desktops with personal vDisks retain the user changes. Choosing the personal vDisk depends on the user requirements and the FlexCast Model that was opted for. Personal vDisk can be set to thin provisioned for estimated growth, but it can't be shrunk later. Applications: The application separation into another layer improves the scalability of the overall desktop solution. Applications are critical elements, which the users require from a desktop environment: Application delivery method: Applications can be installed on the base image, on the Personal vDisks, streamed into the session, or through the on-demand XenApp hosted mode. It also depends on application compatibility, and it requires technical expertise and tools, such as AppDNA, for effectively resolving them. Application streaming: XenDesktop supports App-V to build isolated application packages, which can be streamed to desktops. 16-bit legacy application delivery: If there are any legacy 16-bit applications to be supported, then you can choose from the 32 bit OS, VM hosted App, or a parallel XenApp5 deployment. Control layer Control layer speaks about all the backend systems that are required for managing and maintaining the overall solution through its life cycle. The control layer includes most of the XenDesktop components that are further classified into categories, such as resource/access controllers, image/desktop controllers, and infrastructure controllers. These respectively correspond to the first three layers of FMA, as shown here: Resource/access controllers: Supports the access layer Image/desktop controllers: Supports the desktop/resource layer Infrastructure controllers: Provides the underlying hardware for the overall FMA components/environment This layer involves the specification of capacity, configuration, and the topology of the environment. Building required/planned redundancy for each of these components enables achieving the enterprise business capabilities, such as HA, scalability, disaster recovery, load balancing, and so on. Components and technologies that operate under this layer include Active Directory, group policies, site database, Citrix licensing, XenDesktop delivery controllers, XenClient hypervisor, the Windows server and the Desktop operating systems, provisioning services, which can be either MCS or PVS and their controllers, and so on. An example business requirement statement may be as follows: Build a highly available desktop environment for a fast growing business users group. We currently have a head count of 30 users, which is expected to double in a year. It involves making the following decisions: Infrastructure controllers: It includes common infrastructure, which is required for XenDesktop to function in the Windows domain network. Active Directory: This is used for the authentication and authorization of users in a Citrix environment. It's also responsible for providing and synchronizing time on the systems, which is critical for Kerberos. For the most part, your AD structure will be in-place, and it may require certain changes for accommodating your XenDesktop requirements, such as: Forest design: It involves choosing the AD forest and domain decisions, such as multi-domain, multi-forest, domain and forest trusts, and so on, which will define the users of the XenDesktop resources. Site design: It involves choosing the number of sites that represent your geographical locations, the number of domain controllers, the subnets that accommodate the IP addresses, site links for replication, and so on. Organizational unit structure: Planning the OU structure for easier management of XenDesktop Workers and VDAs. In the case of multi-forest deployment scenarios (as supported in App Orchestration), having the same OU structure is critical. Naming standards: Planning proper conventions for XenDesktop AD objects, which includes users, security groups, XenDesktop servers, OUs, and so on. User groups: This helps in choosing the individual user names or groups. The user security groups are recommended as they reduce validation to just one object despite the number of users in it. Policy control: This helps in planning GPOs ordering and sizing, inheritance, filtering, enforcement, blocking, and loopback processing for reducing the overall processing time on the VDAs and servers. Database: Citrix uses the Microsoft SQL server database for most of its products, as follows: Edition: Microsoft ships the SQL server database in different editions, which provide varying features and capabilities. Using the standard edition for typical XenDesktop production deployments is recommended. For larger/enterprise deployments, depending on the requirement, a higher edition may be required. Database and Transaction Log Sizing: This involves estimating the storage requirements for the Site Configuration database, Monitoring database, and configuration logging databases. Database Location: By default, the Configuration Logging and the Monitoring databases are located within the Site Configuration database. Separating these into separate databases and relocating the Monitoring database to a different SQL server is recommended. High availability: Choose from VM-level HA, Mirroring, AlwaysOn Failover Cluster, and AlwaysOn Availability Groups. Database Creation: Usually, the database is automatically recreated during the XenDesktop installation. Alternatively, they can be created by using the scripts. Citrix licensing: Citrix licensing for XenDesktop requires the existence of a Citrix license server on the network. You can install and manage the multiple Citrix licenses. License type: Choose from user, device, and concurrent licenses. Version: Citrix's new license servers are backward compatible. Sizing: A license server can be scaled out to support a higher number of license requests per second. High availability: License server comes with a 30 day grace period to usually help in recovering from failures. High Availability for license server can be implemented through Window clustering technology or duplication of the virtual server. Optimization: Optimize the number of the receiving and processing threads depending on your hardware. This is generally required in large and heavily-loaded enterprise environments. Resource controllers: The resource controllers include the XenDesktop, the XenApp controllers, and the XenClient synchronizer, as shown here: XenDesktop and XenApp delivery controller. Number of sites: It is considered to have been based on network, risk tolerance, security requirements. Delivery controller sizing: Delivery controller scalability is based on CPU utilization. The more processor cores are available, the more virtual desktops a controller can support. High availability: Always plan for the N+1 deployment of the controllers for achieving the HA. Then, update the controllers' details on VDA through policy. Host connection configuration: Host connections define the hosts, storage repositories, and guest network to be used by the virtual machines on hypervisors. XML service encryption: The XML service protocol running on delivery controllers uses clear text for exchanging all data except passwords. Consider using an SSL encryption for sending the StoreFront data over a secure HTTP connection. Server OS load management: The default maximum number of sessions per server has been set to 250. Using real time usage monitoring and loading analysis, you can define appropriate load management policies. Session PreLaunch and Session Linger: Designed for helping the users in quickly accessing the applications by starting the sessions before they are requested (session prelaunch) and by keeping the user sessions active after a user closes all the applications in a session (session linger). XenClient synchronizer: It includes considerations for its architecture, processor specification, memory specification, network specification, high availability, the SQL database, remote synchronizer servers, storage repository size and location, and external access, and Active Directory integration. Image controllers: This includes all the image provisioning controllers. MCS is built-into the delivery controller. We'll have PVS considerations, such as the following: Farms: A farm represents the top level of the PVS infrastructure. Depending on your networking and administration boundaries, you can define the number of farms to be deployed in your environment. Sites: Each Farm consists of one or more sites, which contain all the PVS objects. While multiple sites share the same database, the target devices can only failover to the other Provisioning Servers that are within the same site. Your networking and organization structure determines the number of sites in your deployment. High availability: If implemented, PVS will be a critical component of the virtual desktop infrastructure. HA should be considered for its database, PVS servers, vDisks and storage, networking and TFTP, and so on. Bootstrap delivery: There are three methods in which the target device can receive the bootstrap program. This can be done by using the DHCP options, the PXE broadcasts, and the boot device manager. Write cache placement: Write cache uniquely identifies the target device by including the target device's MAC address and disk identifier. Write cache can be placed on the following: Cache on the Device Hard Drive, Cache on the Device Hard Drive Persisted, Cache in the Device RAM, Cache in the Device RAM with overflow on the hard disk, and Cache on the Provisioning Server Disk, and Cache on the Provisioning Server Disk Persisted. vDisk format and replication: PVS supports the use of fixed-size or dynamic vDisks. vDisks hosted on a SAN, local, or Direct Attached Storage must be replicated between the vDisk stores whenever a vDisk is created or changed. It can be replicated either manually or automatically. Virtual or physical servers, processor and memory: The virtual Provisioning Servers are preferred when sufficient processor, memory, disk and networking resources are guaranteed. Scale up or scale out: Determining whether to scale up or scale out the servers requires considering factors like redundancy, failover times, datacenter capacity, hardware costs, hosting costs, and so on. Bandwidth requirements and network configuration: PVS can boot 500 devices simultaneously. A 10Gbps network is recommended for provisioning services. Network configuration should consider the PVS Uplink, the Hypervisor Uplink, and the VM Uplink. Recommended switch settings include either Disable Spanning Tree or Enable Portfast, Storm Control, and Broadcast Helper. Network interfaces: Teaming the multiple network interfaces with link aggregation can provide a greater throughput. Consider the NIC features TCP Offloading and Receive Side Scaling (RSS) while selecting NICs. Subnet affinity: It is a load balancing algorithm, which helps in ensuring that the target devices are connected to the most appropriate Provisioning Server. It can be configured to Best Effort and Fixed. Auditing: By default, the auditing feature is disabled. When enabled, the audit trail information is written in the provisioning services database along with the general configuration data. Antivirus: The antivirus software can cause file-locking issues on the PVS server by contending with the files being accessed by PVS Services. The vDisk store and the write cache should be excluded from any antivirus scans in order to prevent file contention issues. Hardware layer The hardware layer involves choosing the right capacity, make, and hardware features of the backend systems that are required for the overall solution as defined in the control layer. In-line with the control layer, the hardware layer decisions will change if any of the first three layer decisions are changed. Components and technologies that operate under this layer include server hardware, storage technologies, hard disks and the RAID configurations, hypervisors and their management software, backup solutions, monitoring, network devices and connectivity, and so on. It involves making the decisions shown here: Hardware Sizing: The hardware sizing is usually done in two ways. The first, and the preferred, way is to plan ahead and purchase the hardware based on the workload requirements. The second way to size the hosts to use the existing hardware in the best configuration to support the different workload requirements, as follows: Workload separation: Workloads can either be separated into dedicated resource clusters or be mixed in the same physical hosts. Control host sizing: The VM resource allocation for each control component should be determined in the control layer and it should be allocated accordingly. Desktop host sizing: This involves choosing the physical resources required for the virtual desktops as well as the hosted server deployments. It includes estimating the pCPU, pRAM, GPU, and the number of hosts. Hypervisors: This involves choosing from the supported hypervisors that include major players, such as Hyper-V, XenServer, and ESX. Choosing from these requires considering a vast range of parameters, such as host hardware - processor and memory, storage requirements, network requirements, scale up/out, and host scalability. Further considerations to be made also include the following: Networking: Networks, physical NIC, NIC teaming, virtual NICs—hosts, virtual NICs—guests, and IP addressing VM provisioning: Templates High availability: Microsoft Hyper-V: Failover clustering, cluster shared volumes, CSV cache VMware ESXi: VMware vSphere high availability cluster Citrix XenServer: XenServer high availability by using the server pool Monitoring: Use the hypervisor specific vendor provided management and monitoring tools for hypervisor monitoring; use hardware specific vendor provided monitoring tools for hardware level monitoring. Backup and recovery: Backup method and components to be backed up. Storage: Storage architecture, RAID level, numbers of disks, disk type, storage bandwidth, tiered storage, thin provisioning, and data de-duplication Disaster recovery Data center utilization: The XenDesktop deployments can leverage multiple data centers for improving the user performance and the availability of resources. Multiple data centers can be deployed in an active/active or an active/passive configuration. An active/active configuration allows for both data centers to be utilized, although the individual users are tied to a specific location. Data center connectivity: An active/active data center configuration utilizing GSLB (Global Server Load Balancing) ensures that the users will be able to establish a connection even if one datacenter is unavailable. In the active/active configuration, the considerations that should be made are as follows: data center failover time, application servers, and StoreFront optimal routing. Capacity in the secondary data center: Planning of the secondary data center capacity is determined by the cost and by the management to support full capacity in each data center. A percent of the overall users, or a percent of the users per application, may be considered for the secondary data center facility. Then, it also needs the consideration of the type and amount of resources that will be made available in a failover scenario. Tools for designing XenDesktop® In the previous section, we saw a broad list of components, technologies, and configuration options, and so on, which we learned are involved in the process of designing the XenDesktop deployment. Obviously, designing the XenDesktop deployment for large, advanced, and complex business scenarios is a mammoth task, which requires operational knowledge of a broad range of technologies. Understanding the maze of this complexity, Citrix constantly helps the customers with great learning resources through handbooks, reviewer guides, blueprints, online eDocs, and training sessions. To ease the life of technical architects and XenDesktop designing and deployment consultants, Citrix has developed an online designing portal called Project Accelerator, which automates, streamlines, and covers all the broad aspects that are involved in the XenDesktop deployment. Project Accelerator Citrix designed the Project Accelerator web based designing tool, and it is available to the customers after they login. Its design is based on the Citrix consulting best practices for the XenDesktop deployment and implementation. It follows the layered FMA and allows you to create a close to deployment architecture. It covers all the key decisions and facilitates modifying them and evaluating their impact on the overall architecture. Upon completion of the design, it generates an architectural diagram and a deployment sizing plan. One can define more than one project and customize them in parallel to achieve multiple deployment plans. I highly recommended starting your Production XenDesktop deployment with the Project Accelerator architecture and the sizing design. Virtual Desktop Handbook Citrix provides the handbook along with new XenDesktop releases. The handbook covers the latest features of that XenDesktop version and provides detailed information on the design decisions. It provides all the possible options for each of the decisions involved, and these options are evaluated and validated in an in-depth manner by the Citrix Solutions lab. They include the Citrix Consulting leading best practices as well. This helps architects and engineers to consider the recommended technologies, and then evaluate them further for fulfilling the business requirements. The Virtual Desktop Handbook for latest the version of XenDesktop, that is, 7.x, can be found at: http://support.Citrix.com/article/CTX139331. XenDesktop® Reviewer's Guide The Reviewer's Guide is also released along with the new versions of XenDesktop. They are designed for helping businesses in quickly installing and configuring the XenDesktop for evaluation. They provide a step-by-step screencast of the installation and configuration wizards of XenDesktop. This provides practical guidance to the IT administrators for successfully installing and delivering the XenDesktop sessions. The XenDesktop Reviewers Guide for the latest version of XenDesktop, that is, 7.6, can be found at https://www.citrix.com/content/dam/citrix/en_us/documents/products-solutions/xendesktop-reviewers-guide.pdf. Summary We learnt the decision making that is involved in designing the XenDesktop in general, and we also saw the deployment designs of the complex environments involving the cloud capabilities. We also saw different tools for designing XenDesktop. Resources for Article: Further resources on this subject: Understanding Citrix®Provisioning Services 7.0 [article] Installation and Deployment of Citrix Systems®' CPSM [article] Designing, Sizing, Building, and Configuring Citrix VDI-in-a-Box [article]

0
0
9311

article-image-clustering-and-other-unsupervised-learning-methods

Packt

09 Jul 2015

19 min read

Clustering and Other Unsupervised Learning Methods

Packt

09 Jul 2015

19 min read

0
0
32729

Packt

09 Jul 2015

7 min read

Essentials of VMware vSphere

Packt

09 Jul 2015

7 min read

In this article by Puthiyavan Udayakumar, author of the book VMware vSphere Design Essentials, we will cover the following topics: Essentials of designing VMware vSphere The PPP framework The challenges and encounters faced on virtual infrastructure (For more resources related to this topic, see here.) Let's get started with understanding the essentials of designing VMware vSphere. Designing is nothing but assembling and integrating VMware vSphere infrastructure components together to form the baseline for a virtualized datacenter. It has the following benefits: Saves power consumption Decreases the datacenter footprint and helps towards server consolidation Fastest server provisioning On-demand QA lab environments Decreases hardware vendor dependency Aids to move to the cloud Greater savings and affordability Superior security and High Availability Designing VMware vSphere Architecture design principles are usually developed by the VMware architect in concurrence with the enterprise CIO, Infrastructure Architecture Board, and other key business stakeholders. From my experience, I would always urge you to have frequent meetings to observe functional requirements as much as possible. This will create a win-win situation for you and the requestor and show you how to get things done. Please follow your own approach, if it works. Architecture design principles should be developed by the overall IT principles specific to the customer's demands, if they exist. If not, they should be selected to ensure positioning of IT strategies in line with business approaches. In nutshell, architect should aim to form an effective architecture principles that fulfills the infrastructure demands, following are high level principles that should be followed across any design: Design mission and plans Design strategic initiatives External influencing factors When you release a design to the customer, keep in mind that the design must have the following principles: Understandable and robust Complete and consistent Stable and capable of accepting continuous requirement-based changes Rational and controlled technical diversity Without the preceding principles, I wouldn't recommend you to release your design to anyone even for peer review. For every design, irrespective of the product that you are about to design, try the following approach; it should work well but if required I would recommend you make changes to the approach. The following approach is called PPP, which will focus on people's requirements, the product's capacity, and the process that helps to bridge the gap between the product capacity and people requirements: The preceding diagram illustrates three entities that should be considered while designing VMware vSphere infrastructure. Please keep in mind that your design is just a product designed by a process that is based on people's needs. In the end, using this unified framework will aid you in getting rid of any known risks and its implications. Functional requirements should be meaningful; while designing, please make sure there is a meaning to your design. Selecting VMware vSphere from other competitors should not be a random pick, you should always list the benefits of VMware vSphere. Some of them are as follows: Server consolidation and easy hardware changes Dynamic provisioning of resources to your compute node Templates, snapshots, vMotion, DRS, DPM, High Availability, fault tolerance, auto monitoring, and solutions for warnings and alerts Virtual Desktop Infrastructure (VDI), building a disaster recovery site, fast deployments, and decommissions The PPP framework Let's explore the components that integrate to form the PPP framework. Always keep in mind that the design should consist of people, processes, and products that meet the unified functional requirements and performance benchmark. Always expect the unexpected. Without these metrics, your design is incomplete; PPP always retains its own decision metrics. What does it do, who does it, and how is it done? We will see the answers in the following diagrams: The PPP Framework helps you to get started with requirements gathering, design vision, business architecture, infrastructure architecture, opportunities and solutions, migration planning, fixing the tone for implementing and design governance. The following table illustrates the essentials of the three-dimensional approach and the basic questions that are required to be answered before you start designing or documenting about designing, which will in turn help to understand the real requirements for a specific design: Phase Description Key components Product Results of what? In what hardware will the VM reside? What kind of CPU is required? What is the quantity of CPU, RAM, storage per host/VM? What kind of storage is required? What kind of network is required? What are the standard applications that need to be rolled out? What kind of power and cooling are required? How much rack and floor space is demanded? People Results of who? Who is responsible for infrastructure provisioning? Who manages the data center and supplies the power? Who is responsible for implementation of the hardware and software patches? Who is responsible for storage and back up? Who is responsible for security and hardware support? Process Results of how? How should we manage the virtual infrastructure? How should we manage hosted VMs? How should we provision VM on demand? How should a DR site be active during a primary site failure? How should we provision storage and backup? How should we take snapshots of VMs? How should we monitor and perform periodic health checks? Before we start to apply the PPP framework on VMware vSphere, we will discuss the list of challenges and encounters faced on the virtual infrastructure. List of challenges and encounters faced on the virtual infrastructure In this section, we will see a list of challenges and encounters faced with virtual infrastructure due to the simple reason that we fail to capture the functional and non-functional demands of business users, or do not understand the fit-for-purpose concept: Resource Estimate Misfire: If you underestimate the amount of memory required up-front, you could change the number of VMs you attempt to run on the VMware ESXi host hardware. Resource unavailability: Without capacity management and configuration management, you cannot create dozens or hundreds of VMs on a single host. Some of the VMs could consume all resources, leaving other VMs unknown. High utilization: An army of VMs can also throw workflows off-balance due to the complexities they can bring to provisioning and operational tasks. Business continuity: Unlike a PC environment, VMs cannot be backed up to an actual hard drive. This is why 80 percent of IT professionals believe that virtualization backup is a great technological challenge. Security: More than six out of ten IT professionals believe that data protection is a top technological challenge. Backward compatibility: This is especially challenging for certain apps and systems that are dependent on legacy systems. Monitoring performance: Unlike physical servers, you cannot monitor the performance of VMs with common hardware resources such as CPU, memory, and storage. Restriction of licensing: Before you install software on virtual machines, read the license agreements; they might not support this; hence, by hosting on VMs, you might violate the agreement. Sizing the database and mailbox: Proper sizing of databases and mailboxes is really critical to the organization's communication systems and for applications. Poor design of storage and network: A poor storage design or a networking design resulting from a failure to properly involve the required teams within an organization is a sure-fire way to ensure that this design isn't successful. Summary In this article we covered a brief introduction of the essentials of designing VMware vSphere which focused on the PPP framework. We also had look over the challenges and encounters faced on the virtual infrastructure. Resources for Article: Further resources on this subject: Creating and Managing VMFS Datastores [article] Networking Performance Design [article] The Design Documentation [article]

0
0
8677

article-image-responsive-web-design-wordpress

Packt

09 Jul 2015

13 min read

Responsive Web Design with WordPress

Packt

09 Jul 2015

13 min read

0
0
29501

Packt

08 Jul 2015

26 min read

The Blueprint Class

Packt

08 Jul 2015

26 min read

0
0
8688

Packt

08 Jul 2015

23 min read

Why Meteor Rocks!

Packt

08 Jul 2015

23 min read

In this article by Isaac Strack, the author of the book, Getting Started with Meteor.js JavaScript Framework - Second Edition, has discussed some really amazing features of Meteor that has contributed a lot to the success of Meteor. Meteor is a disruptive (in a good way!) technology. It enables a new type of web application that is faster, easier to build, and takes advantage of modern techniques, such as Full Stack Reactivity, Latency Compensation, and Data On The Wire. (For more resources related to this topic, see here.) This article explains how web applications have changed over time, why that matters, and how Meteor specifically enables modern web apps through the above-mentioned techniques. By the end of this article, you will have learned: What a modern web application is What Data On The Wire means and how it's different How Latency Compensation can improve your app experience Templates and Reactivity—programming the reactive way! Modern web applications Our world is changing. With continual advancements in displays, computing, and storage capacities, things that weren't even possible a few years ago are now not only possible but are critical to the success of a good application. The Web in particular has undergone significant change. The origin of the web app (client/server) From the beginning, web servers and clients have mimicked the dumb terminal approach to computing where a server with significantly more processing power than a client will perform operations on data (writing records to a database, math calculations, text searches, and so on), transform the data and render it (turn a database record into HTML and so on), and then serve the result to the client, where it is displayed for the user. In other words, the server does all the work, and the client acts as more of a display, or a dumb terminal. This design pattern for this is called…wait for it…the client/server design pattern. The diagrammatic representation of the client-server architecture is shown in the following diagram: This design pattern, borrowed from the dumb terminals and mainframes of the 60s and 70s, was the beginning of the Web as we know it and has continued to be the design pattern that we think of when we think of the Internet. The rise of the machines (MVC) Before the Web (and ever since), desktops were able to run a program such as a spreadsheet or a word processor without needing to talk to a server. This type of application could do everything it needed to, right there on the big and beefy desktop machine. During the early 90s, desktop computers got even more beefy. At the same time, the Web was coming alive, and people started having the idea that a hybrid between the beefy desktop application (a fat app) and the connected client/server application (a thin app) would produce the best of both worlds. This kind of hybrid app—quite the opposite of a dumb terminal—was called a smart app. Many business-oriented smart apps were created, but the easiest examples can be found in computer games. Massively Multiplayer Online games (MMOs), first-person shooters, and real-time strategies are smart apps where information (the data model) is passed between machines through a server. The client in this case does a lot more than just display the information. It performs most of the processing (or acts as a controller) and transforms the data into something to be displayed (the view). This design pattern is simple but very effective. It's called the Model View Controller (MVC) pattern. The model is essentially the data for an application. In the context of a smart app, the model is provided by a server. The client makes requests to the server for data and stores that data as the model. Once the client has a model, it performs actions/logic on that data and then prepares it to be displayed on the screen. This part of the application (talking to the server, modifying the data model, and preparing data for display) is called the controller. The controller sends commands to the view, which displays the information. The view also reports back to the controller when something happens on the screen (a button click, for example). The controller receives the feedback, performs the logic, and updates the model. Lather, rinse, repeat! Since web browsers were built to be "dumb clients", the idea of using a browser as a smart app back then was out of question. Instead, smart apps were built on frameworks such as Microsoft .NET, Java, or Macromedia (now Adobe) Flash. As long as you had the framework installed, you could visit a web page to download/run a smart app. Sometimes, you could run the app inside the browser, and sometimes, you would download it first, but either way, you were running a new type of web app where the client application could talk to the server and share the processing workload. The browser grows up Beginning in the early 2000s, a new twist on the MVC pattern started to emerge. Developers started to realize that, for connected/enterprise "smart apps", there was actually a nested MVC pattern. The server code (controller) was performing business logic against the database (model) through the use of business objects and then sending processed/rendered data to the client application (a "view"). The client was receiving this data from the server and treating it as its own personal "model". The client would then act as a proper controller, perform logic, and send the information to the view to be displayed on the screen. So, the "view" for the server MVC was the "model" for the client MVC. As browser technologies (HTML and JavaScript) matured, it became possible to create smart apps that used the Nested MVC design pattern directly inside an HTML web page. This pattern makes it possible to run a full-sized application using only JavaScript. There is no longer any need to download multiple frameworks or separate apps. You can now get the same functionality from visiting a URL as you could previously by buying a packaged product. A giant Meteor appears! Meteor takes modern web apps to the next level. It enhances and builds upon the nested MVC design pattern by implementing three key features: Data On The Wire through the Distributed Data Protocol (DDP) Latency Compensation with Mini Databases Full Stack Reactivity with Blaze and Tracker Let's walk through these concepts to see why they're valuable, and then, we'll apply them to our Lending Library application. Data On The Wire The concept of Data On The Wire is very simple and in tune with the nested MVC pattern; instead of having a server process everything, render content, and then send HTML across the wire, why not just send the data across the wire and let the client decide what to do with it? This concept is implemented in Meteor using the Distributed Data Protocol, or DDP. DDP has a JSON-based syntax and sends messages similar to the REST protocol. Additions, deletions, and changes are all sent across the wire and handled by the receiving service/client/device. Since DDP uses WebSockets rather than HTTP, the data can be pushed whenever changes occur. But the true beauty of DDP lies in the generic nature of the communication. It doesn't matter what kind of system sends or receives data over DDP—it can be a server, a web service, or a client app—they all use the same protocol to communicate. This means that none of the systems know (or care) whether the other systems are clients or servers. With the exception of the browser, any system can be a server, and without exception, any server can act as a client. All the traffic looks the same and can be treated in a similar manner. In other words, the traditional concept of having a single server for a single client goes away. You can hook multiple servers together, each serving a discreet purpose, or you can have a client connect to multiple servers, interacting with each one differently. Think about what you can do with a system like that: Imagine multiple systems all coming together to create, for example, a health monitoring system. Some systems are built with C++, some with Arduino, some with…well, we don't really care. They all speak DDP. They send and receive data on the wire and decide individually what to do with that data. Suddenly, very difficult and complex problems become much easier to solve. DDP has been implemented in pretty much every major programming language, allowing you true freedom to architect an enterprise application. Latency Compensation Meteor employs a very clever technique called Mini Databases. A mini database is a "lite" version of a normal database that lives in the memory on the client side. Instead of the client sending requests to a server, it can make changes directly to the mini database on the client. This mini database then automatically syncs with the server (using DDP of course), which has the actual database. Out of the box, Meteor uses MongoDB and Minimongo: When the client notices a change, it first executes that change against the client-side Minimongo instance. The client then goes on its merry way and lets the Minimongo handlers communicate with the server over DDP. If the server accepts the change, it then sends out a "changed" message to all connected clients, including the one that made the change. If the server rejects the change, or if a newer change has come in from a different client, the Minimongo instance on the client is corrected, and any affected UI elements are updated as a result. All of this doesn't seem very groundbreaking, but here's the thing—it's all asynchronous, and it's done using DDP. This means that the client doesn't have to wait until it gets a response back from the server. It can immediately update the UI based on what is in the Minimongo instance. What if the change was illegal or other changes have come in from the server? This is not a problem as the client is updated as soon as it gets word from the server. Now, what if you have a slow internet connection or your connection goes down temporarily? In a normal client/server environment, you couldn't make any changes, or the screen would take a while to refresh while the client waits for permission from the server. However, Meteor compensates for this. Since the changes are immediately sent to Minimongo, the UI gets updated immediately. So, if your connection is down, it won't cause a problem: All the changes you make are reflected in your UI, based on the data in Minimongo. When your connection comes back, all the queued changes are sent to the server, and the server will send authorized changes to the client. Basically, Meteor lets the client take things on faith. If there's a problem, the data coming in from the server will fix it, but for the most part, the changes you make will be ratified and broadcast by the server immediately. Coding this type of behavior in Meteor is crazy easy (although you can make it more complex and therefore more controlled if you like): lists = new Mongo.Collection("lists"); This one line declares that there is a lists data model. Both the client and server will have a version of it, but they treat their versions differently. The client will subscribe to changes announced by the server and update its model accordingly. The server will publish changes, listen to change requests from the client, and update its model (its master copy) based on these change requests. Wow, one line of code that does all that! Of course, there is more to it, but that's beyond the scope of this article, so we'll move on. To better understand Meteor data synchronization, see the Publish and subscribe section of the meteor documentation at http://docs.meteor.com/#/full/meteor_publish. Full Stack Reactivity Reactivity is integral to every part of Meteor. On the client side, Meteor has the Blaze library, which uses HTML templates and JavaScript helpers to detect changes and render the data in your UI. Whenever there is a change, the helpers re-run themselves and add, delete, and change UI elements, as appropriate, based on the structure found in the templates. These functions that re-run themselves are called reactive computations. On both the client and the server, Meteor also offers reactive computations without having to use a UI. Called the Tracker library, these helpers also detect any data changes and rerun themselves accordingly. Because both the client and the server are JavaScript-based, you can use the Tracker library anywhere. This is defined as isomorphic or full stack reactivity because you're using the same language (and in some cases the same code!) on both the client and the server. Re-running functions on data changes has a really amazing benefit for you, the programmer: you get to write code declaratively, and Meteor takes care of the reactive part automatically. Just tell Meteor how you want the data displayed, and Meteor will manage any and all data changes. This declarative style is usually accomplished through the use of templates. Templates work their magic through the use of view data bindings. Without getting too deep, a view data binding is a shared piece of data that will be displayed differently if the data changes. Let's look at a very simple data binding—one for which you don't technically need Meteor—to illustrate the point. Let's perform the following set of steps to understand the concept in detail: In LendLib.html, you will see an HTML-based template expression: <div id="categories-container"> {{> categories}} </div> This expression is a placeholder for an HTML template that is found just below it: <template name="categories"> <h2 class="title">my stuff</h2>.. So, {{> categories}} is basically saying, "put whatever is in the template categories right here." And the HTML template with the matching name is providing that. If you want to see how data changes will affect the display, change the h2 tag to an h4 tag and save the change: <template name="categories"> <h4 class="title">my stuff</h4> You'll see the effect in your browser. (my stuff will become itsy bitsy.) That's view data binding at work. Change the h4 tag back to an h2 tag and save the change, unless you like the change. No judgment here...okay, maybe a little bit of judgment. It's ugly, and tiny, and hard to read. Seriously, you should change it back before someone sees it and makes fun of you! Alright, now that we know what a view data binding is, let's see how Meteor uses it. Inside the categories template in LendLib.html, you'll find even more templates: <template name="categories"> <h4 class="title">my stuff</h4> <div id="categories" class="btn-group"> {{#each lists}} <div class="category btn btn-primary"> {{Category}} </div> {{/each}} </div> </template> Meteor uses a template language called Spacebars to provide instructions inside templates. These instructions are called expressions, and they let us do things like add HTML for every record in a collection, insert the values of properties, and control layouts with conditional statements. The first Spacebars expression is part of a pair and is a for-each statement. {{#each lists}} tells the interpreter to perform the action below it (in this case, it tells it to make a new div element) for each item in the lists collection. lists is the piece of data, and {{#each lists}} is the placeholder. Now, inside the {{#each lists}} expression, there is one more Spacebars expression: {{Category}} Since the expression is found inside the #each expression, it is considered a property. That is to say that {{Category}} is the same as saying this.Category, where this is the current item in the for-each loop. So, the placeholder is saying, "add the value of the Category property for the current record." Now, if we look in LendLib.js, we will see the reactive values (called reactive contexts) behind the templates: lists : function () { return lists.find(... Here, Meteor is declaring a template helper named lists. The helper, lists, is found inside the template helpers belonging to categories. The lists helper happens to be a function that returns all the data in the lists collection, which we defined previously. Remember this line? lists = new Mongo.Collection("lists"); This lists collection is returned by the above-mentioned helper. When there is a change to the lists collection, the helper gets updated and the template's placeholder is changed as well. Let's see this in action. On your web page pointing to http://localhost:3000, open the browser console and enter the following line: > lists.insert({Category:"Games"}); This will update the lists data collection. The template will see this change and update the HTML code/placeholder. Each of the placeholders will run one additional time for the new entry in lists, and you'll see the following screen: When the lists collection was updated, the Template.categories.lists helper detected the change and reran itself (recomputed). This changed the contents of the code meant to be displayed in the {{> categories}} placeholder. Since the contents were changed, the affected part of the template was re-run. Now, take a minute here and think about how little we had to do to get this reactive computation to run: we simply created a template, instructing Blaze how we want the lists data collection to be displayed, and we put in a placeholder. This is simple, declarative programming at its finest! Let's create some templates We'll now see a real-life example of reactive computations and work on our Lending Library at the same time. Adding categories through the console has been a fun exercise, but it's not a long-term solution. Let's make it so that we can do that on the page instead as follows: Open LendLib.html and add a new button just before the {{#each lists}} expression: <div id="categories" class="btn-group"> <div class="category btn btn-primary" id="btnNewCat"> <span class="glyphicon glyphicon-plus"></span> </div> {{#each lists}} This will add a plus button on the page, as follows: Now, we want to change the button into a text field when we click on it. So let's build that functionality by using the reactive pattern. We will make it based on the value of a variable in the template. Add the following {{#if…else}} conditionals around our new button: <div id="categories" class="btn-group"> {{#if new_cat}} {{else}} <div class="category btn btn-primary" id="btnNewCat"> <span class="glyphicon glyphicon-plus"></span> </div> {{/if}} {{#each lists}} The first line, {{#if new_cat}}, checks to see whether new_cat is true or false. If it's false, the {{else}} section is triggered, and it means that we haven't yet indicated that we want to add a new category, so we should be displaying the button with the plus sign. In this case, since we haven't defined it yet, new_cat will always be false, and so the display won't change. Now, let's add the HTML code to display when we want to add a new category: {{#if new_cat}} <div class="category form-group" id="newCat"> <input type="text" id="add-category" class="form-control" value="" /> </div> {{else}} ... {{/if}} There's the smallest bit of CSS we need to take care of as well. Open ~/Documents/Meteor/LendLib/LendLib.css and add the following declaration: #newCat { max-width: 250px; } Okay, so now we've added an input field, which will show up when new_cat is true. The input field won't show up unless it is set to true; so, for now, it's hidden. So, how do we make new_cat equal to true? Save your changes if you haven't already done so, and open LendLib.js. First, we'll declare a Session variable, just below our Meteor.isClient check function, at the top of the file: if (Meteor.isClient) { // We are declaring the 'adding_category' flag Session.set('adding_category', false); Now, we'll declare the new template helper new_cat, which will be a function returning the value of adding_category. We need to place the new helper in the Template.categories.helpers() method, just below the declaration for lists: Template.categories.helpers({ lists: function () { ... }, new_cat: function(){ //returns true if adding_category has been assigned //a value of true return Session.equals('adding_category',true); } }); Note the comma (,) on the line above new_cat. It's important that you add that comma, or your code will not execute. Save these changes, and you'll see that nothing has changed. Ta-da! In reality, this is exactly as it should be because we haven't done anything to change the value of adding_category yet. Let's do this now: First, we'll declare our click event handler, which will change the value in our Session variable. To do this, add the following highlighted code just below the Template.categories.helpers() block: Template.categories.helpers({ ... }); Template.categories.events({ 'click #btnNewCat': function (e, t) { Session.set('adding_category', true); Tracker.flush(); focusText(t.find("#add-category")); } }); Now, let's take a look at the following line of code: Template.categories.events({ This line declares that events will be found in the category template. Now, let's take a look at the next line: 'click #btnNewCat': function (e, t) { This tells us that we're looking for a click event on the HTML element with an id="btnNewCat" statement (which we already created in LendLib.html). Session.set('adding_category', true); Tracker.flush(); focusText(t.find("#add-category")); Next, we set the Session variable, adding_category = true, flush the DOM (to clear up anything wonky), and then set the focus onto the input box with the id="add-category" expression. There is one last thing to do, and that is to quickly add the focusText(). helper function. To do this, just before the closing tag for the if (Meteor.isClient) function, add the following code: /////Generic Helper Functions///// //this function puts our cursor where it needs to be. function focusText(i) { i.focus(); i.select(); }; } //<------closing bracket for if(Meteor.isClient){} Now, when you save the changes and click on the plus button, you will see the input box: Fancy! However, it's still not useful, and we want to pause for a second and reflect on what just happened; we created a conditional template in the HTML page that will either show an input box or a plus button, depending on the value of a variable. This variable is a reactive variable, called a reactive context. This means that if we change the value of the variable (like we do with the click event handler), then the view automatically updates because the new_cat helpers function (a reactive computation) will rerun. Congratulations, you've just used Meteor's reactive programming model! To really bring this home, let's add a change to the lists collection (which is also a reactive context, remember?) and figure out a way to hide the input field when we're done. First, we need to add a listener for the keyup event. Or, to put it another way, we want to listen when the user types something in the box and hits Enter. When this happens, we want to add a category based on what the user typed. To do this, let's first declare the event handler. Just after the click handler for #btnNewCat, let's add another event handler: 'click #btnNewCat': function (e, t) { ... }, 'keyup #add-category': function (e,t){ if (e.which === 13) { var catVal = String(e.target.value || ""); if (catVal) { lists.insert({Category:catVal}); Session.set('adding_category', false); } } } We add a "," character at the end of the first click handler, and then add the keyup event handler. Now, let's check each of the lines in the preceding code: This line checks to see whether we hit the Enter/Return key. if (e.which === 13) This line of code checks to see whether the input field has any value in it: var catVal = String(e.target.value || ""); if (catVal) If it does, we want to add an entry to the lists collection: lists.insert({Category:catVal}); Then, we want to hide the input box, which we can do by simply modifying the value of adding_category: Session.set('adding_category', false); There is one more thing to add and then we'll be done. When we click away from the input box, we want to hide it and bring back the plus button. We already know how to do this reactively, so let's add a quick function that changes the value of adding_category. To do this, add one more comma after the keyup event handler and insert the following event handler: 'keyup #add-category': function (e,t){ ... }, 'focusout #add-category': function(e,t){ Session.set('adding_category',false); } Save your changes, and let's see this in action! In your web browser on http://localhost:3000, click on the plus sign, add the word Clothes, and hit Enter. Your screen should now resemble the following screenshot: Feel free to add more categories if you like. Also, experiment by clicking on the plus button, typing something in, and then clicking away from the input field. Summary In this article, you learned about the history of web applications and saw how we've moved from a traditional client/server model to a nested MVC design pattern. You learned what smart apps are, and you also saw how Meteor has taken smart apps to the next level with Data On The Wire, Latency Compensation, and Full Stack Reactivity. You saw how Meteor uses templates and helpers to automatically update content, using reactive variables and reactive computations. Lastly, you added more functionality to the Lending Library. You made a button and an input field to add categories, and you did it all using reactive programming rather than directly editing the HTML code. Resources for Article: Further resources on this subject: Building the next generation Web with Meteor [article] Quick start - creating your first application [article] Meteor.js JavaScript Framework: Why Meteor Rocks! [article]

0
0
4916

article-image-project-setup-and-modeling-residential-project

Packt

08 Jul 2015

20 min read

Project Setup and Modeling a Residential Project

Packt

08 Jul 2015

20 min read

In this article by Scott H. MacKenzie and Adam Rendek, authors of the book ArchiCAD 19 – The Definitive Guide, we will see how our journey, into ArchiCAD 19, begins with an introduction to the graphic user interface, also known as the GUI. As with any software program, there is a menu bar along the top that gives access to all the tools and features. There are also toolbars and tool palettes that can be docked anywhere you like. In addition to this, there are some special palettes that pop up only when you need them. After your introduction to ArchiCAD's user interface, you can jump right in and start creating the walls and floors for your new house. Then you will learn how to create ceilings and the stairs. Before too long you will have a 3D model to orbit around. It is really fun and probably easier than you would expect. (For more resources related to this topic, see here.) The ArchiCAD GUI The first time you open ArchiCAD you will find the toolbars along the top, just under the menu bar and there will be palettes docked to the left and right of the drawing area. We will focus on the 3 following palettes to get started: The Toolbox palette: This contains all of your selection, modeling, and drafting tools. It will be located on the left hand side by default. The Info Box palette: This is your context menu that changes according to whatever tool is currently in use. By default, this will be located directly under the toolbars at the top. It has a scrolling function; hover your cursor over the palette and spin the scroll wheel on your mouse to reveal everything on the palette. The Navigator palette: This is your project navigation window. This palette gives you access to all your views, sheets, and lists. It will be located on the right-hand side by default. These three palettes can be seen in the following screenshot: All of the mentioned palettes are dockable and can be arranged however you like on your screen. They can also be dragged away from the main ArchiCAD interface. For instance, you could have palettes on a second monitor. Panning and Zooming ArchiCAD has the same panning and zooming interface as most other CAD (Computer-aided design) and BIM (Building Information Modeling) programs. Rolling the scroll wheel on your mouse will zoom in and out. Pressing down on the scroll wheel (or middle button) and moving your cursor will execute a pan. Each drawing view window has a row of zoom commands along the bottom. You should try each one to get familiar with each of their functions. View toggling When you have multiple views open, you can toggle through them by pressing the Ctrl key and tapping on the Tab key. Or, you can pick any of the open views from the bottom of the Window pull-down menu. Pressing the F2 key will open a 2D floor plan view and pressing the F3 key will open the default 3D view. Pressing the F5 key will open a 3D view of selected items. In other words, if you want to isolate specific items in a 3D view, select those items and press F5. The function keys are second nature to those that have been using ArchiCAD for a long time. If a feature has a function key shortcut, you should use it. Project setup ArchiCAD is available in multiple different language versions. The exercises in this book use the USA version of ArchiCAD. Obviously this version is in English. There is another version in English and that is referred to as the International (INT) version. You can use the International version to do the exercises in the book, just be aware that there may be some subtle differences in the way that something is named or designed. When you create a new project in ArchiCAD, you start by opening a project template. The template will have all the basic stuff you need to get started including layers, line types, wall types, doors, windows, and more. The following lesson will take you through the first steps in creating a new ArchiCAD project: Open ArchiCAD. The Start ArchiCAD dialog box will appear. Select the Create a New Project radio button at the top. Select the Use a Template radio button under Set up Project Settings. Select ArchiCAD 19 Residential Template.tpl from the drop-down list. If you have the International version of ArchiCAD, then the residential template may not be available. Therefore you can use ArchiCAD 19 Template.tpl. Click on New. This will open a blank project file. Project Settings Now that you have opened your new project, we are going to create a house with 4 stories (which includes a story for the roof). We create a story for the roof in order to facilitate a workspace to model the elements on that level. The template we just opened only has 2 stories, so we will need to add 2 more. Then we need to look at some other settings. Stories The settings for the stories are as follows: On the Navigator palette, select the Project Map icon . Double click on 1st FLOOR. Right click on Stories and select Create New Story. You will be prompted to give the new story a name. Enter the name BASEMENT. Click on the button next to Below. Enter 9' into the Height box and click on the Create button. Then double click on 2. 2nd FLOOR. Right click on Stories and then select Create New Story. You will be prompted to give the new story a name. Enter the name ROOF. Click on the button next to Above. Enter 9' into the Height box and click on the Create button. Your list of stories should now look like this 3. ROOF 2. 2nd Floor 1. 1st Floor -1. BASEMENT The International version of ArchiCAD (INT) will give the first floor the index number of 0. The second floor index number will be 1. And the roof will be 2. Now we need to adjust the heights of the other stories: Right click on Stories (on the Navigator palette) and select Story Settings. Change the number in the Height to Next box for 1st FLOOR to 9'. Do the same for 2nd FLOOR. Units On the menu bar, go to Options | Project Preferences | Working Units and perform the following steps: Ensure Model Units is set to feet & fractional inches. Ensure that Fractions is set to 1/64. Ensure that Layout Units is set to feet & fractional inches. Ensure that Angle Unit is set to Decimal degrees. Ensure that Decimals is set to 2. You are now ready to begin modeling your house, but first let's save the project. To save the project, perform the following steps: Navigate to the File menu and click on Save. If by chance you have saved it already, then click on Save As. Name your file Colonial House. Click on Save. Renovation filters The Renovation Filter feature allows you to differentiate how your drawing elements will appear in different construction phases. For renovation projects that have demolition and new work phases, you need to show the items to be demolished differently than the existing items that are to remain, or that are new. The projects we will work on in this book do not require this feature to manage phases because we will only be creating a new construction. However, it is essential that your renovation filter setting is set to New Construction. We will do this in the first modeling exercise. Selection methods Before you can do much in ArchiCAD, you need to be familiar with selecting elements. There are several ways to select something in ArchiCAD, which are as follows: Single cursor click Pick the Arrow tool from the toolbox or hold the Shift key down on the keyboard and click on what you want to select. As you click on the elements, hold the Shift key down to add them to your selection set. To remove elements from the selection set, just click on them again with the Shift key pressed. There is a mode within this mode called Quick Selection. It is toggled on and off from the Info Box palette. The icon looks like a magnet. When it is on, it works like a magnet because it will stick to faces or surfaces, such as slabs or fill patterns. If this mode is not on, then you are required to find an edge, endpoint, or hotspot node to select an element with a single click. Hold the Space button down to temporarily change the mode while selecting elements. Window Pick the Arrow tool from the toolbox or hold the Shift key down and draw your selection window. Click once for the window starting corner and click a second time for the end corner. This works just as windowing does in AutoCAD. Not as Revit does, where you need to hold the mouse button down while you draw your window. There are 3 different windowing methods. Each one is set from the Info Box palette: Partial Elements: Anything that is inside of or touching the window will be selected. AutoCAD users will know this as a Crossing Window. Entire Elements: Anything completely encapsulated by the window will be selected. If something is not completely inside the window then it will not be selected. Direction Dependent: Click and window to the left, the Partial Elements window will be used. Click and window to the right, the Entire Elements window will be used. Marquee A marquee is a selection window that stays on the screen after you create it. If you are a MicroStation CAD program user, this will be similar to a selection window. It can be used for printing a specific area in a drawing view and performing what AutoCAD users would refer to as a Stretch command. There are 2 types of marquees; single story (skinny) and multi story (fat). The single story marquee is used when you want to select elements on your current story view only. The multi-story marquee will select everything on your current story as well as the stories above and below your selections. The Find & Select tool This lets ArchiCAD select elements for you, based on the attribute criteria that you define, such as element type, layer, and pen number. When you have the criteria defined, click on the plus sign button on the palette and all the elements within that criterion inside your current view or marquee will be selected. The quickest way to open the Find & Select tool is with the Ctrl + F key combination Modification commands As you draw, you will inevitably need to move, copy, stretch, or trim something. Select your items first, and then execute the modification command. Here are the basic commands you will need to get things moving: Adjust (Extend): Press Ctrl + - or navigate to Edit | Reshape | Adjust Drag (Move): Press Ctrl + D or…navigate to Edit | Move | Drag Drag a Copy (Copy): Press Ctrl + Shift + D or navigate to Edit | Move | Drag a Copy Intersect (Fillet): Click on the Intersect button on the Standard toolbar or navigate to Edit | Reshape | Intersect Resize (Scale): Press Ctrl + K or navigate to Edit | Reshape | Resize Rotate: Press Ctrl + E or navigate to Edit | Move | Rotate Stretch: Press Ctrl + H or navigate to Edit | Reshape | Stretch Trim: Press Ctrl or click on the Trim button on the Standard toolbar or navigate to Edit | Reshape | Trim. Hold the Ctrl key down and click on the portion of wall or line that you want trimmed off. This is the fastest way to trim anything! Memorizing the keyboard combinations above is a sure way to increase your productivity. Modeling – part I We will start with the wall tool to create the main exterior walls on the 1st floor of our house, and then create the floor with the slab tool. However, before we begin, let's make sure your Renovation Filter is set to New Construction. Setting the Renovation Filter The Renovation Filter is an active setting that controls how the elements you create are displayed. Everything we create in this project is for new construction so we need the new construction filter to be active. To do so, go to the Document menu, click on Renovation and then click on 04 New Construction. Using the Wall tool The Wall tool has settings for height, width, composite, layer, pen weight and more. We will learn about these things as we go along, and learn a little bit more each time we progress into to the project. Double click on 1. 1st Story in the Navigator palette to ensure we are working on story 1. Select the Wall tool from the Toolbox palette or from the menu bar under Design | Design Tools | Wall. Notice that this will automatically change the contents of the Info Box palette. Click on the wall icon inside Info Box. This will bring up the active properties of the wall tool in the form of the Wall Default Settings window. (This can also be achieved by double clicking on the wall tool button in Toolbox). Change the composite type to Siding 2x6 Wd. Stud. Click on the wall composite button to do this. Creating the exterior walls of the 1st Story To create the exterior walls of the 1st story perform the following steps: Select the Wall tool from the Toolbox palette, or from the menu bar under Design | Design Tools | Wall. Double click on 1. 1st Story in the Navigator palette to ensure that we are working on story 1. Select the Wall tool from the Toolbox palette, or from the menu bar under Design | Design Tools | Wall. Change the composite type to be Siding 2x6 Wd. Stud. Click on the wall composite button to do this. Notice at the bottom of the Wall Default Settings window is the layer currently assigned to the wall tool. It should be set to A-WALL-EXTR. Click on OK to start your first wall. Click near the center of the drawing screen and move your cursor to the left, notice the orange dashed line that appears. That is your guide line. Keep your cursor over the guide line so that it keeps you locked in the orthogonal direction. You should also immediately see the Tracker palette pop up, displaying your distance drawn and angle after your first click. Before you make your second click, enter the number 24 from your keyboard and press Enter. You should now have 24-0" long wall. If your Tracker palette does not appear, it may be toggled off. Go up to the Standard tool bar and click on the Tracker button to turn it on. Select this again and make your first click on the upper left end corner of your first wall. Move your cursor down, so that it snaps to the guideline, enter the number 28, and press the Enter key. Draw your third wall by clicking on the bottom left endpoint of your second wall, move your cursor to the right, snapped over the guide line, type in the number 24 and press Enter. Draw your fourth wall by clicking on the bottom right end point of your third wall and the starting point of your first wall. You should now have four walls that measure 24'-0" x 28"-0, outside edge to outside edge. Move your four walls to the center of the drawing view and perform the following steps: Click on the Arrow tool at the top of the Toolbox. Click outside one of the corners of the walls, and then click on the opposite side. All four walls should be selected now. Use the Drag command to move the walls. The quickest way to activate the Drag command is by pressing Ctrl + D. The long way is from the menu bar by navigating to Edit | Move | Drag. Drag (move) the walls to the center of your drawing window. Press the Esc key or click on a blank space in your drawing window to deselect the walls. You can select all the walls in a view by activating the Wall tool and pressing Ctrl + A. You are now ready to create a floor with the slab tool. But first, let's have a little fun and see how it looks in 3D (press the F3 key): From the Navigator palette, double click on Generic Axonometry under the 3D folder icon. This will open a 3D view window. Hold your Shift key down, press down on your scroll wheel button, and slowly move your mouse around. You are now orbiting! Play around with it a little, then get back to work and go to the next step to create your first floor slab. Press the F2 key to get back to a 2D view. You can also perform a 3D orbit via the Orbit button at the bottom of any 3D view window. Creating the first story's floor with the Slab tool The slab tool is used to create floors. It is also used to create ceilings. We will begin using it now to create the first floor for our house. Similar to the Wall tool, it also has settings for layer, pen weight and composite. To create the first story's floor using the Slab tool, perform the following steps: Select the Slab tool from the Toolbox palette or from the menu bar under Design | Design Tools | Slab. This will change the contents of the Info Box palette. Click on the Slab icon in Info Box. This will bring up the Slab Default Settings (active properties) window for the Slab tool. As with the Wall tool, you have a composite setting for the slab tool. Set the composite type for the slab tool to FLR Wd Flr + 2x10. The layer should be set to A-FLOR. Click OK. You could draw the shape of the slab by tracing over the outside lines of your walls but we are going to use the Magic Wand feature. Hover your cursor over the space inside your four walls and press the space bar on your keyboard. This will automatically create the slab using the boundary created by the walls. Then, open a 3D view and look at your floor. Instead of using the tool icon inside the Info Box palette, double click on any tool icon inside the Toolbox palette to bring up the default settings window for that tool. Creating the exterior walls and floor slabs for the basement and the second story We could repeat all of the previous steps to create the floor and walls for the second story and the basement, but in this case, it will be quicker to copy what we have already drawn on the first story and copy it up with the Edit Elements by Stories tool. Perform the following steps to create the exterior walls and floor slabs for the basement and second story: Go to the Navigator palette and right click over Stories, select Edit Elements by Stories. The Edit Elements by Stories window will open. Under Select Action, you want to set it to Copy. Under From Story, set it to 1. 1st FLOOR. In the To Story section, check the box for 2nd FLOOR and -1. BASEMENT. Click on OK. You should see a dialog box appear, stating that as a result of the last operation, elements have been created and/or have changed their position on currently unseen stories. Whenever you get this message, you should confirm that you have not created any unwanted elements. Click on the Continue button. Now you should have walls and a floor on three stories; Basement, 1st FLOOR, and 2nd FLOOR. The quickest way to jump to the next story up or the next story down is with the Ctrl + Arrow Up or Ctrl + Arrow Down key combination. Basement element modification The floor and the walls on the BASEMENT story need to be changed to a different composite type. Do this by performing the following steps: Open the BASEMENT view and select the four walls by clicking on one at a time while holding down the Shift key. Right click over your selection and click on Wall Selection Settings. Change the walls to the EIFS on 8" CMU composite type. Then, click on OK. Move your cursor over the floor slab. The quick selection cursor should appear. This selection mode allows you to click on an object without needing to find an edge or endpoint. Click on the slab. Open the Slab Selection Setting window but this time, do it by pressing the Ctrl + T key combination. Change the floor slab composite to Conc. Slab: 4" on gravel. Click on OK. The Ctrl + T key combination is the quickest way to bring up an element's selection settings window when an element is selected. Open a 3D view (by pressing the F3 key) and orbit around your house. It should look similar to the following screenshot: Adding the garage We need to add the garage and the laundry room, which connects the garage to the house. Do this by performing the following steps: Open the 1st FLOOR story from the project map. Start the Wall tool. From the Info Box palette, set the wall composite setting to Siding 2x6 Wd. Stud. Click on the upper-left corner of your house for your wall starting point. Move your cursor to the left, snap to the guide line, type 6'-10", and press Enter. Change the Geometry Method setting on Info Box to Chained. Refer to the following screenshot: Start your next wall by clicking on the endpoint of your last wall, move your cursor up, snap to the guideline and type 5', and press Enter. Move your cursor to the left, snap to grid line, type in 12'-6", and press Enter. Move your cursor down, snap to grid line, type in 22'-4", and press Enter. Move your cursor to the right, snap to grid line and double click on the perpendicular west wall (double pressing your Enter key will work the same as a double click). Now we want to create the floor for this new set of walls. To do that, perform the following steps: Start the Slab tool. Change the composite to Conc. Slab: 4" on gravel. Hover your cursor inside the new set of walls and press the Space key to use the magic wand. This will create the floor slab for the garage and laundry room. There is still one more wall to create, but this time we will use the Adjust command to, in effect, create a new wall: Select the 5'-0" wall drawn in the previous exercise. Go to the Edit menu, click on Reshape, and then click on Adjust. Click on the bottom edge of the perpendicular wall down below. The wall should extend down. Refer to the following screenshot: Then Change to a 3D view (by pressing F3) and examine your work. The 3D view If you switch to a 3D view and your new modeling does not show, zoom in or out to refresh the view, or double click your scroll wheel (middle button). Your new work will appear. Summary In this article you were introduced to the ArchiCAD Graphical User Interface (GUI), project settings and learned how to select stuff. You created all the major modeling for your house and got a primer on layers. Now you should have a good understanding of the ArchiCAD way of creating architectural elements and how to control their parameters. Resources for Article: Further resources on this subject: Let There be Light! [article] Creating an AutoCAD command [article] Setting Up for Photoreal Rendering [article]

0
0
2614

Packt

08 Jul 2015

9 min read

What is Apache Camel?

Packt

08 Jul 2015

9 min read

In this article Jean-Baptiste Onofré, author of the book Mastering Apache Camel, we will see how Apache Camel originated in Apache ServiceMix. Apache ServiceMix 3 was powered by the Spring framework and implemented in the JBI specification. The Java Business Integration (JBI) specification proposed a Plug and Play approach for integration problems. JBI was based on WebService concepts and standards. For instance, it directly reusesthe Message Exchange Patterns (MEP) concept that comes from WebService Description Language (WSDL). Camel reuses some of these concepts, for instance, you will see that we have the concept of MEP in Camel. However, JBI suffered mostly from two issues: In JBI, all messages between endpoints are transported in the Normalized Messages Router (NMR). In the NMR, a message has a standard XML format. As all messages in the NMR havethe same format, it's easy to audit messages and the format is predictable. However, the JBI XML format has an important drawback for performances: it needs to marshall and unmarshall the messages. Some protocols (such as REST or RMI) are not easy to describe in XML. For instance, REST can work in stream mode. It doesn't make sense to marshall streamsin XML. Camel is payload-agnostic. This means that you can transport any kind of messages with Camel (not necessary XML formatted). JBI describes a packaging. We distinguish the binding components (responsible for the interaction with the system outside of the NMR and the handling of the messages in the NMR), and the service engines (responsible for transforming the messages inside the NMR). However, it's not possible to directly deploy the endpoints based on these components. JBI requires a service unit (a ZIP file) per endpoint, and for each package in a service assembly (another ZIP file). JBI also splits the description of the endpoint from its configuration. It does not result in a very flexible packaging: with definitions and configurations scattered in different files, not easy to maintain. In Camel, the configuration and definition of the endpoints are gatheredin a simple URI. It's easier to read. Moreover, Camel doesn't force any packaging; the same definition can be packaged in a simple XML file, OSGi bundle, andregular JAR file. In addition to JBI, another foundation of Camel is the book Enterprise Integration Patterns by Gregor Hohpe and Bobby Woolf. It describes design patterns answering classical problems while dealing with enterprise application integration and message oriented middleware. The book describes the problems and the patterns to solve them. Camel strives to implement the patterns described in the book to make them easy to use and let the developer concentrate on the task at hand. This is what Camel is: an open source framework that allows you to integrate systems and that comes with a lot of connectors and Enterprise Integration Patterns (EIP) components out of the box. And if that is not enough, one can extend and implement custom components. Components and bean support Apache Camel ships with a wide variety of components out of the box; currently, there are more than 100 components available. We can see: The connectivity components that allow exposure of endpoints for external systems or communicate with external systems. For instance, the FTP, HTTP, JMX, WebServices, JMS, and a lot more components are connectivity components. Creating an endpoint and the associated configuration for these components is easy, by directly using a URI. The internal components applying rules to the messages internally to Camel. These kinds of components apply validation or transformation rules to the inflight message. For instance, validation or XSLT are internal components. Camel brings a very powerful connectivity and mediation framework. Moreover, it's pretty easy to create new custom components, allowing you to extend Camel if the default components set doesn't match your requirements. It's also very easy to implement complex integration logic by creating your own processors and reusing your beans. Camel supports beans frameworks (IoC), such as Spring or Blueprint. Predicates and expressions As we will see later, most of the EIP need a rule definition to apply a routing logic to a message. The rule is described using an expression. It means that we have to define expressions or predicates in the Enterprise Integration Patterns. An expression returns any kind of value, whereas a predicate returns true or false only. Camel supports a lot of different languages to declare expressions or predicates. It doesn't force you to use one, it allows you to use the most appropriate one. For instance, Camel supports xpath, mvel, ognl, python, ruby, PHP, JavaScript, SpEL (Spring Expression Language), Groovy, and so on as expression languages. It also provides native Camel prebuilt functions and languages that are easy to use such as header, constant, or simple languages. Data format and type conversion Camel is payload-agnostic. This means that it can support any kind of message. Depending on the endpoints, it could be required to convert from one format to another. That's why Camel supports different data formats, in a pluggable way. This means that Camel can marshall or unmarshall a message in a given format. For instance, in addition to the standard JVM serialization, Camel natively supports Avro, JSON, protobuf, JAXB, XmlBeans, XStream, JiBX, SOAP, and so on. Depending on the endpoints and your need, you can explicitly define the data format during the processing of the message. On the other hand, Camel knows the expected format and type of endpoints. Thanks to this, Camel looks for a type converter, allowing to implicitly transform a message from one format to another. You can also explicitly define the type converter of your choice at some points during the processing of the message. Camel provides a set of ready-to-use type converters, but, as Camel supports a pluggable model, you can extend it by providing your own type converters. It's a simple POJO to implement. Easy configuration and URI Camel uses a different approach based on URI. The endpoint itself and its configuration are on the URI. The URI is human readable and provides the details of the endpoint, which is the endpoint component and the endpoint configuration. As this URI is part of the complete configuration (which defines what we name a route, as we will see later), it's possible to have a complete overview of the integration logic and connectivity in a row. Lightweight and different deployment topologies Camel itself is very light. The Camel core is only around 2 MB, and contains everythingrequired to run Camel. As it's based on a pluggable architecture, all Camel components are provided as external modules, allowing you to install only what you need, without installing superfluous and needlessly heavy modules. Camel is based on simple POJO, which means that the Camel core doesn't depend on other frameworks: it's an atomic framework and is ready to use. All other modules (components, DSL, and so on) are built on top of this Camel core. Moreover, Camel is not tied to one container for deployment. Camel supports a wide range of containers to run. They are as follows: A J2EE application server such as WebSphere, WebLogic, JBoss, and so on A Web container such as Apache Tomcat An OSGi container such as Apache Karaf A standalone application using frameworks such as Spring Camel gives a lot of flexibility, allowing you to embed it into your application or to use an enterprise-ready container. Quick prototyping and testing support In any integration project, it's typical that we have some part of the integration logic not yet available. For instance: The application to integrate with has not yet been purchased or not yet ready The remote system to integrate with has a heavy cost, not acceptable during the development phase Multiple teams work in parallel, so we may have some kinds of deadlocks between the teams As a complete integration framework, Camel provides a very easy way to prototype part of the integration logic. Even if you don't have the actual system to integrate, you can simulate this system (mock), as it allows you to implement your integration logic without waiting for dependencies. The mocking support is directly part of the Camel core and doesn't require any additional dependency. Along the same lines, testing is also crucial in an integration project. In such a kind of project, a lot of errors can happen and most are unforeseen. Moreover, a small change in an integration process might impact a lot of other processes. Camel provides the tools to easily test your design and integration logic, allowing you to integrate this in a continuous integration platform. Management and monitoring using JMX Apache Camel uses the Java Management Extension (JMX) standard and provides a lot of insights into the system using MBeans (Management Beans), providing a detailed view of the following current system: The different integration processes with the associated metrics The different components and endpoints with the associated metrics Moreover, these MBeans provide more insights than metrics. They also provide the operations to manage Camel. For instance, the operations allow you to stop an integration process, to suspend an endpoint, and so on. Using a combination of metrics and operations, you can configure a very agile integration solution. Active community The Apache Camel community is very active. This means that potential issues are identified very quickly and a fix is available soon after. However, it also means that a lot of ideas and contributions are proposed, giving more and more features to Camel. Another big advantage of an active community is that you will never be alone; a lot of people are active on the mailing lists who are ready to answer your question and provide advice. Summary Apache Camel is an enterprise integration solution used in many large organizations with enterprise support available through RedHat or Talend. Resources for Article: Further resources on this subject: Getting Started [article] A Quick Start Guide to Flume [article] Best Practices [article]

0
0
2520

article-image-developing-javafx-application-ios

Packt

08 Jul 2015

10 min read

Developing a JavaFX Application for iOS

Packt

08 Jul 2015

10 min read

In this article by Mohamed Taman, authors of the book JavaFX Essentials, we will learn how to develop a JavaFX, Apple has a great market share in the mobile and PC/Laptop world, with many different devices, from mobile phones such as the iPhone to musical devices such as the iPod and tablets such as the iPad. (For more resources related to this topic, see here.) It has a rapidly growing application market, called the Apple Store, serving its community, where the number of available apps increases daily. Mobile application developers should be ready for such a market. Mobile application developers targeting both iOS and Android face many challenges. By just comparing the native development environments of these two platforms, you will find that they differ substantially. iOS development, according to Apple, is based on the Xcode IDE (https://developer.apple.com/xcode/) and its programming languages. Traditionally, it was Objetive-C and, in June 2014, Apple introduced Swift (https://developer.apple.com/swift/); on the other hand, Android development, as defined by Google, is based on the Intellij IDEA IDE and the Java programming language. Not many developers are proficient in both environments. In addition, these differences rule out any code reuse between the platforms. JavaFX 8 is filling the gap for reusable code between the platforms, as we will see in this article, by sharing the same application in both platforms. Here are some skills that you will have gained by the end of this article: Installing and configuring iOS environment tools and software Creating iOS JavaFX 8 applications Simulating and debugging JavaFX mobile applications Packaging and deploying applications on iOS mobile devices Using RoboVM to run JavaFX on iOS RoboVM is the bridge from Java to Objetive-C. Using this, it becomes easy to develop JavaFX 8 applications that are to be run on iOS-based devices, as the ultimate goal of the RoboVM project is to solve this problem without compromising on developer experience or app user experience. As we saw in the article about Android, using JavaFXPorts to generate APKs was a relatively easy task due to the fact that Android is based on Java and the Dalvik VM. On the contrary, iOS doesn't have a VM for Java, and it doesn't allow dynamic loading of native libraries. Another approach is required. The RoboVM open source project tries to close the gap for Java developers by creating a bridge between Java and Objective-C using an ahead-of-time compiler that translates Java bytecode into native ARM or x86 machine code. Features Let's go through the RoboVM features: Brings Java and other JVM languages, such as Scala, Clojure, and Groovy, to iOS-based devices Translates Java bytecode into machine code ahead of time for fast execution directly on the CPU without any overhead The main target is iOS and the ARM processor (32- and 64-bit), but there is also support for Mac OS X and Linux running on x86 CPUs (both 32- and 64-bit) Does not impose any restrictions on the Java platform features accessible to the developer, such as reflection or file I/O Supports standard JAR files that let the developer reuse the vast ecosystem of third-party Java libraries Provides access to the full native iOS APIs through a Java-to-Objective-C bridge, enabling the development of apps with truly native UIs and with full hardware access Integrates with the most popular tools such as NetBeans, Eclipse, Intellij IDEA, Maven, and Gradle App Store ready, with hundreds of apps already in the store Limitations Mainly due to the restrictions of the iOS platform, there are a few limitations when using RoboVM: Loading custom bytecode at runtime is not supported. All class files comprising the app have to be available at compile time on the developer machine. The Java Native Interface technology as used on the desktop or on servers usually loads native code from dynamic libraries, but Apple does not permit custom dynamic libraries to be shipped with an iOS app. RoboVM supports a variant of JNI based on static libraries. Another big limitation is that RoboVM is an alpha-state project under development and not yet recommended for production usage. RoboVM has full support for reflection. How it works Since February 2015 there has been an agreement between the companies behind RoboVM and JavaFXPorts, and now a single plugin called jfxmobile-plugin allows us to build applications for three platforms—desktop, Android, and iOS—from the same codebase. The JavaFXMobile plugin adds a number of tasks to your Java application that allow you to create .ipa packages that can be submitted to the Apple Store. Android mostly uses Java as the main development language, so it is easy to merge your JavaFX 8 code with it. On iOS, the situation is internally totally different—but with similar Gradle commands. The plugin will download and install the RoboVM compiler, and it will use RoboVM compiler commands to create an iOS application in build/javafxports/ios. Getting started In this section, you will learn how to install the RoboVM compiler using the JavaFXMobile plugin, and make sure the tool chain works correctly by reusing the same application, Phone Dial version 1.0. Prerequisites In order to use the RoboVM compiler to build iOS apps, the following tools are required: Gradle 2.4 or higher is required to build applications with the jfxmobile plugin. A Mac running Mac OS X 10.9 or later. Xcode 6.x from the Mac App Store (https://itunes.apple.com/us/app/xcode/id497799835?mt=12). The first time you install Xcode, and every time you update to a new version, you have to open it once to agree to the Xcode terms. Preparing a project for iOS We will reuse the project we developed before, for the Android platform, since there is no difference in code, project structure, or Gradle build script when targeting iOS. They share the same properties and features, but with different Gradle commands that serve iOS development, and a minor change in the Gradle build script for the RoboVM compiler. Therefore, we will see the power of WORA Write Once, Run Everywhere with the same application. Project structure Based on the same project structure from the Android, the project structure for our iOS app should be as shown in the following figure: The application We are going to reuse the same application from the Phone DialPad version 2.0 JavaFX 8 application: As you can see, reusing the same codebase is a very powerful and useful feature, especially when you are developing to target many mobile platforms such as iOS and Android at the same time. Interoperability with low-level iOS APIs To have the same functionality of natively calling the default iOS phone dialer from our application as we did with Android, we have to provide the native solution for iOS as the following IosPlatform implementation: import org.robovm.apple.foundation.NSURL; import org.robovm.apple.uikit.UIApplication; import packt.taman.jfx8.ch4.Platform; public class IosPlatform implements Platform { @Override public void callNumber(String number) { if (!number.equals("")) { NSURL nsURL = new NSURL("telprompt://" + number); UIApplication.getSharedApplication().openURL(nsURL); } } } Gradle build files We will use the Gradle build script file, but with a minor change by adding the following lines to the end of the script: jfxmobile { ios { forceLinkClasses = [ 'packt.taman.jfx8.ch4.**.*' ] } android { manifest = 'lib/android/AndroidManifest.xml' } } All the work involved in installing and using robovm compilers is done by the jfxmobile plugin. The purpose of those lines is to give the RoboVM compiler the location of the main application class that has to be loaded at runtime is, as it is not visible by default to the compiler. The forceLinkClasses property ensures that those classes are linked in during RoboVM compilation. Building the application After we have added the necessary configuration set to build the script for iOS, its time to build the application in order to deploy it to different iOS target devices. To do so, we have to run the following command: $ gradle build We should have the following output: BUILD SUCCESSFUL Total time: 44.74 secs We have built our application successfully; next, we need to generate the .ipa and, in the case of production, you have to test it by deploying it to as many iOS versions as you can. Generating the iOS .ipa package file In order to generate the final .ipa iOS package for our JavaFX 8 application, which is necessary for the final distribution to any device or the AppStore, you have to run the following gradle command: gradle ios This will generate the .ipa file in the directory build/javafxports/ios. Deploying the application During development, we need to check our application GUI and final application prototype on iOS simulators and measure the application performance and functionality on different devices. These procedures are very useful, especially for testers. Let's see how it is a very easy task to run our application on either simulators or on real devices. Deploying to a simulator On a simulator, you can simply run the following command to check if your application is running: $ gradle launchIPhoneSimulator This command will package and launch the application in an iPhone simulator as shown in the following screenshot: DialPad2 JavaFX 8 application running on the iOS 8.3/iPhone 4s simulator This command will launch the application in an iPad simulator: $ gradle launchIPadSimulator Deploying to an Apple device In order to package a JavaFX 8 application and deploy it to an Apple device, simply run the following command: $ gradle launchIOSDevice This command will launch the JavaFX 8 application in the device that is connected to your desktop/laptop. Then, once the application is launched on your device, type in any number and then tap Call. The iPhone will ask for permission to dial using the default mobile dialer; tap on Ok. The default mobile dialer will be launched and will the number as shown in the following figure: To be able to test and deploy your apps on your devices, you will need an active subscription with the Apple Developer Program. Visit the Apple Developer Portal, https://developer.apple.com/register/index.action, to sign up. You will also need to provision your device for development. You can find information on device provisioning in the Apple Developer Portal, or follow this guide: http://www.bignerdranch.com/we-teach/how-to-prepare/ios-device-provisioning/. Summary This article gave us a very good understanding of how JavaFX-based applications can be developed and customized using RoboVM for iOS to make it possible to run your applications on Apple platforms. You learned about RoboVM features and limitations, and how it works; you also gained skills that you can use for developing. You then learned how to install the required software and tools for iOS development and how to enable Xcode along with the RoboVM compiler, to package and install the Phone Dial JavaFX-8-based application on OS simulators. Finally, we provided tips on how to run and deploy your application on real devices. Resources for Article: Further resources on this subject: Function passing [article] Creating Java EE Applications [article] Contexts and Dependency Injection in NetBeans [article]

0
0
10071

Packt

08 Jul 2015

21 min read

To Be or Not to Be – Optionals

Packt

08 Jul 2015

21 min read

0
0
5063

article-image-how-to-build-remote-controlled-tv-node-webkit

Roberto González

08 Jul 2015

14 min read

How to build a Remote-controlled TV with Node-Webkit

Roberto González

08 Jul 2015

14 min read

Node-webkit is one of the most promising technologies to come out in the last few years. It lets you ship a native desktop app for Windows, Mac, and Linux just using HTML, CSS, and some JavaScript. These are the exact same languages you use to build any web app. You basically get your very own Frameless Webkit to build your app, which is then supercharged with NodeJS, giving you access to some powerful libraries that are not available in a typical browser. As a demo, we are going to build a remote-controlled Youtube app. This involves creating a native app that displays YouTube videos on your computer, as well as a mobile client that will let you search for and select the videos you want to watch straight from your couch. You can download the finished project from https://github.com/Aerolab/youtube-tv. You need to follow the first part of this guide (Getting started) to set up the environment and then run run.sh (on Mac) or run.bat (on Windows) to start the app. Getting started First of all, you need to install Node.JS (a JavaScript platform), which you can download from http://nodejs.org/download/. The installer comes bundled with NPM (Node.JS Package Manager), which lets you install everything you need for this project. Since we are going to be building two apps (a desktop app and a mobile app), it’s better if we get the boring HTML+CSS part out of the way, so we can concentrate on the JavaScript part of the equation. Download the project files from https://github.com/Aerolab/youtube-tv/blob/master/assets/basics.zip and put them in a new folder. You can name the project’s folder youtube-tv or whatever you want. The folder should look like this: - index.html // This is the starting point for our desktop app - css // Our desktop app styles - js // This is where the magic happens - remote // This is where the magic happens (Part 2) - libraries // FFMPEG libraries, which give you H.264 video support in Node-Webkit - player // Our youtube player - Gruntfile.js // Build scripts - run.bat // run.bat runs the app on Windows - run.sh // sh run.sh runs the app on Mac Now open the Terminal (on Mac or Linux) or a new command prompt (on Windows) right in that folder. Now we’ll install a couple of dependencies we need for this project, so type these commands to install node-gyp and grunt-cli. Each one will take a few seconds to download and install: On Mac or Linux: sudo npm install node-gyp -g sudo npm install grunt-cli -g On Windows: npm install node-gyp -g npm install grunt-cli -g Leave the Terminal open. We’ll be using it again in a bit. All Node.JS apps start with a package.json file (our manifest), which holds most of the settings for your project, including which dependencies you are using. Go ahead and create your own package.json file (right inside the project folder) with the following contents. Feel free to change anything you like, such as the project name, the icon, or anything else. Check out the documentation at https://github.com/rogerwang/node-webkit/wiki/Manifest-format: { "//": "The // keys in package.json are comments.", "//": "Your project’s name. Go ahead and change it!", "name": "Remote", "//": "A simple description of what the app does.", "description": "An example of node-webkit", "//": "This is the first html the app will load. Just leave this this way", "main": "app://host/index.html", "//": "The version number. 0.0.1 is a good start :D", "version": "0.0.1", "//": "This is used by Node-Webkit to set up your app.", "window": { "//": "The Window Title for the app", "title": "Remote", "//": "The Icon for the app", "icon": "css/images/icon.png", "//": "Do you want the File/Edit/Whatever toolbar?", "toolbar": false, "//": "Do you want a standard window around your app (a title bar and some borders)?", "frame": true, "//": "Can you resize the window?", "resizable": true}, "webkit": { "plugin": false, "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Safari/537.36" }, "//": "These are the libraries we’ll be using:", "//": "Express is a web server, which will handle the files for the remote", "//": "Socket.io lets you handle events in real time, which we'll use with the remote as well.", "dependencies": { "express": "^4.9.5", "socket.io": "^1.1.0" }, "//": "And these are just task handlers to make things easier", "devDependencies": { "grunt": "^0.4.5", "grunt-contrib-copy": "^0.6.0", "grunt-node-webkit-builder": "^0.1.21" } } You’ll also find Gruntfile.js, which takes care of downloading all of the node-webkit assets and building the app once we are ready to ship. Feel free to take a look into it, but it’s mostly boilerplate code. Once you’ve set everything up, go back to the Terminal and install everything you need by typing: npm install grunt nodewebkitbuild You may run into some issues when doing this on Mac or Linux. In that case, try using sudo npm install and sudo grunt nodewebkitbuild. npm install installs all of the dependencies you mentioned in package.json, both the regular dependencies and the development ones, like grunt and grunt-nodewebkitbuild, which downloads the Windows and Mac version of node-webkit, setting them up so they can play videos, and building the app. Wait a bit for everything to install properly and we’re ready to get started. Note that if you are using Windows, you might get a scary error related to Visual C++ when running npm install. Just ignore it. Building the desktop app All web apps (or websites for that matter) start with an index.html file. We are going to be creating just that to get our app to run: <!DOCTYPE html><html> <head> <metacharset="utf-8"/> <title>Youtube TV</title> <linkhref='http://fonts.googleapis.com/css?family=Roboto:500,400'rel='stylesheet'type='text/css'/> <linkhref="css/normalize.css"rel="stylesheet"type="text/css"/> <linkhref="css/styles.css"rel="stylesheet"type="text/css"/> </head> <body> <divid="serverInfo"> <h1>Youtube TV</h1> </div> <divid="videoPlayer"> </div> <script src="js/jquery-1.11.1.min.js"></script> <script src="js/youtube.js"></script> <script src="js/app.js"></script> </body> </html> As you may have noticed, we are using three scripts for our app: jQuery (pretty well known at this point), a Youtube video player, and finally app.js, which contains our app's logic. Let’s dive into that! First of all, we need to create the basic elements for our remote control. The easiest way of doing this is to create a basic web server and serve a small web app that can search Youtube, select a video, and have some play/pause controls so we don’t have any good reasons to get up from the couch. Open js/app.js and type the following: // Show the Developer Tools. And yes, Node-Webkit has developer tools built in! Uncomment it to open it automatically//require('nw.gui').Window.get().showDevTools(); // Express is a web server, will will allow us to create a small web app with which to control the playervar express = require('express'); var app = express(); var server = require('http').Server(app); var io = require('socket.io')(server); // We'll be opening up our web server on Port 8080 (which doesn't require root privileges)// You can access this server at http://127.0.0.1:8080var serverPort =8080; server.listen(serverPort); // All the static files (css, js, html) for the remote will be served using Express.// These assets are in the /remote folderapp.use('/', express.static('remote')); With those 7 lines of code (not counting comments) we just got a neat web server working on port 8080. If you were paying attention to the code, you may have noticed that we required something called socket.io. This lets us use websockets with minimal effort, which means we can communicate with, from, and to our remote instantly. You can learn more about socket.io at http://socket.io/. Let’s set that up next in app.js: // Socket.io handles the communication between the remote and our app in real time, // so we can instantly send commands from a computer to our remote and backio.on('connection', function (socket) { // When a remote connects to the app, let it know immediately the current status of the video (play/pause)socket.emit('statusChange', Youtube.status); // This is what happens when we receive the watchVideo command (picking a video from the list)socket.on('watchVideo', function (video) { // video contains a bit of info about our video (id, title, thumbnail)// Order our Youtube Player to watch that video Youtube.watchVideo(video); }); // These are playback controls. They receive the “play” and “pause” events from the remotesocket.on('play', function () { Youtube.playVideo(); }); socket.on('pause', function () { Youtube.pauseVideo(); }); }); // Notify all the remotes when the playback status changes (play/pause)// This is done with io.emit, which sends the same message to all the remotesYoutube.onStatusChange =function(status) { io.emit('statusChange', status); }; That’s the desktop part done! In a few dozen lines of code we got a web server running at http://127.0.0.1:8080 that can receive commands from a remote to watch a specific video, as well as handling some basic playback controls (play and pause). We are also notifying the remotes of the status of the player as soon as they connect so they can update their UI with the correct buttons (if it’s playing, show the pause button and vice versa). Now we just need to build the remote. Building the remote control The server is just half of the equation. We also need to add the corresponding logic on the remote control, so it’s able to communicate with our app. In remote/index.html, add the following HTML: <!DOCTYPE html><html> <head> <metacharset=“utf-8”/> <title>TV Remote</title> <metaname="viewport"content="width=device-width, initial-scale=1, maximum-scale=1"/> <linkrel="stylesheet"href="/css/normalize.css"/> <linkrel="stylesheet"href="/css/styles.css"/> </head> <body> <divclass="controls"> <divclass="search"> <inputid="searchQuery"type="search"value=""placeholder="Search on Youtube..."/> </div> <divclass="playback"> <buttonclass="play">></button> <buttonclass="pause">||</button> </div> </div> <divid="results"class="video-list"> </div> <divclass="__templates"style="display:none;"> <articleclass="video"> <figure><imgsrc=""alt=""/></figure> <divclass="info"> <h2></h2> </div> </article> </div> <script src="/socket.io/socket.io.js"></script> <script src="/js/jquery-1.11.1.min.js"></script> <script src="/js/search.js"></script> <script src="/js/remote.js"></script> </body> </html> Again, we have a few libraries: Socket.io is served automatically by our desktop app at /socket.io/socket.io.js, and it manages the communication with the server. jQuery is somehow always there, search.js manages the integration with the Youtube API (you can take a look if you want), and remote.js handles the logic for the remote. The remote itself is pretty simple. It can look for videos on Youtube, and when we click on a video it connects with the app, telling it to play the video with socket.emit. Let’s dive into remote/js/remote.js to make this thing work: // First of all, connect to the server (our desktop app)var socket = io.connect(); // Search youtube when the user stops typing. This gives us an automatic search.var searchTimeout =null; $('#searchQuery').on('keyup', function(event){ clearTimeout(searchTimeout); searchTimeout = setTimeout(function(){ searchYoutube($('#searchQuery').val()); }, 500); }); // When we click on a video, watch it on the App$('#results').on('click', '.video', function(event){ // Send an event to notify the server we want to watch this videosocket.emit('watchVideo', $(this).data()); }); // When the server tells us that the player changed status (play/pause), alter the playback controlssocket.on('statusChange', function(status){ if( status ==='play' ) { $('.playback .pause').show(); $('.playback .play').hide(); } elseif( status ==='pause'|| status ==='stop' ) { $('.playback .pause').hide(); $('.playback .play').show(); } }); // Notify the app when we hit the play button$('.playback .play').on('click', function(event){ socket.emit('play'); }); // Notify the app when we hit the pause button$('.playback .pause').on('click', function(event){ socket.emit('pause'); }); This is very similar to our server, except we are using socket.emit a lot more often to send commands back to our desktop app, telling it which videos to play and handle our basic play/pause controls. The only thing left to do is make the app run. Ready? Go to the terminal again and type: If you are on a Mac: sh run.sh If you are on Windows: run.bat If everything worked properly, you should be both seeing the app and if you open a web browser to http://127.0.0.1:8080 the remote client will open up. Search for a video, pick anything you like, and it’ll play in the app. This also works if you point any other device on the same network to your computer’s IP, which brings me to the next (and last) point. Finishing touches There is one small improvement we can make: print out the computer’s IP to make it easier to connect to the app from any other device on the same Wi-Fi network (like a smartphone). On js/app.js add the following code to find out the IP and update our UI so it’s the first thing we see when we open the app: // Find the local IPfunction getLocalIP(callback) { require('dns').lookup( require('os').hostname(), function (err, add, fam) { typeof callback =='function'? callback(add) :null; }); } // To make things easier, find out the machine's ip and communicate itgetLocalIP(function(ip){ $('#serverInfo h1').html('Go to<br/><strong>http://'+ip+':'+serverPort+'</strong><br/>to open the remote'); }); The next time you run the app, the first thing you’ll see is the IP for your computer, so you just need to type that URL in your smartphone to open the remote and control the player from any computer, tablet, or smartphone (as long as they are in the same Wi-Fi network). That's it! You can start expanding on this to improve the app: Why not open the app on a fullscreen by default? Why not get rid of the horrible default frame and create your own? You can actually designate any div as a window handle with CSS (using -webkit-app-region: drag), so you can drag the window by that div and create your own custom title bar. Summary While the app has a lot of interlocking parts, it's a good first project to find out what you can achieve with node-webkit in just a few minutes. I hope you enjoyed this post! About the author Roberto González is the co-founder of Aerolab, “an awesome place where we really push the barriers to create amazing, well-coded designs for the best digital products”. He can be reached at @robertcode.

0
0
6647

article-image-understanding-mesos-internals

Packt

08 Jul 2015

26 min read

Understanding Mesos Internals

Packt

08 Jul 2015

26 min read

0
0
5733

article-image-working-large-data-sources

Packt

08 Jul 2015

20 min read

Working with large data sources

Packt

08 Jul 2015

20 min read

In this article, by Duncan M. McGreggor, author of the book Mastering matplotlib, we come across the use of NumPy in the world of matplotlib and big data, problems with large data sources, and the possible solutions to these problems. (For more resources related to this topic, see here.) Most of the data that users feed into matplotlib when generating plots is from NumPy. NumPy is one of the fastest ways of processing numerical and array-based data in Python (if not the fastest), so this makes sense. However by default, NumPy works on in-memory database. If the dataset that you want to plot is larger than the total RAM available on your system, performance is going to plummet. In the following section, we're going to take a look at an example that illustrates this limitation. But first, let's get our notebook set up, as follows: In [1]: import matplotlib matplotlib.use('nbagg') %matplotlib inline Here are the modules that we are going to use: In [2]: import glob, io, math, os import psutil import numpy as np import pandas as pd import tables as tb from scipy import interpolate from scipy.stats import burr, norm import matplotlib as mpl import matplotlib.pyplot as plt from IPython.display import Image We'll use the custom style sheet that we created earlier, as follows: In [3]: plt.style.use("../styles/superheroine-2.mplstyle") An example problem To keep things manageable for an in-memory example, we're going to limit our generated dataset to 100 million points by using one of SciPy's many statistical distributions, as follows: In [4]: (c, d) = (10.8, 4.2) (mean, var, skew, kurt) = burr.stats(c, d, moments='mvsk') The Burr distribution, also known as the Singh–Maddala distribution, is commonly used to model household income. Next, we'll use the burr object's method to generate a random population with our desired count, as follows: In [5]: r = burr.rvs(c, d, size=100000000) Creating 100 million data points in the last call took about 10 seconds on a moderately recent workstation, with the RAM usage peaking at about 2.25 GB (before the garbage collection kicked in). Let's make sure that it's the size we expect, as follows: In [6]: len(r) Out[6]: 100000000 If we save this to a file, it weighs in at about three-fourths of a gigabyte: In [7]: r.tofile("../data/points.bin") In [8]: ls -alh ../data/points.bin -rw-r--r-- 1 oubiwann staff 763M Mar 20 11:35 points.bin This actually does fit in the memory on a machine with a RAM of 8 GB, but generating much larger files tends to be problematic. We can reuse it multiple times though, to reach a size that is larger than what can fit in the system RAM. Before we do this, let's take a look at what we've got by generating a smooth curve for the probability distribution, as follows: In [9]: x = np.linspace(burr.ppf(0.0001, c, d), burr.ppf(0.9999, c, d), 100) y = burr.pdf(x, c, d) In [10]: (figure, axes) = plt.subplots(figsize=(20, 10)) axes.plot(x, y, linewidth=5, alpha=0.7) axes.hist(r, bins=100, normed=True) plt.show() The following plot is the result of the preceding code: Our plot of the Burr probability distribution function, along with the 100-bin histogram with a sample size of 100 million points, took about 7 seconds to render. This is due to the fact that NumPy handles most of the work, and we only displayed a limited number of visual elements. What would happen if we did try to plot all the 100 million points? This can be checked by the following code: In [11]: (figure, axes) = plt.subplots() axes.plot(r) plt.show() formatters.py:239: FormatterWarning: Exception in image/png formatter: Allocated too many blocks After about 30 seconds of crunching, the preceding error was thrown—the Agg backend (a shared library) simply couldn't handle the number of artists required to render all the points. But for now, this case clarifies the point that we stated a while back—our first plot rendered relatively quickly because we were selective about the data we chose to present, given the large number of points with which we are working. However, let's say we have data from the files that are too large to fit into the memory. What do we do about this? Possible ways to address this include the following: Moving the data out of the memory and into the filesystem Moving the data off the filesystem and into the databases We will explore examples of these in the following section. Big data on the filesystem The first of the two proposed solutions for large datasets involves not burdening the system memory with data, but rather leaving it on the filesystem. There are several ways to accomplish this, but the following two methods in particular are the most common in the world of NumPy and matplotlib: NumPy's memmap function: This function creates memory-mapped files that are useful if you wish to access small segments of large files on the disk without having to read the whole file into the memory. PyTables: This is a package that is used to manage hierarchical datasets. It is built on the top of the HDF5 and NumPy libraries and is designed to efficiently and easily cope with extremely large amounts of data. We will examine each in turn. NumPy's memmap function Let's restart the IPython kernel by going to the IPython menu at the top of notebook page, selecting Kernel, and then clicking on Restart. When the dialog box pops up, click on Restart. Then, re-execute the first few lines of the notebook by importing the required libraries and getting our style sheet set up. Once the kernel is restarted, take a look at the RAM utilization on your system for a fresh Python process for the notebook: In [4]: Image("memory-before.png") Out[4]: The following screenshot shows the RAM utilization for a fresh Python process: Now, let's load the array data that we previously saved to disk and recheck the memory utilization, as follows: In [5]: data = np.fromfile("../data/points.bin") data_shape = data.shape data_len = len(data) data_len Out[5]: 100000000 In [6]: Image("memory-after.png") Out[6]: The following screenshot shows the memory utilization after loading the array data: This took about five seconds to load, with the memory consumption equivalent to the file size of the data. This means that if we wanted to build some sample data that was too large to fit in the memory, we'd need about 11 of those files concatenated, as follows: In [7]: 8 * 1024 Out[7]: 8192 In [8]: filesize = 763 8192 / filesize Out[8]: 10.73656618610747 However, this is only if the entire memory was available. Let's see how much memory is available right now, as follows: In [9]: del data In [10]: psutil.virtual_memory().available / 1024**2 Out[10]: 2449.1796875 That's 2.5 GB. So, to overrun our RAM, we'll just need a fraction of the total. This is done in the following way: In [11]: 2449 / filesize Out[11]: 3.2096985583224114 The preceding output means that we only need four of our original files to create a file that won't fit in memory. However, in the following section, we will still use 11 files to ensure that data, if loaded into the memory, will be much larger than the memory. How do we create this large file for demonstration purposes (knowing that in a real-life situation, the data would already be created and potentially quite large)? We can try to use numpy.tile to create a file of the desired size (larger than memory), but this can make our system unusable for a significant period of time. Instead, let's use numpy.memmap, which will treat a file on the disk as an array, thus letting us work with data that is too large to fit into the memory. Let's load the data file again, but this time as a memory-mapped array, as follows: In [12]: data = np.memmap( "../data/points.bin", mode="r", shape=data_shape) The loading of the array to a memmap object was very quick (compared to the process of bringing the contents of the file into the memory), taking less than a second to complete. Now, let's create a new file to write the data to. This file must be larger in size as compared to our total system memory (if held on in-memory database, it will be smaller on the disk): In [13]: big_data_shape = (data_len * 11,) big_data = np.memmap( "../data/many-points.bin", dtype="uint8", mode="w+", shape=big_data_shape) The preceding code creates a 1 GB file, which is mapped to an array that has the shape we requested and just contains zeros: In [14]: ls -alh ../data/many-points.bin -rw-r--r-- 1 oubiwann staff 1.0G Apr 2 11:35 many-points.bin In [15]: big_data.shape Out[15]: (1100000000,) In [16]: big_data Out[16]: memmap([0, 0, 0, ..., 0, 0, 0], dtype=uint8) Now, let's fill the empty data structure with copies of the data we saved to the 763 MB file, as follows: In [17]: for x in range(11): start = x * data_len end = (x * data_len) + data_len big_data[start:end] = data big_data Out[17]: memmap([ 90, 71, 15, ..., 33, 244, 63], dtype=uint8) If you check your system memory before and after, you will only see minimal changes, which confirms that we are not creating an 8 GB data structure on in-memory. Furthermore, checking your system only takes a few seconds. Now, we can do some sanity checks on the resulting data and ensure that we have what we were trying to get, as follows: In [18]: big_data_len = len(big_data) big_data_len Out[18]: 1100000000 In [19]: data[100000000 – 1] Out[19]: 63 In [20]: big_data[100000000 – 1] Out[20]: 63 Attempting to get the next index from our original dataset will throw an error (as shown in the following code), since it didn't have that index: In [21]: data[100000000] ----------------------------------------------------------- IndexError Traceback (most recent call last) ... IndexError: index 100000000 is out of bounds ... But our new data does have an index, as shown in the following code: In [22]: big_data[100000000 Out[22]: 90 And then some: In [23]: big_data[1100000000 – 1] Out[23]: 63 We can also plot data from a memmaped array without having a significant lag time. However, note that in the following code, we will create a histogram from 1.1 million points of data, so the plotting won't be instantaneous: In [24]: (figure, axes) = plt.subplots(figsize=(20, 10)) axes.hist(big_data, bins=100) plt.show() The following plot is the result of the preceding code: The plotting took about 40 seconds to generate. The odd shape of the histogram is due to the fact that, with our data file-hacking, we have radically changed the nature of our data since we've increased the sample size linearly without regard for the distribution. The purpose of this demonstration wasn't to preserve a sample distribution, but rather to show how one can work with large datasets. What we have seen is not too shabby. Thanks to NumPy, matplotlib can work with data that is too large for memory, even if it is a bit slow iterating over hundreds of millions of data points from the disk. Can matplotlib do better? HDF5 and PyTables A commonly used file format in the scientific computing community is Hierarchical Data Format (HDF). HDF is a set of file formats (namely HDF4 and HDF5) that were originally developed at the National Center for Supercomputing Applications (NCSA), a unit of the University of Illinois at Urbana-Champaign, to store and organize large amounts of numerical data. The NCSA is a great source of technical innovation in the computing industry—a Telnet client, the first graphical web browser, a web server that evolved into the Apache HTTP server, and HDF, which is of particular interest to us, were all developed here. It is a little known fact that NCSA's web browser code was the ancestor to both the Netscape web browser as well as a prototype of Internet Explorer that was provided to Microsoft by a third party. HDF is supported by Python, R, Julia, Java, Octave, IDL, and MATLAB, to name a few. HDF5 offers significant improvements and useful simplifications over HDF4. It uses B-trees to index table objects and, as such, works well for write-once/read-many time series data. Common use cases span fields such as meteorological studies, biosciences, finance, and aviation. The HDF5 files of multiterabyte sizes are common in these applications. Its typically constructed from the analyses of multiple HDF5 source files, thus providing a single (and often extensive) source of grouped data for a particular application. The PyTables library is built on the top of the Python HDF5 library and NumPy. As such, it not only provides access to one of the most widely used large data file formats in the scientific computing community, but also links data extracted from these files with the data types and objects provided by the fast Python numerical processing library. PyTables is also used in other projects. Pandas wraps PyTables, thus extending its convenient in-memory data structures, functions, and objects to large on-disk files. To use HDF data with Pandas, you'll want to create pandas.HDFStore, read from the HDF data sources with pandas.read_hdf, or write to one with pandas.to_hdf. Files that are too large to fit in the memory may be read and written by utilizing chunking techniques. Pandas does support the disk-based DataFrame operations, but these are not very efficient due to the required assembly on columns of data upon reading back into the memory. One project to keep an eye on under the PyData umbrella of projects is Blaze. It's an open wrapper and a utility framework that can be used when you wish to work with large datasets and generalize actions such as the creation, access, updates, and migration. Blaze supports not only HDF, but also SQL, CSV, and JSON. The API usage between Pandas and Blaze is very similar, and it offers a nice tool for developers who need to support multiple backends. In the following example, we will use PyTables directly to create an HDF5 file that is too large to fit in the memory (for an 8GB RAM machine). We will follow the following steps: Create a series of CSV source data files that take up approximately 14 GB of disk space Create an empty HDF5 file Create a table in the HDF5 file and provide the schema metadata and compression options Load the CSV source data into the HDF5 table Query the new data source once the data has been migrated Remember the temperature precipitation data for St. Francis, in Kansas, USA, from a previous notebook? We are going to generate random data with similar columns for the purposes of the HDF5 example. This data will be generated from a normal distribution, which will be used in the guise of the temperature and precipitation data for hundreds of thousands of fictitious towns across the globe for the last century, as follows: In [25]: head = "country,town,year,month,precip,tempn" row = "{},{},{},{},{},{}n" filename = "../data/{}.csv" town_count = 1000 (start_year, end_year) = (1894, 2014) (start_month, end_month) = (1, 13) sample_size = (1 + 2 * town_count * (end_year – start_year) * (end_month - start_month)) countries = range(200) towns = range(town_count) years = range(start_year, end_year) months = range(start_month, end_month) for country in countries: with open(filename.format(country), "w") as csvfile: csvfile.write(head) csvdata = "" weather_data = norm.rvs(size=sample_size) weather_index = 0 for town in towns: for year in years: for month in months: csvdata += row.format( country, town, year, month, weather_data[weather_index], weather_data[weather_index + 1]) weather_index += 2 csvfile.write(csvdata) Note that we generated a sample data population that was twice as large as the expected size in order to pull both the simulated temperature and precipitation data at the same time (from the same set). This will take about 30 minutes to run. When complete, you will see the following files: In [26]: ls -rtm ../data/*.csv ../data/0.csv, ../data/1.csv, ../data/2.csv, ../data/3.csv, ../data/4.csv, ../data/5.csv, ... ../data/194.csv, ../data/195.csv, ../data/196.csv, ../data/197.csv, ../data/198.csv, ../data/199.csv A quick look at just one of the files reveals the size of each, as follows: In [27]: ls -lh ../data/0.csv -rw-r--r-- 1 oubiwann staff 72M Mar 21 19:02 ../data/0.csv With each file that is 72 MB in size, we have data that takes up 14 GB of disk space, which exceeds the size of the RAM of the system in question. Furthermore, running queries against so much data in the .csv files isn't going to be very efficient. It's going to take a long time. So what are our options? Well, to read this data, HDF5 is a very good fit. In fact, it is designed for jobs like this. We will use PyTables to convert the .csv files to a single HDF5. We'll start by creating an empty table file, as follows: In [28]: tb_name = "../data/weather.h5t" h5 = tb.open_file(tb_name, "w") h5 Out[28]: File(filename=../data/weather.h5t, title='', mode='w', root_uep='/', filters=Filters( complevel=0, shuffle=False, fletcher32=False, least_significant_digit=None)) / (RootGroup) '' Next, we'll provide some assistance to PyTables by indicating the data types of each column in our table, as follows: In [29]: data_types = np.dtype( [("country", "<i8"), ("town", "<i8"), ("year", "<i8"), ("month", "<i8"), ("precip", "<f8"), ("temp", "<f8")]) Also, let's define a compression filter that can be used by PyTables when saving our data, as follows: In [30]: filters = tb.Filters(complevel=5, complib='blosc') Now, we can create a table inside our new HDF5 file, as follows: In [31]: tab = h5.create_table( "/", "weather", description=data_types, filters=filters) With that done, let's load each CSV file, read it in chunks so that we don't overload the memory, and then append it to our new HDF5 table, as follows: In [32]: for filename in glob.glob("../data/*.csv"): it = pd.read_csv(filename, iterator=True, chunksize=10000) for chunk in it: tab.append(chunk.to_records(index=False)) tab.flush() Depending on your machine, the entire process of loading the CSV file, reading it in chunks, and appending to a new HDF5 table can take anywhere from 5 to 10 minutes. However, what started out as a collection of the .csv files that weigh in at 14 GB is now a single compressed 4.8 GB HDF5 file, as shown in the following code: In [33]: h5.get_filesize() Out[33]: 4758762819 Here's the metadata for the PyTables-wrapped HDF5 table after the data insertion: In [34]: tab Out[34]: /weather (Table(288000000,), shuffle, blosc(5)) '' description := { "country": Int64Col(shape=(), dflt=0, pos=0), "town": Int64Col(shape=(), dflt=0, pos=1), "year": Int64Col(shape=(), dflt=0, pos=2), "month": Int64Col(shape=(), dflt=0, pos=3), "precip": Float64Col(shape=(), dflt=0.0, pos=4), "temp": Float64Col(shape=(), dflt=0.0, pos=5)} byteorder := 'little' chunkshape := (1365,) Now that we've created our file, let's use it. Let's excerpt a few lines with an array slice, as follows: In [35]: tab[100000:100010] Out[35]: array([(0, 69, 1947, 5, -0.2328834718674, 0.06810312195695), (0, 69, 1947, 6, 0.4724989007889, 1.9529216219569), (0, 69, 1947, 7, -1.0757216683235, 1.0415374480545), (0, 69, 1947, 8, -1.3700249968748, 3.0971874991576), (0, 69, 1947, 9, 0.27279758311253, 0.8263207523831), (0, 69, 1947, 10, -0.0475253104621, 1.4530808932953), (0, 69, 1947, 11, -0.7555493935762, -1.2665440609117), (0, 69, 1947, 12, 1.540049376928, 1.2338186532516), (0, 69, 1948, 1, 0.829743501445, -0.1562732708511), (0, 69, 1948, 2, 0.06924900463163, 1.187193711598)], dtype=[('country', '<i8'), ('town', '<i8'), ('year', '<i8'), ('month', '<i8'), ('precip', '<f8'), ('temp', '<f8')]) In [36]: tab[100000:100010]["precip"] Out[36]: array([-0.23288347, 0.4724989 , -1.07572167, -1.370025 , 0.27279758, -0.04752531, -0.75554939, 1.54004938, 0.8297435 , 0.069249 ]) When we're done with the file, we do the same thing that we would do with any other file-like object: In [37]: h5.close() If we want to work with it again, simply load it, as follows: In [38]: h5 = tb.open_file(tb_name, "r") tab = h5.root.weather Let's try plotting the data from our HDF5 file: In [39]: (figure, axes) = plt.subplots(figsize=(20, 10)) axes.hist(tab[:1000000]["temp"], bins=100) plt.show() Here's a plot for the first million data points: This histogram was rendered quickly, with a much better response time than what we've seen before. Hence, the process of accessing the HDF5 data is very fast. The next question might be "What about executing calculations against this data?" Unfortunately, running the following will consume an enormous amount of RAM: tab[:]["temp"].mean() We've just asked for all of the data—all of its 288 million rows. We are going to end up loading everything into the RAM, grinding the average workstation to a halt. Ideally though, when you iterate through the source data and create the HDF5 file, you also crunch the numbers that you will need, adding supplemental columns or groups to the HDF5 file that can be used later by you and your peers. If we have data that we will mostly be selecting (extracting portions) and which has already been crunched and grouped as needed, HDF5 is a very good fit. This is why one of the most common use cases that you see for HDF5 is the sharing and distribution of the processed data. However, if we have data that we need to process repeatedly, then we will either need to use another method besides the one that will cause all the data to be loaded into the memory, or find a better match for our data processing needs. We saw in the previous section that the selection of data is very fast in HDF5. What about calculating the mean for a small section of data? If we've got a total of 288 million rows, let's select a divisor of the number that gives us several hundred thousand rows at a time—2,81,250 rows, to be more precise. Let's get the mean for the first slice, as follows: In [40]: tab[0:281250]["temp"].mean() Out[40]: 0.0030696632864265312 This took about 1 second to calculate. What about iterating through the records in a similar fashion? Let's break up the 288 million records into chunks of the same size; this will result in 1024 chunks. We'll start by getting the ranges needed for an increment of 281,250 and then, we'll examine the first and the last row as a sanity check, as follows: In [41]: limit = 281250 ranges = [(x * limit, x * limit + limit) for x in range(2 ** 10)] (ranges[0], ranges[-1]) Out[41]: ((0, 281250), (287718750, 288000000)) Now, we can use these ranges to generate the mean for each chunk of 281,250 rows of temperature data and print the total number of means that we generated to make sure that we're getting our counts right, as follows: In [42]: means = [tab[x * limit:x * limit + limit]["temp"].mean() for x in range(2 ** 10)] len(means) Out[42]: 1024 Depending on your machine, that should take between 30 and 60 seconds. With this work done, it's now easy to calculate the mean for all of the 288 million points of temperature data: In [43]: sum(means) / len(means) Out[43]: -5.3051780413782918e-05 Through HDF5's efficient file format and implementation, combined with the splitting of our operations into tasks that would not copy the HDF5 data into memory, we were able to perform calculations across a significant fraction of a billion records in less than a minute. HDF5 even supports parallelization. So, this can be improved upon with a little more time and effort. However, there are many cases where HDF5 is not a practical choice. You may have some free-form data, and preprocessing it will be too expensive. Alternatively, the datasets may be actually too large to fit on a single machine. This is when you may consider using matplotlib with distributed data. Summary In this article, we covered the role of NumPy in the world of big data and matplotlib as well as the process and problems in working with large data sources. Also, we discussed the possible solutions to these problems using NumPy's memmap function and HDF5 and PyTables. Resources for Article: Further resources on this subject: First Steps [article] Introducing Interactive Plotting [article] The plot function [article]

0
0
5127

Packt

08 Jul 2015

14 min read

File Sharing

Packt

08 Jul 2015

14 min read

0
0
2062

The Essentials of Working with Python Collections

Process of designing XenDesktop® deployments

Clustering and Other Unsupervised Learning Methods

Essentials of VMware vSphere

Responsive Web Design with WordPress

The Blueprint Class

Why Meteor Rocks!

Project Setup and Modeling a Residential Project

What is Apache Camel?

Developing a JavaFX Application for iOS

Trending Topics

To Be or Not to Be – Optionals

How to build a Remote-controlled TV with Node-Webkit

Understanding Mesos Internals

Working with large data sources

File Sharing

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access