Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7019 Articles
article-image-essentials-working-python-collections
Packt
09 Jul 2015
14 min read
Save for later

The Essentials of Working with Python Collections

Packt
09 Jul 2015
14 min read
In this article by Steven F. Lott, the author of the book Python Essentials, we'll look at the break and continue statements; these modify a for or while loop to allow skipping items or exiting before the loop has processed all items. This is a fundamental change in the semantics of a collection-processing statement. (For more resources related to this topic, see here.) Processing collections with the for statement The for statement is an extremely versatile way to process every item in a collection. We do this by defining a target variable, a source of items, and a suite of statements. The for statement will iterate through the source of items, assigning each item to the target variable, and also execute the suite of statements. All of the collections in Python provide the necessary methods, which means that we can use anything as the source of items in a for statement. Here's some sample data that we'll work with. This is part of Mike Keith's poem, Near a Raven. We'll remove the punctuation to make the text easier to work with: >>> text = '''Poe, E. ...     Near a Raven ... ... Midnights so dreary, tired and weary.''' >>> text = text.replace(",","").replace(".","").lower() This will put the original text, with uppercase and lowercase and punctuation into the text variable. When we use text.split(), we get a sequence of individual words. The for loop can iterate through this sequence of words so that we can process each one. The syntax looks like this: >>> cadaeic= {} >>> for word in text.split(): ...     cadaeic[word]= len(word) We've created an empty dictionary, and assigned it to the cadaeic variable. The expression in the for loop, text.split(), will create a sequence of substrings. Each of these substrings will be assigned to the word variable. The for loop body—a single assignment statement—will be executed once for each value assigned to word. The resulting dictionary might look like this (irrespective of ordering): {'raven': 5, 'midnights': 9, 'dreary': 6, 'e': 1, 'weary': 5, 'near': 4, 'a': 1, 'poe': 3, 'and': 3, 'so': 2, 'tired': 5} There's no guaranteed order for mappings or sets. Your results may differ slightly. In addition to iterating over a sequence, we can also iterate over the keys in a dictionary. >>> for word in sorted(cadaeic): ...   print(word, cadaeic[word]) When we use sorted() on a tuple or a list, an interim list is created with sorted items. When we apply sorted() to a mapping, the sorting applies to the keys of the mapping, creating a sequence of sorted keys. This loop will print a list in alphabetical order of the various pilish words used in this poem. Pilish is a subset of English where the word lengths are important: they're used as mnemonic aids. A for statement corresponds to the "for all" logical quantifier, . At the end of a simple for loop we can assert that all items in the source collection have been processed. In order to build the "there exists" quantifier, , we can either use the while statement, or the break statement inside the body of a for statement. Using literal lists in a for statement We can apply the for statement to a sequence of literal values. One of the most common ways to present literals is as a tuple. It might look like this: for scheme in 'http', 'https', 'ftp':    do_something(scheme) This will assign three different values to the scheme variable. For each of those values, it will evaluate the do_something() function. From this, we can see that, strictly-speaking, the () are not required to delimit a tuple object. If the sequence of values grows, however, and we need to span more than one physical line, we'll want to add (), making the tuple literal more explicit. Using the range() and enumerate() functions The range() object will provide a sequence of numbers, often used in a for loop. The range() object is iterable, it's not itself a sequence object. It's a generator, which will produce items when required. If we use range() outside a for statement, we need to use a function like list(range(x)) or tuple(range(a,b)) to consume all of the generated values and create a new sequence object. The range() object has three commonly-used forms: range(n) produces ascending numbers including 0 but not including n itself. This is a half-open interval. We could say that range(n) produces numbers, x, such that . The expression list(range(5)) returns [0, 1, 2, 3, 4]. This produces n values including 0 and n - 1. range(a,b) produces ascending numbers starting from a but not including b. The expression tuple(range(-1,3)) will return (-1, 0, 1, 2). This produces b - a values including a and b - 1. range(x,y,z) produces ascending numbers in the sequence . This produces (y-x)//z values. We can use the range() object like this: for n in range(1, 21):    status= str(n)    if n % 5 == 0: status += " fizz"    if n % 7 == 0: status += " buzz"    print(status) In this example, we've used a range() object to produce values, n, such that . We use the range() object to generate the index values for all items in a list: for n in range(len(some_list)):    print(n, some_list[n]) We've used the range() function to generate values between 0 and the length of the sequence object named some_list. The for statement allows multiple target variables. The rules for multiple target variables are the same as for a multiple variable assignment statement: a sequence object will be decomposed and items assigned to each variable. Because of that, we can leverage the enumerate() function to iterate through a sequence and assign the index values at the same time. It looks like this: for n, v in enumerate(some_list):      print(n, v) The enumerate() function is a generator function which iterates through the items in source sequence and yields a sequence of two-tuple pairs with the index and the item. Since we've provided two variables, the two-tuple is decomposed and assigned to each variable. There are numerous use cases for this multiple-assignment for loop. We often have list-of-tuples data structures that can be handled very neatly with this multiple-assignment feature. Iterating with the while statement The while statement is a more general iteration than the for statement. We'll use a while loop in two situations. We'll use this in cases where we don't have a finite collection to impose an upper bound on the loop's iteration; we may suggest an upper bound in the while clause itself. We'll also use this when writing a "search" or "there exists" kind of loop; we aren't processing all items in a collection. A desktop application that accepts input from a user, for example, will often have a while loop. The application runs until the user decides to quit; there's no upper bound on the number of user interactions. For this, we generally use a while True: loop. Infinite iteration is recommended. If we want to write a character-mode user interface, we could do it like this: quit_received= False while not quit_received:    command= input("prompt> ")    quit_received= process(command) This will iterate until the quit_received variable is set to True. This will process indefinitely; there's no upper boundary on the number of iterations. This process() function might use some kind of command processing. This should include a statement like this: if command.lower().startswith("quit"): return True When the user enters "quit", the process() function will return True. This will be assigned to the quit_received variable. The while expression, not quit_received, will become False, and the loop ends. A "there exists" loop will iterate through a collection, stopping at the first item that meets certain criteria. This can look complex because we're forced to make two details of loop processing explicit. Here's an example of searching for the first value that meets a condition. This example assumes that we have a function, condition(), which will eventually be True for some number. Here's how we can use a while statement to locate the minimum for which this function is True: >>> n = 1 >>> while n != 101 and not condition(n): ...     n += 1 >>> assert n == 101 or condition(n) The while statement will terminate when n == 101 or the condition(n) is True. If this expression is False, we can advance the n variable to the next value in the sequence of values. Since we're iterating through the values in order from the smallest to the largest, we know that n will be the smallest value for which the condition() function is true. At the end of the while statement we have included a formal assertion that either n is 101 or the condition() function is True for the given value of n. Writing an assertion like this can help in design as well as debugging because it will often summarize the loop invariant condition. We can also write this kind of loop using the break statement in a for loop, something we'll look at in the next section. The continue and break statements The continue statement is helpful for skipping items without writing deeply-nested if statements. The effect of executing a continue statement is to skip the rest of the loop's suite. In a for loop, this means that the next item will be taken from the source iterable. In a while loop, this must be used carefully to avoid an otherwise infinite iteration. We might see file processing that looks like this: for line in some_file:    clean = line.strip()    if len(clean) == 0:        continue    data, _, _ = clean.partition("#")    data = data.rstrip()    if len(data) == 0:        continue    process(data) In this loop, we're relying on the way files act like sequences of individual lines. For each line in the file, we've stripped whitespace from the input line, and assigned the resulting string to the clean variable. If the length of this string is zero, the line was entirely whitespace, and we'll continue the loop with the next line. The continue statement skips the remaining statements in the body of the loop. We'll partition the line into three pieces: a portion in front of any "#", the "#" (if present), and the portion after any "#". We've assigned the "#" character and any text after the "#" character to the same easily-ignored variable, _, because we don't have any use for these two results of the partition() method. We can then strip any trailing whitespace from the string assigned to the data variable. If the resulting string has a length of zero, then the line is entirely filled with "#" and any trailing comment text. Since there's no useful data, we can continue the loop, ignoring this line of input. If the line passes the two if conditions, we can process the resulting data. By using the continue statement, we have avoided complex-looking, deeply-nested if statements. It's important to note that a continue statement must always be part of the suite inside an if statement, inside a for or while loop. The condition on that if statement becomes a filter condition that applies to the collection of data being processed. continue always applies to the innermost loop. Breaking early from a loop The break statement is a profound change in the semantics of the loop. An ordinary for statement can be summarized by "for all." We can comfortably say that "for all items in a collection, the suite of statements was processed." When we use a break statement, a loop is no longer summarized by "for all." We need to change our perspective to "there exists". A break statement asserts that at least one item in the collection matches the condition that leads to the execution of the break statement. Here's a simple example of a break statement: for n in range(1, 100):    factors = []    for x in range(1,n):        if n % x == 0: factors.append(x)    if sum(factors) == n:        break We've written a loop that is bound by . This loop includes a break statement, so it will not process all values of n. Instead, it will determine the smallest value of n, for which n is equal to the sum of its factors. Since the loop doesn't examine all values, it shows that at least one such number exists within the given range. We've used a nested loop to determine the factors of the number n. This nested loop creates a sequence, factors, for all values of x in the range , such that x, is a factor of the number n. This inner loop doesn't have a break statement, so we are sure it examines all values in the given range. The least value for which this is true is the number six. It's important to note that a break statement must always be part of the suite inside an if statement inside a for or while loop. If the break isn't in an if suite, the loop will always terminate while processing the first item. The condition on that if statement becomes the "where exists" condition that summarizes the loop as a whole. Clearly, multiple if statements with multiple break statements mean that the overall loop can have a potentially confusing and difficult-to-summarize post-condition. Using the else clause on a loop Python's else clause can be used on a for or while statement as well as on an if statement. The else clause executes after the loop body if there was no break statement executed. To see this, here's a contrived example: >>> for item in 1,2,3: ...     print(item) ...     if item == 2: ...         print("Found",item) ...       break ... else: ...     print("Found Nothing") The for statement here will iterate over a short list of literal values. When a specific target value has been found, a message is printed. Then, the break statement will end the loop, avoiding the else clause. When we run this, we'll see three lines of output, like this: 1 2 Found 2 The value of three isn't shown, nor is the "Found Nothing" message in the else clause. If we change the target value in the if statement from two to a value that won't be seen (for example, zero or four), then the output will change. If the break statement is not executed, then the else clause will be executed. The idea here is to allow us to write contrasting break and non-break suites of statements. An if statement suite that includes a break statement can do some processing in the suite before the break statement ends the loop. An else clause allows some processing at the end of the loop when none of the break-related suites statements were executed. Summary In this article, we've looked at the for statement, which is the primary way we'll process the individual items in a collection. A simple for statement assures us that our processing has been done for all items in the collection. We've also looked at the general purpose while loop. Resources for Article: Further resources on this subject: Introspecting Maya, Python, and PyMEL [article] Analyzing a Complex Dataset [article] Geo-Spatial Data in Python: Working with Geometry [article]
Read more
  • 0
  • 0
  • 2320

Packt
09 Jul 2015
29 min read
Save for later

Process of designing XenDesktop® deployments

Packt
09 Jul 2015
29 min read
In this article by, Govardhan Gunnala and Daniele Tosatto, authors of the book Mastering Citrix® XenDesktop®, we will discuss the process of designing XenDesktop® deployments. The uniqueness of the XenDesktop architecture is its modular five layer model. It covers all the key decisions in designing the XenDesktop deployment. User layer: Defines the users and their requirements Access layer: Defines how the users will access the resources Desktop/resource layer: Defines what resources will be delivered Control layer: Defines managing and maintaining the solution Hardware layer: Defines what resources it needs for implementing the chosen solution (For more resources related to this topic, see here.) While FMA is simple at a high level, its implementation can become complex depending on the technologies/options that are chosen for each component across the layers of FMA. Along with great flexibility, comes the responsibility of diligently choosing the technologies/options for fulfilling your business requirements. Importantly, the decisions made in the first three layers impact the last two layers of the deployment. It means that fixing a wrong decision anywhere in the first three layers during/after implementation stage would have less or no scope, and may even lead to implement the solution from the scratch again. Your design decisions speak for your solution's effectiveness in helping with the given business requirements. The layered architecture of the XenDesktop FMA, featuring the components at each layer is given in the following diagram. Each component of XenDesktop will fall under one of the layers shown in the succeeding diagram. We'll see what decisions are to be made for each of these components at each layer in the next sub section. Decisions to be made at each layer I will have to write a separate book for discussing all the possible technologies/options that are available at each layer. Following is a highly summarized list of the decisions to be made at each layer. This will help you in realizing the breadth of designing XenDesktop. This high level coverage of the various options will help you in locating and considering all the possible options that are available for making the right decisions and avoiding any slippages and missing any considerations. User layer The user layer refers to the specification of the users who will utilize the XenDesktop deployment. A business requirement statement may mention that the service users can either be the internal business users or the external customers accessing the service from Internet. Furthermore, both of these users may also need mobile access to the XenDesktop services. The Citrix receiver is the only component that belongs to the user layer, and XenDesktop is dependent on it for successfully delivering a XenDesktop session. By correlating this technical aspect with the preceding business requirement statement, one needs to consider all the possible aspects of receiver software on the client devices. This involves making the following decisions: Endpoint/user devices related: What are the devices that the users are supposed to access the services from? Who owns and administrates those devices throughout their lifecycle? Endpoints supported: Corporate computers, laptops, or mobiles running on Windows or thin clients. User smart devices, such as Android tablets, Apple iPads, and so on. In case of service providers, the endpoints can usually be any device and they need to be supported. Endpoint ownership: Device management includes security, availability, and compliance. Maintaining the responsibility of the devices on network. Endpoint lifecycle: Devices become either outdated or limited very quickly. Define minimum device hardware requirements to run your business workloads. Endpoint form factor: Choose the devices that may either be fully featured or have limited thin clients, or be a mix of both to support features, such as HDX graphics, multi-monitors, and so on. Thin client selection: Choose if the thin clients, such as Dell Wyse Zero clients, running on the limited functionality operating systems would suffice your user requirements. Understand its licensing cost. Receiver selection: Once you determine your endpoint device and its capabilities, you need to decide on the receiver selection that can be run on the devices. The greatest thing is that receiver is available for almost any device. Receiver type: Choose the receiver that is required for your device. Since the Receiver software for each platform (OS) differs, it is important to use the appropriate Receiver software for your devices while considering the platform that it runs on. You can download the appropriate Receiver software for your device from http://www.Citrix.com/go/receiver.html page. Initial deployment: Receiver is like any other software that will fit into your overall application portfolio. Determine how you would deploy this application on your devices. For corporate desktops and mobiles, you may use the enterprise application deployment and the mobile device management software. Otherwise, the users will be prompted to install it when they access the StoreFront URL, or they can even download it from Citrix for facilitating the installation process. For user-managed mobile devices, you can get it from the respective Windows or Google Apple stores/marketplaces. Initial configuration: Similar to other applications, Receiver requires certain initial configuration. It can be configured either manually or by using a provisioning file, group policy, and e-mail based discovery. Keeping the Receiver software up-to-date: Once you have installed Receiver on user devices, you will also require a mechanism for deploying the updates to Receiver. This can also be the way of initial deployments. Access layer An access layer refers to the specification of how the users gain access to the resources. A business requirement statement may usually state that the users should be validated for gaining access, and the access should be secured when the user is connected over the Internet. The technical components that fall under this layer include firewall(s), NetScaler, and StoreFront. These components play a broader role in the overall networking infrastructure of the company, which also includes the XenDesktop, as well as complete Citrix solutions in the environment. Their key activities include firewalling, external to internal IP address NATing, NetScaler Gateway to secure the connection between the virtual desktop and the user device, global load balancing, user validation/authentication, and GUI presentation of the enumerated resources to the end users. It involves making the following decisions: Authentication: Authentication point: A user can be authenticated at the NetScaler Gateway or StoreFront. Authentication policy: Various business use cases and compliance makes certain modes of authentication mandatory. You can choose from the different authentication methods supported at: StoreFront: Basic authentication by using a username and a password; Domain Pass-through, the NS Gateway pass-through, smart card, and even unauthenticated access. NetScaler Gateway: LDAP, RADIUS (token), client certificates. StoreFront: The decisions that are to be made around the scope of StoreFront are as follows: Unauthenticated access: Provides access to the users who don't require a username and a password, but they are still able to access the administrator allowed resources. Usually, this fits well with public or Kiosk systems. High availability: Making the StoreFront servers available at all times. Hardware load balancing, DNS Round Robin, Windows network load balancing, and so on. Delivery controller high availability and StoreFront: Building high availability for the delivery controller is recommended since they are needed for forming successful connections. Defining more than one delivery controller for the stores makes StoreFront auto failover to the next server in the list. Security - Inbound traffic: Consider securing the user connection to virtual desktops from the internal StoreFront and the external NetScaler Gateway. Security – Backend traffic: Consider securing the communication between the StoreFront and the XML services running on the controller servers. As these will be within the internal network, they can be secured by using the internal private certificate. Routing Receiver with Beacons: Receiver supports websites called Beacons to identify whether the user connection is internal or external. StoreFront provides Receiver with the http(s) addresses of the Beacon points during the initial connection. Resource Presentation: StoreFront presents a webpage, which provides self-service of the resources by the user. Scalability: The StoreFront server load and capacity for the user workload. Multi-site App synchronization: StoreFront can connect to the controllers at multiple site deployments. StoreFront can replicate the user subscribed applications across the servers. NetScaler Gateway: In the NetScaler Gateway, the decision regarding the secured external user access from public Internet involves the following: Topology: NetScaler supports two topologies: 1-Arm (normal security) and 2-Arm (high security). High availability: The NetScaler Gateways can be configured in pairs to provide high availability. Platform: NetScaler is available in different platforms, such as VPX, MDX, and SDX. They have different SSL throughput and SSL Transaction Per Second (TPS) metrics. Pre-authentication policy: Specifies about the Endpoint Analysis (EPA) scans for evaluating whether the endpoints meet the pre-set security criteria. This is available when NetScaler is chosen as the authentication point. Session management: The session policies define the overall user experience by classifying the endpoints into mobile and non-mobile devices. Session profile defines the details needed for gaining access to the environment. These are in two forms: SSLVPN and HDX proxy. Preferred data center: In multi-active data center deployments, StoreFront can determine the user resources primary data center and NetScaler can direct the user connections to that. Static and dynamic methods are used for specifying the preferred data center. Desktop/resource layer The desktop or resource layer refers to the specification of which resources (applications and desktops) users will receive. This layer comes with various options, which are tailored for business user roles and their requirements. This layer makes XenDesktop a better fit for achieving the varying user needs across each of their departments. It includes specification of the FlexCast model (type of desktop), user personalization, and delivering the application to the users in the desktop session. An example business requirement statement may specify that all the permanent employees would require a desktop with all the basic applications pre-installed based on their team and role, with their user settings and data to be retained. For all the contract employees, provide a basic desktop with controlled access to the applications on-demand and do not retain their user data. It includes various components, such as profile management solutions (including Windows profiles, the Citrix profile management, AppSense), Citrix print server, Windows operating systems, application delivery, and so on. It involves making decisions, such as: Images: It involves choosing the FlexCast model that is tailored to the user requirements, thereby delivering the expected desktop behavior to the end users, as follows: Operating system related: It requires choosing the desktop or the server operating systems for your master image, which depends on the FlexCast model that you are choosing from. Hosted Shared Hosted VDI: Pooled-static, pooled-random, pooled with PvD, dedicated, existing, physical/remote PC, streamed and streamed with PvD Streamed VHD Local VM On-demand apps Local apps In case of the desktop OS, it's also important to choose the right OS architecture according to the 32-bit or 64-bit processor architecture of the desktop. Computer policies: Define the controls over the user connection, security and bandwidth settings, devices or connection types, and so on. Specify all the policy features similar to that of the user policies. Machine catalogs: Define your catalog settings, including the FlexCast model, AD computer accounts, provisioning method, OS of the base image, and so on. Delivery groups: Assign desktops or applications to the user groups. Application folders: This is a tidy interface feature in Studio for organizing the applications into folders for easy management. StoreFront integration: This is an option for specifying the StoreFront URL for the Receiver in the master image so that the users will be auto connected to the storefront in the session. Resource allocation: This defines the hardware resources for the desktop VMs. It primarily involves hosts and storage. Depending on your estimated workloads, you can define the resources, such as number of virtual processors (vCPU), amount of virtual memory (vRAM), storage requirements for the needed disk space, and also the following resources Graphics (GPU): For the advanced use cases, you may choose to allocate the pass-through GPU, hardware vGPU, or the software vGPUs. IOPS: Depending on the operating system, the FlexCast model, and estimated workloads, you can analyze the overall IOPS load from the system and plan the corresponding hardware to support that load. Optimizations: Depending on the operating system, you can apply various optimizations to Windows that run on the master image. This greatly reduces the overall load later. Bandwidth requirements: Bandwidth can be a limiting factor in case of WAN and remote user connections of slow networks. Bandwidth consumption and user experience depend on various factors, such as the operating system being used, the application design, and screen resolution. To retain high user experience, it's important to consider the bandwidth requirements and optimization technologies, as follows: Bandwidth minimizing technologies: These include Quality of Service (QoS), HDX RealTime, and WAN Optimization, with Citrix's own CloudBridge solution. HDX Encoding Method: HDX encoding method also affects the bandwidth usage. For XenDesktop 7.x, there are three encoding methods that are available. These will appropriately be employed by the HDX protocol. These are Desktop Composition Redirection, H.264 Enhanced SuperCodec, and Legacy Mode (XenDesktop 5.X Adaptive Display). Session Bandwidth: Bandwidth needed in a session depends on the user interaction with desktop and applications. Latency: HDX can typically perform well up to 300 ms latency and the experience begins to degrade as latency increases. Personalization: This is an essential element of the desktop environment. It involves the decisions that are critical for the end user experience/acceptance and for the overall success of the solution during implementation. Following are the decisions that are involved in personalization. User profiles: This involves the decisions that are related to the user login, roaming of their settings, and seamless profile experience across overall Windows network: Profile type: Choose which profile type works for your user requirements. Possible options include local, roaming, mandatory, and hybrid profile with Citrix Profile Management. Citrix Profile Management provides various additional features, such as profile streaming, active write back, and configuring profiles using an .ini file, and so on. Folder redirection: This option saves the user's application settings in the profile. Represents special folders, such as AppData, Desktop, and so on. Folder exclusion: This option is for setting the exclusion of folders that are to be saved in the user profile. Usually, it refers to the local and IE Temp folders of a user profile. Profile caching: Caching profiles on a local system improves the user login experience and it occurs by default. You need to consider this depending on the type of virtual desktop FlexCast mode. Profile permissions: Specify whether the administrator needs access to the user profiles based on information sensitivity. Profile path: The decision to place the user profiles on a network location for high availability. It affects the logon performance depending on how close the profile is to the virtual desktop from which the user is logging on. It can be managed either from Active Directory or through Citrix Profile Management. User profile replication between data centers: This involves making the user profiles highly available and supporting the profile roaming among multiple data centers. User policies: Involves the decision regarding deploying the user settings and controlling those using management policies providing consistent settings for users, such as: Preferred policy engine: This requires choosing the policy processing for the Windows systems. The Citrix policies can be defined and managed from either Citrix Studio or the Active Directory group policy. Policy filtering: The Citrix policies can be applied to the users and their desktop with the various filter options that are available in the Citrix policy engine. If the group policies have been used, then you'll use the group policy filtering options. Policy precedence: The Citrix policies are processed in the order of LCSDOU (Local, Citrix, Site, Domain, OU policies). Baseline policy: This defines the policy with default and common settings for all the desktop images. Citrix provides the policy templates that suit specific business use cases. A baseline should cover security requirements, common network conditions, and managing the user device or the user profile requirements. Such a baseline can be configured using the security policies, connection-based policies, device-based policies, and profile-based policies. Printing: This is one of the most common desktop user requirements. XenDesktop supports printing, which can work for various scenarios. The printing technology involves deploying and using appropriate drivers. Provisioning printers: These can either be a static or dynamic set of printers. The options for dynamic printers do and do not auto-create all the client printers and auto-create the non-network client printers only. You can also set the options for session printers through the Citrix policy, which can include either static or dynamic printers. Furthermore, you can also set proximity printers. Managing print drivers: This option can be configured so that printer drivers are auto-installed during the session creation. It can be installed by using either the generic Citrix universal printer driver, or the manual option. You can also have all the known drivers preinstalled on the master image. Citrix even provides the Citrix universal print server, which extends XenDesktop universal printing support to network printing. Print job routing: It can be routed among client device or through the network server. The ICA protocol is used for compressing and sending data. Personal vDisk: Desktops with personal vDisks retain the user changes. Choosing the personal vDisk depends on the user requirements and the FlexCast Model that was opted for. Personal vDisk can be set to thin provisioned for estimated growth, but it can't be shrunk later. Applications: The application separation into another layer improves the scalability of the overall desktop solution. Applications are critical elements, which the users require from a desktop environment: Application delivery method: Applications can be installed on the base image, on the Personal vDisks, streamed into the session, or through the on-demand XenApp hosted mode. It also depends on application compatibility, and it requires technical expertise and tools, such as AppDNA, for effectively resolving them. Application streaming: XenDesktop supports App-V to build isolated application packages, which can be streamed to desktops. 16-bit legacy application delivery: If there are any legacy 16-bit applications to be supported, then you can choose from the 32 bit OS, VM hosted App, or a parallel XenApp5 deployment. Control layer Control layer speaks about all the backend systems that are required for managing and maintaining the overall solution through its life cycle. The control layer includes most of the XenDesktop components that are further classified into categories, such as resource/access controllers, image/desktop controllers, and infrastructure controllers. These respectively correspond to the first three layers of FMA, as shown here: Resource/access controllers: Supports the access layer Image/desktop controllers: Supports the desktop/resource layer Infrastructure controllers: Provides the underlying hardware for the overall FMA components/environment This layer involves the specification of capacity, configuration, and the topology of the environment. Building required/planned redundancy for each of these components enables achieving the enterprise business capabilities, such as HA, scalability, disaster recovery, load balancing, and so on. Components and technologies that operate under this layer include Active Directory, group policies, site database, Citrix licensing, XenDesktop delivery controllers, XenClient hypervisor, the Windows server and the Desktop operating systems, provisioning services, which can be either MCS or PVS and their controllers, and so on. An example business requirement statement may be as follows: Build a highly available desktop environment for a fast growing business users group. We currently have a head count of 30 users, which is expected to double in a year. It involves making the following decisions: Infrastructure controllers: It includes common infrastructure, which is required for XenDesktop to function in the Windows domain network. Active Directory: This is used for the authentication and authorization of users in a Citrix environment. It's also responsible for providing and synchronizing time on the systems, which is critical for Kerberos. For the most part, your AD structure will be in-place, and it may require certain changes for accommodating your XenDesktop requirements, such as: Forest design: It involves choosing the AD forest and domain decisions, such as multi-domain, multi-forest, domain and forest trusts, and so on, which will define the users of the XenDesktop resources. Site design: It involves choosing the number of sites that represent your geographical locations, the number of domain controllers, the subnets that accommodate the IP addresses, site links for replication, and so on. Organizational unit structure: Planning the OU structure for easier management of XenDesktop Workers and VDAs. In the case of multi-forest deployment scenarios (as supported in App Orchestration), having the same OU structure is critical. Naming standards: Planning proper conventions for XenDesktop AD objects, which includes users, security groups, XenDesktop servers, OUs, and so on. User groups: This helps in choosing the individual user names or groups. The user security groups are recommended as they reduce validation to just one object despite the number of users in it. Policy control: This helps in planning GPOs ordering and sizing, inheritance, filtering, enforcement, blocking, and loopback processing for reducing the overall processing time on the VDAs and servers. Database: Citrix uses the Microsoft SQL server database for most of its products, as follows: Edition: Microsoft ships the SQL server database in different editions, which provide varying features and capabilities. Using the standard edition for typical XenDesktop production deployments is recommended. For larger/enterprise deployments, depending on the requirement, a higher edition may be required. Database and Transaction Log Sizing: This involves estimating the storage requirements for the Site Configuration database, Monitoring database, and configuration logging databases. Database Location: By default, the Configuration Logging and the Monitoring databases are located within the Site Configuration database. Separating these into separate databases and relocating the Monitoring database to a different SQL server is recommended. High availability: Choose from VM-level HA, Mirroring, AlwaysOn Failover Cluster, and AlwaysOn Availability Groups. Database Creation: Usually, the database is automatically recreated during the XenDesktop installation. Alternatively, they can be created by using the scripts. Citrix licensing: Citrix licensing for XenDesktop requires the existence of a Citrix license server on the network. You can install and manage the multiple Citrix licenses. License type: Choose from user, device, and concurrent licenses. Version: Citrix's new license servers are backward compatible. Sizing: A license server can be scaled out to support a higher number of license requests per second. High availability: License server comes with a 30 day grace period to usually help in recovering from failures. High Availability for license server can be implemented through Window clustering technology or duplication of the virtual server. Optimization: Optimize the number of the receiving and processing threads depending on your hardware. This is generally required in large and heavily-loaded enterprise environments. Resource controllers: The resource controllers include the XenDesktop, the XenApp controllers, and the XenClient synchronizer, as shown here: XenDesktop and XenApp delivery controller. Number of sites: It is considered to have been based on network, risk tolerance, security requirements. Delivery controller sizing: Delivery controller scalability is based on CPU utilization. The more processor cores are available, the more virtual desktops a controller can support. High availability: Always plan for the N+1 deployment of the controllers for achieving the HA. Then, update the controllers' details on VDA through policy. Host connection configuration: Host connections define the hosts, storage repositories, and guest network to be used by the virtual machines on hypervisors. XML service encryption: The XML service protocol running on delivery controllers uses clear text for exchanging all data except passwords. Consider using an SSL encryption for sending the StoreFront data over a secure HTTP connection. Server OS load management: The default maximum number of sessions per server has been set to 250. Using real time usage monitoring and loading analysis, you can define appropriate load management policies. Session PreLaunch and Session Linger: Designed for helping the users in quickly accessing the applications by starting the sessions before they are requested (session prelaunch) and by keeping the user sessions active after a user closes all the applications in a session (session linger). XenClient synchronizer: It includes considerations for its architecture, processor specification, memory specification, network specification, high availability, the SQL database, remote synchronizer servers, storage repository size and location, and external access, and Active Directory integration. Image controllers: This includes all the image provisioning controllers. MCS is built-into the delivery controller. We'll have PVS considerations, such as the following: Farms: A farm represents the top level of the PVS infrastructure. Depending on your networking and administration boundaries, you can define the number of farms to be deployed in your environment. Sites: Each Farm consists of one or more sites, which contain all the PVS objects. While multiple sites share the same database, the target devices can only failover to the other Provisioning Servers that are within the same site. Your networking and organization structure determines the number of sites in your deployment. High availability: If implemented, PVS will be a critical component of the virtual desktop infrastructure. HA should be considered for its database, PVS servers, vDisks and storage, networking and TFTP, and so on. Bootstrap delivery: There are three methods in which the target device can receive the bootstrap program. This can be done by using the DHCP options, the PXE broadcasts, and the boot device manager. Write cache placement: Write cache uniquely identifies the target device by including the target device's MAC address and disk identifier. Write cache can be placed on the following: Cache on the Device Hard Drive, Cache on the Device Hard Drive Persisted, Cache in the Device RAM, Cache in the Device RAM with overflow on the hard disk, and Cache on the Provisioning Server Disk, and Cache on the Provisioning Server Disk Persisted. vDisk format and replication: PVS supports the use of fixed-size or dynamic vDisks. vDisks hosted on a SAN, local, or Direct Attached Storage must be replicated between the vDisk stores whenever a vDisk is created or changed. It can be replicated either manually or automatically. Virtual or physical servers, processor and memory: The virtual Provisioning Servers are preferred when sufficient processor, memory, disk and networking resources are guaranteed. Scale up or scale out: Determining whether to scale up or scale out the servers requires considering factors like redundancy, failover times, datacenter capacity, hardware costs, hosting costs, and so on. Bandwidth requirements and network configuration: PVS can boot 500 devices simultaneously. A 10Gbps network is recommended for provisioning services. Network configuration should consider the PVS Uplink, the Hypervisor Uplink, and the VM Uplink. Recommended switch settings include either Disable Spanning Tree or Enable Portfast, Storm Control, and Broadcast Helper. Network interfaces: Teaming the multiple network interfaces with link aggregation can provide a greater throughput. Consider the NIC features TCP Offloading and Receive Side Scaling (RSS) while selecting NICs. Subnet affinity: It is a load balancing algorithm, which helps in ensuring that the target devices are connected to the most appropriate Provisioning Server. It can be configured to Best Effort and Fixed. Auditing: By default, the auditing feature is disabled. When enabled, the audit trail information is written in the provisioning services database along with the general configuration data. Antivirus: The antivirus software can cause file-locking issues on the PVS server by contending with the files being accessed by PVS Services. The vDisk store and the write cache should be excluded from any antivirus scans in order to prevent file contention issues. Hardware layer The hardware layer involves choosing the right capacity, make, and hardware features of the backend systems that are required for the overall solution as defined in the control layer. In-line with the control layer, the hardware layer decisions will change if any of the first three layer decisions are changed. Components and technologies that operate under this layer include server hardware, storage technologies, hard disks and the RAID configurations, hypervisors and their management software, backup solutions, monitoring, network devices and connectivity, and so on. It involves making the decisions shown here: Hardware Sizing: The hardware sizing is usually done in two ways. The first, and the preferred, way is to plan ahead and purchase the hardware based on the workload requirements. The second way to size the hosts to use the existing hardware in the best configuration to support the different workload requirements, as follows: Workload separation: Workloads can either be separated into dedicated resource clusters or be mixed in the same physical hosts. Control host sizing: The VM resource allocation for each control component should be determined in the control layer and it should be allocated accordingly. Desktop host sizing: This involves choosing the physical resources required for the virtual desktops as well as the hosted server deployments. It includes estimating the pCPU, pRAM, GPU, and the number of hosts. Hypervisors: This involves choosing from the supported hypervisors that include major players, such as Hyper-V, XenServer, and ESX. Choosing from these requires considering a vast range of parameters, such as host hardware - processor and memory, storage requirements, network requirements, scale up/out, and host scalability. Further considerations to be made also include the following: Networking: Networks, physical NIC, NIC teaming, virtual NICs—hosts, virtual NICs—guests, and IP addressing VM provisioning: Templates High availability: Microsoft Hyper-V: Failover clustering, cluster shared volumes, CSV cache VMware ESXi: VMware vSphere high availability cluster Citrix XenServer: XenServer high availability by using the server pool Monitoring: Use the hypervisor specific vendor provided management and monitoring tools for hypervisor monitoring; use hardware specific vendor provided monitoring tools for hardware level monitoring. Backup and recovery: Backup method and components to be backed up. Storage: Storage architecture, RAID level, numbers of disks, disk type, storage bandwidth, tiered storage, thin provisioning, and data de-duplication Disaster recovery Data center utilization: The XenDesktop deployments can leverage multiple data centers for improving the user performance and the availability of resources. Multiple data centers can be deployed in an active/active or an active/passive configuration. An active/active configuration allows for both data centers to be utilized, although the individual users are tied to a specific location. Data center connectivity: An active/active data center configuration utilizing GSLB (Global Server Load Balancing) ensures that the users will be able to establish a connection even if one datacenter is unavailable. In the active/active configuration, the considerations that should be made are as follows: data center failover time, application servers, and StoreFront optimal routing. Capacity in the secondary data center: Planning of the secondary data center capacity is determined by the cost and by the management to support full capacity in each data center. A percent of the overall users, or a percent of the users per application, may be considered for the secondary data center facility. Then, it also needs the consideration of the type and amount of resources that will be made available in a failover scenario. Tools for designing XenDesktop® In the previous section, we saw a broad list of components, technologies, and configuration options, and so on, which we learned are involved in the process of designing the XenDesktop deployment. Obviously, designing the XenDesktop deployment for large, advanced, and complex business scenarios is a mammoth task, which requires operational knowledge of a broad range of technologies. Understanding the maze of this complexity, Citrix constantly helps the customers with great learning resources through handbooks, reviewer guides, blueprints, online eDocs, and training sessions. To ease the life of technical architects and XenDesktop designing and deployment consultants, Citrix has developed an online designing portal called Project Accelerator, which automates, streamlines, and covers all the broad aspects that are involved in the XenDesktop deployment. Project Accelerator Citrix designed the Project Accelerator web based designing tool, and it is available to the customers after they login. Its design is based on the Citrix consulting best practices for the XenDesktop deployment and implementation. It follows the layered FMA and allows you to create a close to deployment architecture. It covers all the key decisions and facilitates modifying them and evaluating their impact on the overall architecture. Upon completion of the design, it generates an architectural diagram and a deployment sizing plan. One can define more than one project and customize them in parallel to achieve multiple deployment plans. I highly recommended starting your Production XenDesktop deployment with the Project Accelerator architecture and the sizing design. Virtual Desktop Handbook Citrix provides the handbook along with new XenDesktop releases. The handbook covers the latest features of that XenDesktop version and provides detailed information on the design decisions. It provides all the possible options for each of the decisions involved, and these options are evaluated and validated in an in-depth manner by the Citrix Solutions lab. They include the Citrix Consulting leading best practices as well. This helps architects and engineers to consider the recommended technologies, and then evaluate them further for fulfilling the business requirements. The Virtual Desktop Handbook for latest the version of XenDesktop, that is, 7.x, can be found at: http://support.Citrix.com/article/CTX139331. XenDesktop® Reviewer's Guide The Reviewer's Guide is also released along with the new versions of XenDesktop. They are designed for helping businesses in quickly installing and configuring the XenDesktop for evaluation. They provide a step-by-step screencast of the installation and configuration wizards of XenDesktop. This provides practical guidance to the IT administrators for successfully installing and delivering the XenDesktop sessions. The XenDesktop Reviewers Guide for the latest version of XenDesktop, that is, 7.6, can be found at https://www.citrix.com/content/dam/citrix/en_us/documents/products-solutions/xendesktop-reviewers-guide.pdf. Summary We learnt the decision making that is involved in designing the XenDesktop in general, and we also saw the deployment designs of the complex environments involving the cloud capabilities. We also saw different tools for designing XenDesktop. Resources for Article: Further resources on this subject: Understanding Citrix®Provisioning Services 7.0 [article] Installation and Deployment of Citrix Systems®' CPSM [article] Designing, Sizing, Building, and Configuring Citrix VDI-in-a-Box [article]
Read more
  • 0
  • 0
  • 9311

article-image-clustering-and-other-unsupervised-learning-methods
Packt
09 Jul 2015
19 min read
Save for later

Clustering and Other Unsupervised Learning Methods

Packt
09 Jul 2015
19 min read
In this article by Ferran Garcia Pagans, author of the book Predictive Analytics Using Rattle and Qlik Sense, we will learn about the following: Define machine learning Introduce unsupervised and supervised methods Focus on K-means, a classic machine learning algorithm, in detail We'll create clusters of customers based on their annual money spent. This will give us a new insight. Being able to group our customers based on their annual money spent will allow us to see the profitability of each customer group and deliver more profitable marketing campaigns or create tailored discounts. Finally, we'll see hierarchical clustering, different clustering methods, and association rules. Association rules are generally used for market basket analysis. Machine learning – unsupervised and supervised learning Machine Learning (ML) is a set of techniques and algorithms that gives computers the ability to learn. These techniques are generic and can be used in various fields. Data mining uses ML techniques to create insights and predictions from data. In data mining, we usually divide ML methods into two main groups – supervisedlearning and unsupervisedlearning. A computer can learn with the help of a teacher (supervised learning) or can discover new knowledge without the assistance of a teacher (unsupervised learning). In supervised learning, the learner is trained with a set of examples (dataset) that contains the right answer; we call it the training dataset. We call the dataset that contains the answers a labeled dataset, because each observation is labeled with its answer. In supervised learning, you are supervising the computer, giving it the right answers. For example, a bank can try to predict the borrower's chance of defaulting on credit loans based on the experience of past credit loans. The training dataset would contain data from past credit loans, including if the borrower was a defaulter or not. In unsupervised learning, our dataset doesn't have the right answers and the learner tries to discover hidden patterns in the data. In this way, we call it unsupervised learning because we're not supervising the computer by giving it the right answers. A classic example is trying to create a classification of customers. The model tries to discover similarities between customers. In some machine learning problems, we don't have a dataset that contains past observations. These datasets are not labeled with the correct answers and we call them unlabeled datasets. In traditional data mining, the terms descriptive analytics and predictive analytics are used for unsupervised learning and supervised learning. In unsupervised learning, there is no target variable. The objective of unsupervised learning or descriptive analytics is to discover the hidden structure of data. There are two main unsupervised learning techniques offered by Rattle: Cluster analysis Association analysis Cluster analysis Sometimes, we have a group of observations and we need to split it into a number of subsets of similar observations. Cluster analysis is a group of techniques that will help you to discover these similarities between observations. Market segmentation is an example of cluster analysis. You can use cluster analysis when you have a lot of customers and you want to divide them into different market segments, but you don't know how to create these segments. Sometimes, especially with a large amount of customers, we need some help to understand our data. Clustering can help us to create different customer groups based on their buying behavior. In Rattle's Cluster tab, there are four cluster algorithms: KMeans EwKm Hierarchical BiCluster The two most popular families of cluster algorithms are hierarchical clustering and centroid-based clustering: Centroid-based clustering the using K-means algorithm I'm going to use K-means as an example of this family because it is the most popular. With this algorithm, a cluster is represented by a point or center called the centroid. In the initialization step of K-means, we need to create k number of centroids; usually, the centroids are initialized randomly. In the following diagram, the observations or objects are represented with a point and three centroids are represented with three colored stars: After this initialization step, the algorithm enters into an iteration with two operations. The computer associates each object with the nearest centroid, creating k clusters. Now, the computer has to recalculate the centroids' position. The new position is the mean of each attribute of every cluster member. This example is very simple, but in real life, when the algorithm associates the observations with the new centroids, some observations move from one cluster to the other. The algorithm iterates by recalculating centroids and assigning observations to each cluster until some finalization condition is reached, as shown in this diagram: The inputs of a K-means algorithm are the observations and the number of clusters, k. The final result of a K-means algorithm are k centroids that represent each cluster and the observations associated with each cluster. The drawbacks of this technique are: You need to know or decide the number of clusters, k. The result of the algorithm has a big dependence on k. The result of the algorithm depends on where the centroids are initialized. There is no guarantee that the result is the optimum result. The algorithm can iterate around a local optimum. In order to avoid a local optimum, you can run the algorithm many times, starting with different centroids' positions. To compare the different runs, you can use the cluster's distortion – the sum of the squared distances between each observation and its centroids. Customer segmentation with K-means clustering We're going to use the wholesale customer dataset we downloaded from the Center for Machine Learning and Intelligent Systems at the University of California, Irvine. You can download the dataset from here – https://archive.ics.uci.edu/ml/datasets/Wholesale+customers#. The dataset contains 440 customers (observations) of a wholesale distributor. It includes the annual spend in monetary units on six product categories – Fresh, Milk, Grocery, Frozen, Detergents_Paper, and Delicatessen. We've created a new field called Food that includes all categories except Detergents_Paper, as shown in the following screenshot: Load the new dataset into Rattle and go to the Cluster tab. Remember that, in unsupervised learning, there is no target variable. I want to create a segmentation based only on buying behavior; for this reason, I set Region and Channel to Ignore, as shown here: In the following screenshot, you can see the options Rattle offers for K-means. The most important one is Number of clusters; as we've seen, the analyst has to decide the number of clusters before running K-means: We have also seen that the initial position of the centroids can have some influence on the result of the algorithm. The position of the centroids is random, but we need to be able to reproduce the same experiment multiple times. When we're creating a model with K-means, we'll iteratively re-run the algorithm, tuning some options in order to improve the performance of the model. In this case, we need to be able to reproduce exactly the same experiment. Under the hood, R has a pseudo-random number generator based on a starting point called Seed. If you want to reproduce the exact same experiment, you need to re-run the algorithm using the same Seed. Sometimes, the performance of K-means depends on the initial position of the centroids. For this reason, sometimes you need to able to re-run the model using a different initial position for the centroids. To run the model with different initial positions, you need to run with a different Seed. After executing the model, Rattle will show some interesting information. The size of each cluster, the means of the variables in the dataset, the centroid's position, and the Within cluster sum of squares value. This measure, also called distortion, is the sum of the squared differences between each point and its centroid. It's a measure of the quality of the model. Another interesting option is Runs; by using this option, Rattle will run the model the specified number of times and will choose the model with the best performance based on the Within cluster sum of squares value. Deciding on the number of clusters can be difficult. To choose the number of clusters, we need a way to evaluate the performance of the algorithm. The sum of the squared distance between the observations and the associated centroid could be a performance measure. Each time we add a centroid to KMeans, the sum of the squared difference between the observations and the centroids decreases. The difference in this measure using a different number of centroids is the gain associated to the added centroids. Rattle provides an option to automate this test, called Iterative Clusters. If you set the Number of clusters value to 10 and check the Iterate Clusters option, Rattle will run KMeans iteratively, starting with 3 clusters and finishing with 10 clusters. To compare each iteration, Rattle provides an iteration plot. In the iteration plot, the blue line shows the sum of the squared differences between each observation and its centroid. The red line shows the difference between the current sum of squared distances and the sum of the squared distance of the previous iteration. For example, for four clusters, the red line has a very low value; this is because the difference between the sum of the squared differences with three clusters and with four clusters is very small. In the following screenshot, the peak in the red line suggests that six clusters could be a good choice. This is because there is an important drop in the Sum of WithinSS value at this point: In this way, to finish my model, I only need to set the Number of clusters to 3, uncheck the Re-Scale checkbox, and click on the Execute button: Finally, Rattle returns the six centroids of my clusters: Now we have the six centroids and we want Rattle to associate each observation with a centroid. Go to the Evaluate tab, select the KMeans option, select the Training dataset, mark All in the report type, and click on the Execute button as shown in the following screenshot. This process will generate a CSV file with the original dataset and a new column called kmeans. The content of this attribute is a label (a number) representing the cluster associated with the observation (customer), as shown in the following screenshot: After clicking on the Execute button, you will need to choose a folder to save the resulting file to and will have to type in a filename. The generated data inside the CSV file will look similar to the following screenshot: In the previous screenshot, you can see ten lines of the resulting file; note that the last column is kmeans. Preparing the data in Qlik Sense Our objective is to create the data model, but using the new CSV file with the kmeans column. We're going to update our application by replacing the customer data file with this new data file. Save the new file in the same folder as the original file, open the Qlik Sense application, and go to Data load editor. There are two differences between the original file and this one. In the original file, we added a line to create a customer identifier called Customer_ID, and in this second file we have this field in the dataset. The second difference is that in this new file we have the kmeans column. From Data load editor, go to the Wholesale customer data sheet, modify line 2, and add line 3. In line 2, we just load the content of Customer_ID, and in line 3, we load the content of the kmeans field and rename it to Cluster, as shown in the following screenshot. Finally, update the name of the file to be the new one and click on the Load data button: When the data load process finishes, open the data model viewer to check your data model, as shown here: Note that you have the same data model with a new field called Cluster. Creating a customer segmentation sheet in Qlik Sense Now we can add a sheet to the application. We'll add three charts to see our clusters and how our customers are distributed in our clusters. The first chart will describe the buying behavior of each cluster, as shown here: The second chart will show all customers distributed in a scatter plot, and in the last chart we'll see the number of customers that belong to each cluster, as shown here: I'll start with the chart to the bottom-right; it's a bar chart with Cluster as the dimension and Count([Customer_ID]) as the measure. This simple bar chart has something special – colors. Each customer's cluster has a special color code that we use in all charts. In this way, cluster 5 is blue in the three charts. To obtain this effect, we use this expression to define the color as color(fieldindex('Cluster', Cluster)), which is shown in the following screenshot: You can find this color trick and more in this interesting blog by Rob Wunderlich – http://qlikviewcookbook.com/. My second chart is the one at the top. I copied the previous chart and pasted it onto a free place. I kept the dimension but I changed the measure by using six new measures: Avg([Detergents_Paper]) Avg([Delicassen]) Avg([Fresh]) Avg([Frozen]) Avg([Grocery]) Avg([Milk]) I placed my last chart at the bottom-left. I used a scatter plot to represent all of my 440 customers. I wanted to show the money spent by each customer on food and detergents, and its cluster. I used the y axis to show the money spent on detergents and the x axis for the money spent on food. Finally, I used colors to highlight the cluster. The dimension is Customer_Id and the measures are Delicassen+Fresh+Frozen+Grocery+Milk (or Food) and [Detergents_Paper]. As the final step, I reused the color expression from the earlier charts. Now our first Qlik Sense application has two sheets – the original one is 100 percent Qlik Sense and helps us to understand our customers, channels, and regions. This new sheet uses clustering to give us a different point of view; this second sheet groups the customers by their similar buying behavior. All this information is useful to deliver better campaigns to our customers. Cluster 5 is our least profitable cluster, but is the biggest one with 227 customers. The main difference between cluster 5 and cluster 2 is the amount of money spent on fresh products. Can we deliver any offer to customers in cluster 5 to try to sell more fresh products? Select retail customers and ask yourself, who are our best retail customers? To which cluster do they belong? Are they buying all our product categories? Hierarchical clustering Hierarchical clustering tries to group objects based on their similarity. To explain how this algorithm works, we're going to start with seven points (or observations) lying in a straight line: We start by calculating the distance between each point. I'll come back later to the term distance; in this example, distance is the difference between two positions in the line. The points D and E are the ones with the smallest distance in between, so we group them in a cluster, as shown in this diagram: Now, we substitute point D and point E for their mean (red point) and we look for the two points with the next smallest distance in between. In this second iteration, the closest points are B and C, as shown in this diagram: We continue iterating until we've grouped all observations in the dataset, as shown here: Note that, in this algorithm, we can decide on the number of clusters after running the algorithm. If we divide the dataset into two clusters, the first cluster is point G and the second cluster is A, B, C, D, E, and F. This gives the analyst the opportunity to see the big picture before deciding on the number of clusters. The lowest level of clustering is a trivial one; in this example, seven clusters with one point in each one. The chart I've created while explaining the algorithm is a basic form of a dendrogram. The dendrogram is a tree diagram used in Rattle and in other tools to illustrate the layout of the clusters produced by hierarchical clustering. In the following screenshot, we can see the dendrogram created by Rattle for the wholesale customer dataset. In Rattle's dendrogram, the y axis represent all observations or customers in the dataset, and the x axis represents the distance between the clusters: Association analysis Association rules or association analysis is also an important topic in data mining. This is an unsupervised method, so we start with an unlabeled dataset. An unlabeled dataset is a dataset without a variable that gives us the right answer. Association analysis attempts to find relationships between different entities. The classic example of association rules is market basket analysis. This means using a database of transactions in a supermarket to find items that are bought together. For example, a person who buys potatoes and burgers usually buys beer. This insight could be used to optimize the supermarket layout. Online stores are also a good example of association analysis. They usually suggest to you a new item based on the items you have bought. They analyze online transactions to find patterns in the buyer's behavior. These algorithms assume all variables are categorical; they perform poorly with numeric variables. Association methods need a lot of time to be completed; they use a lot of CPU and memory. Remember that Rattle runs on R and the R engine loads all data into RAM memory. Suppose we have a dataset such as the following: Our objective is to discover items that are purchased together. We'll create rules and we'll represent these rules like this: Chicken, Potatoes → Clothes This rule means that when a customer buys Chicken and Potatoes, he tends to buy Clothes. As we'll see, the output of the model will be a set of rules. We need a way to evaluate the quality or interest of a rule. There are different measures, but we'll use only a few of them. Rattle provides three measures: Support Confidence Lift Support indicates how often the rule appears in the whole dataset. In our dataset, the rule Chicken, Potatoes → Clothes has a support of 48.57 percent (3 occurrences / 7 transactions). Confidence measures how strong rules or associations are between items. In this dataset, the rule Chicken, Potatoes → Clothes has a confidence of 1. The items Chicken and Potatoes appear three times in the dataset and the items Chicken, Potatoes, and Clothes appear three times in the dataset; and 3/3 = 1. A confidence close to 1 indicates a strong association. In the following screenshot, I've highlighted the options on the Associate tab we have to choose from before executing an association method in Rattle: The first option is the Baskets checkbox. Depending on the kind of input data, we'll decide whether or not to check this option. If the option is checked, such as in the preceding screenshot, Rattle needs an identification variable and a target variable. After this example, we'll try another example without this option. The second option is the minimum Support value; by default, it is set to 0.1. Rattle will not return rules with a lower Support value than the one you have set in this text box. If you choose a higher value, Rattle will only return rules that appear many times in your dataset. If you choose a lower value, Rattle will return rules that appear in your dataset only a few times. Usually, if you set a high value for Support, the system will return only the obvious relationships. I suggest you start with a high Support value and execute the methods many times with a lower value in each execution. In this way, in each execution, new rules will appear that you can analyze. The third parameter you have to set is Confidence. This parameter tells you how strong the rule is. Finally, the length is the number of items that contains a rule. A rule like Beer è Chips has length of two. The default option for Min Length is 2. If you set this variable to 2, Rattle will return all rules with two or more items in it. After executing the model, you can see the rules created by Rattle by clicking on the Show Rules button, as illustrated here: Rattle provides a very simple dataset to test the association rules in a file called dvdtrans.csv. Test the dataset to learn about association rules. Further learning In this article, we introduced supervised and unsupervised learning, the two main subgroups of machine learning algorithms; if you want to learn more about machine learning, I suggest you complete a MOOC course called Machine Learning at Coursera: https://www.coursera.org/learn/machine-learning The acronym MOOC stands for Massive Open Online Course; these are courses open to participation via the Internet. These courses are generally free. Coursera is one of the leading platforms for MOOC courses. Machine Learning is a great course designed and taught by Andrew Ng, Associate Professor at Stanford University; Chief Scientist at Baidu; and Chairman and Co-founder at Coursera. This course is really interesting. A very interesting book is Machine Learning with R by Brett Lantz, Packt Publishing. Summary In this article, we were introduced to machine learning, and supervised and unsupervised methods. We focused on unsupervised methods and covered centroid-based clustering, hierarchical clustering, and association rules. We used a simple dataset, but we saw how a clustering algorithm can complement a 100 percent Qlik Sense approach by adding more information. Resources for Article: Further resources on this subject: Qlik Sense's Vision [article] Securing QlikView Documents [article] Conozca QlikView [article]
Read more
  • 0
  • 0
  • 32729

article-image-essentials-vmware-vsphere
Packt
09 Jul 2015
7 min read
Save for later

Essentials of VMware vSphere

Packt
09 Jul 2015
7 min read
In this article by Puthiyavan Udayakumar, author of the book VMware vSphere Design Essentials, we will cover the following topics: Essentials of designing VMware vSphere The PPP framework The challenges and encounters faced on virtual infrastructure (For more resources related to this topic, see here.) Let's get started with understanding the essentials of designing VMware vSphere. Designing is nothing but assembling and integrating VMware vSphere infrastructure components together to form the baseline for a virtualized datacenter. It has the following benefits: Saves power consumption Decreases the datacenter footprint and helps towards server consolidation Fastest server provisioning On-demand QA lab environments Decreases hardware vendor dependency Aids to move to the cloud Greater savings and affordability Superior security and High Availability Designing VMware vSphere Architecture design principles are usually developed by the VMware architect in concurrence with the enterprise CIO, Infrastructure Architecture Board, and other key business stakeholders. From my experience, I would always urge you to have frequent meetings to observe functional requirements as much as possible. This will create a win-win situation for you and the requestor and show you how to get things done. Please follow your own approach, if it works. Architecture design principles should be developed by the overall IT principles specific to the customer's demands, if they exist. If not, they should be selected to ensure positioning of IT strategies in line with business approaches. In nutshell, architect should aim to form an effective architecture principles that fulfills the infrastructure demands, following are high level principles that should be followed across any design: Design mission and plans Design strategic initiatives External influencing factors When you release a design to the customer, keep in mind that the design must have the following principles: Understandable and robust Complete and consistent Stable and capable of accepting continuous requirement-based changes Rational and controlled technical diversity Without the preceding principles, I wouldn't recommend you to release your design to anyone even for peer review. For every design, irrespective of the product that you are about to design, try the following approach; it should work well but if required I would recommend you make changes to the approach. The following approach is called PPP, which will focus on people's requirements, the product's capacity, and the process that helps to bridge the gap between the product capacity and people requirements: The preceding diagram illustrates three entities that should be considered while designing VMware vSphere infrastructure. Please keep in mind that your design is just a product designed by a process that is based on people's needs. In the end, using this unified framework will aid you in getting rid of any known risks and its implications. Functional requirements should be meaningful; while designing, please make sure there is a meaning to your design. Selecting VMware vSphere from other competitors should not be a random pick, you should always list the benefits of VMware vSphere. Some of them are as follows: Server consolidation and easy hardware changes Dynamic provisioning of resources to your compute node Templates, snapshots, vMotion, DRS, DPM, High Availability, fault tolerance, auto monitoring, and solutions for warnings and alerts Virtual Desktop Infrastructure (VDI), building a disaster recovery site, fast deployments, and decommissions The PPP framework Let's explore the components that integrate to form the PPP framework. Always keep in mind that the design should consist of people, processes, and products that meet the unified functional requirements and performance benchmark. Always expect the unexpected. Without these metrics, your design is incomplete; PPP always retains its own decision metrics. What does it do, who does it, and how is it done? We will see the answers in the following diagrams: The PPP Framework helps you to get started with requirements gathering, design vision, business architecture, infrastructure architecture, opportunities and solutions, migration planning, fixing the tone for implementing and design governance. The following table illustrates the essentials of the three-dimensional approach and the basic questions that are required to be answered before you start designing or documenting about designing, which will in turn help to understand the real requirements for a specific design: Phase Description Key components Product Results of what? In what hardware will the VM reside? What kind of CPU is required? What is the quantity of CPU, RAM, storage per host/VM? What kind of storage is required? What kind of network is required? What are the standard applications that need to be rolled out? What kind of power and cooling are required? How much rack and floor space is demanded? People Results of who? Who is responsible for infrastructure provisioning? Who manages the data center and supplies the power? Who is responsible for implementation of the hardware and software patches? Who is responsible for storage and back up? Who is responsible for security and hardware support? Process Results of how? How should we manage the virtual infrastructure? How should we manage hosted VMs? How should we provision VM on demand? How should a DR site be active during a primary site failure? How should we provision storage and backup? How should we take snapshots of VMs? How should we monitor and perform periodic health checks? Before we start to apply the PPP framework on VMware vSphere, we will discuss the list of challenges and encounters faced on the virtual infrastructure. List of challenges and encounters faced on the virtual infrastructure In this section, we will see a list of challenges and encounters faced with virtual infrastructure due to the simple reason that we fail to capture the functional and non-functional demands of business users, or do not understand the fit-for-purpose concept: Resource Estimate Misfire: If you underestimate the amount of memory required up-front, you could change the number of VMs you attempt to run on the VMware ESXi host hardware. Resource unavailability: Without capacity management and configuration management, you cannot create dozens or hundreds of VMs on a single host. Some of the VMs could consume all resources, leaving other VMs unknown. High utilization: An army of VMs can also throw workflows off-balance due to the complexities they can bring to provisioning and operational tasks. Business continuity: Unlike a PC environment, VMs cannot be backed up to an actual hard drive. This is why 80 percent of IT professionals believe that virtualization backup is a great technological challenge. Security: More than six out of ten IT professionals believe that data protection is a top technological challenge. Backward compatibility: This is especially challenging for certain apps and systems that are dependent on legacy systems. Monitoring performance: Unlike physical servers, you cannot monitor the performance of VMs with common hardware resources such as CPU, memory, and storage. Restriction of licensing: Before you install software on virtual machines, read the license agreements; they might not support this; hence, by hosting on VMs, you might violate the agreement. Sizing the database and mailbox: Proper sizing of databases and mailboxes is really critical to the organization's communication systems and for applications. Poor design of storage and network: A poor storage design or a networking design resulting from a failure to properly involve the required teams within an organization is a sure-fire way to ensure that this design isn't successful. Summary In this article we covered a brief introduction of the essentials of designing VMware vSphere which focused on the PPP framework. We also had look over the challenges and encounters faced on the virtual infrastructure. Resources for Article: Further resources on this subject: Creating and Managing VMFS Datastores [article] Networking Performance Design [article] The Design Documentation [article]
Read more
  • 0
  • 0
  • 8677

article-image-responsive-web-design-wordpress
Packt
09 Jul 2015
13 min read
Save for later

Responsive Web Design with WordPress

Packt
09 Jul 2015
13 min read
Welcome to the world of the Responsive Web Design! This article is written by Dejan Markovic, author of the book WordPress Responsive Theme Design, and it will introduce you to the Responsive Web Design and its concepts and techniques. It will also present crisp notes from WordPress Responsive Theme Design. (For more resources related to this topic, see here.) Responsive web design (RWD) is a web design approach aimed at crafting sites to provide an optimal viewing experience—easy reading and navigation with a minimum of resizing, panning, and scrolling—across a wide range of devices (from mobile phones to desktop computer monitors). Reference: http://en.wikipedia.org/wiki/Responsive_web_design. To say it simply, responsive web design (RWD) means that the responsive website should adapt to the screen size of the device it is being viewed on. When I began my web development journey in 2002, we didn't have to consider as many factors as we do today. We just had to create the website for a 17-inch screen (which was the standard at that time), and that was it. Yes, we also had to consider 15, 19, and 21-inch monitors, but since the 17-inch screen was the standard, that was the target screen size for us. In pixels, these sizes were usually 800 or 1024. We also had to consider a fewer number of browsers (Internet Explorer, Netscape, and Opera) and the styling for the print, and that was it. Since then, a lot of things have changed, and today, in 2015, for a website design, we have to consider multiple factors, such as: A lot of different web browsers (Internet Explorer, Firefox, Opera, Chrome, and Safari) A number of different operating systems (Windows (XP, 7, and 8), Mac OS X, Linux, Unix, iOS, Android, and Windows phones) Device screen sizes (desktop, mobile, and tablet) Is content accessible and readable with screen readers? How the content will look like when it's printed? Today, creating different design for all these listed factors & devices would take years. This is where a responsive web design comes to the rescue. The concepts of RWD I have to point out that the mobile environment is becoming more important factor than the desktop environment. Mobile browsing is becoming bigger than the desktop-based access, which makes the mobile environment very important factor to consider when developing a website. Simply put, the main point of RWD is that the layout changes based on the size and capabilities of the device its being viewed on. The concepts of RWD, that we will learn next, are: Viewport, scaling and screen density. Controlling Viewport On the desktop, Viewport is the screen size of the window in a browser. For example, when we resize the browser window, we are actually changing the Viewport size. On mobile devices, the Viewport size is also independent of the device screen size. For example, Viewport is 850 px for mobile Opera and 980 px for mobile Safari, and the screen size for iPhone is 320 px. If we compare the Viewport size of 980 px and the screen size of an iPhone of 320 px, we can see that Viewport is bigger than the screen size. This is because mobile browsers function differently. They first load the page into Viewport, and then they resize it to the device's screen size. This is why we are able to see the whole page on the mobile device. If the mobile browsers had Viewport the same as the screen size (320 px), we would be able to see only a part of the page on the mobile device. In the following screenshot, we can see the table with the list of Viewport sizes for some iPhone models: We can control Viewport with CSS: @viewport {width: device-width;} Or, we can control it with the meta tag: <meta name="viewport" content="width=device-width"> In the preceding code, we are matching the Viewport width with the device width. Because the Viewport meta tag approach is more widely adopted, as it was first used on iOS and the @viewport approach was not supported by some browsers, we will use the meta tag approach. We are setting the Viewport width in order to match our web content with our mobile content, as we want to make sure that our web content looks good on a mobile device as well. We can set Viewports in the code for each device separately, for example, 320 px for the iPhone. The better approach will be to use content="width=device-width". Scaling Scaling is extremely important, as the initial scale controls the zoom aspect of the content for the initial look of the page. For example, if the initial scale is set to 3, the content will be loaded in the size of 3 times of the Viewport size, which means 3 times zoom. Here is the look of the screenshot for initial-scale=1 and initial-scale=3: As we can see from the preceding screenshots, on the initial scale 3 (three times zoom), the logo image takes the bigger part of the screen. It is important to note that this is just the initial scale, which means that the user can zoom in and zoom out later, if they want to. Here is the example of the code with the initial scale: <meta name="viewport" content="width=device-width, initial- scale=1, maximum-scale=1"> In this example, we have used the maximum-scale=1 option, which means that the user will not be able to use the zoom here. We should avoid using the maximum-scale property because of accessibility issues. If we forbid zooming on our pages, users with visual problems will not be able to see the content properly. The screen density As the screen technology is going forward every year or even faster than that, we have to consider the screen density aspect as well. Screen density is the number of pixels that are contained within a screen area. This means that if the screen density is higher, we can have more details, in this case, pixels in the same area. There are two measurements that are usually used for this, dots per inch (DPI) and pixels per inch (PPI). DPI means how many drops a printer can place in an inch of a space. PPI is the number of pixels we can have in one inch of the screen. If we go back to the preceding screenshot with the table where we are showing Viewports and densities and compare the values of iPhone 3G and iPhone 4S, we will see that the screen size stayed the same at 3.5 inch, Viewport stayed the same at 320 px, but the screen density has doubled, from 163 dpi to 326 dpi, which means that the screen resolution also has doubled from 320x480 to 640x960. The screen density is very relevant to RWD, as newer devices have bigger densities and we should do our best to cover as many densities as we can in order to provide a better experience for end users. Pixels' density matters more than the resolution or screen size, because more pixels is equal to sharper display: There are topics that need to be taken into consideration, such as hardware, reference pixels, and the device-pixel-ratio, too. Problems and solutions with the screen density Scalable vector graphics and CSS graphics will scale to the resolution. This is why I recommend using Font Awesome icons in your project. Font Awesome icons are available for download at: http://fortawesome.github.io/Font-Awesome/icons/. Font Icons is a font that is made up of symbols, icons, or pictograms (whatever you prefer to call them) that you can use in a webpage just like a font. They can be instantly customized with properties like: size, drop shadow, or anything you want can be done with the power of CSS. The real problem triggered by the change in the screen density is images, as for high-density screens, we should provide higher resolution images. There are several ways through which we can approach this problem: By targeting high-density screens (providing high-resolution images to all screens) By providing high-resolution images where appropriate (loading high-resolution images only on devices with high-resolution screens) By not using high-resolution images For the beginner developers I will recommend using second approach, providing high-resolution images where appropriate. Techniques in RWD RWD consists of three coding techniques: Media queries (adapt content to specific screen sizes) Fluid grids (for flexible layouts) Flexible images and media (that respond to changes to screen sizes) More detailed information about RWD techniques by Ethan Marcote, who coined the term Reponsive Web Design, is available at http://alistapart.com/article/responsive-web-design. Media queries Media queries are CSS modules, or as some people like to say, just a conditional statements, which are telling tells the browsers to use a specific type of style, depending on the size of the screen and other factors, such as print (specific styles for print). They are here for a long time already, as I was using different styles for print in 2002. If you wish to know more about media queries, refer to W3C Candidate Recommendation 8 July 2002 at http://www.w3.org/TR/2002/CR-css3-mediaqueries-20020708/. Here is an example of media query declaration: @media only screen and (min-width:500px) { font-family: sans-serif; } Let's explain the preceding code: The @media code means that it is a media type declaration. The screen and part of the query is an expression or condition (in this case, it means only screen and no print). The following conditional statement means that everything above 500 px will have the font family of sans serif: (min-width:500px) { font-family: sans-serif; } Here is another example of a media query declaration: @media only screen and (min-width: 500px), screen and (orientation: portrait) { font-family: sans-serif; } In this case, if we have two statements and if one of the statements is true, the entire declaration is applied (either everything above 50 px or the portrait orientation will be applied to the screen). The only keyword hides the styles from older browsers. As some older browsers don't support media queries, I recommend using a respond.js script, which will "patch" support for them. Polyfill (or polyfiller) is code that provides features that are not built or supported by some web browsers. For example, a number of HTML5 features are not supported by older versions of IE (older than 8 or 9), but these features can be used if polyfill is installed on the web page. This means that if the developer wants to use these features, he/she can just include that polyfill library and these features will work in older browsers. Breakpoints Breakpoint is a moment when layout switches, from one layout to another, when some condition is fulfilled, for example, the screen has been resized. Almost all responsive designs cover the changes of the screen between the desktop, tablets, and smart phones. Here is an example with comments inside: @media only screen and (max-width: 480px) { //mobile styles // up to 480px size } Media query in the preceding code will only be used if the width of the screen is 480 px or less. @media only screen and (min-width:481px) and (max-width: 768px) { //tablet styles //between 481 and 768px } Media query in the preceding code will only be used if the width of the screen is between the 481 px and 768 px. @media only screen and (min-width:769px) { //desktop styles //from 769px and up } Media query in the preceding code will only be used when the width of the screen is 769 px and more. The minimum width value in desktop styles is 1 pixel over the maximum width value in tablet styles, and the same difference is there between values from tablet and mobile styles. We are doing this in order to avoid overlapping, as that could cause problem with our styles. There is also an approach to set the maximum width and minimum width with em values. Setting em of the screen for maximum will mean that the width of the screen is set relative to the device's font size. If the font size for the device is 16 px (which is the usual size), the maximum width for mobile styles would be 480/16=30. Why do we use em values? With pixel sizes, everything is fixed; for example, h1 is 19 px (or 1.5 em of the default size of 16 px), and that's it. With em sizes, everything is relative, so if we change the default value in the browser from, for example, 16 px to 18 px, everything relative to that will change. Therefore, all h1 values will change from 19 px to 22 px and make our layout "zoomable". Here is the example with sizes changed to em: @media only screen and (max-width: 30em) { //mobile styles // up to 480px size }   @media only screen and (min-width:30em) and (max-width: 48em) { //tablet styles //between 481 and 768px }   @media only screen and (min-width:48em) { //desktop styles //from 769px and up } Fluid grids The major point in RWD is that the content should adapt to any screen it's viewed on. One of the best solutions to do this is to use fluid layouts where our content can be resized on each breakpoint. In fluid grids, we define a maximum layout size for the design. The grid is divided into a specific number of columns to keep the layout clean and easy to handle. Then we design each element with proportional widths and heights instead of pixel based dimensions. So whenever the device or screen size is changed, elements will adjust their widths and heights by the specified proportions to its parent container. Reference: http://www.1stwebdesigner.com/tutorials/fluid-grids-in-responsive-design/. To make the grid flexible (or elastic), we can use the % points, or we can use the em values, whichever suits us better. We can make our own fluid grids, or we can use grid frameworks. As there are so many frameworks available, I would recommend that you use the existing framework rather than building your own. Grid frameworks could use a single grid that covers various screen sizes, or we can have multiple grids for each of the break points or screen size categories, such as mobiles, tablets, and desktops. Some of the notable frameworks are Twitter's Bootstrap, Foundation, and SemanticUI. I prefer Twitter's Bootstrap, as it really helps me speed up the process and it is the most used framework currently. Flexible images and media Last but not the least important, are images and media (videos). The problem with them is that they are elements that come with fixed sizes. There are several approaches to fix this: Replacing dimensions with percentage values Using maximum widths Using background images only for some cases, as these are not good for accessibility Using some libraries, such as Scott Jehl's picturefill (https://github.com/scottjehl/picturefill) Taking out the width and height parameters from the image tag and dealing with dimensions in CSS Summary In this article, you learned about the RWD concepts such as: Viewport, scaling and the screen density. Also, we have covered the RWD techniques: media queries, fluid grids, and flexible media. Resources for Article: Further resources on this subject: Deployment Preparations [article] Why Meteor Rocks! [article] Clustering and Other Unsupervised Learning Methods [article]
Read more
  • 0
  • 0
  • 29501

article-image-blueprint-class
Packt
08 Jul 2015
26 min read
Save for later

The Blueprint Class

Packt
08 Jul 2015
26 min read
In this article by Nitish Misra, author of the book, Learning Unreal Engine Android Game Development, mentions about the Blueprint class. You would need to do all the scripting and everything else only once. A Blueprint class is an entity that contains actors (static meshes, volumes, camera classes, trigger box, and so on) and functionalities scripted in it. Looking at our example once again of the lamp turning on/off, say you want to place 10 such lamps. With a Blueprint class, you would just have to create and script once, save it, and duplicate it. This is really an amazing feature offered by UE4. (For more resources related to this topic, see here.) Creating a Blueprint class To create a Blueprint class, click on the Blueprints button in the Viewport toolbar, and in the dropdown menu, select New Empty Blueprint Class. A window will then open, asking you to pick your parent class, indicating the kind of Blueprint class you wish to create. At the top, you will see the most common classes. These are as follows: Actor: An Actor, as already discussed, is an object that can be placed in the world (static meshes, triggers, cameras, volumes, and so on, all count as actors) Pawn: A Pawn is an actor that can be controlled by the player or the computer Character: This is similar to a Pawn, but has the ability to walk around Player Controller: This is responsible for giving the Pawn or Character inputs in the game, or controlling it Game Mode: This is responsible for all of the rules of gameplay Actor Component: You can create a component using this and add it to any actor Scene Component: You can create components that you can attach to other scene components Apart from these, there are other classes that you can choose from. To see them, click on All Classes, which will open a menu listing all the classes you can create a Blueprint with. For our key cube, we will need to create an Actor Blueprint Class. Select Actor, which will then open another window, asking you where you wish to save it and what to name it. Name it Key_Cube, and save it in the Blueprint folder. After you are satisfied, click on OK and the Actor Blueprint Class window will open. The Blueprint class user interface is similar to that of Level Blueprint, but with a few differences. It has some extra windows and panels, which have been described as follows: Components panel: The Components panel is where you can view, and add components to the Blueprint class. The default component in an empty Blueprint class is DefaultSceneRoot. It cannot be renamed, copied, or removed. However, as soon as you add a component, it will replace it. Similarly, if you were to delete all of the components, it will come back. To add a component, click on the Add Component button, which will open a menu, from where you can choose which component to add. Alternatively, you can drag an asset from the Content Browser and drop it in either the Graph Editor or the Components panel, and it will be added to the Blueprint class as a component. Components include actors such as static or skeletal meshes, light actors, camera, audio actors, trigger boxes, volumes, particle systems, to name a few. When you place a component, it can be seen in the Graph Editor, where you can set its properties, such as size, position, mobility, material (if it is a static mesh or a skeletal mesh), and so on, in the Details panel. Graph Editor: The Graph Editor is also slightly different from that of Level Blueprint, in that there are additional windows and editors in a Blueprint class. The first window is the Viewport, which is the same as that in the Editor. It is mainly used to place actors and set their positions, properties, and so on. Most of the tools you will find in the main Viewport (the editor's Viewport) toolbar are present here as well. Event Graph: The next window is the Event Graph window, which is the same as a Level Blueprint window. Here, you can script the components that you added in the Viewport and their functionalities (for example, scripting the toggling of the lamp on/off when the player is in proximity and moves away respectively). Keep in mind that you can script the functionalities of the components only present within the Blueprint class. You cannot use it directly to script the functionalities of any actor that is not a component of the Class. Construction Script: Lastly, there is the Construction Script window. This is also similar to the Event Graph, as in you can set up and connect nodes, just like in the Event Graph. The difference here is that these nodes are activated when you are constructing the Blueprint class. They do not work during runtime, since that is when the Event Graph scripts work. You can use the Construction Script to set properties, create and add your own property of any of the components you wish to alter during the construction, and so on. Let's begin creating the Blueprint class for our key cubes. Viewport The first thing we need are the components. We require three components: a cube, a trigger box, and a PostProcessVolume. In the Viewport, click on the Add Components button, and under Rendering, select Static Mesh. It will add a Static Mesh component to the class. You now need to specify which Static Mesh you want to add to the class. With the Static Mesh actor selected in the Components panel, in the actor's Details panel, under the Static Mesh section, click on the None button and select TemplateCube_Rounded. As soon as you set the mesh, it will appear in the Viewport. With the cube selected, decrease its scale (located in the Details panel) from 1 to 0.2 along all three axes. The next thing we need is a trigger box. Click on the Add Component button and select Box Collision in the Collision section. Once added, increase its scale from 1 to 9 along all three axes, and place it in such a way that its bottom is in line with the bottom of the cube. The Construction Script You could set its material in the Details panel itself by clicking on the Override Materials button in the Rendering section, and selecting the key cube material. However, we are going to assign its material using Construction Script. Switch to the Construction Script tab. You will see a node called Construction Script, which is present by default. You cannot delete this node; this is where the script starts. However, before we can script it in, we will need to create a variable of the type Material. In the My Blueprint section, click on Add New and select Variable in the dropdown menu. Name this variable Key Cube Material, and change its type from Bool (which is the default variable type) to Material in the Details panel. Also, be sure to check the Editable box so that we can edit it from outside the Blueprint class. Next, drag the Key Cube Material variable from the My Blueprint panel, drop it in the Graph Editor, and select Set when the window opens up. Connect this to the output pin of the Construction Script node. Repeat this process, only this time, select Get and connect it to the input pin of Key Cube Material. Right-click in the Graph Editor window and type in Set Material in the search bar. You should see Set Material (Static Mesh). Click on it and add it to the scene. This node already has a reference of the Static Mesh actor (TemplateCube_Rounded), so we will not have to create a reference node. Connect this to the Set node. Finally, drag Key Cube Material from My Blueprint, drop it in the Graph Editor, select Get, and connect it to the Material input pin. After you are done, hit Compile. We will now be able to set the cube's material outside of the Blueprint class. Let's test it out. Add the Blueprint class to the level. You will see a TemplateCube_Rounded actor added to the scene. In its Details panel, you will see a Key Cube Material option under the Default section. This is the variable we created inside our Construction Script. Any material we add here will be added to the cube. So, click on None and select KeyCube_Material. As soon as you select it, you will see the material on the cube. This is one of the many things you can do using Construction Script. For now, only this will do. The Event Graph We now need to script the key cube's functionalities. This is more or less the same as what we did in the Level Blueprint with our first key cube, with some small differences. In the Event Graph panel, the first thing we are going to script is enabling and disabling input when the player overlaps and stops overlapping the trigger box respectively. In the Components section, right-click on Box. This will open a menu. Mouse over Add Event and select Add OnComponentBeginOverlap. This will add a Begin Overlap node to the Graph Editor. Next, we are going to need a Cast node. A Cast node is used to specify which actor you want to use. Right-click in the Graph Editor and add a Cast to Character node. Connect this to the OnComponentBeginOverlap node and connect the other actor pin to the Object pin of the Cast to Character node. Finally, add an Enable Input node and a Get Player Controller node and connect them as we did in the Level Blueprint. Next, we are going to add an event for when the player stops overlapping the box. Again, right-click on Box and add an OnComponentEndOverlap node. Do the exact same thing you did with the OnComponentBeginOverlap node; only here, instead of adding an Enable Input node, add a Disable Input node. The setup should look something like this: You can move the key cube we had placed earlier on top of the pedestal, set it to hidden, and put the key cube Blueprint class in its place. Also, make sure that you set the collision response of the trigger actor to Ignore. The next step is scripting the destruction of the key cube when the player touches the screen. This, too, is similar to what we had done in Level Blueprint, with a few differences. Firstly, add a Touch node and a Sequence node, and connect them to each other. Next, we need a Destroy Component node, which you can find under Components | Destroy Component (Static Mesh). This node already has a reference to the key cube (Static Mesh) inside it, so you do not have to create an external reference and connect it to the node. Connect this to the Then 0 node. We also need to activate the trigger after the player has picked up the key cube. Now, since we cannot call functions on actors outside the Blueprint class directly (like we could in Level Blueprint), we need to create a variable. This variable will be of the type Trigger Box. The way this works is, when you have created a Trigger Box variable, you can assign it to any trigger in the level, and it will call that function to that particular trigger. With that in mind, in the My Blueprint panel, click on Add New and create a variable. Name this variable Activated Trigger Box, and set its type to Trigger Box. Finally, make sure you tick on the Editable box; otherwise, you will not be able to assign any trigger to it. After doing that, create a Set Collision Response to All Channels node (uncheck the Context Sensitive box), and set the New Response option to Overlap. For the target, drag the Activated Trigger Box variable, drop it in the Graph Editor, select Get, and connect it to the Target input. Finally, for the Post Process Volume, we will need to create another variable of the type PostProcessVolume. You can name this variable Visual Indicator, again, while ensuring that the Editable box is checked. Add this variable to the Graph Editor as well. Next, click on its pin, drag it out, and release it, which will open the actions menu. Here, type in Enabled, select Set Enabled, and check Enabled. Finally, add a Delay node and a Destroy Actor and connect them to the Set Enabled node, in that order. Your setup should look something like this: Back in the Viewport, you will find that under the Default section of the Blueprint class actor, two more options have appeared: Activated Trigger Box and Visual Indicator (the variables we had created). Using this, you can assign which particular trigger box's collision response you want to change, and which exact post process volume you want to activate and destroy. In front of both variables, you will see a small icon in the shape of an eye dropper. You can use this to choose which external actor you wish to assign the corresponding variable. Anything you scripted using those variables will take effect on the actor you assigned in the scene. This is one of the many amazing features offered by the Blueprint class. All we need to do now for the remaining key cubes is: Place them in the level Using the eye dropper icon that is located next to the name of the variables, pick the trigger to activate once the player has picked up the key cube, and which post process volume to activate and destroy. In the second room, we have two key cubes: one to activate the large door and the other to activate the door leading to the third room. The first key cube will be placed on the pedestal near the big door. So, with the first key cube selected, using the eye dropper, select the trigger box on the pedestal near the big door for the Activated Trigger Box variable. Then, pick the post process volume inside which the key cube is placed for the Visual Indicator variable. The next thing we need to do is to open Level Blueprint and script in what happens when the player places the key cube on the pedestal near the big door. Doing what we did in the previous room, we set up nodes that will unhide the hidden key cube on the pedestal, and change the collision response of the trigger box around the big door to Overlap, ensuring that it was set to Ignore initially. Test it out! You will find that everything is working as expected. Now, do the same with the remaining key cubes. Pick which trigger box and which post process volume to activate when you touch on the screen. Then, in the Level Blueprint, script in which key cube to unhide, and so on (place the key cubes we had placed earlier on the pedestals and set it to Hidden), and place the Blueprint class key cube in its place. This is one of the many ways you can use Blueprint class. You can see it takes a lot of work and hassle. Let us now move on to Artificial intelligence. Scripting basic AI Coming back to the third room, we are now going to implement AI in our game. We have an AI character in the third room which, when activated, moves. The main objective is to make a path for it with the help of switches and prevent it from falling. When the AI character reaches its destination, it will unlock the key cube, which the player can then pick up and place on the pedestal. We first need to create another Blueprint class of the type Character, and name it AI_Character. When created, double-click on it to open it. You will see a few components already set up in the Viewport. These are the CapsuleComponent (which is mainly used for collision), ArrowComponent (to specify which side is the front of the character, and which side is the back), Mesh (used for character animation), and CharacterMovement. All four are there by default, and cannot be removed. The only thing we need to do here is add a StaticMesh for our character, which will be TemplateCube_Rounded. Click on Add Components, add a StaticMesh, and assign it TemplateCube_Rounded (in its Details panel). Next, scale this cube to 0.2 along all three axes and move it towards the bottom of the CapsuleComponent, so that it does not float in midair. This is all we require for our AI character. The rest we will handle in Level Blueprints. Next, place AI_Character into the scene on the Player side of the pit, with all of the switches. Place it directly over the Target Point actor. Next, open up Level Blueprint, and let's begin scripting it. The left-most switch will be used to activate the AI character, and the remaining three will be used to draw the parts of a path on which it will walk to reach the other side. To move the AI character, we will need an AI Move To node. The first thing we need is an overlapping event for the trigger over the first switch, which will enable the input, otherwise the AI character will start moving whenever the player touches the screen, which we do not want. Set up an Overlap event, an Enable Input node, and a Gate event. Connect the Overlap event to the Enable Input event, and then to the Gate node's Open input. The next thing is to create a Touch node. To this, we will attach an AI Move To node. You can either type it in or find it under the AI section. Once created, attach it to the Gate node's Exit pin. We now need to specify to the node which character we want to move, and where it should move to. To specify which character we want to move, select the AI character in the Viewport, and in the Level Blueprint's Graph Editor, right-click and create a reference for it. Connect it to the Pawn input pin. Next, for the location, we want the AI character to move towards the second Target Point actor, located on the other side of the pit. But first, we need to get its location in the world. With it selected, right-click in the Graph Editor, and type in Get Actor Location. This node returns an actor's location (coordinates) in the world (the one connected to it). This will create a Get Actor Location, with the Target Point actor connect to its input pin. Finally, connect its Return Value to the Destination input of the AI Move To node. If you were to test it out, you would find that it works fine, except for one thing: the AI character stops when it reaches the edge of the pit. We want it to fall off the pit if there is no path. For that, we will need a Nav Proxy Link actor. A Nav Proxy Link actor is used when an AI character has to step outside the Nav Mesh temporarily (for example, jump between ledges). We will need this if we want our AI character to fall off the ledge. You can find it in the All Classes section in the Modes panel. Place it in the level. The actor is depicted by two cylinders with a curved arrow connecting them. We want the first cylinder to be on one side of the pit and the other cylinder on the other side. Using the Scale tool, increase the size of the Nav Proxy Link actor. When placing the Nav Proxy Link actor, keep two things in mind: Make sure that both cylinders intersect in the green area; otherwise, the actor will not work Ensure that both cylinders are in line with the AI character; otherwise, it will not move in a straight line but instead to where the cylinder is located Once placed, you will see that the AI character falls off when it reaches the edge of the pit. We are not done yet. We need to bring the AI character back to its starting position so that the player can start over (or else the player will not be able to progress). For that, we need to first place a trigger at the bottom of the pit, making sure that if the AI character does fall into it, it overlaps the trigger. This trigger will perform two actions: first, it will teleport the AI character to its initial location (with the help of the first Target Point); second, it will stop the AI Move To node, or it will keep moving even after it has been teleported. After placing the trigger, open Level Blueprint and create an Overlap event for the trigger box. To this, we will add a Sequence node, since we are calling two separate functions for when the player overlaps the trigger. The first node we are going to create is a Teleport node. Here, we can specify which actor to teleport, and where. The actor we want to teleport is the AI character, so create a reference for it and connect it to the Target input pin. As for the destination, first use the Get Actor Location function to get the location of the first Target Point actor (upon which the AI character is initially placed), and connect it to the Dest Location input. To stop the AI character's movement, right-click anywhere in the Graph Editor, and first uncheck the Context Sensitive box, since we cannot use this function directly on our AI character. What we need is a Stop Active Movement node. Type it into the search bar and create it. Connect this to the Then 1 output node, and attach a reference of the AI character to it. It will automatically convert from a Character Reference into Character Movement component reference. This is all that we need to script for our AI in the third room. There is one more thing left: how to unlock the key cube. In the fourth room, we are going to use the same principle. Here, we are going to make a chain of AI Move To nodes, each connected to the previous one's On Success output pin. This means that when the AI character has successfully reached the destination (Target Point actor), it should move to the next, and so on. Using this, and what we have just discussed about AI, script the path that the AI will follow. Packaging the project Another way of packaging the game and testing it on your device is to first package the game, import it to the device, install it, and then play it. But first, we should discuss some settings regarding packaging, and packaging for Android. The Maps & Modes settings These settings deal with the maps (scenes) and the game mode of the final game. In the Editor, click on Edit and select Project settings. In the Project settings, Project category, select Maps & Modes. Let's go over the various sections: Default Maps: Here, you can set which map the Editor should open when you open the Project. You can also set which map the game should open when it is run. The first thing you need to change is the main menu map we had created. To do this, click on the downward arrow next to Game Default Map and select Main_Menu. Local Multiplayer: If your game has local multiplayer, you can alter a few settings regarding whether the game should have a split screen. If so, you can set what the layout should be for two and three players. Default Modes: In this section, you can set the default game mode the game should run with. The game mode includes things such as the Default Pawn class, HUD class, Controller class, and the Game State Class. For our game, we will stick to MyGame. Game Instance: Here, you can set the default Game Instance Class. The Packaging settings There are settings you can tweak when packaging your game. To access those settings, first go to Edit and open the Project settings window. Once opened, under the Project section click on Packaging. Here, you can view and tweak the general settings related to packaging the project file. There are two sections: Project and Packaging. Under the Project section, you can set options such as the directory of the packaged project, the build configuration to either debug, development, or shipping, whether you want UE4 to build the whole project from scratch every time you build, or only build the modified files and assets, and so on. Under the Packaging settings, you can set things such as whether you want all files to be under one .pak file instead of many individual files, whether you want those .pak files in chunks, and so on. Clicking on the downward arrow will open the advanced settings. Here, since we are packaging our game for distribution check the For Distribution checkbox. The Android app settings In the preceding section, we talked about the general packaging settings. We will now talk about settings specific to Android apps. This can be found in Project Settings, under the Platforms section. In this section, click on Android to open the Android app settings. Here you will find all the settings and properties you need to package your game. At the top the first thing you should do is configure your project for Android. If your project is not configured, it will prompt you to do so (since version 4.7, UE4 automatically creates the androidmanifest.xml file for you). Do this before you do anything else. Here you have various sections. These are: APKPackaging: In this section, you can find options such as opening the folder where all of the build files are located, setting the package's name, setting the version number, what the default orientation of the game should be, and so on. Advanced APKPackaging: This section contains more advanced packaging options, such as one to add extra settings to the .apk files. Build: To tweak settings in the Build section, you first need the source code which is available from GitHub. Here, you can set things like whether you want the build to support x86, OpenGL ES2, and so on. Distribution Signing: This section deals with signing your app. It is a requirement on Android that all apps have a digital signature. This is so that Android can identify the developers of the app. You can learn more about digital signatures by clicking on the hyperlink at the top of the section. When you generate the key for your app, be sure to keep it in a safe and secure place since if you lose it you will not be able to modify or update your app on Google Play. Google Play Service: Android apps are downloaded via the Google Play store. This section deals with things such as enabling/disabling Google Play support, setting your app's ID, the Google Play license key, and so on. Icons: In this section, you can set your game's icons. You can set various sizes of icons depending upon the screen density of the device you are aiming to develop on. You can get more information about icons by click on the hyperlink at the top of the section. Data Cooker: Finally, in this section, you can set how you want the audio in the game to be encoded. For our game, the first thing you need to set is the Android Package Name which is found in the APKPackaging section. The format of the naming is com.YourCompany.[PROJECT]. Here, replace YourCompany with the name of the company and [PROJECT] with the name of your project. Building a package To package your project, in the Editor go to File | Package Project | Android. You will see different types of formats to package the project in. These are as follows: ATC: Use this format if you have a device that has a Qualcomm Snapdragon processor. DXT: Use this format if your device has a Tegra graphical processing unit (GPU). ETC1: You can use this for any device. However, this format does not accept textures with alpha channels. Those textures will be uncompressed, making your game requiring more space. ETC2: Use this format is you have a MALI-based device. PVRTC: Use this format if you have a device with a PowerVR GPU. Once you have decided upon which format to use, click on it to begin the packaging process. A window will open up asking you to specify which folder you want the package to be stored in. Once you have decided where to store the package file, click OK and the build process will commence. When started, just like with launching the project, a small window will pop up at the bottom-right corner of the screen notifying the user that the build process has begun. You can open the output log and cancel the build process. Once the build process is complete, go the folder you set. You will find a .bat file of the game. Providing you have checked the packaged game data inside .apk? option (which is located in the Project settings in the Android category under the APKPackaging section), you will also find an .apk file of the game. The .bat file directly installs the game from the system onto your device. To do so, first connect your device to the system. Then double-click on the .bat file. This will open a command prompt window.   Once it has opened, you do not need to do anything. Just wait until the installation process finishes. Once the installation is done, the game will be on your device ready to be executed. To use the .apk file, you will have to do things a bit differently. An .apk file installs the game when it is on the device. For that, you need to perform the following steps: Connect the device. Create a copy of the .apk file. Paste it in the device's storage. Execute the .apk file from the device. The installation process will begin. Once completed, you can play the game. Summary In this article, we covered with Blueprints and discussed how they work. We also discussed Level Blueprints and the Blueprint class, and covered how to script AI. We discussed how to package the final product and upload the game onto the Google Play Store for people to download. Resources for Article: Further resources on this subject: Flash Game Development: Creation of a Complete Tetris Game [article] Adding Finesse to Your Game [article] Saying Hello to Unity and Android [article]
Read more
  • 0
  • 0
  • 8688
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-why-meteor-rocks
Packt
08 Jul 2015
23 min read
Save for later

Why Meteor Rocks!

Packt
08 Jul 2015
23 min read
In this article by Isaac Strack, the author of the book, Getting Started with Meteor.js JavaScript Framework - Second Edition, has discussed some really amazing features of Meteor that has contributed a lot to the success of Meteor. Meteor is a disruptive (in a good way!) technology. It enables a new type of web application that is faster, easier to build, and takes advantage of modern techniques, such as Full Stack Reactivity, Latency Compensation, and Data On The Wire. (For more resources related to this topic, see here.) This article explains how web applications have changed over time, why that matters, and how Meteor specifically enables modern web apps through the above-mentioned techniques. By the end of this article, you will have learned: What a modern web application is What Data On The Wire means and how it's different How Latency Compensation can improve your app experience Templates and Reactivity—programming the reactive way! Modern web applications Our world is changing. With continual advancements in displays, computing, and storage capacities, things that weren't even possible a few years ago are now not only possible but are critical to the success of a good application. The Web in particular has undergone significant change. The origin of the web app (client/server) From the beginning, web servers and clients have mimicked the dumb terminal approach to computing where a server with significantly more processing power than a client will perform operations on data (writing records to a database, math calculations, text searches, and so on), transform the data and render it (turn a database record into HTML and so on), and then serve the result to the client, where it is displayed for the user. In other words, the server does all the work, and the client acts as more of a display, or a dumb terminal. This design pattern for this is called…wait for it…the client/server design pattern. The diagrammatic representation of the client-server architecture is shown in the following diagram: This design pattern, borrowed from the dumb terminals and mainframes of the 60s and 70s, was the beginning of the Web as we know it and has continued to be the design pattern that we think of when we think of the Internet. The rise of the machines (MVC) Before the Web (and ever since), desktops were able to run a program such as a spreadsheet or a word processor without needing to talk to a server. This type of application could do everything it needed to, right there on the big and beefy desktop machine. During the early 90s, desktop computers got even more beefy. At the same time, the Web was coming alive, and people started having the idea that a hybrid between the beefy desktop application (a fat app) and the connected client/server application (a thin app) would produce the best of both worlds. This kind of hybrid app—quite the opposite of a dumb terminal—was called a smart app. Many business-oriented smart apps were created, but the easiest examples can be found in computer games. Massively Multiplayer Online games (MMOs), first-person shooters, and real-time strategies are smart apps where information (the data model) is passed between machines through a server. The client in this case does a lot more than just display the information. It performs most of the processing (or acts as a controller) and transforms the data into something to be displayed (the view). This design pattern is simple but very effective. It's called the Model View Controller (MVC) pattern. The model is essentially the data for an application. In the context of a smart app, the model is provided by a server. The client makes requests to the server for data and stores that data as the model. Once the client has a model, it performs actions/logic on that data and then prepares it to be displayed on the screen. This part of the application (talking to the server, modifying the data model, and preparing data for display) is called the controller. The controller sends commands to the view, which displays the information. The view also reports back to the controller when something happens on the screen (a button click, for example). The controller receives the feedback, performs the logic, and updates the model. Lather, rinse, repeat! Since web browsers were built to be "dumb clients", the idea of using a browser as a smart app back then was out of question. Instead, smart apps were built on frameworks such as Microsoft .NET, Java, or Macromedia (now Adobe) Flash. As long as you had the framework installed, you could visit a web page to download/run a smart app. Sometimes, you could run the app inside the browser, and sometimes, you would download it first, but either way, you were running a new type of web app where the client application could talk to the server and share the processing workload. The browser grows up Beginning in the early 2000s, a new twist on the MVC pattern started to emerge. Developers started to realize that, for connected/enterprise "smart apps", there was actually a nested MVC pattern. The server code (controller) was performing business logic against the database (model) through the use of business objects and then sending processed/rendered data to the client application (a "view"). The client was receiving this data from the server and treating it as its own personal "model". The client would then act as a proper controller, perform logic, and send the information to the view to be displayed on the screen. So, the "view" for the server MVC was the "model" for the client MVC. As browser technologies (HTML and JavaScript) matured, it became possible to create smart apps that used the Nested MVC design pattern directly inside an HTML web page. This pattern makes it possible to run a full-sized application using only JavaScript. There is no longer any need to download multiple frameworks or separate apps. You can now get the same functionality from visiting a URL as you could previously by buying a packaged product. A giant Meteor appears! Meteor takes modern web apps to the next level. It enhances and builds upon the nested MVC design pattern by implementing three key features: Data On The Wire through the Distributed Data Protocol (DDP) Latency Compensation with Mini Databases Full Stack Reactivity with Blaze and Tracker Let's walk through these concepts to see why they're valuable, and then, we'll apply them to our Lending Library application. Data On The Wire The concept of Data On The Wire is very simple and in tune with the nested MVC pattern; instead of having a server process everything, render content, and then send HTML across the wire, why not just send the data across the wire and let the client decide what to do with it? This concept is implemented in Meteor using the Distributed Data Protocol, or DDP. DDP has a JSON-based syntax and sends messages similar to the REST protocol. Additions, deletions, and changes are all sent across the wire and handled by the receiving service/client/device. Since DDP uses WebSockets rather than HTTP, the data can be pushed whenever changes occur. But the true beauty of DDP lies in the generic nature of the communication. It doesn't matter what kind of system sends or receives data over DDP—it can be a server, a web service, or a client app—they all use the same protocol to communicate. This means that none of the systems know (or care) whether the other systems are clients or servers. With the exception of the browser, any system can be a server, and without exception, any server can act as a client. All the traffic looks the same and can be treated in a similar manner. In other words, the traditional concept of having a single server for a single client goes away. You can hook multiple servers together, each serving a discreet purpose, or you can have a client connect to multiple servers, interacting with each one differently. Think about what you can do with a system like that: Imagine multiple systems all coming together to create, for example, a health monitoring system. Some systems are built with C++, some with Arduino, some with…well, we don't really care. They all speak DDP. They send and receive data on the wire and decide individually what to do with that data. Suddenly, very difficult and complex problems become much easier to solve. DDP has been implemented in pretty much every major programming language, allowing you true freedom to architect an enterprise application. Latency Compensation Meteor employs a very clever technique called Mini Databases. A mini database is a "lite" version of a normal database that lives in the memory on the client side. Instead of the client sending requests to a server, it can make changes directly to the mini database on the client. This mini database then automatically syncs with the server (using DDP of course), which has the actual database. Out of the box, Meteor uses MongoDB and Minimongo: When the client notices a change, it first executes that change against the client-side Minimongo instance. The client then goes on its merry way and lets the Minimongo handlers communicate with the server over DDP. If the server accepts the change, it then sends out a "changed" message to all connected clients, including the one that made the change. If the server rejects the change, or if a newer change has come in from a different client, the Minimongo instance on the client is corrected, and any affected UI elements are updated as a result. All of this doesn't seem very groundbreaking, but here's the thing—it's all asynchronous, and it's done using DDP. This means that the client doesn't have to wait until it gets a response back from the server. It can immediately update the UI based on what is in the Minimongo instance. What if the change was illegal or other changes have come in from the server? This is not a problem as the client is updated as soon as it gets word from the server. Now, what if you have a slow internet connection or your connection goes down temporarily? In a normal client/server environment, you couldn't make any changes, or the screen would take a while to refresh while the client waits for permission from the server. However, Meteor compensates for this. Since the changes are immediately sent to Minimongo, the UI gets updated immediately. So, if your connection is down, it won't cause a problem: All the changes you make are reflected in your UI, based on the data in Minimongo. When your connection comes back, all the queued changes are sent to the server, and the server will send authorized changes to the client. Basically, Meteor lets the client take things on faith. If there's a problem, the data coming in from the server will fix it, but for the most part, the changes you make will be ratified and broadcast by the server immediately. Coding this type of behavior in Meteor is crazy easy (although you can make it more complex and therefore more controlled if you like): lists = new Mongo.Collection("lists"); This one line declares that there is a lists data model. Both the client and server will have a version of it, but they treat their versions differently. The client will subscribe to changes announced by the server and update its model accordingly. The server will publish changes, listen to change requests from the client, and update its model (its master copy) based on these change requests. Wow, one line of code that does all that! Of course, there is more to it, but that's beyond the scope of this article, so we'll move on. To better understand Meteor data synchronization, see the Publish and subscribe section of the meteor documentation at http://docs.meteor.com/#/full/meteor_publish. Full Stack Reactivity Reactivity is integral to every part of Meteor. On the client side, Meteor has the Blaze library, which uses HTML templates and JavaScript helpers to detect changes and render the data in your UI. Whenever there is a change, the helpers re-run themselves and add, delete, and change UI elements, as appropriate, based on the structure found in the templates. These functions that re-run themselves are called reactive computations. On both the client and the server, Meteor also offers reactive computations without having to use a UI. Called the Tracker library, these helpers also detect any data changes and rerun themselves accordingly. Because both the client and the server are JavaScript-based, you can use the Tracker library anywhere. This is defined as isomorphic or full stack reactivity because you're using the same language (and in some cases the same code!) on both the client and the server. Re-running functions on data changes has a really amazing benefit for you, the programmer: you get to write code declaratively, and Meteor takes care of the reactive part automatically. Just tell Meteor how you want the data displayed, and Meteor will manage any and all data changes. This declarative style is usually accomplished through the use of templates. Templates work their magic through the use of view data bindings. Without getting too deep, a view data binding is a shared piece of data that will be displayed differently if the data changes. Let's look at a very simple data binding—one for which you don't technically need Meteor—to illustrate the point. Let's perform the following set of steps to understand the concept in detail: In LendLib.html, you will see an HTML-based template expression: <div id="categories-container">      {{> categories}}   </div> This expression is a placeholder for an HTML template that is found just below it: <template name="categories">    <h2 class="title">my stuff</h2>.. So, {{> categories}} is basically saying, "put whatever is in the template categories right here." And the HTML template with the matching name is providing that. If you want to see how data changes will affect the display, change the h2 tag to an h4 tag and save the change: <template name="categories">    <h4 class="title">my stuff</h4> You'll see the effect in your browser. (my stuff will become itsy bitsy.) That's view data binding at work. Change the h4 tag back to an h2 tag and save the change, unless you like the change. No judgment here...okay, maybe a little bit of judgment. It's ugly, and tiny, and hard to read. Seriously, you should change it back before someone sees it and makes fun of you! Alright, now that we know what a view data binding is, let's see how Meteor uses it. Inside the categories template in LendLib.html, you'll find even more templates: <template name="categories"> <h4 class="title">my stuff</h4> <div id="categories" class="btn-group">    {{#each lists}}      <div class="category btn btn-primary">        {{Category}}      </div>    {{/each}} </div> </template> Meteor uses a template language called Spacebars to provide instructions inside templates. These instructions are called expressions, and they let us do things like add HTML for every record in a collection, insert the values of properties, and control layouts with conditional statements. The first Spacebars expression is part of a pair and is a for-each statement. {{#each lists}} tells the interpreter to perform the action below it (in this case, it tells it to make a new div element) for each item in the lists collection. lists is the piece of data, and {{#each lists}} is the placeholder. Now, inside the {{#each lists}} expression, there is one more Spacebars expression: {{Category}} Since the expression is found inside the #each expression, it is considered a property. That is to say that {{Category}} is the same as saying this.Category, where this is the current item in the for-each loop. So, the placeholder is saying, "add the value of the Category property for the current record." Now, if we look in LendLib.js, we will see the reactive values (called reactive contexts) behind the templates: lists : function () { return lists.find(... Here, Meteor is declaring a template helper named lists. The helper, lists, is found inside the template helpers belonging to categories. The lists helper happens to be a function that returns all the data in the lists collection, which we defined previously. Remember this line? lists = new Mongo.Collection("lists"); This lists collection is returned by the above-mentioned helper. When there is a change to the lists collection, the helper gets updated and the template's placeholder is changed as well. Let's see this in action. On your web page pointing to http://localhost:3000, open the browser console and enter the following line: > lists.insert({Category:"Games"}); This will update the lists data collection. The template will see this change and update the HTML code/placeholder. Each of the placeholders will run one additional time for the new entry in lists, and you'll see the following screen: When the lists collection was updated, the Template.categories.lists helper detected the change and reran itself (recomputed). This changed the contents of the code meant to be displayed in the {{> categories}} placeholder. Since the contents were changed, the affected part of the template was re-run. Now, take a minute here and think about how little we had to do to get this reactive computation to run: we simply created a template, instructing Blaze how we want the lists data collection to be displayed, and we put in a placeholder. This is simple, declarative programming at its finest! Let's create some templates We'll now see a real-life example of reactive computations and work on our Lending Library at the same time. Adding categories through the console has been a fun exercise, but it's not a long-term solution. Let's make it so that we can do that on the page instead as follows: Open LendLib.html and add a new button just before the {{#each lists}} expression: <div id="categories" class="btn-group"> <div class="category btn btn-primary" id="btnNewCat">    <span class="glyphicon glyphicon-plus"></span> </div> {{#each lists}} This will add a plus button on the page, as follows: Now, we want to change the button into a text field when we click on it. So let's build that functionality by using the reactive pattern. We will make it based on the value of a variable in the template. Add the following {{#if…else}} conditionals around our new button: <div id="categories" class="btn-group"> {{#if new_cat}} {{else}}    <div class="category btn btn-primary" id="btnNewCat">      <span class="glyphicon glyphicon-plus"></span>    </div> {{/if}} {{#each lists}} The first line, {{#if new_cat}}, checks to see whether new_cat is true or false. If it's false, the {{else}} section is triggered, and it means that we haven't yet indicated that we want to add a new category, so we should be displaying the button with the plus sign. In this case, since we haven't defined it yet, new_cat will always be false, and so the display won't change. Now, let's add the HTML code to display when we want to add a new category: {{#if new_cat}} <div class="category form-group" id="newCat">      <input type="text" id="add-category" class="form-control" value="" />    </div> {{else}} ... {{/if}} There's the smallest bit of CSS we need to take care of as well. Open ~/Documents/Meteor/LendLib/LendLib.css and add the following declaration: #newCat { max-width: 250px; } Okay, so now we've added an input field, which will show up when new_cat is true. The input field won't show up unless it is set to true; so, for now, it's hidden. So, how do we make new_cat equal to true? Save your changes if you haven't already done so, and open LendLib.js. First, we'll declare a Session variable, just below our Meteor.isClient check function, at the top of the file: if (Meteor.isClient) { // We are declaring the 'adding_category' flag Session.set('adding_category', false); Now, we'll declare the new template helper new_cat, which will be a function returning the value of adding_category. We need to place the new helper in the Template.categories.helpers() method, just below the declaration for lists: Template.categories.helpers({ lists: function () {    ... }, new_cat: function(){    //returns true if adding_category has been assigned    //a value of true    return Session.equals('adding_category',true); } }); Note the comma (,) on the line above new_cat. It's important that you add that comma, or your code will not execute. Save these changes, and you'll see that nothing has changed. Ta-da! In reality, this is exactly as it should be because we haven't done anything to change the value of adding_category yet. Let's do this now: First, we'll declare our click event handler, which will change the value in our Session variable. To do this, add the following highlighted code just below the Template.categories.helpers() block: Template.categories.helpers({ ... }); Template.categories.events({ 'click #btnNewCat': function (e, t) {    Session.set('adding_category', true);    Tracker.flush();    focusText(t.find("#add-category")); } }); Now, let's take a look at the following line of code: Template.categories.events({ This line declares that events will be found in the category template. Now, let's take a look at the next line: 'click #btnNewCat': function (e, t) { This tells us that we're looking for a click event on the HTML element with an id="btnNewCat" statement (which we already created in LendLib.html). Session.set('adding_category', true); Tracker.flush(); focusText(t.find("#add-category")); Next, we set the Session variable, adding_category = true, flush the DOM (to clear up anything wonky), and then set the focus onto the input box with the id="add-category" expression. There is one last thing to do, and that is to quickly add the focusText(). helper function. To do this, just before the closing tag for the if (Meteor.isClient) function, add the following code: /////Generic Helper Functions///// //this function puts our cursor where it needs to be. function focusText(i) { i.focus(); i.select(); }; } //<------closing bracket for if(Meteor.isClient){} Now, when you save the changes and click on the plus button, you will see the input box: Fancy! However, it's still not useful, and we want to pause for a second and reflect on what just happened; we created a conditional template in the HTML page that will either show an input box or a plus button, depending on the value of a variable. This variable is a reactive variable, called a reactive context. This means that if we change the value of the variable (like we do with the click event handler), then the view automatically updates because the new_cat helpers function (a reactive computation) will rerun. Congratulations, you've just used Meteor's reactive programming model! To really bring this home, let's add a change to the lists collection (which is also a reactive context, remember?) and figure out a way to hide the input field when we're done. First, we need to add a listener for the keyup event. Or, to put it another way, we want to listen when the user types something in the box and hits Enter. When this happens, we want to add a category based on what the user typed. To do this, let's first declare the event handler. Just after the click handler for #btnNewCat, let's add another event handler: 'click #btnNewCat': function (e, t) {    ... }, 'keyup #add-category': function (e,t){    if (e.which === 13)    {      var catVal = String(e.target.value || "");      if (catVal)      {        lists.insert({Category:catVal});        Session.set('adding_category', false);      }    } } We add a "," character at the end of the first click handler, and then add the keyup event handler. Now, let's check each of the lines in the preceding code: This line checks to see whether we hit the Enter/Return key. if (e.which === 13) This line of code checks to see whether the input field has any value in it: var catVal = String(e.target.value || ""); if (catVal) If it does, we want to add an entry to the lists collection: lists.insert({Category:catVal}); Then, we want to hide the input box, which we can do by simply modifying the value of adding_category: Session.set('adding_category', false); There is one more thing to add and then we'll be done. When we click away from the input box, we want to hide it and bring back the plus button. We already know how to do this reactively, so let's add a quick function that changes the value of adding_category. To do this, add one more comma after the keyup event handler and insert the following event handler: 'keyup #add-category': function (e,t){ ... }, 'focusout #add-category': function(e,t){    Session.set('adding_category',false); } Save your changes, and let's see this in action! In your web browser on http://localhost:3000, click on the plus sign, add the word Clothes, and hit Enter. Your screen should now resemble the following screenshot: Feel free to add more categories if you like. Also, experiment by clicking on the plus button, typing something in, and then clicking away from the input field. Summary In this article, you learned about the history of web applications and saw how we've moved from a traditional client/server model to a nested MVC design pattern. You learned what smart apps are, and you also saw how Meteor has taken smart apps to the next level with Data On The Wire, Latency Compensation, and Full Stack Reactivity. You saw how Meteor uses templates and helpers to automatically update content, using reactive variables and reactive computations. Lastly, you added more functionality to the Lending Library. You made a button and an input field to add categories, and you did it all using reactive programming rather than directly editing the HTML code. Resources for Article: Further resources on this subject: Building the next generation Web with Meteor [article] Quick start - creating your first application [article] Meteor.js JavaScript Framework: Why Meteor Rocks! [article]
Read more
  • 0
  • 0
  • 4916

article-image-project-setup-and-modeling-residential-project
Packt
08 Jul 2015
20 min read
Save for later

Project Setup and Modeling a Residential Project

Packt
08 Jul 2015
20 min read
In this article by Scott H. MacKenzie and Adam Rendek, authors of the book ArchiCAD 19 – The Definitive Guide, we will see how our journey, into ArchiCAD 19, begins with an introduction to the graphic user interface, also known as the GUI. As with any software program, there is a menu bar along the top that gives access to all the tools and features. There are also toolbars and tool palettes that can be docked anywhere you like. In addition to this, there are some special palettes that pop up only when you need them. After your introduction to ArchiCAD's user interface, you can jump right in and start creating the walls and floors for your new house. Then you will learn how to create ceilings and the stairs. Before too long you will have a 3D model to orbit around. It is really fun and probably easier than you would expect. (For more resources related to this topic, see here.) The ArchiCAD GUI The first time you open ArchiCAD you will find the toolbars along the top, just under the menu bar and there will be palettes docked to the left and right of the drawing area. We will focus on the 3 following palettes to get started: The Toolbox palette: This contains all of your selection, modeling, and drafting tools. It will be located on the left hand side by default. The Info Box palette: This is your context menu that changes according to whatever tool is currently in use. By default, this will be located directly under the toolbars at the top. It has a scrolling function; hover your cursor over the palette and spin the scroll wheel on your mouse to reveal everything on the palette. The Navigator palette: This is your project navigation window. This palette gives you access to all your views, sheets, and lists. It will be located on the right-hand side by default. These three palettes can be seen in the following screenshot: All of the mentioned palettes are dockable and can be arranged however you like on your screen. They can also be dragged away from the main ArchiCAD interface. For instance, you could have palettes on a second monitor. Panning and Zooming ArchiCAD has the same panning and zooming interface as most other CAD (Computer-aided design) and BIM (Building Information Modeling) programs. Rolling the scroll wheel on your mouse will zoom in and out. Pressing down on the scroll wheel (or middle button) and moving your cursor will execute a pan. Each drawing view window has a row of zoom commands along the bottom. You should try each one to get familiar with each of their functions. View toggling When you have multiple views open, you can toggle through them by pressing the Ctrl key and tapping on the Tab key. Or, you can pick any of the open views from the bottom of the Window pull-down menu. Pressing the F2 key will open a 2D floor plan view and pressing the F3 key will open the default 3D view. Pressing the F5 key will open a 3D view of selected items. In other words, if you want to isolate specific items in a 3D view, select those items and press F5. The function keys are second nature to those that have been using ArchiCAD for a long time. If a feature has a function key shortcut, you should use it. Project setup ArchiCAD is available in multiple different language versions. The exercises in this book use the USA version of ArchiCAD. Obviously this version is in English. There is another version in English and that is referred to as the International (INT) version. You can use the International version to do the exercises in the book, just be aware that there may be some subtle differences in the way that something is named or designed. When you create a new project in ArchiCAD, you start by opening a project template. The template will have all the basic stuff you need to get started including layers, line types, wall types, doors, windows, and more. The following lesson will take you through the first steps in creating a new ArchiCAD project: Open ArchiCAD. The Start ArchiCAD dialog box will appear. Select the Create a New Project radio button at the top. Select the Use a Template radio button under Set up Project Settings. Select ArchiCAD 19 Residential Template.tpl from the drop-down list. If you have the International version of ArchiCAD, then the residential template may not be available. Therefore you can use ArchiCAD 19 Template.tpl. Click on New. This will open a blank project file. Project Settings Now that you have opened your new project, we are going to create a house with 4 stories (which includes a story for the roof). We create a story for the roof in order to facilitate a workspace to model the elements on that level. The template we just opened only has 2 stories, so we will need to add 2 more. Then we need to look at some other settings. Stories The settings for the stories are as follows: On the Navigator palette, select the Project Map icon . Double click on 1st FLOOR. Right click on Stories and select Create New Story. You will be prompted to give the new story a name. Enter the name BASEMENT. Click on the button next to Below. Enter 9' into the Height box and click on the Create button. Then double click on 2. 2nd FLOOR. Right click on Stories and then select Create New Story. You will be prompted to give the new story a name. Enter the name ROOF. Click on the button next to Above. Enter 9' into the Height box and click on the Create button. Your list of stories should now look like this 3. ROOF 2. 2nd Floor 1. 1st Floor -1. BASEMENT The International version of ArchiCAD (INT) will give the first floor the index number of 0. The second floor index number will be 1. And the roof will be 2. Now we need to adjust the heights of the other stories: Right click on Stories (on the Navigator palette) and select Story Settings. Change the number in the Height to Next box for 1st FLOOR to 9'. Do the same for 2nd FLOOR. Units On the menu bar, go to Options | Project Preferences | Working Units and perform the following steps: Ensure Model Units is set to feet & fractional inches. Ensure that Fractions is set to 1/64. Ensure that Layout Units is set to feet & fractional inches. Ensure that Angle Unit is set to Decimal degrees. Ensure that Decimals is set to 2. You are now ready to begin modeling your house, but first let's save the project. To save the project, perform the following steps: Navigate to the File menu and click on Save. If by chance you have saved it already, then click on Save As. Name your file Colonial House. Click on Save. Renovation filters The Renovation Filter feature allows you to differentiate how your drawing elements will appear in different construction phases. For renovation projects that have demolition and new work phases, you need to show the items to be demolished differently than the existing items that are to remain, or that are new. The projects we will work on in this book do not require this feature to manage phases because we will only be creating a new construction. However, it is essential that your renovation filter setting is set to New Construction. We will do this in the first modeling exercise. Selection methods Before you can do much in ArchiCAD, you need to be familiar with selecting elements. There are several ways to select something in ArchiCAD, which are as follows: Single cursor click Pick the Arrow tool from the toolbox or hold the Shift key down on the keyboard and click on what you want to select. As you click on the elements, hold the Shift key down to add them to your selection set. To remove elements from the selection set, just click on them again with the Shift key pressed. There is a mode within this mode called Quick Selection. It is toggled on and off from the Info Box palette. The icon looks like a magnet. When it is on, it works like a magnet because it will stick to faces or surfaces, such as slabs or fill patterns. If this mode is not on, then you are required to find an edge, endpoint, or hotspot node to select an element with a single click. Hold the Space button down to temporarily change the mode while selecting elements. Window Pick the Arrow tool from the toolbox or hold the Shift key down and draw your selection window. Click once for the window starting corner and click a second time for the end corner. This works just as windowing does in AutoCAD. Not as Revit does, where you need to hold the mouse button down while you draw your window. There are 3 different windowing methods. Each one is set from the Info Box palette: Partial Elements: Anything that is inside of or touching the window will be selected. AutoCAD users will know this as a Crossing Window. Entire Elements: Anything completely encapsulated by the window will be selected. If something is not completely inside the window then it will not be selected. Direction Dependent: Click and window to the left, the Partial Elements window will be used. Click and window to the right, the Entire Elements window will be used. Marquee A marquee is a selection window that stays on the screen after you create it. If you are a MicroStation CAD program user, this will be similar to a selection window. It can be used for printing a specific area in a drawing view and performing what AutoCAD users would refer to as a Stretch command. There are 2 types of marquees; single story (skinny) and multi story (fat). The single story marquee is used when you want to select elements on your current story view only. The multi-story marquee will select everything on your current story as well as the stories above and below your selections. The Find & Select tool This lets ArchiCAD select elements for you, based on the attribute criteria that you define, such as element type, layer, and pen number. When you have the criteria defined, click on the plus sign button on the palette and all the elements within that criterion inside your current view or marquee will be selected. The quickest way to open the Find & Select tool is with the Ctrl + F key combination Modification commands As you draw, you will inevitably need to move, copy, stretch, or trim something. Select your items first, and then execute the modification command. Here are the basic commands you will need to get things moving: Adjust (Extend): Press Ctrl + - or navigate to Edit | Reshape | Adjust Drag (Move): Press Ctrl + D or…navigate to Edit | Move | Drag Drag a Copy (Copy): Press Ctrl + Shift + D or navigate to Edit | Move | Drag a Copy Intersect (Fillet): Click on the Intersect button on the Standard toolbar or navigate to Edit | Reshape | Intersect Resize (Scale): Press Ctrl + K or navigate to Edit | Reshape | Resize Rotate: Press Ctrl + E or navigate to Edit | Move | Rotate Stretch: Press Ctrl + H or navigate to Edit | Reshape | Stretch Trim: Press Ctrl or click on the Trim button on the Standard toolbar or navigate to Edit | Reshape | Trim. Hold the Ctrl key down and click on the portion of wall or line that you want trimmed off. This is the fastest way to trim anything! Memorizing the keyboard combinations above is a sure way to increase your productivity. Modeling – part I We will start with the wall tool to create the main exterior walls on the 1st floor of our house, and then create the floor with the slab tool. However, before we begin, let's make sure your Renovation Filter is set to New Construction. Setting the Renovation Filter The Renovation Filter is an active setting that controls how the elements you create are displayed. Everything we create in this project is for new construction so we need the new construction filter to be active. To do so, go to the Document menu, click on Renovation and then click on 04 New Construction. Using the Wall tool The Wall tool has settings for height, width, composite, layer, pen weight and more. We will learn about these things as we go along, and learn a little bit more each time we progress into to the project. Double click on 1. 1st Story in the Navigator palette to ensure we are working on story 1. Select the Wall tool from the Toolbox palette or from the menu bar under Design | Design Tools | Wall. Notice that this will automatically change the contents of the Info Box palette. Click on the wall icon inside Info Box. This will bring up the active properties of the wall tool in the form of the Wall Default Settings window. (This can also be achieved by double clicking on the wall tool button in Toolbox). Change the composite type to Siding 2x6 Wd. Stud. Click on the wall composite button to do this.   Creating the exterior walls of the 1st Story To create the exterior walls of the 1st story perform the following steps: Select the Wall tool from the Toolbox palette, or from the menu bar under Design | Design Tools | Wall. Double click on 1. 1st Story in the Navigator palette to ensure that we are working on story 1. Select the Wall tool from the Toolbox palette, or from the menu bar under Design | Design Tools | Wall. Change the composite type to be Siding 2x6 Wd. Stud. Click on the wall composite button to do this. Notice at the bottom of the Wall Default Settings window is the layer currently assigned to the wall tool. It should be set to A-WALL-EXTR. Click on OK to start your first wall. Click near the center of the drawing screen and move your cursor to the left, notice the orange dashed line that appears. That is your guide line. Keep your cursor over the guide line so that it keeps you locked in the orthogonal direction. You should also immediately see the Tracker palette pop up, displaying your distance drawn and angle after your first click. Before you make your second click, enter the number 24 from your keyboard and press Enter. You should now have 24-0" long wall. If your Tracker palette does not appear, it may be toggled off. Go up to the Standard tool bar and click on the Tracker button to turn it on. Select this again and make your first click on the upper left end corner of your first wall. Move your cursor down, so that it snaps to the guideline, enter the number 28, and press the Enter key. Draw your third wall by clicking on the bottom left endpoint of your second wall, move your cursor to the right, snapped over the guide line, type in the number 24 and press Enter. Draw your fourth wall by clicking on the bottom right end point of your third wall and the starting point of your first wall. You should now have four walls that measure 24'-0" x 28"-0, outside edge to outside edge. Move your four walls to the center of the drawing view and perform the following steps: Click on the Arrow tool at the top of the Toolbox. Click outside one of the corners of the walls, and then click on the opposite side. All four walls should be selected now. Use the Drag command to move the walls. The quickest way to activate the Drag command is by pressing Ctrl + D. The long way is from the menu bar by navigating to Edit | Move | Drag. Drag (move) the walls to the center of your drawing window. Press the Esc key or click on a blank space in your drawing window to deselect the walls. You can select all the walls in a view by activating the Wall tool and pressing Ctrl + A. You are now ready to create a floor with the slab tool. But first, let's have a little fun and see how it looks in 3D (press the F3 key): From the Navigator palette, double click on Generic Axonometry under the 3D folder icon. This will open a 3D view window. Hold your Shift key down, press down on your scroll wheel button, and slowly move your mouse around. You are now orbiting! Play around with it a little, then get back to work and go to the next step to create your first floor slab. Press the F2 key to get back to a 2D view. You can also perform a 3D orbit via the Orbit button at the bottom of any 3D view window. Creating the first story's floor with the Slab tool The slab tool is used to create floors. It is also used to create ceilings. We will begin using it now to create the first floor for our house. Similar to the Wall tool, it also has settings for layer, pen weight and composite. To create the first story's floor using the Slab tool, perform the following steps: Select the Slab tool from the Toolbox palette or from the menu bar under Design | Design Tools | Slab. This will change the contents of the Info Box palette. Click on the Slab icon in Info Box. This will bring up the Slab Default Settings (active properties) window for the Slab tool. As with the Wall tool, you have a composite setting for the slab tool. Set the composite type for the slab tool to FLR Wd Flr + 2x10. The layer should be set to A-FLOR. Click OK. You could draw the shape of the slab by tracing over the outside lines of your walls but we are going to use the Magic Wand feature. Hover your cursor over the space inside your four walls and press the space bar on your keyboard. This will automatically create the slab using the boundary created by the walls. Then, open a 3D view and look at your floor. Instead of using the tool icon inside the Info Box palette, double click on any tool icon inside the Toolbox palette to bring up the default settings window for that tool. Creating the exterior walls and floor slabs for the basement and the second story We could repeat all of the previous steps to create the floor and walls for the second story and the basement, but in this case, it will be quicker to copy what we have already drawn on the first story and copy it up with the Edit Elements by Stories tool. Perform the following steps to create the exterior walls and floor slabs for the basement and second story: Go to the Navigator palette and right click over Stories, select Edit Elements by Stories. The Edit Elements by Stories window will open. Under Select Action, you want to set it to Copy. Under From Story, set it to 1. 1st FLOOR. In the To Story section, check the box for 2nd FLOOR and -1. BASEMENT. Click on OK. You should see a dialog box appear, stating that as a result of the last operation, elements have been created and/or have changed their position on currently unseen stories. Whenever you get this message, you should confirm that you have not created any unwanted elements. Click on the Continue button. Now you should have walls and a floor on three stories; Basement, 1st FLOOR, and 2nd FLOOR. The quickest way to jump to the next story up or the next story down is with the Ctrl + Arrow Up or Ctrl + Arrow Down key combination. Basement element modification The floor and the walls on the BASEMENT story need to be changed to a different composite type. Do this by performing the following steps: Open the BASEMENT view and select the four walls by clicking on one at a time while holding down the Shift key. Right click over your selection and click on Wall Selection Settings. Change the walls to the EIFS on 8" CMU composite type. Then, click on OK. Move your cursor over the floor slab. The quick selection cursor should appear. This selection mode allows you to click on an object without needing to find an edge or endpoint. Click on the slab. Open the Slab Selection Setting window but this time, do it by pressing the Ctrl + T key combination. Change the floor slab composite to Conc. Slab: 4" on gravel. Click on OK. The Ctrl + T key combination is the quickest way to bring up an element's selection settings window when an element is selected. Open a 3D view (by pressing the F3 key) and orbit around your house. It should look similar to the following screenshot: Adding the garage We need to add the garage and the laundry room, which connects the garage to the house. Do this by performing the following steps: Open the 1st FLOOR story from the project map. Start the Wall tool. From the Info Box palette, set the wall composite setting to Siding 2x6 Wd. Stud. Click on the upper-left corner of your house for your wall starting point. Move your cursor to the left, snap to the guide line, type 6'-10", and press Enter. Change the Geometry Method setting on Info Box to Chained. Refer to the following screenshot: Start your next wall by clicking on the endpoint of your last wall, move your cursor up, snap to the guideline and type 5', and press Enter. Move your cursor to the left, snap to grid line, type in 12'-6", and press Enter. Move your cursor down, snap to grid line, type in 22'-4", and press Enter. Move your cursor to the right, snap to grid line and double click on the perpendicular west wall (double pressing your Enter key will work the same as a double click). Now we want to create the floor for this new set of walls. To do that, perform the following steps: Start the Slab tool. Change the composite to Conc. Slab: 4" on gravel. Hover your cursor inside the new set of walls and press the Space key to use the magic wand. This will create the floor slab for the garage and laundry room. There is still one more wall to create, but this time we will use the Adjust command to, in effect, create a new wall: Select the 5'-0" wall drawn in the previous exercise. Go to the Edit menu, click on Reshape, and then click on Adjust. Click on the bottom edge of the perpendicular wall down below. The wall should extend down. Refer to the following screenshot: Then Change to a 3D view (by pressing F3) and examine your work. The 3D view If you switch to a 3D view and your new modeling does not show, zoom in or out to refresh the view, or double click your scroll wheel (middle button). Your new work will appear. Summary In this article you were introduced to the ArchiCAD Graphical User Interface (GUI), project settings and learned how to select stuff. You created all the major modeling for your house and got a primer on layers. Now you should have a good understanding of the ArchiCAD way of creating architectural elements and how to control their parameters. Resources for Article: Further resources on this subject: Let There be Light! [article] Creating an AutoCAD command [article] Setting Up for Photoreal Rendering [article]
Read more
  • 0
  • 0
  • 2614

article-image-what-apache-camel
Packt
08 Jul 2015
9 min read
Save for later

What is Apache Camel?

Packt
08 Jul 2015
9 min read
In this article Jean-Baptiste Onofré, author of the book Mastering Apache Camel, we will see how Apache Camel originated in Apache ServiceMix. Apache ServiceMix 3 was powered by the Spring framework and implemented in the JBI specification. The Java Business Integration (JBI) specification proposed a Plug and Play approach for integration problems. JBI was based on WebService concepts and standards. For instance, it directly reusesthe Message Exchange Patterns (MEP) concept that comes from WebService Description Language (WSDL). Camel reuses some of these concepts, for instance, you will see that we have the concept of MEP in Camel. However, JBI suffered mostly from two issues: In JBI, all messages between endpoints are transported in the Normalized Messages Router (NMR). In the NMR, a message has a standard XML format. As all messages in the NMR havethe same format, it's easy to audit messages and the format is predictable. However, the JBI XML format has an important drawback for performances: it needs to marshall and unmarshall the messages. Some protocols (such as REST or RMI) are not easy to describe in XML. For instance, REST can work in stream mode. It doesn't make sense to marshall streamsin XML. Camel is payload-agnostic. This means that you can transport any kind of messages with Camel (not necessary XML formatted). JBI describes a packaging. We distinguish the binding components (responsible for the interaction with the system outside of the NMR and the handling of the messages in the NMR), and the service engines (responsible for transforming the messages inside the NMR). However, it's not possible to directly deploy the endpoints based on these components. JBI requires a service unit (a ZIP file) per endpoint, and for each package in a service assembly (another ZIP file). JBI also splits the description of the endpoint from its configuration. It does not result in a very flexible packaging: with definitions and configurations scattered in different files, not easy to maintain. In Camel, the configuration and definition of the endpoints are gatheredin a simple URI. It's easier to read. Moreover, Camel doesn't force any packaging; the same definition can be packaged in a simple XML file, OSGi bundle, andregular JAR file. In addition to JBI, another foundation of Camel is the book Enterprise Integration Patterns by Gregor Hohpe and Bobby Woolf. It describes design patterns answering classical problems while dealing with enterprise application integration and message oriented middleware. The book describes the problems and the patterns to solve them. Camel strives to implement the patterns described in the book to make them easy to use and let the developer concentrate on the task at hand. This is what Camel is: an open source framework that allows you to integrate systems and that comes with a lot of connectors and Enterprise Integration Patterns (EIP) components out of the box. And if that is not enough, one can extend and implement custom components. Components and bean support Apache Camel ships with a wide variety of components out of the box; currently, there are more than 100 components available. We can see: The connectivity components that allow exposure of endpoints for external systems or communicate with external systems. For instance, the FTP, HTTP, JMX, WebServices, JMS, and a lot more components are connectivity components. Creating an endpoint and the associated configuration for these components is easy, by directly using a URI. The internal components applying rules to the messages internally to Camel. These kinds of components apply validation or transformation rules to the inflight message. For instance, validation or XSLT are internal components. Camel brings a very powerful connectivity and mediation framework. Moreover, it's pretty easy to create new custom components, allowing you to extend Camel if the default components set doesn't match your requirements. It's also very easy to implement complex integration logic by creating your own processors and reusing your beans. Camel supports beans frameworks (IoC), such as Spring or Blueprint. Predicates and expressions As we will see later, most of the EIP need a rule definition to apply a routing logic to a message. The rule is described using an expression. It means that we have to define expressions or predicates in the Enterprise Integration Patterns. An expression returns any kind of value, whereas a predicate returns true or false only. Camel supports a lot of different languages to declare expressions or predicates. It doesn't force you to use one, it allows you to use the most appropriate one. For instance, Camel supports xpath, mvel, ognl, python, ruby, PHP, JavaScript, SpEL (Spring Expression Language), Groovy, and so on as expression languages. It also provides native Camel prebuilt functions and languages that are easy to use such as header, constant, or simple languages. Data format and type conversion Camel is payload-agnostic. This means that it can support any kind of message. Depending on the endpoints, it could be required to convert from one format to another. That's why Camel supports different data formats, in a pluggable way. This means that Camel can marshall or unmarshall a message in a given format. For instance, in addition to the standard JVM serialization, Camel natively supports Avro, JSON, protobuf, JAXB, XmlBeans, XStream, JiBX, SOAP, and so on. Depending on the endpoints and your need, you can explicitly define the data format during the processing of the message. On the other hand, Camel knows the expected format and type of endpoints. Thanks to this, Camel looks for a type converter, allowing to implicitly transform a message from one format to another. You can also explicitly define the type converter of your choice at some points during the processing of the message. Camel provides a set of ready-to-use type converters, but, as Camel supports a pluggable model, you can extend it by providing your own type converters. It's a simple POJO to implement. Easy configuration and URI Camel uses a different approach based on URI. The endpoint itself and its configuration are on the URI. The URI is human readable and provides the details of the endpoint, which is the endpoint component and the endpoint configuration. As this URI is part of the complete configuration (which defines what we name a route, as we will see later), it's possible to have a complete overview of the integration logic and connectivity in a row. Lightweight and different deployment topologies Camel itself is very light. The Camel core is only around 2 MB, and contains everythingrequired to run Camel. As it's based on a pluggable architecture, all Camel components are provided as external modules, allowing you to install only what you need, without installing superfluous and needlessly heavy modules. Camel is based on simple POJO, which means that the Camel core doesn't depend on other frameworks: it's an atomic framework and is ready to use. All other modules (components, DSL, and so on) are built on top of this Camel core. Moreover, Camel is not tied to one container for deployment. Camel supports a wide range of containers to run. They are as follows: A J2EE application server such as WebSphere, WebLogic, JBoss, and so on A Web container such as Apache Tomcat An OSGi container such as Apache Karaf A standalone application using frameworks such as Spring Camel gives a lot of flexibility, allowing you to embed it into your application or to use an enterprise-ready container. Quick prototyping and testing support In any integration project, it's typical that we have some part of the integration logic not yet available. For instance: The application to integrate with has not yet been purchased or not yet ready The remote system to integrate with has a heavy cost, not acceptable during the development phase Multiple teams work in parallel, so we may have some kinds of deadlocks between the teams As a complete integration framework, Camel provides a very easy way to prototype part of the integration logic. Even if you don't have the actual system to integrate, you can simulate this system (mock), as it allows you to implement your integration logic without waiting for dependencies. The mocking support is directly part of the Camel core and doesn't require any additional dependency. Along the same lines, testing is also crucial in an integration project. In such a kind of project, a lot of errors can happen and most are unforeseen. Moreover, a small change in an integration process might impact a lot of other processes. Camel provides the tools to easily test your design and integration logic, allowing you to integrate this in a continuous integration platform. Management and monitoring using JMX Apache Camel uses the Java Management Extension (JMX) standard and provides a lot of insights into the system using MBeans (Management Beans), providing a detailed view of the following current system: The different integration processes with the associated metrics The different components and endpoints with the associated metrics Moreover, these MBeans provide more insights than metrics. They also provide the operations to manage Camel. For instance, the operations allow you to stop an integration process, to suspend an endpoint, and so on. Using a combination of metrics and operations, you can configure a very agile integration solution. Active community The Apache Camel community is very active. This means that potential issues are identified very quickly and a fix is available soon after. However, it also means that a lot of ideas and contributions are proposed, giving more and more features to Camel. Another big advantage of an active community is that you will never be alone; a lot of people are active on the mailing lists who are ready to answer your question and provide advice. Summary Apache Camel is an enterprise integration solution used in many large organizations with enterprise support available through RedHat or Talend. Resources for Article: Further resources on this subject: Getting Started [article] A Quick Start Guide to Flume [article] Best Practices [article]
Read more
  • 0
  • 0
  • 2520

article-image-developing-javafx-application-ios
Packt
08 Jul 2015
10 min read
Save for later

Developing a JavaFX Application for iOS

Packt
08 Jul 2015
10 min read
In this article by Mohamed Taman, authors of the book JavaFX Essentials, we will learn how to develop a JavaFX, Apple has a great market share in the mobile and PC/Laptop world, with many different devices, from mobile phones such as the iPhone to musical devices such as the iPod and tablets such as the iPad. (For more resources related to this topic, see here.) It has a rapidly growing application market, called the Apple Store, serving its community, where the number of available apps increases daily. Mobile application developers should be ready for such a market. Mobile application developers targeting both iOS and Android face many challenges. By just comparing the native development environments of these two platforms, you will find that they differ substantially. iOS development, according to Apple, is based on the Xcode IDE (https://developer.apple.com/xcode/) and its programming languages. Traditionally, it was Objetive-C and, in June 2014, Apple introduced Swift (https://developer.apple.com/swift/); on the other hand, Android development, as defined by Google, is based on the Intellij IDEA IDE and the Java programming language. Not many developers are proficient in both environments. In addition, these differences rule out any code reuse between the platforms. JavaFX 8 is filling the gap for reusable code between the platforms, as we will see in this article, by sharing the same application in both platforms. Here are some skills that you will have gained by the end of this article: Installing and configuring iOS environment tools and software Creating iOS JavaFX 8 applications Simulating and debugging JavaFX mobile applications Packaging and deploying applications on iOS mobile devices Using RoboVM to run JavaFX on iOS RoboVM is the bridge from Java to Objetive-C. Using this, it becomes easy to develop JavaFX 8 applications that are to be run on iOS-based devices, as the ultimate goal of the RoboVM project is to solve this problem without compromising on developer experience or app user experience. As we saw in the article about Android, using JavaFXPorts to generate APKs was a relatively easy task due to the fact that Android is based on Java and the Dalvik VM. On the contrary, iOS doesn't have a VM for Java, and it doesn't allow dynamic loading of native libraries. Another approach is required. The RoboVM open source project tries to close the gap for Java developers by creating a bridge between Java and Objective-C using an ahead-of-time compiler that translates Java bytecode into native ARM or x86 machine code. Features Let's go through the RoboVM features: Brings Java and other JVM languages, such as Scala, Clojure, and Groovy, to iOS-based devices Translates Java bytecode into machine code ahead of time for fast execution directly on the CPU without any overhead The main target is iOS and the ARM processor (32- and 64-bit), but there is also support for Mac OS X and Linux running on x86 CPUs (both 32- and 64-bit) Does not impose any restrictions on the Java platform features accessible to the developer, such as reflection or file I/O Supports standard JAR files that let the developer reuse the vast ecosystem of third-party Java libraries Provides access to the full native iOS APIs through a Java-to-Objective-C bridge, enabling the development of apps with truly native UIs and with full hardware access Integrates with the most popular tools such as NetBeans, Eclipse, Intellij IDEA, Maven, and Gradle App Store ready, with hundreds of apps already in the store Limitations Mainly due to the restrictions of the iOS platform, there are a few limitations when using RoboVM: Loading custom bytecode at runtime is not supported. All class files comprising the app have to be available at compile time on the developer machine. The Java Native Interface technology as used on the desktop or on servers usually loads native code from dynamic libraries, but Apple does not permit custom dynamic libraries to be shipped with an iOS app. RoboVM supports a variant of JNI based on static libraries. Another big limitation is that RoboVM is an alpha-state project under development and not yet recommended for production usage. RoboVM has full support for reflection. How it works Since February 2015 there has been an agreement between the companies behind RoboVM and JavaFXPorts, and now a single plugin called jfxmobile-plugin allows us to build applications for three platforms—desktop, Android, and iOS—from the same codebase. The JavaFXMobile plugin adds a number of tasks to your Java application that allow you to create .ipa packages that can be submitted to the Apple Store. Android mostly uses Java as the main development language, so it is easy to merge your JavaFX 8 code with it. On iOS, the situation is internally totally different—but with similar Gradle commands. The plugin will download and install the RoboVM compiler, and it will use RoboVM compiler commands to create an iOS application in build/javafxports/ios. Getting started In this section, you will learn how to install the RoboVM compiler using the JavaFXMobile plugin, and make sure the tool chain works correctly by reusing the same application, Phone Dial version 1.0. Prerequisites In order to use the RoboVM compiler to build iOS apps, the following tools are required: Gradle 2.4 or higher is required to build applications with the jfxmobile plugin. A Mac running Mac OS X 10.9 or later. Xcode 6.x from the Mac App Store (https://itunes.apple.com/us/app/xcode/id497799835?mt=12). The first time you install Xcode, and every time you update to a new version, you have to open it once to agree to the Xcode terms. Preparing a project for iOS We will reuse the project we developed before, for the Android platform, since there is no difference in code, project structure, or Gradle build script when targeting iOS. They share the same properties and features, but with different Gradle commands that serve iOS development, and a minor change in the Gradle build script for the RoboVM compiler. Therefore, we will see the power of WORA Write Once, Run Everywhere with the same application. Project structure Based on the same project structure from the Android, the project structure for our iOS app should be as shown in the following figure: The application We are going to reuse the same application from the Phone DialPad version 2.0 JavaFX 8 application: As you can see, reusing the same codebase is a very powerful and useful feature, especially when you are developing to target many mobile platforms such as iOS and Android at the same time. Interoperability with low-level iOS APIs To have the same functionality of natively calling the default iOS phone dialer from our application as we did with Android, we have to provide the native solution for iOS as the following IosPlatform implementation: import org.robovm.apple.foundation.NSURL; import org.robovm.apple.uikit.UIApplication; import packt.taman.jfx8.ch4.Platform;   public class IosPlatform implements Platform {   @Override public void callNumber(String number) {    if (!number.equals("")) {      NSURL nsURL = new NSURL("telprompt://" + number);      UIApplication.getSharedApplication().openURL(nsURL);    } } } Gradle build files We will use the Gradle build script file, but with a minor change by adding the following lines to the end of the script: jfxmobile { ios {    forceLinkClasses = [ 'packt.taman.jfx8.ch4.**.*' ] } android {    manifest = 'lib/android/AndroidManifest.xml' } } All the work involved in installing and using robovm compilers is done by the jfxmobile plugin. The purpose of those lines is to give the RoboVM compiler the location of the main application class that has to be loaded at runtime is, as it is not visible by default to the compiler. The forceLinkClasses property ensures that those classes are linked in during RoboVM compilation. Building the application After we have added the necessary configuration set to build the script for iOS, its time to build the application in order to deploy it to different iOS target devices. To do so, we have to run the following command: $ gradle build We should have the following output: BUILD SUCCESSFUL   Total time: 44.74 secs We have built our application successfully; next, we need to generate the .ipa and, in the case of production, you have to test it by deploying it to as many iOS versions as you can. Generating the iOS .ipa package file In order to generate the final .ipa iOS package for our JavaFX 8 application, which is necessary for the final distribution to any device or the AppStore, you have to run the following gradle command: gradle ios This will generate the .ipa file in the directory build/javafxports/ios. Deploying the application During development, we need to check our application GUI and final application prototype on iOS simulators and measure the application performance and functionality on different devices. These procedures are very useful, especially for testers. Let's see how it is a very easy task to run our application on either simulators or on real devices. Deploying to a simulator On a simulator, you can simply run the following command to check if your application is running: $ gradle launchIPhoneSimulator This command will package and launch the application in an iPhone simulator as shown in the following screenshot: DialPad2 JavaFX 8 application running on the iOS 8.3/iPhone 4s simulator This command will launch the application in an iPad simulator: $ gradle launchIPadSimulator Deploying to an Apple device In order to package a JavaFX 8 application and deploy it to an Apple device, simply run the following command: $ gradle launchIOSDevice This command will launch the JavaFX 8 application in the device that is connected to your desktop/laptop. Then, once the application is launched on your device, type in any number and then tap Call. The iPhone will ask for permission to dial using the default mobile dialer; tap on Ok. The default mobile dialer will be launched and will the number as shown in the following figure: To be able to test and deploy your apps on your devices, you will need an active subscription with the Apple Developer Program. Visit the Apple Developer Portal, https://developer.apple.com/register/index.action, to sign up. You will also need to provision your device for development. You can find information on device provisioning in the Apple Developer Portal, or follow this guide: http://www.bignerdranch.com/we-teach/how-to-prepare/ios-device-provisioning/. Summary This article gave us a very good understanding of how JavaFX-based applications can be developed and customized using RoboVM for iOS to make it possible to run your applications on Apple platforms. You learned about RoboVM features and limitations, and how it works; you also gained skills that you can use for developing. You then learned how to install the required software and tools for iOS development and how to enable Xcode along with the RoboVM compiler, to package and install the Phone Dial JavaFX-8-based application on OS simulators. Finally, we provided tips on how to run and deploy your application on real devices. Resources for Article: Further resources on this subject: Function passing [article] Creating Java EE Applications [article] Contexts and Dependency Injection in NetBeans [article]
Read more
  • 0
  • 0
  • 10071
Packt
08 Jul 2015
21 min read
Save for later

To Be or Not to Be – Optionals

Packt
08 Jul 2015
21 min read
In this article by Andrew J Wagner, author of the book Learning Swift, we will cover: What is an optional? How to unwrap an optional Optional chaining Implicitly unwrapped optionals How to debug optionals The underlying implementation of an optional (For more resources related to this topic, see here.) Introducing optionals So, we know that the purpose of optionals in Swift is to allow the representation of the absent value, but what does that look like and how does it work? An optional is a special type that can wrap any other type. This means that you can make an optional String, optional Array, and so on. You can do this by adding a question mark (?) to the type name: var possibleString: String? var possibleArray: [Int]? Note that this code does not specify any initial values. This is because all optionals, by default, are set to no value at all. If we want to provide an initial value, we can do so like any other variable: var possibleInt: Int? = 10 Also note that, if we leave out the type specification (: Int?), possibleInt would be inferred to be of the Int type instead of an Int optional. It is pretty verbose to say that a variable lacks a value. Instead, if an optional lacks a variable, we say that it is nil. So, both possibleString and possibleArray are nil, while possibleInt is 10. However, possibleInt is not truly 10. It is still wrapped in an optional. You can see all the forms a variable can take by putting the following code in to a playground: var actualInt = 10 var possibleInt: Int? = 10 var nilInt: Int? println(actualInt) // "10" println(possibleInt) // "Optional(10)" println(nilInt) // "nil" As you can see, actualInt prints out as we expect it to, but possibleInt prints out as an optional that contains the value 10 instead of just 10. This is a very important distinction because an optional cannot be used as if it were the value it wraps. The nilInt optional just reports that it is nil. At any point, you can update the value within an optional, including the fact that you can give it a value for the first time using the assignment operator (=): nilInt = 2 println(nilInt) // "Optional(2)" You can even remove the value within an optional by assigning it to nil: nilInt = nil println(nilInt) // "nil" So, we have this wrapped form of a variable that may or may not contain a value. What do we do if we need to access the value within an optional? The answer is that we must unwrap it. Unwrapping an optional There are multiple ways to unwrap an optional. All of them essentially assert that there is truly a value within the optional. This is a wonderful safety feature of Swift. The compiler forces you to consider the possibility that an optional lacks any value at all. In other languages, this is a very commonly overlooked scenario that can cause obscure bugs. Optional binding The safest way to unwrap an optional is using something called optional binding. With this technique, you can assign a temporary constant or variable to the value contained within the optional. This process is contained within an if statement, so that you can use an else statement for when there is no value. An optional binding looks like this: if let string = possibleString {    println("possibleString has a value: \(string)") } else {    println("possibleString has no value") } An optional binding is distinguished from an if statement primarily by the if let syntax. Semantically, this code says "if you can let the constant string be equal to the value within possibleString, print out its value; otherwise, print that it has no value." The primary purpose of an optional binding is to create a temporary constant that is the normal (nonoptional) version of the optional. It is also possible to use a temporary variable in an optional binding: possibleInt = 10 if var int = possibleInt {    int *= 2 } println(possibleInt) // Optional(10) Note that an astrix (*) is used for multiplication in Swift. You should also note something important about this code, that is, if you put it into a playground, even though we multiplied int by 2, the value does not change. When we print out possibleInt later, the value still remains Optional(10). This is because even though we made the int variable (otherwise known as mutable), it is simply a temporary copy of the value within possibleInt. No matter what we do with int, nothing will be changed about the value within possibleInt. If we need to update the actual value stored within possibleInt, we need to simply assign possibleInt to int after we are done modifying it: possibleInt = 10 if var int = possibleInt {    int *= 2    possibleInt = int } println(possibleInt) // Optional(20) Now the value wrapped inside possibleInt has actually been updated. A common scenario that you will probably come across is the need to unwrap multiple optional values. One way of doing this is by simply nesting the optional bindings: if let actualString = possibleString {    if let actualArray = possibleArray {        if let actualInt = possibleInt {            println(actualString)            println(actualArray)            println(actualInt)        }    } } However, this can be a pain as it increases the indentation level each time to keep the code organized. Instead, you can actually list multiple optional bindings in a single statement separated by commas: if let actualString = possibleString,    let actualArray = possibleArray,    let actualInt = possibleInt {    println(actualString)    println(actualArray)    println(actualInt) } This generally produces more readable code. This way of unwrapping is great, but saying that optional binding is the safe way to access the value within an optional implies that there is an unsafe way to unwrap an optional. This way is called forced unwrapping. Forced unwrapping The shortest way to unwrap an optional is by forced unwrapping. This is done using an exclamation mark (!) after the variable name when it is used: possibleInt = 10 possibleInt! *= 2   println(possibleInt) // "Optional(20)" However, the reason it is considered unsafe is that your entire program crashes if you try to unwrap an optional that is currently nil: nilInt! *= 2 // fatal error The full error you get is "unexpectedly found as nil while unwrapping an optional value". This is because forced unwrapping is essentially your personal guarantee that the optional truly holds a value. This is why it is called forced. Therefore, forced unwrapping should be used in limited circumstances. It should never be used just to shorten up the code. Instead, it should only be used when you can guarantee, from the structure of the code, that it cannot be nil, even though it is defined as an optional. Even in this case, you should check whether it is possible to use a nonoptional variable instead. The only other place you may use it is when your program truly cannot recover if an optional is nil. In these circumstances, you should at least consider presenting an error to the user, which is always better than simply having your program crash. An example of a scenario where forced unwrapping may be used effectively is with lazily calculated values. A lazily calculated value is a value that is not created until the first time it is accessed. To illustrate this, let's consider a hypothetical class that represents a filesystem directory. It would have a property that lists its contents that are lazily calculated. The code would look something like this: class FileSystemItem {} class File: FileSystemItem {} class Directory: FileSystemItem {    private var realContents: [FileSystemItem]?    var contents: [FileSystemItem] {        if self.realContents == nil {           self.realContents = self.loadContents()        }        return self.realContents!    }      private func loadContents() -> [FileSystemItem] {        // Do some loading        return []    } } Here, we defined a superclass called FileSystemItem that both File and Directory inherit from. The contents of a directory is a list of any kind of FileSystemItem. We define content as a calculated variable and store the real value within the realContents property. The calculated property checks whether there is a value yet loaded for realContents; if there isn't, it loads the contents and puts it into the realContents property. Based on this logic, we know with 100 percent certainty that there will be a value within realContents by the time we get to the return statement, so it is perfectly safe to use forced unwrapping. Nil coalescing In addition to optional binding and forced unwrapping, Swift also provides an operator called the nil coalescing operator to unwrap an optional. This is represented by a double question mark (??). Basically, this operator lets us provide a default value for a variable or operation result in case it is nil. This is a safe way to turn an optional value into a nonoptional value and it would look something like this: var possibleString: String? = "An actual string" println(possibleString ?? "Default String")   // "An Actual String" Here, we ask the program to print out possibleString unless it is nil, in which case, it will just print Default String. Since we did give it a value, it printed out that value and it is important to note that it printed out as a regular variable, not as an optional. This is because one way or another, an actual value will be printed. This is a great tool for concisely and safely unwrapping an optional when a default value makes sense. Optional chaining A common scenario in Swift is to have an optional that you must calculate something from. If the optional has a value you want to store the result of the calculation on, but if it is nil, the result should just be set to nil: var invitee: String? = "Sarah" var uppercaseInvitee: String? if let actualInvitee = invitee {    uppercaseInvitee = actualInvitee.uppercaseString } This is pretty verbose. To shorten this up in an unsafe way, we could use forced unwrapping: uppercaseInvitee = invitee!.uppercaseString However, optional chaining will allow us to do this safely. Essentially, it allows optional operations on an optional. When the operation is called, if the optional is nil, it immediately returns nil; otherwise, it returns the result of performing the operation on the value within the optional: uppercaseInvitee = invitee?.uppercaseString So in this call, invitee is an optional. Instead of unwrapping it, we will use optional chaining by placing a question mark (?) after it, followed by the optional operation. In this case, we asked for the uppercaseInvitee property on it. If invitee is nil, uppercaseInvitee is immediately set to nil without it even trying to access uppercaseString. If it actually does contain a value, uppercaseInvitee gets set to the uppercaseString property of the contained value. Note that all optional chains return an optional result. You can chain as many calls, both optional and nonoptional, as you want in this way: var myNumber: String? = "27" myNumber?.toInt()?.advancedBy(10).description This code attempts to add 10 to myNumber, which is represented by String. First, the code uses an optional chain in case myNumber is nil. Then, the call to toInt uses an additional optional chain because that method returns an optional Int type. We then call advancedBy, which does not return an optional, allowing us to access the description of the result without using another optional chain. If at any point any of the optionals are nil, the result will be nil. This can happen for two different reasons: This can happen because myNumber is nil This can also happen because toInt returns nil as it cannot convert String to the Int type If the chain makes it all the way to advanceBy, there is no longer a failure path and it will definitely return an actual value. You will notice that there are exactly two question marks used in this chain and there are two possible failure reasons. At first, it can be hard to understand when you should and should not use a question mark to create a chain of calls. The rule is that you should always use a question mark if the previous element in the chain returns an optional. However, since you are prepared, let's look at what happens if you use an optional chain improperly: myNumber.toInt() // Value of optional type 'String?' not unwrapped In this case, we try to call a method directly on an optional without a chain so that we get an error. We also have the case where we try to inappropriately use an optional chain: var otherNumber = "10" otherNumber?.toInt() // Operand of postfix '?'   should have optional type Here, we get an error that says a question mark can only be used on an optional type. It is great to have a good sense of catching errors, which you will see when you make mistakes, so that you can quickly correct them because we all make silly mistakes from time to time. Another great feature of optional chaining is that it can be used for method calls on an optional that does not actually return a value: var invitees: [String]? = [] invitee?.removeAll(keepCapacity: false) In this case, we only want to call removeAll if there is truly a value within the optional array. So, with this code, if there is a value, all the elements are removed from it: otherwise, it remains nil. In the end, option chaining is a great choice for writing concise code that still remains expressive and understandable. Implicitly unwrapped optionals There is a second type of optional called an implicitly unwrapped optional. There are two ways to look at what an implicitly unwrapped optional is. One way is to say that it is a normal variable that can also be nil. The other way is to say that it is an optional that you don't have to unwrap to use. The important thing to understand about them is that like optionals, they can be nil, but like a normal variable, you do not have to unwrap them. You can define an implicitly unwrapped optional with an exclamation mark (!) instead of a question mark (?) after the type name: var name: String! Just like with regular optionals, implicitly unwrapped optionals do not need to be given an initial value because they are nil by default. At first, this may sound like it is the best of both worlds, but in reality, it is more like the worst of both worlds. Even though an implicitly unwrapped optional does not have to be unwrapped, it will crash your entire program if it is nil when used: name.uppercaseString // Crash A great way to think about them is that every time an implicitly unwrapped optional is used, it is implicitly performing a forced unwrapping. The exclamation mark is placed in its type declaration instead of using it every time. This is particularly bad because it appears the same as any other variable except for how it is declared. This means that it is very unsafe to use, unlike a normal optional. So, if implicitly unwrapped optionals are the worst of both worlds and are so unsafe, why do they even exist? The reality is that in rare circumstances, they are necessary. They are used in circumstances where a variable is not truly optional, but you also cannot give an initial value to it. This is almost always true in the case of custom types that have a member variable that is nonoptional, but cannot be set during initialization. A rare example of this is a view in iOS. UIKit, as we discussed earlier, is the framework that Apple provides for iOS development. In it, Apple has a class called UIView that is used for displaying content on the screen. Apple also provides a tool in Xcode called Interface Builder that lets you design these views in a visual editor instead of in code. Many views designed in this way need references to other views that can be accessed programmatically later. When one of these views is loaded, it is initialized without anything connected and then all the connections are made. Once all the connections are made, a function called awakeFromNib is called on the view. This means that these connections are not available for use during initialization, but are available once awakeFromNib is called. This order of operations also ensures that awakeFromNib is always called before anything actually uses the view. This is a circumstance where it is necessary to use an implicitly unwrapped optional. A member variable may not be defined until the view is initialized and when it is completely loaded: import UIKit class MyView: UIView {    @IBOutlet var button : UIButton!    var buttonOriginalWidth : CGFloat!      override func awakeFromNib() {        self.buttonOriginalWidth = self.button.frame.size.width    } } Note that we have actually declared two implicitly unwrapped optionals. The first is a connection to button. We know this is a connection because it is preceded by @IBOutlet. This is declared as an implicitly unwrapped optional because the connections are not set up until after initialization, but they are still guaranteed to be set up before any other methods are called on the view. This also then leads us to make our second variable, buttonOriginalWidth, implicitly unwrapped because we need to wait until the connection is made before we can determine the width of button. After awakeFromNib is called, it is safe to treat both button and buttonOriginalWidth as nonoptional. You may have noticed that we had to dive pretty deep in to app development in order to find a valid use case for implicitly unwrapped optionals, and this is arguably only because UIKit is implemented in Objective-C. Debugging optionals We already saw a couple of compiler errors that we commonly see because of optionals. If we try to call a method on an optional that we intended to call on the wrapped value, we will get an error. If we try to unwrap a value that is not actually optional, we will get an error that the variable or constant is not optional. We also need to be prepared for runtime errors that optionals can cause. As discussed, optionals cause runtime errors if you try to forcefully unwrap an optional that is nil. This can happen with both explicit and implicit forced unwrapping. If you followed my advice so far in this article, this should be a rare occurrence. However, we all end up working with third-party code, and maybe they were lazy or maybe they used forced unwrapping to enforce their expectations about how their code should be used. Also, we all suffer from laziness from time to time. It can be exhausting or discouraging to worry about all the edge cases when you are excited about programming the main functionality of your app. We may use forced unwrapping temporarily while we worry about that main functionality and plan to come back to handle it later. After all, during development, it is better to have a forced unwrapping crash the development version of your app than it is for it to fail silently if you have not yet handled that edge case. We may even decide that an edge case is not worth the development effort of handling because everything about developing an app is a trade-off. Either way, we need to recognize a crash from forced unwrapping quickly, so that we don't waste extra time trying to figure out what went wrong. When an app tries to unwrap a nil value, if you are currently debugging the app, Xcode shows you the line that tries to do the unwrapping. The line reports that there was EXC_BAD_INSTRUCTION and you will also get a message in the console saying fatal error: unexpectedly found nil while unwrapping an Optional value:   You will also sometimes have to look at which code currently calls the code that failed. To do that, you can use the call stack in Xcode. When your program crashes, Xcode automatically displays the call stack, but you can also manually show it by going to View | Navigators | Show Debug Navigator. This will look something as follows:   Here, you can click on different levels of code to see the state of things. This becomes even more important if the program crashes within one of Apple's framework, where you do not have access to the code. In that case, you should move up the call stack to the point where your code is called in the framework. You may also be able to look at the names of the functions to help you figure out what may have gone wrong. Anywhere on the call stack, you can look at the state of the variables in the debugger, as shown in the following screenshot:   If you do not see this variable's view, you can display it by clicking on the button at the bottom-left corner, which is second from the right that will be grayed out. Here, you can see that invitee is indeed nil, which is what caused the crash. As powerful as the debugger is, if you find that it isn't helping you find the problem, you can always put println statements in important parts of the code. It is always safe to print out an optional as long as you don't forcefully unwrap it like in the preceding example. As we saw earlier, when an optional is printed, it will print nil if it doesn't have a value or it will print Optional(<value>) if it does have a value. Debugging is an extremely important part of becoming a productive developer because we all make mistakes and create bugs. Being a great developer means that you can identify problems quickly and understand how to fix them soon after that. This will largely come from practice, but it will also come when you have a firm grasp of what really happens with your code instead of simply adapting some code you find online to fit your needs through trial and error. The underlying implementation At this point, you should have a pretty strong grasp of what an optional is and how to use and debug it, but it is valuable to look deeper at optionals and see how they actually work. In reality, the question mark syntax for optionals is just a special shorthand. Writing String? is equivalent to writing Optional<String>. Writing String! is equivalent to writing ImplicitlyUnwrappedOptional<String>. The Swift compiler has shorthand versions because they are so commonly used This allows the code to be more concise and readable. If you declare an optional using the long form, you can see Swift's implementation by holding command and clicking on the word Optional. Here, you can see that Optional is implemented as an enumeration. If we simplify the code a little, we have: enum Optional<T> {    case None    case Some(T) } So, we can see that Optional really has two cases: None and Some. None stands for the nil case, while the Some case has an associated value, which is the value wrapped inside Optional. Unwrapping is then the process of retrieving the associated value out of the Some case. One part of this that you have not seen yet is the angled bracket syntax (<T>). This is a generic and essentially allows the enumeration to have an associated value of any type. Realizing that optionals are simply enumerations will help you to understand how to use them. It also gives you some insight into how concepts are built on top of other concepts. Optionals seem really complex until you realize that they are just two-case enumerations. Once you understand enumerations, you can pretty easily understand optionals as well. Summary We only covered a single concept, optionals, in this article, but we saw that this is a pretty dense topic. We saw that at the surface level, optionals are pretty straightforward. They offer a way to represent a variable that has no value. However, there are multiple ways to get access to the value wrapped within an optional, which have very specific use cases. Optional binding is always preferred as it is the safest method, but we can also use forced unwrapping if we are confident that an optional is not nil. We also have a type called implicitly unwrapped optional to delay the assigning of a variable that is not intended to be optional, but we should use it sparingly because there is almost always a better alternative. Resources for Article: Further resources on this subject: Network Development with Swift [article] Flappy Swift [article] Playing with Swift [article]
Read more
  • 0
  • 0
  • 5063

article-image-how-to-build-remote-controlled-tv-node-webkit
Roberto González
08 Jul 2015
14 min read
Save for later

How to build a Remote-controlled TV with Node-Webkit

Roberto González
08 Jul 2015
14 min read
Node-webkit is one of the most promising technologies to come out in the last few years. It lets you ship a native desktop app for Windows, Mac, and Linux just using HTML, CSS, and some JavaScript. These are the exact same languages you use to build any web app. You basically get your very own Frameless Webkit to build your app, which is then supercharged with NodeJS, giving you access to some powerful libraries that are not available in a typical browser. As a demo, we are going to build a remote-controlled Youtube app. This involves creating a native app that displays YouTube videos on your computer, as well as a mobile client that will let you search for and select the videos you want to watch straight from your couch. You can download the finished project from https://github.com/Aerolab/youtube-tv. You need to follow the first part of this guide (Getting started) to set up the environment and then run run.sh (on Mac) or run.bat (on Windows) to start the app. Getting started First of all, you need to install Node.JS (a JavaScript platform), which you can download from http://nodejs.org/download/. The installer comes bundled with NPM (Node.JS Package Manager), which lets you install everything you need for this project. Since we are going to be building two apps (a desktop app and a mobile app), it’s better if we get the boring HTML+CSS part out of the way, so we can concentrate on the JavaScript part of the equation. Download the project files from https://github.com/Aerolab/youtube-tv/blob/master/assets/basics.zip and put them in a new folder. You can name the project’s folder youtube-tv  or whatever you want. The folder should look like this: - index.html // This is the starting point for our desktop app - css // Our desktop app styles - js // This is where the magic happens - remote // This is where the magic happens (Part 2) - libraries // FFMPEG libraries, which give you H.264 video support in Node-Webkit - player // Our youtube player - Gruntfile.js // Build scripts - run.bat // run.bat runs the app on Windows - run.sh // sh run.sh runs the app on Mac Now open the Terminal (on Mac or Linux) or a new command prompt (on Windows) right in that folder. Now we’ll install a couple of dependencies we need for this project, so type these commands to install node-gyp and grunt-cli. Each one will take a few seconds to download and install: On Mac or Linux: sudo npm install node-gyp -g sudo npm install grunt-cli -g  On Windows: npm install node-gyp -g npm install grunt-cli -g Leave the Terminal open. We’ll be using it again in a bit. All Node.JS apps start with a package.json file (our manifest), which holds most of the settings for your project, including which dependencies you are using. Go ahead and create your own package.json file (right inside the project folder) with the following contents. Feel free to change anything you like, such as the project name, the icon, or anything else. Check out the documentation at https://github.com/rogerwang/node-webkit/wiki/Manifest-format: { "//": "The // keys in package.json are comments.", "//": "Your project’s name. Go ahead and change it!", "name": "Remote", "//": "A simple description of what the app does.", "description": "An example of node-webkit", "//": "This is the first html the app will load. Just leave this this way", "main": "app://host/index.html", "//": "The version number. 0.0.1 is a good start :D", "version": "0.0.1", "//": "This is used by Node-Webkit to set up your app.", "window": { "//": "The Window Title for the app", "title": "Remote", "//": "The Icon for the app", "icon": "css/images/icon.png", "//": "Do you want the File/Edit/Whatever toolbar?", "toolbar": false, "//": "Do you want a standard window around your app (a title bar and some borders)?", "frame": true, "//": "Can you resize the window?", "resizable": true}, "webkit": { "plugin": false, "user-agent": "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit/537.36 (KHTML, like Gecko) Safari/537.36" }, "//": "These are the libraries we’ll be using:", "//": "Express is a web server, which will handle the files for the remote", "//": "Socket.io lets you handle events in real time, which we'll use with the remote as well.", "dependencies": { "express": "^4.9.5", "socket.io": "^1.1.0" }, "//": "And these are just task handlers to make things easier", "devDependencies": { "grunt": "^0.4.5", "grunt-contrib-copy": "^0.6.0", "grunt-node-webkit-builder": "^0.1.21" } } You’ll also find Gruntfile.js, which takes care of downloading all of the node-webkit assets and building the app once we are ready to ship. Feel free to take a look into it, but it’s mostly boilerplate code. Once you’ve set everything up, go back to the Terminal and install everything you need by typing: npm install grunt nodewebkitbuild You may run into some issues when doing this on Mac or Linux. In that case, try using sudo npm install and sudo grunt nodewebkitbuild. npm install installs all of the dependencies you mentioned in package.json, both the regular dependencies and the development ones, like grunt and grunt-nodewebkitbuild, which downloads the Windows and Mac version of node-webkit, setting them up so they can play videos, and building the app. Wait a bit for everything to install properly and we’re ready to get started. Note that if you are using Windows, you might get a scary error related to Visual C++ when running npm install. Just ignore it. Building the desktop app All web apps (or websites for that matter) start with an index.html file. We are going to be creating just that to get our app to run: <!DOCTYPE html><html> <head> <metacharset="utf-8"/> <title>Youtube TV</title> <linkhref='http://fonts.googleapis.com/css?family=Roboto:500,400'rel='stylesheet'type='text/css'/> <linkhref="css/normalize.css"rel="stylesheet"type="text/css"/> <linkhref="css/styles.css"rel="stylesheet"type="text/css"/> </head> <body> <divid="serverInfo"> <h1>Youtube TV</h1> </div> <divid="videoPlayer"> </div> <script src="js/jquery-1.11.1.min.js"></script> <script src="js/youtube.js"></script> <script src="js/app.js"></script> </body> </html> As you may have noticed, we are using three scripts for our app: jQuery (pretty well known at this point), a Youtube video player, and finally app.js, which contains our app's logic. Let’s dive into that! First of all, we need to create the basic elements for our remote control. The easiest way of doing this is to create a basic web server and serve a small web app that can search Youtube, select a video, and have some play/pause controls so we don’t have any good reasons to get up from the couch. Open js/app.js and type the following: // Show the Developer Tools. And yes, Node-Webkit has developer tools built in! Uncomment it to open it automatically//require('nw.gui').Window.get().showDevTools(); // Express is a web server, will will allow us to create a small web app with which to control the playervar express = require('express'); var app = express(); var server = require('http').Server(app); var io = require('socket.io')(server); // We'll be opening up our web server on Port 8080 (which doesn't require root privileges)// You can access this server at http://127.0.0.1:8080var serverPort =8080; server.listen(serverPort); // All the static files (css, js, html) for the remote will be served using Express.// These assets are in the /remote folderapp.use('/', express.static('remote')); With those 7 lines of code (not counting comments) we just got a neat web server working on port 8080. If you were paying attention to the code, you may have noticed that we required something called socket.io. This lets us use websockets with minimal effort, which means we can communicate with, from, and to our remote instantly. You can learn more about socket.io at http://socket.io/. Let’s set that up next in app.js: // Socket.io handles the communication between the remote and our app in real time, // so we can instantly send commands from a computer to our remote and backio.on('connection', function (socket) { // When a remote connects to the app, let it know immediately the current status of the video (play/pause)socket.emit('statusChange', Youtube.status); // This is what happens when we receive the watchVideo command (picking a video from the list)socket.on('watchVideo', function (video) { // video contains a bit of info about our video (id, title, thumbnail)// Order our Youtube Player to watch that video Youtube.watchVideo(video); }); // These are playback controls. They receive the “play” and “pause” events from the remotesocket.on('play', function () { Youtube.playVideo(); }); socket.on('pause', function () { Youtube.pauseVideo(); }); }); // Notify all the remotes when the playback status changes (play/pause)// This is done with io.emit, which sends the same message to all the remotesYoutube.onStatusChange =function(status) { io.emit('statusChange', status); }; That’s the desktop part done! In a few dozen lines of code we got a web server running at http://127.0.0.1:8080 that can receive commands from a remote to watch a specific video, as well as handling some basic playback controls (play and pause). We are also notifying the remotes of the status of the player as soon as they connect so they can update their UI with the correct buttons (if it’s playing, show the pause button and vice versa). Now we just need to build the remote. Building the remote control The server is just half of the equation. We also need to add the corresponding logic on the remote control, so it’s able to communicate with our app. In remote/index.html, add the following HTML: <!DOCTYPE html><html> <head> <metacharset=“utf-8”/> <title>TV Remote</title> <metaname="viewport"content="width=device-width, initial-scale=1, maximum-scale=1"/> <linkrel="stylesheet"href="/css/normalize.css"/> <linkrel="stylesheet"href="/css/styles.css"/> </head> <body> <divclass="controls"> <divclass="search"> <inputid="searchQuery"type="search"value=""placeholder="Search on Youtube..."/> </div> <divclass="playback"> <buttonclass="play">&gt;</button> <buttonclass="pause">||</button> </div> </div> <divid="results"class="video-list"> </div> <divclass="__templates"style="display:none;"> <articleclass="video"> <figure><imgsrc=""alt=""/></figure> <divclass="info"> <h2></h2> </div> </article> </div> <script src="/socket.io/socket.io.js"></script> <script src="/js/jquery-1.11.1.min.js"></script> <script src="/js/search.js"></script> <script src="/js/remote.js"></script> </body> </html> Again, we have a few libraries: Socket.io is served automatically by our desktop app at /socket.io/socket.io.js, and it manages the communication with the server. jQuery is somehow always there, search.js manages the integration with the Youtube API (you can take a look if you want), and remote.js handles the logic for the remote. The remote itself is pretty simple. It can look for videos on Youtube, and when we click on a video it connects with the app, telling it to play the video with socket.emit. Let’s dive into remote/js/remote.js to make this thing work: // First of all, connect to the server (our desktop app)var socket = io.connect(); // Search youtube when the user stops typing. This gives us an automatic search.var searchTimeout =null; $('#searchQuery').on('keyup', function(event){ clearTimeout(searchTimeout); searchTimeout = setTimeout(function(){ searchYoutube($('#searchQuery').val()); }, 500); }); // When we click on a video, watch it on the App$('#results').on('click', '.video', function(event){ // Send an event to notify the server we want to watch this videosocket.emit('watchVideo', $(this).data()); }); // When the server tells us that the player changed status (play/pause), alter the playback controlssocket.on('statusChange', function(status){ if( status ==='play' ) { $('.playback .pause').show(); $('.playback .play').hide(); } elseif( status ==='pause'|| status ==='stop' ) { $('.playback .pause').hide(); $('.playback .play').show(); } }); // Notify the app when we hit the play button$('.playback .play').on('click', function(event){ socket.emit('play'); }); // Notify the app when we hit the pause button$('.playback .pause').on('click', function(event){ socket.emit('pause'); }); This is very similar to our server, except we are using socket.emit a lot more often to send commands back to our desktop app, telling it which videos to play and handle our basic play/pause controls. The only thing left to do is make the app run. Ready? Go to the terminal again and type: If you are on a Mac: sh run.sh If you are on Windows: run.bat If everything worked properly, you should be both seeing the app and if you open a web browser to http://127.0.0.1:8080 the remote client will open up. Search for a video, pick anything you like, and it’ll play in the app. This also works if you point any other device on the same network to your computer’s IP, which brings me to the next (and last) point. Finishing touches There is one small improvement we can make: print out the computer’s IP to make it easier to connect to the app from any other device on the same Wi-Fi network (like a smartphone). On js/app.js add the following code to find out the IP and update our UI so it’s the first thing we see when we open the app: // Find the local IPfunction getLocalIP(callback) { require('dns').lookup( require('os').hostname(), function (err, add, fam) { typeof callback =='function'? callback(add) :null; }); } // To make things easier, find out the machine's ip and communicate itgetLocalIP(function(ip){ $('#serverInfo h1').html('Go to<br/><strong>http://'+ip+':'+serverPort+'</strong><br/>to open the remote'); }); The next time you run the app, the first thing you’ll see is the IP for your computer, so you just need to type that URL in your smartphone to open the remote and control the player from any computer, tablet, or smartphone (as long as they are in the same Wi-Fi network). That's it! You can start expanding on this to improve the app: Why not open the app on a fullscreen by default? Why not get rid of the horrible default frame and create your own? You can actually designate any div as a window handle with CSS (using -webkit-app-region: drag), so you can drag the window by that div and create your own custom title bar. Summary While the app has a lot of interlocking parts, it's a good first project to find out what you can achieve with node-webkit in just a few minutes. I hope you enjoyed this post! About the author Roberto González is the co-founder of Aerolab, “an awesome place where we really push the barriers to create amazing, well-coded designs for the best digital products”. He can be reached at @robertcode.
Read more
  • 0
  • 0
  • 6647

article-image-understanding-mesos-internals
Packt
08 Jul 2015
26 min read
Save for later

Understanding Mesos Internals

Packt
08 Jul 2015
26 min read
 In this article by, Dharmesh Kakadia, author of the book Apache Mesos Essentials, explains how Mesos works internally in detail. We will start off with cluster scheduling and fairness concepts, understanding the Mesos architecture, and we will move on towards resource isolation and fault tolerance implementation in Mesos. In this article, we will cover the following topics: The Mesos architecture Resource allocation (For more resources related to this topic, see here.) The Mesos architecture Modern organizations have a lot of different kinds of applications for different business needs. Modern applications are distributed and they are deployed across commodity hardware. Organizations today run different applications in siloed environments, where separate clusters are created for different applications. This static partitioning of cluster leads to low utilization, and all the applications will duplicate the effort of dealing with distributed infrastructures. Not only is this a wasted effort, but it also undermines the fact that distributed systems are hard to build and maintain. This is challenging for both developers and operators. For developers, it is a challenge to build applications that scale elastically and can handle faults that are inevitable in large-scale environment. Operators, on the other hand, have to manage and scale all of these applications individually in siloed environments. The preceding situation is like trying to develop applications without having an operating system and managing all the devices in a computer. Mesos solves the problems mentioned earlier by providing a data center kernel. Mesos provides a higher-level abstraction to develop applications that treat distributed infrastructure just like a large computer. Mesos abstracts the hardware infrastructure away from the applications from the physical infrastructure. Mesos makes developers more productive by providing an SDK to easily write data center scale applications. Now, developers can focus on their application logic and do not have to worry about the infrastructure that runs it. Mesos SDK provides primitives to build large-scale distributed systems, such as resource allocation, deployment, and monitoring isolation. They only need to know and implement what resources are needed, and not how they get the resources. Mesos allows you to treat the data center just as a computer. Mesos makes the infrastructure operations easier by providing elastic infrastructure. Mesos aggregates all the resources in a single shared pool of resources and avoids static partitioning. This makes it easy to manage and increases the utilization. The data center kernel has to provide resource allocation, isolation, and fault tolerance in a scalable, robust, and extensible way. We will discuss how Mesos fulfills these requirements, as well as some other important considerations of modern data center kernel: Scalability: The kernel should be scalable in terms of the number of machines and number of applications. As the number of machines and applications increase, the response time of the scheduler should remain acceptable. Flexibility: The kernel should support a wide range of applications. It should also support diverse frameworks currently running on the cluster and future frameworks as well. The framework should also be able to cope up with the heterogeneity in the hardware, as most clusters are built over time and have a variety of hardware running. Maintainability: The kernel would be one of the very important pieces of modern infrastructure. As the requirements evolve, the scheduler should be able to accommodate new requirements. Utilization and dynamism: The kernel should adapt to the changes in resource requirements and available hardware resources and utilize resources in an optimal manner. Fairness: The kernel should be fair in allocating resources to the different users and/or frameworks. We will see what it means to be fair in detail in the next section. The design philosophy behind Mesos was to define a minimal interface to enable efficient resource sharing across frameworks and defer the task scheduling and execution to the frameworks. This allows the frameworks to implement diverse approaches toward scheduling and fault tolerance. It also makes the Mesos core simple, and the frameworks and core can evolve independently. The preceding figure shows the overall architecture (http://mesos.apache.org/documentation/latest/mesos-architecture) of a Mesos cluster. It has the following entities: The Mesos masters The Mesos slaves Frameworks Communication Auxiliary services We will describe each of these entities and their role, followed by how Mesos implements different requirements of the data center kernel. The Mesos slave The Mesos slaves are responsible for executing tasks from frameworks using the resources they have. The slave has to provide proper isolation while running multiple tasks. The isolation mechanism should also make sure that the tasks get resources that they are promised, and not more or less. The resources on slaves that are managed by Mesos can be described using slave resources and slave attributes. Resources are elements of slaves that can be consumed by a task, while we use attributes to tag slaves with some information. Slave resources are managed by the Mesos master and are allocated to different frameworks. Attributes identify something about the node, such as the slave having a specific OS or software version, it's part of a particular network, or it has a particular hardware, and so on. The attributes are simple key-value pairs of strings that are passed along with the offers to frameworks. Since attributes cannot be consumed by a running task, they will always be offered for that slave. Mesos doesn't understand the slave attribute, and interpretation of the attributes is left to the frameworks. More information about resource and attributes in Mesos can be found at https://mesos.apache.org/documentation/attributes-resources. A Mesos resource or attribute can be described as one of the following types: Scalar values are floating numbers Range values are a range of scalar values; they are represented as [minValue-maxValue] Set values are arbitrary strings Text values are arbitrary strings; they are applicable only to attributes Names of the resources can be an arbitrary string consisting of alphabets, numbers, "-", "/", ".", "-". The Mesos master handles the cpus, mem, disk, and ports resources in a special way. A slave without the cpus and mem resources will not be advertised to the frameworks. The mem and disk scalars are interpreted in MB. The ports resource is represented as ranges. The list of resources a slave has to offer to various frameworks can be specified as the resources flag. Resources and attributes are separated by a semicolon. For example: --resources='cpus:30;mem:122880;disk:921600;ports:[21000-29000];bugs:{a,b,c}' --attributes='rack:rack-2;datacenter:europe;os:ubuntuv14.4' This slave offers 30 cpus, 102 GB mem, 900 GB disk, ports from 21000 to 29000, and have bugs a, b, and c. The slave has three attributes: rack with value rack-2, datacenter with value europe, and os with value ubuntu14.4. Mesos does not yet provide direct support for GPUs, but does support custom resource types. This means that if we specify gpu(*):8 as part of --resources, then it will be part of the resource that offers to frameworks. Frameworks can use it just like other resources. Once some of the GPU resources are in use by a task, only the remaining resources will be offered. Mesos does not yet have support for GPU isolation, but it can be extended by implementing a custom isolator. Alternately, we can also specify which slaves have GPUs using attributes, such as --attributes="hasGpu:true". The Mesos master The Mesos master is primarily responsible for allocating resources to different frameworks and managing the task life cycle for them. The Mesos master implements fine-grained resource sharing using resource offers. The Mesos master acts as a resource broker for frameworks using pluggable policies. The master decides to offer cluster resources to frameworks in the form of resource offers based on them. Resources offer represents a unit of allocation in the Mesos world. It's a vector of resource available on a node. An offer represents some resources available on a slave being offered to a particular framework. Frameworks Distributed applications that run on top of Mesos are called frameworks. Frameworks implement the domain requirements using the general resource allocation API of Mesos. A typical framework wants to run a number of tasks. Tasks are the consumers of resources and they do not have to be the same. A framework in Mesos consists of two components: a framework scheduler and executors. Framework schedulers are responsible for coordinating the execution. An executor provides the ability to control the task execution. Executors can realize a task execution in many ways. An executor can choose to run multiple tasks, by spawning multiple threads, in an executor, or it can run one task in each executor. Apart from the life cycle and task management-related functions, the Mesos framework API also provides functions to communicate with framework schedulers and executors. Communication Mesos currently uses an HTTP-like wire protocol to communicate with the Mesos components. Mesos uses the libprocess library to implement the communication that is located in 3rdparty/libprocess. The libprocess library provides asynchronous communication with processes. The communication primitives have an actor message passing, such as semantics. The libprocess messages are immutable, which makes parallelizing the libprocess internals easier. Mesos communication happens along with the following APIs: Scheduler API: This is used to communicate with the framework scheduler and master. The internal communication is intended to be used only by the SchedulerDriver API. Executor API: This is used to communicate with an executor and the Mesos slave. Internal API: This is used to communicate with the Mesos master and slave. Operator API: This is the API exposed by Mesos for operators and is used by web UI, among other things. Unlike most Mesos API, the operator API is a synchronous API. To send a message, the actor does an HTTP POST request. The path is composed by the name of the actor followed by the name of the message. The User-Agent field is set to "libprocess/…" to distinguish from the normal HTTP requests. The message data is passed as the body of the HTTP request. Mesos uses protocol buffers to serialize all the messages (defined in src/messages/messages.proto). The parsing and interpretation of the message is left to the receiving actor. Here is an example header of a message sent to master to register the framework by scheduler(1) running at 10.0.1.7:53523 address: POST /master/mesos.internal.RegisterFrameworkMessage HTTP/1.1 User-Agent: libprocess/scheduler(1)@10.0.1.7:53523 The reply message header from the master that acknowledges the framework registration might look like this: POST /scheduler(1)/mesos.internal.FrameworkRegisteredMessage HTTP/1.1 User-Agent: libprocess/master@10.0.1.7:5050 At the time of writing, there is a very early discussion about rewiring the Mesos Scheduler API and Executor API as a pure HTTP API (https://issues.apache.org/jira/browse/MESOS-2288). This will make the API standard and integration with Mesos for various tools much easier without the need to be dependent on native libmesos. Also, there is an ongoing effort to convert all the internal messages into a standardized JSON or protocol buffer format (https://issues.apache.org/jira/browse/MESOS-1127). Auxiliary services Apart from the preceding main components, a Mesos cluster also needs some auxiliary services. These services are not part of Mesos itself, and are not strictly required, but they form a basis for operating the Mesos cluster in production environments. These services include, but are not limited to, the following: Shared filesystem: Mesos provides a view of the data center as a single computer and allows developers to develop for the data center scale application. With this unified view of resources, clusters need a shared filesystem to truly make the data center a computer. HDFS, NFS (Network File System), and Cloud-based storage options, such as S3, are popular among various Mesos deployments. Consensus service: Mesos uses a consensus service to be resilient in face of failure. Consensus services, such as ZooKeeper or etcd, provide a reliable leader election in a distributed environment. Service fabric: Mesos enables users to run a number of frameworks on unified computing resources. With a large number of applications and services running, it's important for users to be able to connect to them in a seamless manner. For example, how do users connect to Hive running on Mesos? How does the Ruby on Rails application discover and connect to the MongoDB database instances when one or both of them are running on Mesos? How is the website traffic routed to web servers running on Mesos? Answering these questions mainly requires service discovery and load balancing mechanisms, but also things such as IP/port management and security infrastructure. We are collectively referring to these services that connect frameworks to other frameworks and users as service fabric. Operational services: Operational services are essential in managing operational aspects of Mesos. Mesos deployments and upgrades, monitoring cluster health and alerting when human intervention is required, logging, and security are all part of the operational services that play a very important role in a Mesos cluster. Resource allocation As a data center kernel, Mesos serves a large variety of workloads and no single scheduler will be able to satisfy the needs of all different frameworks. For example, the way in which a real-time processing framework schedules its tasks will be very different from how a long running service will schedule its task, which, in turn, will be very different from how a batch processing framework would like to use its resources. This observation leads to a very important design decision in Mesos: separation of resource allocation and task scheduling. Resource allocation is all about deciding who gets what resources, and it is the responsibility of the Mesos master. Task scheduling, on the other hand, is all about how to use the resources. This is decided by various framework schedulers according to their own needs. Another way to understand this would be that Mesos handles coarse-grain resource allocation across frameworks, and then each framework does fine-grain job scheduling via appropriate job ordering to achieve its needs. The Mesos master gets information on the available resources from the Mesos slaves, and based on resource policies, the Mesos master offers these resources to different frameworks. Different frameworks can choose to accept or reject the offer. If the framework accepts a resource offer, the framework allocates the corresponding resources to the framework, and then the framework is free to use them to launch tasks. The following image shows the high-level flow of Mesos resource allocation: Mesos two level scheduler Here is the typical flow of events for one framework in Mesos: The framework scheduler registers itself with the Mesos master. The Mesos master receives the resource offers from slaves. It invokes the allocation module and decides which frameworks should receive the resource offers. The framework scheduler receives the resource offers from the Mesos master. On receiving the resource offers, the framework scheduler inspects the offer to decide whether it's suitable. If it finds it satisfactory, the framework scheduler accepts the offer and replies to the master with the list of executors that should be run on the slave, utilizing the accepted resource offers. Alternatively, the framework can reject the offer and wait for a better offer. The slave allocates the requested resources and launches the task executors. The executor is launched on slave nodes and runs the framework's tasks. It is up to the framework scheduler to accept or reject the resource offers. Here is an example of events that can happen when allocating resources. The framework scheduler gets notified about the task's completion or failure. The framework scheduler will continue receiving the resource offers and task reports and launch tasks as it sees fit. The framework unregisters with the Mesos master and will not receive any further resource offers. Note that this is optional and a long running service, and meta-framework will not unregister during the normal operation. Because of this design, Mesos is also known as a two-level scheduler. Mesos' two-level scheduler design makes it simpler and more scalable, as the resource allocation process does not need to know how scheduling happens. This makes the Mesos core more stable and scalable. Frameworks and Mesos are not tied to each other and each can iterate independently. Also, this makes porting frameworks easier. The choice of a two-level scheduler means that the scheduler does not have a global knowledge about resource utilization and the resource allocation decisions can be nonoptimal. One potential concern could be about the preferences that the frameworks have about the kind of resources needed for execution. Data locality, special hardware, and security constraints can be a few of the constraints on which tasks can run. In the Mesos realm, these preferences are not explicitly specified by a framework to the Mesos master, instead the framework rejects all the offers that do not meet its constraints. The Mesos scheduler Mesos was the first cluster scheduler to allow the sharing of resources to multiple frameworks. Mesos resource allocation is based on online Dominant Resource Fairness (DRF) called HierarchicalDRF. In a world of single resource static partitioning, fairness is easy to define. DRF extends this concept of fairness to multi-resource settings without the need for static partitioning. Resource utilization and fairness are equally important, and often conflicting, goals for a cluster scheduler. The fairness of resource allocation is important in a shared environment, such as data centers, to ensure that all the users/processes of the cluster get nearly an equal amount of resources. Min-max fairness provides a well-known mechanism to share a single resource among multiple users. Min-max fairness algorithm maximizes the minimum resources allocated to a user. In its simplest form, it allocates 1/Nth of the resource to each of the users. The weighted min-max fairness algorithm can also support priorities and reservations. Min-max resource fairness has been a basis for many well-known schedulers in operating systems and distributed frameworks, such as Hadoop's fair scheduler (http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/FairScheduler.html), capacity scheduler (https://hadoop.apache.org/docs/r2.4.1/hadoop-yarn/hadoop-yarn-site/CapacityScheduler.html), Quincy scheduler (http://dl.acm.org/citation.cfm?id=1629601), and so on. However, it falls short when the cluster has multiple types of resources, such as the CPU, memory, disk, and network. When jobs in a distributed environment use different combinations of these resources to achieve the outcome, the fairness has to be redefined. For example, the two requests <1 CPU, 3 GB> and <3 CPU, 1 GB> come to the scheduler. How do they compare and what is the fair allocation? DRF generalizes the min-max algorithm for multiple resources. A user's dominant resource is the resource for which the user has a biggest share. For example, if the total resources are <8 CPU, 5 GB>, then for the user allocation of <2 CPU, 1 GB>, the user's dominant share is maximumOf(2/8,1/5) means CPU. A user's dominant share is the fraction of the dominant resource that's allocated to the user. In our example, it would be 25 percent (2/8). DRF applies the min-max algorithm to the dominant share of each user. It has many provable properties: Strategy proofness: A user cannot gain any advantage by lying about the demands. Sharing incentive: DRF has a minimum allocation guarantee for each user, and no user will prefer exclusive partitioned cluster of size 1/N over DRF allocation. Single resource fairness: In case of only one resource, DRF is equivalent to the min-max algorithm. Envy free: Every user prefers his allocation over any other allocation of other users. This also means that the users with the same requests get equivalent allocations. Bottleneck fairness: When one resource becomes a bottleneck, and each user has a dominant demand for it, DRF is equivalent to min-max. Monotonicity: Adding resources or removing users can only increase the allocation of the remaining users. Pareto efficiency: The allocation achieved by DRF will be pareto efficient, so it would be impossible to make a better allocation for any user without making allocation for some other user worse. We will not further discuss DRF but will encourage you to refer to the DRF paper for more details at http://static.usenix.org/event/nsdi11/tech/full_papers/Ghodsi.pdf. Mesos uses role specified in FrameworkInfo for resource allocation decision. A role can be per user or per framework or can be shared by multiple users and frameworks. If it's not set, Mesos will set it to the current user that runs the framework scheduler. An optimization is to use deny resource offers from particular slaves for a specified time period. Mesos can revoke tasks allocation killing those tasks. Before killing a task, Mesos gives the framework a grace period to clean up. Mesos asks the executor to kill the task, but if it does not oblige the request, it will kill the executor and all of its tasks. Weighted DRF DRF calculates each role's dominant share and allocates the available resources to the user with the smallest dominant share. In practice, an organization rarely wants to assign resources in a complete fair manner. Most organizations want to allocate resources in a weighted manner, such as 50 percent resources to ads team, 30 percent to QA, and 20 percent to R&D teams. To satisfy this command functionality, Mesos implements weighted DRF, where masters can be configured with weights for different roles. When weights are specified, a client's DRF share will be divided by the weight. For example, a role that has a weight of two will be offered twice as many resources as a role with weight of one. Mesos can be configured to use weighted DRF using the --weights and --roles flags on the master startup. The --weights flag expects a list of role/weight pairs in the form of role1=weight1 and role2=weight2. Weights do not need to be integers. We must provide weights for each role that appear in --roles on the master startup. Reservation One of the other most asked questions for requirement is the ability to reserve resources. For example, persistent or stateful services, such as memcache, or a database running on Mesos, would need a reservation mechanism to avoid being negatively affected on restart. Without reservation, memcache is not guaranteed to get a resource offer from the slave, which has all the data and would incur significant time in initialization and downtime for the service. Reservation can also be used to limit the resource per role. Reservation provides guaranteed resources for roles, but improper usage might lead to resource fragmentation and lower utilization of resources. Note that all the reservation requests go through a Mesos authorization mechanism to ensure that the operator or framework requesting the operation has the proper privileges. Reservation privileges are specified to the Mesos master through ACL along with the rest of the ACL configuration. Mesos supports the following two kinds of reservation: Static reservation Dynamic reservation Static reservation In static reservation, resources are reserved for a particular role. The restart of the slave after removing the checkpointed state is required to change static reservation. Static reservation is thus typically managed by operators using the --resources flag on the slave. The flag expects a list of name(role):value for different resources. If a resource is assigned to role A, then only frameworks with role A are eligible to get an offer for that resource. Any resources that do not include a role or resources that are not included in the --resources flag will be included in the default role (default *). For example, --resources="cpus:4;mem:2048;cpus(ads):8;mem(ads):4096" specifies that the slave has 8 CPUs and 4096 MB memory reserved for "ads" role and has 4 CPUs and 2048 MB memory unreserved. Nonuniform static reservation across slaves can quickly become difficult to manage. Dynamic reservation Dynamic reservation allows operators and frameworks to manage reservation more dynamically. Frameworks can use dynamic reservations to reserve offered resources, allowing those resources to only be reoffered to the same framework. At the time of writing, dynamic reservation is still being actively developed and is targeted toward the next release of Mesos (https://issues.apache.org/jira/browse/MESOS-2018). When asked for a reservation, Mesos will try to convert the unreserved resources to reserved resources. On the other hand, during the unreserve operation, the previously reserved resources are returned to the unreserved pool of resources. To support dynamic reservation, Mesos allows a sequence of Offer::Operations to be performed as a response to accepting resource offers. A framework manages reservation by sending Offer::Operations::Reserve and Offer::Operations::Unreserve as part of these operations, when receiving resource offers. For example, consider the framework that receives the following resource offer with 32 CPUs and 65536 MB memory: {   "id" : <offer_id>,   "framework_id" : <framework_id>,   "slave_id" : <slave_id>,   "hostname" : <hostname>,   "resources" : [     {       "name" : "cpus",       "type" : "SCALAR",       "scalar" : { "value" : 32 },       "role" : "*",     },     {       "name" : "mem",       "type" : "SCALAR",       "scalar" : { "value" : 65536 },       "role" : "*",     }   ] } The framework can decide to reserve 8 CPUs and 4096 MB memory by sending the Operation::Reserve message with resources field with the desired resources state: [   {     "type" : Offer::Operation::RESERVE,     "resources" : [       {         "name" : "cpus",         "type" : "SCALAR",         "scalar" : { "value" : 8 },         "role" : <framework_role>,         "reservation" : {           "framework_id" : <framework_id>,           "principal" : <framework_principal>         }       }       {         "name" : "mem",         "type" : "SCALAR",         "scalar" : { "value" : 4096 },         "role" : <framework_role>,         "reservation" : {           "framework_id" : <framework_id>,           "principal" : <framework_principal>         }       }     ]   } ] After a successful execution, the framework will receive resource offers with reservation. The next offer from the slave might look as follows: {   "id" : <offer_id>,   "framework_id" : <framework_id>,   "slave_id" : <slave_id>,   "hostname" : <hostname>,   "resources" : [     {       "name" : "cpus",       "type" : "SCALAR",       "scalar" : { "value" : 8 },       "role" : <framework_role>,       "reservation" : {         "framework_id" : <framework_id>,         "principal" : <framework_principal>       }     },     {       "name" : "mem",       "type" : "SCALAR",       "scalar" : { "value" : 4096 },       "role" : <framework_role>,       "reservation" : {         "framework_id" : <framework_id>,         "principal" : <framework_principal>       }     },     {       "name" : "cpus",       "type" : "SCALAR",       "scalar" : { "value" : 24 },       "role" : "*",     },     {       "name" : "mem",       "type" : "SCALAR",       "scalar" : { "value" : 61440 },       "role" : "*",     }   ] } As shown, the framework has 8 CPUs and 4096 MB memory reserved resources and 24 CPUs and 61440 MB memory underserved in the resource offer. The unreserve operation is similar. The framework on receiving the resource offer can send the unreserve operation message, and subsequent offers will not have reserved resources. The operators can use/reserve and/unreserve HTTP endpoints of the operator API to manage the reservation. The operator API allows operators to change the reservation specified when the slave starts. For example, the following command will reserve 4 CPUs and 4096 MB memory on slave1 for role1 with the operator authentication principal ops: ubuntu@master:~ $ curl -d slaveId=slave1 -d resources="{          {            "name" : "cpus",            "type" : "SCALAR",            "scalar" : { "value" : 4 },            "role" : "role1",            "reservation" : {              "principal" : "ops"            }          },          {            "name" : "mem",            "type" : "SCALAR",            "scalar" : { "value" : 4096 },            "role" : "role1",            "reservation" : {              "principal" : "ops"            }          },        }"        -X POST http://master:5050/master/reserve Before we end this discussion on resource allocation, it would be important to note that the Mesos community continues to innovate on the resource allocation front by incorporating interesting ideas, such as oversubscription (https://issues.apache.org/jira/browse/MESOS-354), from academic literature and other systems. Summary In this article, we looked at the Mesos architecture in detail and learned how Mesos deals with resource allocation, resource isolation, and fault tolerance. We also saw the various ways in which we can extend Mesos. Resources for Article: Further resources on this subject: Recommender systems dissected Tuning Solr JVM and Container [article] Transformation [article] Getting Started [article]
Read more
  • 0
  • 0
  • 5733
article-image-working-large-data-sources
Packt
08 Jul 2015
20 min read
Save for later

Working with large data sources

Packt
08 Jul 2015
20 min read
In this article, by Duncan M. McGreggor, author of the book Mastering matplotlib, we come across the use of NumPy in the world of matplotlib and big data, problems with large data sources, and the possible solutions to these problems. (For more resources related to this topic, see here.) Most of the data that users feed into matplotlib when generating plots is from NumPy. NumPy is one of the fastest ways of processing numerical and array-based data in Python (if not the fastest), so this makes sense. However by default, NumPy works on in-memory database. If the dataset that you want to plot is larger than the total RAM available on your system, performance is going to plummet. In the following section, we're going to take a look at an example that illustrates this limitation. But first, let's get our notebook set up, as follows: In [1]: import matplotlib        matplotlib.use('nbagg')        %matplotlib inline Here are the modules that we are going to use: In [2]: import glob, io, math, os         import psutil        import numpy as np        import pandas as pd        import tables as tb        from scipy import interpolate        from scipy.stats import burr, norm        import matplotlib as mpl        import matplotlib.pyplot as plt        from IPython.display import Image We'll use the custom style sheet that we created earlier, as follows: In [3]: plt.style.use("../styles/superheroine-2.mplstyle") An example problem To keep things manageable for an in-memory example, we're going to limit our generated dataset to 100 million points by using one of SciPy's many statistical distributions, as follows: In [4]: (c, d) = (10.8, 4.2)        (mean, var, skew, kurt) = burr.stats(c, d, moments='mvsk') The Burr distribution, also known as the Singh–Maddala distribution, is commonly used to model household income. Next, we'll use the burr object's method to generate a random population with our desired count, as follows: In [5]: r = burr.rvs(c, d, size=100000000) Creating 100 million data points in the last call took about 10 seconds on a moderately recent workstation, with the RAM usage peaking at about 2.25 GB (before the garbage collection kicked in). Let's make sure that it's the size we expect, as follows: In [6]: len(r) Out[6]: 100000000 If we save this to a file, it weighs in at about three-fourths of a gigabyte: In [7]: r.tofile("../data/points.bin") In [8]: ls -alh ../data/points.bin        -rw-r--r-- 1 oubiwann staff 763M Mar 20 11:35 points.bin This actually does fit in the memory on a machine with a RAM of 8 GB, but generating much larger files tends to be problematic. We can reuse it multiple times though, to reach a size that is larger than what can fit in the system RAM. Before we do this, let's take a look at what we've got by generating a smooth curve for the probability distribution, as follows: In [9]: x = np.linspace(burr.ppf(0.0001, c, d),                          burr.ppf(0.9999, c, d), 100)          y = burr.pdf(x, c, d) In [10]: (figure, axes) = plt.subplots(figsize=(20, 10))          axes.plot(x, y, linewidth=5, alpha=0.7)          axes.hist(r, bins=100, normed=True)          plt.show() The following plot is the result of the preceding code: Our plot of the Burr probability distribution function, along with the 100-bin histogram with a sample size of 100 million points, took about 7 seconds to render. This is due to the fact that NumPy handles most of the work, and we only displayed a limited number of visual elements. What would happen if we did try to plot all the 100 million points? This can be checked by the following code: In [11]: (figure, axes) = plt.subplots()          axes.plot(r)          plt.show() formatters.py:239: FormatterWarning: Exception in image/png formatter: Allocated too many blocks After about 30 seconds of crunching, the preceding error was thrown—the Agg backend (a shared library) simply couldn't handle the number of artists required to render all the points. But for now, this case clarifies the point that we stated a while back—our first plot rendered relatively quickly because we were selective about the data we chose to present, given the large number of points with which we are working. However, let's say we have data from the files that are too large to fit into the memory. What do we do about this? Possible ways to address this include the following: Moving the data out of the memory and into the filesystem Moving the data off the filesystem and into the databases We will explore examples of these in the following section. Big data on the filesystem The first of the two proposed solutions for large datasets involves not burdening the system memory with data, but rather leaving it on the filesystem. There are several ways to accomplish this, but the following two methods in particular are the most common in the world of NumPy and matplotlib: NumPy's memmap function: This function creates memory-mapped files that are useful if you wish to access small segments of large files on the disk without having to read the whole file into the memory. PyTables: This is a package that is used to manage hierarchical datasets. It is built on the top of the HDF5 and NumPy libraries and is designed to efficiently and easily cope with extremely large amounts of data. We will examine each in turn. NumPy's memmap function Let's restart the IPython kernel by going to the IPython menu at the top of notebook page, selecting Kernel, and then clicking on Restart. When the dialog box pops up, click on Restart. Then, re-execute the first few lines of the notebook by importing the required libraries and getting our style sheet set up. Once the kernel is restarted, take a look at the RAM utilization on your system for a fresh Python process for the notebook: In [4]: Image("memory-before.png") Out[4]: The following screenshot shows the RAM utilization for a fresh Python process: Now, let's load the array data that we previously saved to disk and recheck the memory utilization, as follows: In [5]: data = np.fromfile("../data/points.bin")        data_shape = data.shape        data_len = len(data)        data_len Out[5]: 100000000 In [6]: Image("memory-after.png") Out[6]: The following screenshot shows the memory utilization after loading the array data: This took about five seconds to load, with the memory consumption equivalent to the file size of the data. This means that if we wanted to build some sample data that was too large to fit in the memory, we'd need about 11 of those files concatenated, as follows: In [7]: 8 * 1024 Out[7]: 8192 In [8]: filesize = 763        8192 / filesize Out[8]: 10.73656618610747 However, this is only if the entire memory was available. Let's see how much memory is available right now, as follows: In [9]: del data In [10]: psutil.virtual_memory().available / 1024**2 Out[10]: 2449.1796875 That's 2.5 GB. So, to overrun our RAM, we'll just need a fraction of the total. This is done in the following way: In [11]: 2449 / filesize Out[11]: 3.2096985583224114 The preceding output means that we only need four of our original files to create a file that won't fit in memory. However, in the following section, we will still use 11 files to ensure that data, if loaded into the memory, will be much larger than the memory. How do we create this large file for demonstration purposes (knowing that in a real-life situation, the data would already be created and potentially quite large)? We can try to use numpy.tile to create a file of the desired size (larger than memory), but this can make our system unusable for a significant period of time. Instead, let's use numpy.memmap, which will treat a file on the disk as an array, thus letting us work with data that is too large to fit into the memory. Let's load the data file again, but this time as a memory-mapped array, as follows: In [12]: data = np.memmap(            "../data/points.bin", mode="r", shape=data_shape) The loading of the array to a memmap object was very quick (compared to the process of bringing the contents of the file into the memory), taking less than a second to complete. Now, let's create a new file to write the data to. This file must be larger in size as compared to our total system memory (if held on in-memory database, it will be smaller on the disk): In [13]: big_data_shape = (data_len * 11,)          big_data = np.memmap(              "../data/many-points.bin", dtype="uint8",              mode="w+", shape=big_data_shape) The preceding code creates a 1 GB file, which is mapped to an array that has the shape we requested and just contains zeros: In [14]: ls -alh ../data/many-points.bin          -rw-r--r-- 1 oubiwann staff 1.0G Apr 2 11:35 many-points.bin In [15]: big_data.shape Out[15]: (1100000000,) In [16]: big_data Out[16]: memmap([0, 0, 0, ..., 0, 0, 0], dtype=uint8) Now, let's fill the empty data structure with copies of the data we saved to the 763 MB file, as follows: In [17]: for x in range(11):              start = x * data_len              end = (x * data_len) + data_len              big_data[start:end] = data          big_data Out[17]: memmap([ 90, 71, 15, ..., 33, 244, 63], dtype=uint8) If you check your system memory before and after, you will only see minimal changes, which confirms that we are not creating an 8 GB data structure on in-memory. Furthermore, checking your system only takes a few seconds. Now, we can do some sanity checks on the resulting data and ensure that we have what we were trying to get, as follows: In [18]: big_data_len = len(big_data)          big_data_len Out[18]: 1100000000 In [19]: data[100000000 – 1] Out[19]: 63 In [20]: big_data[100000000 – 1] Out[20]: 63 Attempting to get the next index from our original dataset will throw an error (as shown in the following code), since it didn't have that index: In [21]: data[100000000] ----------------------------------------------------------- IndexError               Traceback (most recent call last) ... IndexError: index 100000000 is out of bounds ... But our new data does have an index, as shown in the following code: In [22]: big_data[100000000 Out[22]: 90 And then some: In [23]: big_data[1100000000 – 1] Out[23]: 63 We can also plot data from a memmaped array without having a significant lag time. However, note that in the following code, we will create a histogram from 1.1 million points of data, so the plotting won't be instantaneous: In [24]: (figure, axes) = plt.subplots(figsize=(20, 10))          axes.hist(big_data, bins=100)          plt.show() The following plot is the result of the preceding code: The plotting took about 40 seconds to generate. The odd shape of the histogram is due to the fact that, with our data file-hacking, we have radically changed the nature of our data since we've increased the sample size linearly without regard for the distribution. The purpose of this demonstration wasn't to preserve a sample distribution, but rather to show how one can work with large datasets. What we have seen is not too shabby. Thanks to NumPy, matplotlib can work with data that is too large for memory, even if it is a bit slow iterating over hundreds of millions of data points from the disk. Can matplotlib do better? HDF5 and PyTables A commonly used file format in the scientific computing community is Hierarchical Data Format (HDF). HDF is a set of file formats (namely HDF4 and HDF5) that were originally developed at the National Center for Supercomputing Applications (NCSA), a unit of the University of Illinois at Urbana-Champaign, to store and organize large amounts of numerical data. The NCSA is a great source of technical innovation in the computing industry—a Telnet client, the first graphical web browser, a web server that evolved into the Apache HTTP server, and HDF, which is of particular interest to us, were all developed here. It is a little known fact that NCSA's web browser code was the ancestor to both the Netscape web browser as well as a prototype of Internet Explorer that was provided to Microsoft by a third party. HDF is supported by Python, R, Julia, Java, Octave, IDL, and MATLAB, to name a few. HDF5 offers significant improvements and useful simplifications over HDF4. It uses B-trees to index table objects and, as such, works well for write-once/read-many time series data. Common use cases span fields such as meteorological studies, biosciences, finance, and aviation. The HDF5 files of multiterabyte sizes are common in these applications. Its typically constructed from the analyses of multiple HDF5 source files, thus providing a single (and often extensive) source of grouped data for a particular application. The PyTables library is built on the top of the Python HDF5 library and NumPy. As such, it not only provides access to one of the most widely used large data file formats in the scientific computing community, but also links data extracted from these files with the data types and objects provided by the fast Python numerical processing library. PyTables is also used in other projects. Pandas wraps PyTables, thus extending its convenient in-memory data structures, functions, and objects to large on-disk files. To use HDF data with Pandas, you'll want to create pandas.HDFStore, read from the HDF data sources with pandas.read_hdf, or write to one with pandas.to_hdf. Files that are too large to fit in the memory may be read and written by utilizing chunking techniques. Pandas does support the disk-based DataFrame operations, but these are not very efficient due to the required assembly on columns of data upon reading back into the memory. One project to keep an eye on under the PyData umbrella of projects is Blaze. It's an open wrapper and a utility framework that can be used when you wish to work with large datasets and generalize actions such as the creation, access, updates, and migration. Blaze supports not only HDF, but also SQL, CSV, and JSON. The API usage between Pandas and Blaze is very similar, and it offers a nice tool for developers who need to support multiple backends. In the following example, we will use PyTables directly to create an HDF5 file that is too large to fit in the memory (for an 8GB RAM machine). We will follow the following steps: Create a series of CSV source data files that take up approximately 14 GB of disk space Create an empty HDF5 file Create a table in the HDF5 file and provide the schema metadata and compression options Load the CSV source data into the HDF5 table Query the new data source once the data has been migrated Remember the temperature precipitation data for St. Francis, in Kansas, USA, from a previous notebook? We are going to generate random data with similar columns for the purposes of the HDF5 example. This data will be generated from a normal distribution, which will be used in the guise of the temperature and precipitation data for hundreds of thousands of fictitious towns across the globe for the last century, as follows: In [25]: head = "country,town,year,month,precip,tempn"          row = "{},{},{},{},{},{}n"          filename = "../data/{}.csv"          town_count = 1000          (start_year, end_year) = (1894, 2014)          (start_month, end_month) = (1, 13)          sample_size = (1 + 2                        * town_count * (end_year – start_year)                        * (end_month - start_month))          countries = range(200)          towns = range(town_count)          years = range(start_year, end_year)          months = range(start_month, end_month)          for country in countries:             with open(filename.format(country), "w") as csvfile:                  csvfile.write(head)                  csvdata = ""                  weather_data = norm.rvs(size=sample_size)                  weather_index = 0                  for town in towns:                    for year in years:                          for month in months:                              csvdata += row.format(                                  country, town, year, month,                                  weather_data[weather_index],                                  weather_data[weather_index + 1])                              weather_index += 2                  csvfile.write(csvdata) Note that we generated a sample data population that was twice as large as the expected size in order to pull both the simulated temperature and precipitation data at the same time (from the same set). This will take about 30 minutes to run. When complete, you will see the following files: In [26]: ls -rtm ../data/*.csv          ../data/0.csv, ../data/1.csv, ../data/2.csv,          ../data/3.csv, ../data/4.csv, ../data/5.csv,          ...          ../data/194.csv, ../data/195.csv, ../data/196.csv,          ../data/197.csv, ../data/198.csv, ../data/199.csv A quick look at just one of the files reveals the size of each, as follows: In [27]: ls -lh ../data/0.csv          -rw-r--r-- 1 oubiwann staff 72M Mar 21 19:02 ../data/0.csv With each file that is 72 MB in size, we have data that takes up 14 GB of disk space, which exceeds the size of the RAM of the system in question. Furthermore, running queries against so much data in the .csv files isn't going to be very efficient. It's going to take a long time. So what are our options? Well, to read this data, HDF5 is a very good fit. In fact, it is designed for jobs like this. We will use PyTables to convert the .csv files to a single HDF5. We'll start by creating an empty table file, as follows: In [28]: tb_name = "../data/weather.h5t"          h5 = tb.open_file(tb_name, "w")          h5 Out[28]: File(filename=../data/weather.h5t, title='', mode='w',              root_uep='/', filters=Filters(                  complevel=0, shuffle=False, fletcher32=False,                  least_significant_digit=None))          / (RootGroup) '' Next, we'll provide some assistance to PyTables by indicating the data types of each column in our table, as follows: In [29]: data_types = np.dtype(              [("country", "<i8"),              ("town", "<i8"),              ("year", "<i8"),              ("month", "<i8"),               ("precip", "<f8"),              ("temp", "<f8")]) Also, let's define a compression filter that can be used by PyTables when saving our data, as follows: In [30]: filters = tb.Filters(complevel=5, complib='blosc') Now, we can create a table inside our new HDF5 file, as follows: In [31]: tab = h5.create_table(              "/", "weather",              description=data_types,              filters=filters) With that done, let's load each CSV file, read it in chunks so that we don't overload the memory, and then append it to our new HDF5 table, as follows: In [32]: for filename in glob.glob("../data/*.csv"):          it = pd.read_csv(filename, iterator=True, chunksize=10000)          for chunk in it:              tab.append(chunk.to_records(index=False))            tab.flush() Depending on your machine, the entire process of loading the CSV file, reading it in chunks, and appending to a new HDF5 table can take anywhere from 5 to 10 minutes. However, what started out as a collection of the .csv files that weigh in at 14 GB is now a single compressed 4.8 GB HDF5 file, as shown in the following code: In [33]: h5.get_filesize() Out[33]: 4758762819 Here's the metadata for the PyTables-wrapped HDF5 table after the data insertion: In [34]: tab Out[34]: /weather (Table(288000000,), shuffle, blosc(5)) '' description := { "country": Int64Col(shape=(), dflt=0, pos=0), "town": Int64Col(shape=(), dflt=0, pos=1), "year": Int64Col(shape=(), dflt=0, pos=2), "month": Int64Col(shape=(), dflt=0, pos=3), "precip": Float64Col(shape=(), dflt=0.0, pos=4), "temp": Float64Col(shape=(), dflt=0.0, pos=5)} byteorder := 'little' chunkshape := (1365,) Now that we've created our file, let's use it. Let's excerpt a few lines with an array slice, as follows: In [35]: tab[100000:100010] Out[35]: array([(0, 69, 1947, 5, -0.2328834718674, 0.06810312195695),          (0, 69, 1947, 6, 0.4724989007889, 1.9529216219569),          (0, 69, 1947, 7, -1.0757216683235, 1.0415374480545),          (0, 69, 1947, 8, -1.3700249968748, 3.0971874991576),          (0, 69, 1947, 9, 0.27279758311253, 0.8263207523831),          (0, 69, 1947, 10, -0.0475253104621, 1.4530808932953),          (0, 69, 1947, 11, -0.7555493935762, -1.2665440609117),          (0, 69, 1947, 12, 1.540049376928, 1.2338186532516),          (0, 69, 1948, 1, 0.829743501445, -0.1562732708511),          (0, 69, 1948, 2, 0.06924900463163, 1.187193711598)],          dtype=[('country', '<i8'), ('town', '<i8'),                ('year', '<i8'), ('month', '<i8'),                ('precip', '<f8'), ('temp', '<f8')]) In [36]: tab[100000:100010]["precip"] Out[36]: array([-0.23288347, 0.4724989 , -1.07572167,                -1.370025 , 0.27279758, -0.04752531,                -0.75554939, 1.54004938, 0.8297435 ,                0.069249 ]) When we're done with the file, we do the same thing that we would do with any other file-like object: In [37]: h5.close() If we want to work with it again, simply load it, as follows: In [38]: h5 = tb.open_file(tb_name, "r")          tab = h5.root.weather Let's try plotting the data from our HDF5 file: In [39]: (figure, axes) = plt.subplots(figsize=(20, 10))          axes.hist(tab[:1000000]["temp"], bins=100)          plt.show() Here's a plot for the first million data points: This histogram was rendered quickly, with a much better response time than what we've seen before. Hence, the process of accessing the HDF5 data is very fast. The next question might be "What about executing calculations against this data?" Unfortunately, running the following will consume an enormous amount of RAM: tab[:]["temp"].mean() We've just asked for all of the data—all of its 288 million rows. We are going to end up loading everything into the RAM, grinding the average workstation to a halt. Ideally though, when you iterate through the source data and create the HDF5 file, you also crunch the numbers that you will need, adding supplemental columns or groups to the HDF5 file that can be used later by you and your peers. If we have data that we will mostly be selecting (extracting portions) and which has already been crunched and grouped as needed, HDF5 is a very good fit. This is why one of the most common use cases that you see for HDF5 is the sharing and distribution of the processed data. However, if we have data that we need to process repeatedly, then we will either need to use another method besides the one that will cause all the data to be loaded into the memory, or find a better match for our data processing needs. We saw in the previous section that the selection of data is very fast in HDF5. What about calculating the mean for a small section of data? If we've got a total of 288 million rows, let's select a divisor of the number that gives us several hundred thousand rows at a time—2,81,250 rows, to be more precise. Let's get the mean for the first slice, as follows: In [40]: tab[0:281250]["temp"].mean() Out[40]: 0.0030696632864265312 This took about 1 second to calculate. What about iterating through the records in a similar fashion? Let's break up the 288 million records into chunks of the same size; this will result in 1024 chunks. We'll start by getting the ranges needed for an increment of 281,250 and then, we'll examine the first and the last row as a sanity check, as follows: In [41]: limit = 281250          ranges = [(x * limit, x * limit + limit)              for x in range(2 ** 10)]          (ranges[0], ranges[-1]) Out[41]: ((0, 281250), (287718750, 288000000)) Now, we can use these ranges to generate the mean for each chunk of 281,250 rows of temperature data and print the total number of means that we generated to make sure that we're getting our counts right, as follows: In [42]: means = [tab[x * limit:x * limit + limit]["temp"].mean()              for x in range(2 ** 10)]          len(means) Out[42]: 1024 Depending on your machine, that should take between 30 and 60 seconds. With this work done, it's now easy to calculate the mean for all of the 288 million points of temperature data: In [43]: sum(means) / len(means) Out[43]: -5.3051780413782918e-05 Through HDF5's efficient file format and implementation, combined with the splitting of our operations into tasks that would not copy the HDF5 data into memory, we were able to perform calculations across a significant fraction of a billion records in less than a minute. HDF5 even supports parallelization. So, this can be improved upon with a little more time and effort. However, there are many cases where HDF5 is not a practical choice. You may have some free-form data, and preprocessing it will be too expensive. Alternatively, the datasets may be actually too large to fit on a single machine. This is when you may consider using matplotlib with distributed data. Summary In this article, we covered the role of NumPy in the world of big data and matplotlib as well as the process and problems in working with large data sources. Also, we discussed the possible solutions to these problems using NumPy's memmap function and HDF5 and PyTables. Resources for Article: Further resources on this subject: First Steps [article] Introducing Interactive Plotting [article] The plot function [article]
Read more
  • 0
  • 0
  • 5127

article-image-file-sharing
Packt
08 Jul 2015
14 min read
Save for later

File Sharing

Packt
08 Jul 2015
14 min read
In this article by Dan Ristic, author of the book Learning WebRTC, we will cover the following topics: Getting a file with File API Setting up our page Getting a reference to a file The real power of a data channel comes when combining it with other powerful technologies from a browser. By opening up the power to send data peer-to-peer and combining it with a File API, we could open up all new possibilities in your browser. This means you could add file sharing functionalities that are available to any user with an Internet connection. The application that we will build will be a simple one with the ability to share files between two peers. The basics of our application will be real-time, meaning that the two users have to be on the page at the same time to share a file. There will be a finite number of steps that both users will go through to transfer an entire file between them: User A will open the page and type a unique ID. User B will open the same page and type the same unique ID. The two users can then connect to each other using RTCPeerConnection. Once the connection is established, one user can select a file to share. The other user will be notified of the file that is being shared, where it will be transferred to their computer over the connection and they will download the file. The main thing we will focus on throughout the article is how to work with the data channel in new and exciting ways. We will be able to take the file data from the browser, break it down into pieces, and send it to the other user using only the RTCPeerConnection API. The interactivity that the API promotes will stand out in this article and can be used in a simple project. Getting a file with the File API One of the first things that we will cover is how to use the File API to get a file from the user's computer. There is a good chance you have interacted with the File API on a web page and have not even realized it yet! The API is usually denoted by the Browse or Choose File text located on an input field in the HTML page and often looks something similar to this: Although the API has been around for quite a while, the one you are probably familiar with is the original specification, dating back as far as 1995. This was the Form-based File Upload in HTML specification that focused on allowing a user to upload a file to a server using an HTML form. Before the days of the file input, application developers had to rely on third-party tools to request files of data from the user. This specification was proposed in order to make a standard way to upload files for a server to download, save, and interact with. The original standard focused entirely on interacting with a file via an HTML form, however, and did not detail any way to interact with a file via JavaScript. This was the origin of the File API. Fast-forward to the groundbreaking days of HTML5 and we now have a fully-fledged File API. The goal of the new specification was to open the doors to file manipulation for web applications, allowing them to interact with files similar to how a native-installed application would. This means providing access to not only a way for the user to upload a file, but also ways to read the file in different formats, manipulate the data of the file, and then ultimately do something with this data. Although there are many great features of the API, we are going to only focus on one small aspect of this API. This is the ability to get binary file data from the user by asking them to upload a file. A typical application that works with files, such as Notepad on Windows, will work with file data in pretty much the same way. It asks the user to open a file in which it will read the binary data from the file and display the characters on the screen. The File API gives us access to the same binary data that any other application would use in the browser. This is the great thing about working with the File API: it works in most browsers from a HTML page; similar to the ones we have been building for our WebRTC demos. To start building our application, we will put together another simple web page. This will look similar to the last ones, and should be hosted with a static file server as done in the previous examples. By the end of the article, you will be a professional single page application builder! Now let's take a look at the following HTML code that demonstrates file sharing: <!DOCTYPE html> <html lang="en"> <head>    <meta charset="utf-8" />      <title>Learning WebRTC - Article: File Sharing</title>      <style>      body {        background-color: #404040;        margin-top: 15px;        font-family: sans-serif;        color: white;      }        .thumb {        height: 75px;        border: 1px solid #000;        margin: 10px 5px 0 0;      }        .page {        position: relative;        display: block;        margin: 0 auto;        width: 500px;        height: 500px;      }        #byte_content {        margin: 5px 0;        max-height: 100px;        overflow-y: auto;        overflow-x: hidden;      }        #byte_range {        margin-top: 5px;      }    </style> </head> <body>    <div id="login-page" class="page">      <h2>Login As</h2>      <input type="text" id="username" />      <button id="login">Login</button>    </div>      <div id="share-page" class="page">      <h2>File Sharing</h2>        <input type="text" id="their-username" />      <button id="connect">Connect</button>      <div id="ready">Ready!</div>        <br />      <br />           <input type="file" id="files" name="file" /> Read bytes:      <button id="send">Send</button>    </div>      <script src="client.js"></script> </body> </html> The page should be fairly recognizable at this point. We will use the same page showing and hiding via CSS as done earlier. One of the main differences is the appearance of the file input, which we will utilize to have the user upload a file to the page. I even picked a different background color this time to spice things up. Setting up our page Create a new folder for our file sharing application and add the HTML code shown in the preceding section. You will also need all the steps from our JavaScript file to log in two users, create a WebRTC peer connection, and create a data channel between them. Copy the following code into your JavaScript file to get the page set up: var name, connectedUser;   var connection = new WebSocket('ws://localhost:8888');   connection.onopen = function () { console.log("Connected"); };   // Handle all messages through this callback connection.onmessage = function (message) { console.log("Got message", message.data);   var data = JSON.parse(message.data);   switch(data.type) {    case "login":      onLogin(data.success);      break;    case "offer":      onOffer(data.offer, data.name);      break;    case "answer":      onAnswer(data.answer);      break;    case "candidate":      onCandidate(data.candidate);      break;    case "leave":      onLeave();      break;    default:      break; } };   connection.onerror = function (err) { console.log("Got error", err); };   // Alias for sending messages in JSON format function send(message) { if (connectedUser) {    message.name = connectedUser; }   connection.send(JSON.stringify(message)); };   var loginPage = document.querySelector('#login-page'), usernameInput = document.querySelector('#username'), loginButton = document.querySelector('#login'), theirUsernameInput = document.querySelector('#their- username'), connectButton = document.querySelector('#connect'), sharePage = document.querySelector('#share-page'), sendButton = document.querySelector('#send'), readyText = document.querySelector('#ready'), statusText = document.querySelector('#status');   sharePage.style.display = "none"; readyText.style.display = "none";   // Login when the user clicks the button loginButton.addEventListener("click", function (event) { name = usernameInput.value;   if (name.length > 0) {    send({      type: "login",      name: name    }); } });   function onLogin(success) { if (success === false) {    alert("Login unsuccessful, please try a different name."); } else {    loginPage.style.display = "none";    sharePage.style.display = "block";      // Get the plumbing ready for a call    startConnection(); } };   var yourConnection, connectedUser, dataChannel, currentFile, currentFileSize, currentFileMeta;   function startConnection() { if (hasRTCPeerConnection()) {    setupPeerConnection(); } else {    alert("Sorry, your browser does not support WebRTC."); } }   function setupPeerConnection() { var configuration = {    "iceServers": [{ "url": "stun:stun.1.google.com:19302 " }] }; yourConnection = new RTCPeerConnection(configuration, {optional: []});   // Setup ice handling yourConnection.onicecandidate = function (event) {    if (event.candidate) {      send({        type: "candidate",       candidate: event.candidate      });    } };   openDataChannel(); }   function openDataChannel() { var dataChannelOptions = {    ordered: true,    reliable: true,    negotiated: true,    id: "myChannel" }; dataChannel = yourConnection.createDataChannel("myLabel", dataChannelOptions);   dataChannel.onerror = function (error) {    console.log("Data Channel Error:", error); };   dataChannel.onmessage = function (event) {    // File receive code will go here };   dataChannel.onopen = function () {    readyText.style.display = "inline-block"; };   dataChannel.onclose = function () {    readyText.style.display = "none"; }; }   function hasUserMedia() { navigator.getUserMedia = navigator.getUserMedia || navigator.webkitGetUserMedia || navigator.mozGetUserMedia || navigator.msGetUserMedia; return !!navigator.getUserMedia; }   function hasRTCPeerConnection() { window.RTCPeerConnection = window.RTCPeerConnection || window.webkitRTCPeerConnection || window.mozRTCPeerConnection; window.RTCSessionDescription = window.RTCSessionDescription || window.webkitRTCSessionDescription || window.mozRTCSessionDescription; window.RTCIceCandidate = window.RTCIceCandidate || window.webkitRTCIceCandidate || window.mozRTCIceCandidate; return !!window.RTCPeerConnection; }   function hasFileApi() { return window.File && window.FileReader && window.FileList && window.Blob; }   connectButton.addEventListener("click", function () { var theirUsername = theirUsernameInput.value;   if (theirUsername.length > 0) {    startPeerConnection(theirUsername); } });   function startPeerConnection(user) { connectedUser = user;   // Begin the offer yourConnection.createOffer(function (offer) {    send({      type: "offer",      offer: offer    });    yourConnection.setLocalDescription(offer); }, function (error) {    alert("An error has occurred."); }); };   function onOffer(offer, name) { connectedUser = name; yourConnection.setRemoteDescription(new RTCSessionDescription(offer));   yourConnection.createAnswer(function (answer) {    yourConnection.setLocalDescription(answer);      send({      type: "answer",      answer: answer    }); }, function (error) {    alert("An error has occurred"); }); };   function onAnswer(answer) { yourConnection.setRemoteDescription(new RTCSessionDescription(answer)); };   function onCandidate(candidate) { yourConnection.addIceCandidate(new RTCIceCandidate(candidate)); };   function onLeave() { connectedUser = null; yourConnection.close(); yourConnection.onicecandidate = null; setupPeerConnection(); }; We set up references to our elements on the screen as well as get the peer connection ready to be processed. When the user decides to log in, we send a login message to the server. The server will return with a success message telling the user they are logged in. From here, we allow the user to connect to another WebRTC user who is given their username. This sends offer and response, connecting the two users together through the peer connection. Once the peer connection is created, we connect the users through a data channel so that we can send arbitrary data across. Hopefully, this is pretty straightforward and you are able to get this code up and running in no time. It should all be familiar to you by now. This is the last time we are going to refer to this code, so get comfortable with it before moving on! Getting a reference to a file Now that we have a simple page up and running, we can start working on the file sharing part of the application. The first thing the user needs to do is select a file from their computer's filesystem. This is easily taken care of already by the input element on the page. The browser will allow the user to select a file from their computer and then save a reference to that file in the browser for later use. When the user presses the Send button, we want to get a reference to the file that the user has selected. To do this, you need to add an event listener, as shown in the following code: sendButton.addEventListener("click", function (event) { var files = document.querySelector('#files').files;   if (files.length > 0) {    dataChannelSend({      type: "start",      data: files[0]    });      sendFile(files[0]); } }); You might be surprised at how simple the code is to get this far! This is the amazing thing about working within a browser. Much of the hard work has already been done for you. Here, we will get a reference to our input element and the files that it has selected. The input element supports both multiple and single selection of files, but in this example we will only work with one file at a time. We then make sure we have a file to work with, tell the other user that we want to start sending data, and then call our sendFile function, which we will implement later in this article. Now, you might think that the object we get back will be in the form of the entire data inside of our file. What we actually get back from the input element is an object representing metadata about the file itself. Let's take a look at this metadata: { lastModified: 1364868324000, lastModifiedDate: "2013-04-02T02:05:24.000Z", name: "example.gif", size: 1745559, type: "image/gif" } This will give us the information we need to tell the other user that we want to start sending a file with the example.gif name. It will also give a few other important details, such as the type of file we are sending and when it has been modified. The next step is to read the file's data and send it through the data channel. This is no easy task, however, and we will require some special logic to do so. Summary In this article we covered the basics of using the File API and retrieving a file from a user's computer. The article also discusses the page setup for the application using JavaScript and getting a reference to a file. Resources for Article: Further resources on this subject: WebRTC with SIP and IMS [article] Using the WebRTC Data API [article] Applications of WebRTC [article]
Read more
  • 0
  • 0
  • 2062
Modal Close icon
Modal Close icon