How-To Tutorials

09 Jul 2015

29 min read

Process of designing XenDesktop® deployments

09 Jul 2015

In this article by, Govardhan Gunnala and Daniele Tosatto, authors of the book Mastering Citrix® XenDesktop®, we will discuss the process of designing XenDesktop® deployments. The uniqueness of the XenDesktop architecture is its modular five layer model. It covers all the key decisions in designing the XenDesktop deployment. User layer: Defines the users and their requirements Access layer: Defines how the users will access the resources Desktop/resource layer: Defines what resources will be delivered Control layer: Defines managing and maintaining the solution Hardware layer: Defines what resources it needs for implementing the chosen solution (For more resources related to this topic, see here.) While FMA is simple at a high level, its implementation can become complex depending on the technologies/options that are chosen for each component across the layers of FMA. Along with great flexibility, comes the responsibility of diligently choosing the technologies/options for fulfilling your business requirements. Importantly, the decisions made in the first three layers impact the last two layers of the deployment. It means that fixing a wrong decision anywhere in the first three layers during/after implementation stage would have less or no scope, and may even lead to implement the solution from the scratch again. Your design decisions speak for your solution's effectiveness in helping with the given business requirements. The layered architecture of the XenDesktop FMA, featuring the components at each layer is given in the following diagram. Each component of XenDesktop will fall under one of the layers shown in the succeeding diagram. We'll see what decisions are to be made for each of these components at each layer in the next sub section. Decisions to be made at each layer I will have to write a separate book for discussing all the possible technologies/options that are available at each layer. Following is a highly summarized list of the decisions to be made at each layer. This will help you in realizing the breadth of designing XenDesktop. This high level coverage of the various options will help you in locating and considering all the possible options that are available for making the right decisions and avoiding any slippages and missing any considerations. User layer The user layer refers to the specification of the users who will utilize the XenDesktop deployment. A business requirement statement may mention that the service users can either be the internal business users or the external customers accessing the service from Internet. Furthermore, both of these users may also need mobile access to the XenDesktop services. The Citrix receiver is the only component that belongs to the user layer, and XenDesktop is dependent on it for successfully delivering a XenDesktop session. By correlating this technical aspect with the preceding business requirement statement, one needs to consider all the possible aspects of receiver software on the client devices. This involves making the following decisions: Endpoint/user devices related: What are the devices that the users are supposed to access the services from? Who owns and administrates those devices throughout their lifecycle? Endpoints supported: Corporate computers, laptops, or mobiles running on Windows or thin clients. User smart devices, such as Android tablets, Apple iPads, and so on. In case of service providers, the endpoints can usually be any device and they need to be supported. Endpoint ownership: Device management includes security, availability, and compliance. Maintaining the responsibility of the devices on network. Endpoint lifecycle: Devices become either outdated or limited very quickly. Define minimum device hardware requirements to run your business workloads. Endpoint form factor: Choose the devices that may either be fully featured or have limited thin clients, or be a mix of both to support features, such as HDX graphics, multi-monitors, and so on. Thin client selection: Choose if the thin clients, such as Dell Wyse Zero clients, running on the limited functionality operating systems would suffice your user requirements. Understand its licensing cost. Receiver selection: Once you determine your endpoint device and its capabilities, you need to decide on the receiver selection that can be run on the devices. The greatest thing is that receiver is available for almost any device. Receiver type: Choose the receiver that is required for your device. Since the Receiver software for each platform (OS) differs, it is important to use the appropriate Receiver software for your devices while considering the platform that it runs on. You can download the appropriate Receiver software for your device from http://www.Citrix.com/go/receiver.html page. Initial deployment: Receiver is like any other software that will fit into your overall application portfolio. Determine how you would deploy this application on your devices. For corporate desktops and mobiles, you may use the enterprise application deployment and the mobile device management software. Otherwise, the users will be prompted to install it when they access the StoreFront URL, or they can even download it from Citrix for facilitating the installation process. For user-managed mobile devices, you can get it from the respective Windows or Google Apple stores/marketplaces. Initial configuration: Similar to other applications, Receiver requires certain initial configuration. It can be configured either manually or by using a provisioning file, group policy, and e-mail based discovery. Keeping the Receiver software up-to-date: Once you have installed Receiver on user devices, you will also require a mechanism for deploying the updates to Receiver. This can also be the way of initial deployments. Access layer An access layer refers to the specification of how the users gain access to the resources. A business requirement statement may usually state that the users should be validated for gaining access, and the access should be secured when the user is connected over the Internet. The technical components that fall under this layer include firewall(s), NetScaler, and StoreFront. These components play a broader role in the overall networking infrastructure of the company, which also includes the XenDesktop, as well as complete Citrix solutions in the environment. Their key activities include firewalling, external to internal IP address NATing, NetScaler Gateway to secure the connection between the virtual desktop and the user device, global load balancing, user validation/authentication, and GUI presentation of the enumerated resources to the end users. It involves making the following decisions: Authentication: Authentication point: A user can be authenticated at the NetScaler Gateway or StoreFront. Authentication policy: Various business use cases and compliance makes certain modes of authentication mandatory. You can choose from the different authentication methods supported at: StoreFront: Basic authentication by using a username and a password; Domain Pass-through, the NS Gateway pass-through, smart card, and even unauthenticated access. NetScaler Gateway: LDAP, RADIUS (token), client certificates. StoreFront: The decisions that are to be made around the scope of StoreFront are as follows: Unauthenticated access: Provides access to the users who don't require a username and a password, but they are still able to access the administrator allowed resources. Usually, this fits well with public or Kiosk systems. High availability: Making the StoreFront servers available at all times. Hardware load balancing, DNS Round Robin, Windows network load balancing, and so on. Delivery controller high availability and StoreFront: Building high availability for the delivery controller is recommended since they are needed for forming successful connections. Defining more than one delivery controller for the stores makes StoreFront auto failover to the next server in the list. Security - Inbound traffic: Consider securing the user connection to virtual desktops from the internal StoreFront and the external NetScaler Gateway. Security – Backend traffic: Consider securing the communication between the StoreFront and the XML services running on the controller servers. As these will be within the internal network, they can be secured by using the internal private certificate. Routing Receiver with Beacons: Receiver supports websites called Beacons to identify whether the user connection is internal or external. StoreFront provides Receiver with the http(s) addresses of the Beacon points during the initial connection. Resource Presentation: StoreFront presents a webpage, which provides self-service of the resources by the user. Scalability: The StoreFront server load and capacity for the user workload. Multi-site App synchronization: StoreFront can connect to the controllers at multiple site deployments. StoreFront can replicate the user subscribed applications across the servers. NetScaler Gateway: In the NetScaler Gateway, the decision regarding the secured external user access from public Internet involves the following: Topology: NetScaler supports two topologies: 1-Arm (normal security) and 2-Arm (high security). High availability: The NetScaler Gateways can be configured in pairs to provide high availability. Platform: NetScaler is available in different platforms, such as VPX, MDX, and SDX. They have different SSL throughput and SSL Transaction Per Second (TPS) metrics. Pre-authentication policy: Specifies about the Endpoint Analysis (EPA) scans for evaluating whether the endpoints meet the pre-set security criteria. This is available when NetScaler is chosen as the authentication point. Session management: The session policies define the overall user experience by classifying the endpoints into mobile and non-mobile devices. Session profile defines the details needed for gaining access to the environment. These are in two forms: SSLVPN and HDX proxy. Preferred data center: In multi-active data center deployments, StoreFront can determine the user resources primary data center and NetScaler can direct the user connections to that. Static and dynamic methods are used for specifying the preferred data center. Desktop/resource layer The desktop or resource layer refers to the specification of which resources (applications and desktops) users will receive. This layer comes with various options, which are tailored for business user roles and their requirements. This layer makes XenDesktop a better fit for achieving the varying user needs across each of their departments. It includes specification of the FlexCast model (type of desktop), user personalization, and delivering the application to the users in the desktop session. An example business requirement statement may specify that all the permanent employees would require a desktop with all the basic applications pre-installed based on their team and role, with their user settings and data to be retained. For all the contract employees, provide a basic desktop with controlled access to the applications on-demand and do not retain their user data. It includes various components, such as profile management solutions (including Windows profiles, the Citrix profile management, AppSense), Citrix print server, Windows operating systems, application delivery, and so on. It involves making decisions, such as: Images: It involves choosing the FlexCast model that is tailored to the user requirements, thereby delivering the expected desktop behavior to the end users, as follows: Operating system related: It requires choosing the desktop or the server operating systems for your master image, which depends on the FlexCast model that you are choosing from. Hosted Shared Hosted VDI: Pooled-static, pooled-random, pooled with PvD, dedicated, existing, physical/remote PC, streamed and streamed with PvD Streamed VHD Local VM On-demand apps Local apps In case of the desktop OS, it's also important to choose the right OS architecture according to the 32-bit or 64-bit processor architecture of the desktop. Computer policies: Define the controls over the user connection, security and bandwidth settings, devices or connection types, and so on. Specify all the policy features similar to that of the user policies. Machine catalogs: Define your catalog settings, including the FlexCast model, AD computer accounts, provisioning method, OS of the base image, and so on. Delivery groups: Assign desktops or applications to the user groups. Application folders: This is a tidy interface feature in Studio for organizing the applications into folders for easy management. StoreFront integration: This is an option for specifying the StoreFront URL for the Receiver in the master image so that the users will be auto connected to the storefront in the session. Resource allocation: This defines the hardware resources for the desktop VMs. It primarily involves hosts and storage. Depending on your estimated workloads, you can define the resources, such as number of virtual processors (vCPU), amount of virtual memory (vRAM), storage requirements for the needed disk space, and also the following resources Graphics (GPU): For the advanced use cases, you may choose to allocate the pass-through GPU, hardware vGPU, or the software vGPUs. IOPS: Depending on the operating system, the FlexCast model, and estimated workloads, you can analyze the overall IOPS load from the system and plan the corresponding hardware to support that load. Optimizations: Depending on the operating system, you can apply various optimizations to Windows that run on the master image. This greatly reduces the overall load later. Bandwidth requirements: Bandwidth can be a limiting factor in case of WAN and remote user connections of slow networks. Bandwidth consumption and user experience depend on various factors, such as the operating system being used, the application design, and screen resolution. To retain high user experience, it's important to consider the bandwidth requirements and optimization technologies, as follows: Bandwidth minimizing technologies: These include Quality of Service (QoS), HDX RealTime, and WAN Optimization, with Citrix's own CloudBridge solution. HDX Encoding Method: HDX encoding method also affects the bandwidth usage. For XenDesktop 7.x, there are three encoding methods that are available. These will appropriately be employed by the HDX protocol. These are Desktop Composition Redirection, H.264 Enhanced SuperCodec, and Legacy Mode (XenDesktop 5.X Adaptive Display). Session Bandwidth: Bandwidth needed in a session depends on the user interaction with desktop and applications. Latency: HDX can typically perform well up to 300 ms latency and the experience begins to degrade as latency increases. Personalization: This is an essential element of the desktop environment. It involves the decisions that are critical for the end user experience/acceptance and for the overall success of the solution during implementation. Following are the decisions that are involved in personalization. User profiles: This involves the decisions that are related to the user login, roaming of their settings, and seamless profile experience across overall Windows network: Profile type: Choose which profile type works for your user requirements. Possible options include local, roaming, mandatory, and hybrid profile with Citrix Profile Management. Citrix Profile Management provides various additional features, such as profile streaming, active write back, and configuring profiles using an .ini file, and so on. Folder redirection: This option saves the user's application settings in the profile. Represents special folders, such as AppData, Desktop, and so on. Folder exclusion: This option is for setting the exclusion of folders that are to be saved in the user profile. Usually, it refers to the local and IE Temp folders of a user profile. Profile caching: Caching profiles on a local system improves the user login experience and it occurs by default. You need to consider this depending on the type of virtual desktop FlexCast mode. Profile permissions: Specify whether the administrator needs access to the user profiles based on information sensitivity. Profile path: The decision to place the user profiles on a network location for high availability. It affects the logon performance depending on how close the profile is to the virtual desktop from which the user is logging on. It can be managed either from Active Directory or through Citrix Profile Management. User profile replication between data centers: This involves making the user profiles highly available and supporting the profile roaming among multiple data centers. User policies: Involves the decision regarding deploying the user settings and controlling those using management policies providing consistent settings for users, such as: Preferred policy engine: This requires choosing the policy processing for the Windows systems. The Citrix policies can be defined and managed from either Citrix Studio or the Active Directory group policy. Policy filtering: The Citrix policies can be applied to the users and their desktop with the various filter options that are available in the Citrix policy engine. If the group policies have been used, then you'll use the group policy filtering options. Policy precedence: The Citrix policies are processed in the order of LCSDOU (Local, Citrix, Site, Domain, OU policies). Baseline policy: This defines the policy with default and common settings for all the desktop images. Citrix provides the policy templates that suit specific business use cases. A baseline should cover security requirements, common network conditions, and managing the user device or the user profile requirements. Such a baseline can be configured using the security policies, connection-based policies, device-based policies, and profile-based policies. Printing: This is one of the most common desktop user requirements. XenDesktop supports printing, which can work for various scenarios. The printing technology involves deploying and using appropriate drivers. Provisioning printers: These can either be a static or dynamic set of printers. The options for dynamic printers do and do not auto-create all the client printers and auto-create the non-network client printers only. You can also set the options for session printers through the Citrix policy, which can include either static or dynamic printers. Furthermore, you can also set proximity printers. Managing print drivers: This option can be configured so that printer drivers are auto-installed during the session creation. It can be installed by using either the generic Citrix universal printer driver, or the manual option. You can also have all the known drivers preinstalled on the master image. Citrix even provides the Citrix universal print server, which extends XenDesktop universal printing support to network printing. Print job routing: It can be routed among client device or through the network server. The ICA protocol is used for compressing and sending data. Personal vDisk: Desktops with personal vDisks retain the user changes. Choosing the personal vDisk depends on the user requirements and the FlexCast Model that was opted for. Personal vDisk can be set to thin provisioned for estimated growth, but it can't be shrunk later. Applications: The application separation into another layer improves the scalability of the overall desktop solution. Applications are critical elements, which the users require from a desktop environment: Application delivery method: Applications can be installed on the base image, on the Personal vDisks, streamed into the session, or through the on-demand XenApp hosted mode. It also depends on application compatibility, and it requires technical expertise and tools, such as AppDNA, for effectively resolving them. Application streaming: XenDesktop supports App-V to build isolated application packages, which can be streamed to desktops. 16-bit legacy application delivery: If there are any legacy 16-bit applications to be supported, then you can choose from the 32 bit OS, VM hosted App, or a parallel XenApp5 deployment. Control layer Control layer speaks about all the backend systems that are required for managing and maintaining the overall solution through its life cycle. The control layer includes most of the XenDesktop components that are further classified into categories, such as resource/access controllers, image/desktop controllers, and infrastructure controllers. These respectively correspond to the first three layers of FMA, as shown here: Resource/access controllers: Supports the access layer Image/desktop controllers: Supports the desktop/resource layer Infrastructure controllers: Provides the underlying hardware for the overall FMA components/environment This layer involves the specification of capacity, configuration, and the topology of the environment. Building required/planned redundancy for each of these components enables achieving the enterprise business capabilities, such as HA, scalability, disaster recovery, load balancing, and so on. Components and technologies that operate under this layer include Active Directory, group policies, site database, Citrix licensing, XenDesktop delivery controllers, XenClient hypervisor, the Windows server and the Desktop operating systems, provisioning services, which can be either MCS or PVS and their controllers, and so on. An example business requirement statement may be as follows: Build a highly available desktop environment for a fast growing business users group. We currently have a head count of 30 users, which is expected to double in a year. It involves making the following decisions: Infrastructure controllers: It includes common infrastructure, which is required for XenDesktop to function in the Windows domain network. Active Directory: This is used for the authentication and authorization of users in a Citrix environment. It's also responsible for providing and synchronizing time on the systems, which is critical for Kerberos. For the most part, your AD structure will be in-place, and it may require certain changes for accommodating your XenDesktop requirements, such as: Forest design: It involves choosing the AD forest and domain decisions, such as multi-domain, multi-forest, domain and forest trusts, and so on, which will define the users of the XenDesktop resources. Site design: It involves choosing the number of sites that represent your geographical locations, the number of domain controllers, the subnets that accommodate the IP addresses, site links for replication, and so on. Organizational unit structure: Planning the OU structure for easier management of XenDesktop Workers and VDAs. In the case of multi-forest deployment scenarios (as supported in App Orchestration), having the same OU structure is critical. Naming standards: Planning proper conventions for XenDesktop AD objects, which includes users, security groups, XenDesktop servers, OUs, and so on. User groups: This helps in choosing the individual user names or groups. The user security groups are recommended as they reduce validation to just one object despite the number of users in it. Policy control: This helps in planning GPOs ordering and sizing, inheritance, filtering, enforcement, blocking, and loopback processing for reducing the overall processing time on the VDAs and servers. Database: Citrix uses the Microsoft SQL server database for most of its products, as follows: Edition: Microsoft ships the SQL server database in different editions, which provide varying features and capabilities. Using the standard edition for typical XenDesktop production deployments is recommended. For larger/enterprise deployments, depending on the requirement, a higher edition may be required. Database and Transaction Log Sizing: This involves estimating the storage requirements for the Site Configuration database, Monitoring database, and configuration logging databases. Database Location: By default, the Configuration Logging and the Monitoring databases are located within the Site Configuration database. Separating these into separate databases and relocating the Monitoring database to a different SQL server is recommended. High availability: Choose from VM-level HA, Mirroring, AlwaysOn Failover Cluster, and AlwaysOn Availability Groups. Database Creation: Usually, the database is automatically recreated during the XenDesktop installation. Alternatively, they can be created by using the scripts. Citrix licensing: Citrix licensing for XenDesktop requires the existence of a Citrix license server on the network. You can install and manage the multiple Citrix licenses. License type: Choose from user, device, and concurrent licenses. Version: Citrix's new license servers are backward compatible. Sizing: A license server can be scaled out to support a higher number of license requests per second. High availability: License server comes with a 30 day grace period to usually help in recovering from failures. High Availability for license server can be implemented through Window clustering technology or duplication of the virtual server. Optimization: Optimize the number of the receiving and processing threads depending on your hardware. This is generally required in large and heavily-loaded enterprise environments. Resource controllers: The resource controllers include the XenDesktop, the XenApp controllers, and the XenClient synchronizer, as shown here: XenDesktop and XenApp delivery controller. Number of sites: It is considered to have been based on network, risk tolerance, security requirements. Delivery controller sizing: Delivery controller scalability is based on CPU utilization. The more processor cores are available, the more virtual desktops a controller can support. High availability: Always plan for the N+1 deployment of the controllers for achieving the HA. Then, update the controllers' details on VDA through policy. Host connection configuration: Host connections define the hosts, storage repositories, and guest network to be used by the virtual machines on hypervisors. XML service encryption: The XML service protocol running on delivery controllers uses clear text for exchanging all data except passwords. Consider using an SSL encryption for sending the StoreFront data over a secure HTTP connection. Server OS load management: The default maximum number of sessions per server has been set to 250. Using real time usage monitoring and loading analysis, you can define appropriate load management policies. Session PreLaunch and Session Linger: Designed for helping the users in quickly accessing the applications by starting the sessions before they are requested (session prelaunch) and by keeping the user sessions active after a user closes all the applications in a session (session linger). XenClient synchronizer: It includes considerations for its architecture, processor specification, memory specification, network specification, high availability, the SQL database, remote synchronizer servers, storage repository size and location, and external access, and Active Directory integration. Image controllers: This includes all the image provisioning controllers. MCS is built-into the delivery controller. We'll have PVS considerations, such as the following: Farms: A farm represents the top level of the PVS infrastructure. Depending on your networking and administration boundaries, you can define the number of farms to be deployed in your environment. Sites: Each Farm consists of one or more sites, which contain all the PVS objects. While multiple sites share the same database, the target devices can only failover to the other Provisioning Servers that are within the same site. Your networking and organization structure determines the number of sites in your deployment. High availability: If implemented, PVS will be a critical component of the virtual desktop infrastructure. HA should be considered for its database, PVS servers, vDisks and storage, networking and TFTP, and so on. Bootstrap delivery: There are three methods in which the target device can receive the bootstrap program. This can be done by using the DHCP options, the PXE broadcasts, and the boot device manager. Write cache placement: Write cache uniquely identifies the target device by including the target device's MAC address and disk identifier. Write cache can be placed on the following: Cache on the Device Hard Drive, Cache on the Device Hard Drive Persisted, Cache in the Device RAM, Cache in the Device RAM with overflow on the hard disk, and Cache on the Provisioning Server Disk, and Cache on the Provisioning Server Disk Persisted. vDisk format and replication: PVS supports the use of fixed-size or dynamic vDisks. vDisks hosted on a SAN, local, or Direct Attached Storage must be replicated between the vDisk stores whenever a vDisk is created or changed. It can be replicated either manually or automatically. Virtual or physical servers, processor and memory: The virtual Provisioning Servers are preferred when sufficient processor, memory, disk and networking resources are guaranteed. Scale up or scale out: Determining whether to scale up or scale out the servers requires considering factors like redundancy, failover times, datacenter capacity, hardware costs, hosting costs, and so on. Bandwidth requirements and network configuration: PVS can boot 500 devices simultaneously. A 10Gbps network is recommended for provisioning services. Network configuration should consider the PVS Uplink, the Hypervisor Uplink, and the VM Uplink. Recommended switch settings include either Disable Spanning Tree or Enable Portfast, Storm Control, and Broadcast Helper. Network interfaces: Teaming the multiple network interfaces with link aggregation can provide a greater throughput. Consider the NIC features TCP Offloading and Receive Side Scaling (RSS) while selecting NICs. Subnet affinity: It is a load balancing algorithm, which helps in ensuring that the target devices are connected to the most appropriate Provisioning Server. It can be configured to Best Effort and Fixed. Auditing: By default, the auditing feature is disabled. When enabled, the audit trail information is written in the provisioning services database along with the general configuration data. Antivirus: The antivirus software can cause file-locking issues on the PVS server by contending with the files being accessed by PVS Services. The vDisk store and the write cache should be excluded from any antivirus scans in order to prevent file contention issues. Hardware layer The hardware layer involves choosing the right capacity, make, and hardware features of the backend systems that are required for the overall solution as defined in the control layer. In-line with the control layer, the hardware layer decisions will change if any of the first three layer decisions are changed. Components and technologies that operate under this layer include server hardware, storage technologies, hard disks and the RAID configurations, hypervisors and their management software, backup solutions, monitoring, network devices and connectivity, and so on. It involves making the decisions shown here: Hardware Sizing: The hardware sizing is usually done in two ways. The first, and the preferred, way is to plan ahead and purchase the hardware based on the workload requirements. The second way to size the hosts to use the existing hardware in the best configuration to support the different workload requirements, as follows: Workload separation: Workloads can either be separated into dedicated resource clusters or be mixed in the same physical hosts. Control host sizing: The VM resource allocation for each control component should be determined in the control layer and it should be allocated accordingly. Desktop host sizing: This involves choosing the physical resources required for the virtual desktops as well as the hosted server deployments. It includes estimating the pCPU, pRAM, GPU, and the number of hosts. Hypervisors: This involves choosing from the supported hypervisors that include major players, such as Hyper-V, XenServer, and ESX. Choosing from these requires considering a vast range of parameters, such as host hardware - processor and memory, storage requirements, network requirements, scale up/out, and host scalability. Further considerations to be made also include the following: Networking: Networks, physical NIC, NIC teaming, virtual NICs—hosts, virtual NICs—guests, and IP addressing VM provisioning: Templates High availability: Microsoft Hyper-V: Failover clustering, cluster shared volumes, CSV cache VMware ESXi: VMware vSphere high availability cluster Citrix XenServer: XenServer high availability by using the server pool Monitoring: Use the hypervisor specific vendor provided management and monitoring tools for hypervisor monitoring; use hardware specific vendor provided monitoring tools for hardware level monitoring. Backup and recovery: Backup method and components to be backed up. Storage: Storage architecture, RAID level, numbers of disks, disk type, storage bandwidth, tiered storage, thin provisioning, and data de-duplication Disaster recovery Data center utilization: The XenDesktop deployments can leverage multiple data centers for improving the user performance and the availability of resources. Multiple data centers can be deployed in an active/active or an active/passive configuration. An active/active configuration allows for both data centers to be utilized, although the individual users are tied to a specific location. Data center connectivity: An active/active data center configuration utilizing GSLB (Global Server Load Balancing) ensures that the users will be able to establish a connection even if one datacenter is unavailable. In the active/active configuration, the considerations that should be made are as follows: data center failover time, application servers, and StoreFront optimal routing. Capacity in the secondary data center: Planning of the secondary data center capacity is determined by the cost and by the management to support full capacity in each data center. A percent of the overall users, or a percent of the users per application, may be considered for the secondary data center facility. Then, it also needs the consideration of the type and amount of resources that will be made available in a failover scenario. Tools for designing XenDesktop® In the previous section, we saw a broad list of components, technologies, and configuration options, and so on, which we learned are involved in the process of designing the XenDesktop deployment. Obviously, designing the XenDesktop deployment for large, advanced, and complex business scenarios is a mammoth task, which requires operational knowledge of a broad range of technologies. Understanding the maze of this complexity, Citrix constantly helps the customers with great learning resources through handbooks, reviewer guides, blueprints, online eDocs, and training sessions. To ease the life of technical architects and XenDesktop designing and deployment consultants, Citrix has developed an online designing portal called Project Accelerator, which automates, streamlines, and covers all the broad aspects that are involved in the XenDesktop deployment. Project Accelerator Citrix designed the Project Accelerator web based designing tool, and it is available to the customers after they login. Its design is based on the Citrix consulting best practices for the XenDesktop deployment and implementation. It follows the layered FMA and allows you to create a close to deployment architecture. It covers all the key decisions and facilitates modifying them and evaluating their impact on the overall architecture. Upon completion of the design, it generates an architectural diagram and a deployment sizing plan. One can define more than one project and customize them in parallel to achieve multiple deployment plans. I highly recommended starting your Production XenDesktop deployment with the Project Accelerator architecture and the sizing design. Virtual Desktop Handbook Citrix provides the handbook along with new XenDesktop releases. The handbook covers the latest features of that XenDesktop version and provides detailed information on the design decisions. It provides all the possible options for each of the decisions involved, and these options are evaluated and validated in an in-depth manner by the Citrix Solutions lab. They include the Citrix Consulting leading best practices as well. This helps architects and engineers to consider the recommended technologies, and then evaluate them further for fulfilling the business requirements. The Virtual Desktop Handbook for latest the version of XenDesktop, that is, 7.x, can be found at: http://support.Citrix.com/article/CTX139331. XenDesktop® Reviewer's Guide The Reviewer's Guide is also released along with the new versions of XenDesktop. They are designed for helping businesses in quickly installing and configuring the XenDesktop for evaluation. They provide a step-by-step screencast of the installation and configuration wizards of XenDesktop. This provides practical guidance to the IT administrators for successfully installing and delivering the XenDesktop sessions. The XenDesktop Reviewers Guide for the latest version of XenDesktop, that is, 7.6, can be found at https://www.citrix.com/content/dam/citrix/en_us/documents/products-solutions/xendesktop-reviewers-guide.pdf. Summary We learnt the decision making that is involved in designing the XenDesktop in general, and we also saw the deployment designs of the complex environments involving the cloud capabilities. We also saw different tools for designing XenDesktop. Resources for Article: Further resources on this subject: Understanding Citrix®Provisioning Services 7.0 [article] Installation and Deployment of Citrix Systems®' CPSM [article] Designing, Sizing, Building, and Configuring Citrix VDI-in-a-Box [article]

0
0
9311

article-image-architecting-and-coding-high-performance-net-applications

Packt

09 Jul 2015

15 min read

Architecting and coding high performance .NET applications

Packt

09 Jul 2015

15 min read

In this article by Antonio Esposito, author of Learning .NET High Performance Programming, we will learn about low-pass audio filtering implemented using .NET, and also learn about MVVM and XAML. Model-View-ViewModel and XAML The MVVM pattern is another descendant of the MVC pattern. Born from an extensive update to the MVP pattern, it is at the base of all eXtensible Application Markup Language (XAML) language-based frameworks, such as Windows presentation foundation (WPF), Silverlight, Windows Phone applications, and Store Apps (formerly known as Metro-style apps). MVVM is different from MVC, which is used by Microsoft in its main web development framework in that it is used for desktop or device class applications. The first and (still) the most powerful application framework using MVVM in Microsoft is WPF, a desktop class framework that can use the full .NET 4.5.3 environment. Future versions within Visual Studio 2015 will support built-in .NET 4.6. On the other hand, all other frameworks by Microsoft that use the XAML language supporting MVVM patterns are based on a smaller edition of .NET. This happens with Silverlight, Windows Store Apps, Universal Apps, or Windows Phone Apps. This is why Microsoft made the Portable Library project within Visual Studio, which allows us to create shared code bases compatible with all frameworks. While a Controller in MVC pattern is sort of a router for requests to catch any request and parsing input/output Models, the MVVM lies behind any View with a full two-way data binding that is always linked to a View's controls and together at Model's properties. Actually, multiple ViewModels may run the same View and many Views can use the same single/multiple instance of a given ViewModel. A simple MVC/MVVM design comparative We could assert that the experience offered by MVVM is like a film, while the experience offered by MVC is like photography, because while a Controller always makes one-shot elaborations regarding the application user requests in MVC, in MVVM, the ViewModel is definitely the view! Not only does a ViewModel lie behind a View, but we could also say that if a VM is a body, then a View is its dress. While the concrete View is the graphical representation, the ViewModel is the virtual view, the un-concrete view, but still the View. In MVC, the View contains the user state (the value of all items showed in the UI) until a GET/POST invocation is sent to the web server. Once sent, in the MVC framework, the View simply binds one-way reading data from a Model. In MVVM, behaviors, interaction logic, and user state actually live within the ViewModel. Moreover, it is again in the ViewModel that any access to the underlying Model, domain, and any persistence provider actually flows. Between a ViewModel and View, a data connection called data binding is established. This is a declarative association between a source and target property, such as Person.Name with TextBox.Text. Although it is possible to configure data binding by imperative coding (while declarative means decorating or setting the property association in XAML), in frameworks such as WPF and other XAML-based frameworks, this is usually avoided because of the more decoupled result made by the declarative choice. The most powerful technology feature provided by any XAML-based language is actually the data binding, other than the simpler one that was available in Windows Forms. XAML allows one-way binding (also reverted to the source) and two-way binding. Such data binding supports any source or target as a property from a Model or ViewModel or any other control's dependency property. This binding subsystem is so powerful in XAML-based languages that events are handled in specific objects named Command, and this can be data-bound to specific controls, such as buttons. In the .NET framework, an event is an implementation of the Observer pattern that lies within a delegate object, allowing a 1-N association between the only source of the event (the owner of the event) and more observers that can handle the event with some specific code. The only object that can raise the event is the owner itself. In XAML-based languages, a Command is an object that targets a specific event (in the meaning of something that can happen) that can be bound to different controls/classes, and all of those can register handlers or raise the signaling of all handlers. An MVVM performance map analysis Performance concerns Regarding performance, MVVM behaves very well in several scenarios in terms of data retrieval (latency-driven) and data entry (throughput- and scalability-driven). The ability to have an impressive abstraction of the view in the VM without having to rely on the pipelines of MVC (the actions) makes the programming very pleasurable and give the developer the choice to use different designs and optimization techniques. Data binding itself is done by implementing specific .NET interfaces that can be easily centralized. Talking about latency, it is slightly different from previous examples based on web request-response time, unavailable in MVVM. Theoretically speaking, in the design pattern of MVVM, there is no latency at all. In a concrete implementation within XAML-based languages, latency can refer to two different kinds of timings. During data binding, latency is the time between when a VM makes new data available and a View actually renders it. Instead, during a command execution, latency is the time between when a command is invoked and all relative handlers complete their execution. We usually use the first definition until differently specified. Although the nominal latency is near zero (some milliseconds because of the dictionary-based configuration of data binding), specific implementation concerns about latency actually exist. In any Model or ViewModel, an updated data notification is made by triggering the View with the INotifyPropertyChanged interface. The .NET interface causes the View to read the underlying data again. Because all notifications are made by a single .NET event, this can easily become a bottleneck because of the serialized approach used by any delegate or event handlers in the .NET world. On the contrary, when dealing with data that flows from the View to the Model, such an inverse binding is usually configured declaratively within the {Binding …} keyword, which supports specifying binding directions and trigger timing (to choose from the control's lost focus CLR event or anytime the property value changes). The XAML data binding does not add any measurable time during its execution. Although this, as said, such binding may link multiple properties or the control's dependency properties together. Linking this interaction logic could increase latency time heavily, adding some annoying delay at the View level. One fact upon all, is the added latency by any validation logic. It is even worse if such validation is other than formal, such as validating some ID or CODE against a database value. Talking about scalability, MVVM patterns does some work here, while we can make some concrete analysis concerning the XAML implementation. It is easy to say that scaling out is impossible because MVVM is a desktop class layered architecture that cannot scale. Instead, we can say that in a multiuser scenario with multiple client systems connected in a 2-tier or 3-tier system architecture, simple MVVM and XAML-based frameworks will never act as bottlenecks. The ability to use the full .NET stack in WPF gives us the chance to use all synchronization techniques available, in order to use a directly connected DBMS or middleware tier. Instead of scaling up by moving the application to an increased CPU clock system, the XAML-based application would benefit more from an increased CPU core count system. Obviously, to profit from many CPU cores, mastering parallel techniques is mandatory. About the resource usage, MVVM-powered architectures require only a simple POCO class as a Model and ViewModel. The only additional requirement is the implementation of the INotifyPropertyChanged interface that costs next to nothing. Talking about the pattern, unlike MVC, which has a specific elaboration workflow, MVVM does not offer this functionality. Multiple commands with multiple logic can process their respective logic (together with asynchronous invocation) with the local VM data or by going down to the persistence layer to grab missing information. We have all the choices here. Although MVVM does not cost anything in terms of graphical rendering, XAML-based frameworks make massive use of hardware-accelerated user controls. Talking about an extreme choice, Windows Forms with Graphics Device Interface (GDI)-based rendering require a lot less resources and can give a higher frame rate on highly updatable data. Thus, if a very high FPS is needed, the choice of still rendering a WPF area in GDI is available. For other XAML languages, the choice is not so easy to obtain. Obviously, this does not mean that XAML is slow in rendering with its DirectX based engine. Simply consider that WPF animations need a good Graphics Processing Unit (GPU), while a basic GDI animation will execute on any system, although it is obsolete. Talking about availability, MVVM-based architectures usually lead programmers to good programming. As MVC allows it, MVVM designs can be tested because of the great modularity. While a Controller uses a pipelined workflow to process any requests, a ViewModel is more flexible and can be tested with multiple initialization conditions. This makes it more powerful but also less predictable than a Controller, and hence is tricky to use. In terms of design, the Controller acts as a transaction script, while the ViewModel acts in a more realistic, object-oriented approach. Finally, yet importantly, throughput and efficiency are simply unaffected by MVVM-based architectures. However, because of the flexibility the solution gives to the developer, any interaction and business logic design may be used inside a ViewModel and their underlying Models. Therefore, any success or failure regarding those performance aspects are usually related to programmer work. In XAML frameworks, throughput is achieved by an intensive use of asynchronous and parallel programming assisted by a built-in thread synchronization subsystem, based on the Dispatcher class that deals with UI updates. Low-pass filtering for Audio Low-pass filtering has been available since 2008 in the native .NET code. NAudio is a powerful library helping any CLR programmer to create, manipulate, or analyze audio data in any format. Available through NuGet Package Manager, NAudio offers a simple and .NET-like programming framework, with specific classes and stream-reader for audio data files. Let's see how to apply the low-pass digital filter in a real audio uncompressed file in WAVE format. For this test, we will use the Windows start-up default sound file. The chart is still made in a legacy Windows Forms application with an empty Form1 file, as shown in the previous example: private async void Form1_Load(object sender, EventArgs e) { //stereo wave file channels var channels = await Task.Factory.StartNew(() => { //the wave stream-like reader using (var reader = new WaveFileReader("startup.wav")) { var leftChannel = new List<float>(); var rightChannel = new List<float>(); //let's read all frames as normalized floats while (reader.Position < reader.Length) { var frame = reader.ReadNextSampleFrame(); leftChannel.Add(frame[0]); rightChannel.Add(frame[1]); } return new { Left = leftChannel.ToArray(), Right = rightChannel.ToArray(), }; } }); //make a low-pass digital filter on floating point data //at 200hz var leftLowpassTask = Task.Factory.StartNew(() => LowPass(channels.Left, 200).ToArray()); var rightLowpassTask = Task.Factory.StartNew(() => LowPass(channels.Right, 200).ToArray()); //this let the two tasks work together in task-parallelism var leftChannelLP = await leftLowpassTask; var rightChannelLP = await rightLowpassTask; //create and databind a chart var chart1 = CreateChart(); chart1.DataSource = Enumerable.Range(0, channels.Left.Length).Select(i => new { Index = i, Left = channels.Left[i], Right = channels.Right[i], LeftLP = leftChannelLP[i], RightLP = rightChannelLP[i], }).ToArray(); chart1.DataBind(); //add the chart to the form this.Controls.Add(chart1); } private static Chart CreateChart() { //creates a chart //namespace System.Windows.Forms.DataVisualization.Charting var chart1 = new Chart(); //shows chart in fullscreen chart1.Dock = DockStyle.Fill; //create a default area chart1.ChartAreas.Add(new ChartArea()); //left and right channel series chart1.Series.Add(new Series { XValueMember = "Index", XValueType = ChartValueType.Auto, YValueMembers = "Left", ChartType = SeriesChartType.Line, }); chart1.Series.Add(new Series { XValueMember = "Index", XValueType = ChartValueType.Auto, YValueMembers = "Right", ChartType = SeriesChartType.Line, }); //left and right channel low-pass (bass) series chart1.Series.Add(new Series { XValueMember = "Index", XValueType = ChartValueType.Auto, YValueMembers = "LeftLP", ChartType = SeriesChartType.Line, BorderWidth = 2, }); chart1.Series.Add(new Series { XValueMember = "Index", XValueType = ChartValueType.Auto, YValueMembers = "RightLP", ChartType = SeriesChartType.Line, BorderWidth = 2, }); return chart1; } Let's see the graphical result: The Windows start-up sound waveform. In bolt, the bass waveform with a low-pass filter at 200hz. The usage of parallelism in elaborations such as this is mandatory. Audio elaboration is a canonical example of engineering data computation because it works on a huge dataset of floating points values. A simple file, such as the preceding one that contains less than 2 seconds of audio sampled at (only) 22,050 Hz, produces an array greater than 40,000 floating points per channel (stereo = 2 channels). Just to have an idea of how hard processing audio files is, note that an uncompressed CD quality song of 4 minutes sampled at 44,100 samples per second * 60 (seconds) * 4 (minutes) will create an array greater than 10 million floating-point items per channel. Because of the FFT intrinsic logic, any low-pass filtering run must run in a single thread. This means that the only optimization we can apply when running FFT based low-pass filtering is parallelizing in a per channel basis. For most cases, this choice can only bring a 2X throughput improvement, regardless of the processor count of the underlying system. Summary In this article we got introduced to the applications of .NET high-performance performance. We learned how MVVM and XAML play their roles in .NET to create applications for various platforms, also we learned about its performance characteristics. Next we learned how high-performance .NET had applications in engineering aspects through a practical example of low-pass audio filtering. It showed you how versatile it is to apply high-performance programming to specific engineering applications. Resources for Article: Further resources on this subject: Windows Phone 8 Applications [article] Core .NET Recipes [article] Parallel Programming Patterns [article]

0
0
3138

article-image-essentials-working-python-collections

Packt

09 Jul 2015

14 min read

The Essentials of Working with Python Collections

Packt

09 Jul 2015

14 min read

In this article by Steven F. Lott, the author of the book Python Essentials, we'll look at the break and continue statements; these modify a for or while loop to allow skipping items or exiting before the loop has processed all items. This is a fundamental change in the semantics of a collection-processing statement. (For more resources related to this topic, see here.) Processing collections with the for statement The for statement is an extremely versatile way to process every item in a collection. We do this by defining a target variable, a source of items, and a suite of statements. The for statement will iterate through the source of items, assigning each item to the target variable, and also execute the suite of statements. All of the collections in Python provide the necessary methods, which means that we can use anything as the source of items in a for statement. Here's some sample data that we'll work with. This is part of Mike Keith's poem, Near a Raven. We'll remove the punctuation to make the text easier to work with: >>> text = '''Poe, E. ... Near a Raven ... ... Midnights so dreary, tired and weary.''' >>> text = text.replace(",","").replace(".","").lower() This will put the original text, with uppercase and lowercase and punctuation into the text variable. When we use text.split(), we get a sequence of individual words. The for loop can iterate through this sequence of words so that we can process each one. The syntax looks like this: >>> cadaeic= {} >>> for word in text.split(): ... cadaeic[word]= len(word) We've created an empty dictionary, and assigned it to the cadaeic variable. The expression in the for loop, text.split(), will create a sequence of substrings. Each of these substrings will be assigned to the word variable. The for loop body—a single assignment statement—will be executed once for each value assigned to word. The resulting dictionary might look like this (irrespective of ordering): {'raven': 5, 'midnights': 9, 'dreary': 6, 'e': 1, 'weary': 5, 'near': 4, 'a': 1, 'poe': 3, 'and': 3, 'so': 2, 'tired': 5} There's no guaranteed order for mappings or sets. Your results may differ slightly. In addition to iterating over a sequence, we can also iterate over the keys in a dictionary. >>> for word in sorted(cadaeic): ... print(word, cadaeic[word]) When we use sorted() on a tuple or a list, an interim list is created with sorted items. When we apply sorted() to a mapping, the sorting applies to the keys of the mapping, creating a sequence of sorted keys. This loop will print a list in alphabetical order of the various pilish words used in this poem. Pilish is a subset of English where the word lengths are important: they're used as mnemonic aids. A for statement corresponds to the "for all" logical quantifier, . At the end of a simple for loop we can assert that all items in the source collection have been processed. In order to build the "there exists" quantifier, , we can either use the while statement, or the break statement inside the body of a for statement. Using literal lists in a for statement We can apply the for statement to a sequence of literal values. One of the most common ways to present literals is as a tuple. It might look like this: for scheme in 'http', 'https', 'ftp': do_something(scheme) This will assign three different values to the scheme variable. For each of those values, it will evaluate the do_something() function. From this, we can see that, strictly-speaking, the () are not required to delimit a tuple object. If the sequence of values grows, however, and we need to span more than one physical line, we'll want to add (), making the tuple literal more explicit. Using the range() and enumerate() functions The range() object will provide a sequence of numbers, often used in a for loop. The range() object is iterable, it's not itself a sequence object. It's a generator, which will produce items when required. If we use range() outside a for statement, we need to use a function like list(range(x)) or tuple(range(a,b)) to consume all of the generated values and create a new sequence object. The range() object has three commonly-used forms: range(n) produces ascending numbers including 0 but not including n itself. This is a half-open interval. We could say that range(n) produces numbers, x, such that . The expression list(range(5)) returns [0, 1, 2, 3, 4]. This produces n values including 0 and n - 1. range(a,b) produces ascending numbers starting from a but not including b. The expression tuple(range(-1,3)) will return (-1, 0, 1, 2). This produces b - a values including a and b - 1. range(x,y,z) produces ascending numbers in the sequence . This produces (y-x)//z values. We can use the range() object like this: for n in range(1, 21): status= str(n) if n % 5 == 0: status += " fizz" if n % 7 == 0: status += " buzz" print(status) In this example, we've used a range() object to produce values, n, such that . We use the range() object to generate the index values for all items in a list: for n in range(len(some_list)): print(n, some_list[n]) We've used the range() function to generate values between 0 and the length of the sequence object named some_list. The for statement allows multiple target variables. The rules for multiple target variables are the same as for a multiple variable assignment statement: a sequence object will be decomposed and items assigned to each variable. Because of that, we can leverage the enumerate() function to iterate through a sequence and assign the index values at the same time. It looks like this: for n, v in enumerate(some_list): print(n, v) The enumerate() function is a generator function which iterates through the items in source sequence and yields a sequence of two-tuple pairs with the index and the item. Since we've provided two variables, the two-tuple is decomposed and assigned to each variable. There are numerous use cases for this multiple-assignment for loop. We often have list-of-tuples data structures that can be handled very neatly with this multiple-assignment feature. Iterating with the while statement The while statement is a more general iteration than the for statement. We'll use a while loop in two situations. We'll use this in cases where we don't have a finite collection to impose an upper bound on the loop's iteration; we may suggest an upper bound in the while clause itself. We'll also use this when writing a "search" or "there exists" kind of loop; we aren't processing all items in a collection. A desktop application that accepts input from a user, for example, will often have a while loop. The application runs until the user decides to quit; there's no upper bound on the number of user interactions. For this, we generally use a while True: loop. Infinite iteration is recommended. If we want to write a character-mode user interface, we could do it like this: quit_received= False while not quit_received: command= input("prompt> ") quit_received= process(command) This will iterate until the quit_received variable is set to True. This will process indefinitely; there's no upper boundary on the number of iterations. This process() function might use some kind of command processing. This should include a statement like this: if command.lower().startswith("quit"): return True When the user enters "quit", the process() function will return True. This will be assigned to the quit_received variable. The while expression, not quit_received, will become False, and the loop ends. A "there exists" loop will iterate through a collection, stopping at the first item that meets certain criteria. This can look complex because we're forced to make two details of loop processing explicit. Here's an example of searching for the first value that meets a condition. This example assumes that we have a function, condition(), which will eventually be True for some number. Here's how we can use a while statement to locate the minimum for which this function is True: >>> n = 1 >>> while n != 101 and not condition(n): ... n += 1 >>> assert n == 101 or condition(n) The while statement will terminate when n == 101 or the condition(n) is True. If this expression is False, we can advance the n variable to the next value in the sequence of values. Since we're iterating through the values in order from the smallest to the largest, we know that n will be the smallest value for which the condition() function is true. At the end of the while statement we have included a formal assertion that either n is 101 or the condition() function is True for the given value of n. Writing an assertion like this can help in design as well as debugging because it will often summarize the loop invariant condition. We can also write this kind of loop using the break statement in a for loop, something we'll look at in the next section. The continue and break statements The continue statement is helpful for skipping items without writing deeply-nested if statements. The effect of executing a continue statement is to skip the rest of the loop's suite. In a for loop, this means that the next item will be taken from the source iterable. In a while loop, this must be used carefully to avoid an otherwise infinite iteration. We might see file processing that looks like this: for line in some_file: clean = line.strip() if len(clean) == 0: continue data, _, _ = clean.partition("#") data = data.rstrip() if len(data) == 0: continue process(data) In this loop, we're relying on the way files act like sequences of individual lines. For each line in the file, we've stripped whitespace from the input line, and assigned the resulting string to the clean variable. If the length of this string is zero, the line was entirely whitespace, and we'll continue the loop with the next line. The continue statement skips the remaining statements in the body of the loop. We'll partition the line into three pieces: a portion in front of any "#", the "#" (if present), and the portion after any "#". We've assigned the "#" character and any text after the "#" character to the same easily-ignored variable, _, because we don't have any use for these two results of the partition() method. We can then strip any trailing whitespace from the string assigned to the data variable. If the resulting string has a length of zero, then the line is entirely filled with "#" and any trailing comment text. Since there's no useful data, we can continue the loop, ignoring this line of input. If the line passes the two if conditions, we can process the resulting data. By using the continue statement, we have avoided complex-looking, deeply-nested if statements. It's important to note that a continue statement must always be part of the suite inside an if statement, inside a for or while loop. The condition on that if statement becomes a filter condition that applies to the collection of data being processed. continue always applies to the innermost loop. Breaking early from a loop The break statement is a profound change in the semantics of the loop. An ordinary for statement can be summarized by "for all." We can comfortably say that "for all items in a collection, the suite of statements was processed." When we use a break statement, a loop is no longer summarized by "for all." We need to change our perspective to "there exists". A break statement asserts that at least one item in the collection matches the condition that leads to the execution of the break statement. Here's a simple example of a break statement: for n in range(1, 100): factors = [] for x in range(1,n): if n % x == 0: factors.append(x) if sum(factors) == n: break We've written a loop that is bound by . This loop includes a break statement, so it will not process all values of n. Instead, it will determine the smallest value of n, for which n is equal to the sum of its factors. Since the loop doesn't examine all values, it shows that at least one such number exists within the given range. We've used a nested loop to determine the factors of the number n. This nested loop creates a sequence, factors, for all values of x in the range , such that x, is a factor of the number n. This inner loop doesn't have a break statement, so we are sure it examines all values in the given range. The least value for which this is true is the number six. It's important to note that a break statement must always be part of the suite inside an if statement inside a for or while loop. If the break isn't in an if suite, the loop will always terminate while processing the first item. The condition on that if statement becomes the "where exists" condition that summarizes the loop as a whole. Clearly, multiple if statements with multiple break statements mean that the overall loop can have a potentially confusing and difficult-to-summarize post-condition. Using the else clause on a loop Python's else clause can be used on a for or while statement as well as on an if statement. The else clause executes after the loop body if there was no break statement executed. To see this, here's a contrived example: >>> for item in 1,2,3: ... print(item) ... if item == 2: ... print("Found",item) ... break ... else: ... print("Found Nothing") The for statement here will iterate over a short list of literal values. When a specific target value has been found, a message is printed. Then, the break statement will end the loop, avoiding the else clause. When we run this, we'll see three lines of output, like this: 1 2 Found 2 The value of three isn't shown, nor is the "Found Nothing" message in the else clause. If we change the target value in the if statement from two to a value that won't be seen (for example, zero or four), then the output will change. If the break statement is not executed, then the else clause will be executed. The idea here is to allow us to write contrasting break and non-break suites of statements. An if statement suite that includes a break statement can do some processing in the suite before the break statement ends the loop. An else clause allows some processing at the end of the loop when none of the break-related suites statements were executed. Summary In this article, we've looked at the for statement, which is the primary way we'll process the individual items in a collection. A simple for statement assures us that our processing has been done for all items in the collection. We've also looked at the general purpose while loop. Resources for Article: Further resources on this subject: Introspecting Maya, Python, and PyMEL [article] Analyzing a Complex Dataset [article] Geo-Spatial Data in Python: Working with Geometry [article]

0
0
2320

article-image-responsive-web-design-wordpress

Packt

09 Jul 2015

13 min read

Responsive Web Design with WordPress

Packt

09 Jul 2015

13 min read

0
0
29501

article-image-clustering-and-other-unsupervised-learning-methods

Packt

09 Jul 2015

19 min read

Clustering and Other Unsupervised Learning Methods

Packt

09 Jul 2015

19 min read

0
0
32729

Packt

08 Jul 2015

9 min read

What is Apache Camel?

Packt

08 Jul 2015

9 min read

In this article Jean-Baptiste Onofré, author of the book Mastering Apache Camel, we will see how Apache Camel originated in Apache ServiceMix. Apache ServiceMix 3 was powered by the Spring framework and implemented in the JBI specification. The Java Business Integration (JBI) specification proposed a Plug and Play approach for integration problems. JBI was based on WebService concepts and standards. For instance, it directly reusesthe Message Exchange Patterns (MEP) concept that comes from WebService Description Language (WSDL). Camel reuses some of these concepts, for instance, you will see that we have the concept of MEP in Camel. However, JBI suffered mostly from two issues: In JBI, all messages between endpoints are transported in the Normalized Messages Router (NMR). In the NMR, a message has a standard XML format. As all messages in the NMR havethe same format, it's easy to audit messages and the format is predictable. However, the JBI XML format has an important drawback for performances: it needs to marshall and unmarshall the messages. Some protocols (such as REST or RMI) are not easy to describe in XML. For instance, REST can work in stream mode. It doesn't make sense to marshall streamsin XML. Camel is payload-agnostic. This means that you can transport any kind of messages with Camel (not necessary XML formatted). JBI describes a packaging. We distinguish the binding components (responsible for the interaction with the system outside of the NMR and the handling of the messages in the NMR), and the service engines (responsible for transforming the messages inside the NMR). However, it's not possible to directly deploy the endpoints based on these components. JBI requires a service unit (a ZIP file) per endpoint, and for each package in a service assembly (another ZIP file). JBI also splits the description of the endpoint from its configuration. It does not result in a very flexible packaging: with definitions and configurations scattered in different files, not easy to maintain. In Camel, the configuration and definition of the endpoints are gatheredin a simple URI. It's easier to read. Moreover, Camel doesn't force any packaging; the same definition can be packaged in a simple XML file, OSGi bundle, andregular JAR file. In addition to JBI, another foundation of Camel is the book Enterprise Integration Patterns by Gregor Hohpe and Bobby Woolf. It describes design patterns answering classical problems while dealing with enterprise application integration and message oriented middleware. The book describes the problems and the patterns to solve them. Camel strives to implement the patterns described in the book to make them easy to use and let the developer concentrate on the task at hand. This is what Camel is: an open source framework that allows you to integrate systems and that comes with a lot of connectors and Enterprise Integration Patterns (EIP) components out of the box. And if that is not enough, one can extend and implement custom components. Components and bean support Apache Camel ships with a wide variety of components out of the box; currently, there are more than 100 components available. We can see: The connectivity components that allow exposure of endpoints for external systems or communicate with external systems. For instance, the FTP, HTTP, JMX, WebServices, JMS, and a lot more components are connectivity components. Creating an endpoint and the associated configuration for these components is easy, by directly using a URI. The internal components applying rules to the messages internally to Camel. These kinds of components apply validation or transformation rules to the inflight message. For instance, validation or XSLT are internal components. Camel brings a very powerful connectivity and mediation framework. Moreover, it's pretty easy to create new custom components, allowing you to extend Camel if the default components set doesn't match your requirements. It's also very easy to implement complex integration logic by creating your own processors and reusing your beans. Camel supports beans frameworks (IoC), such as Spring or Blueprint. Predicates and expressions As we will see later, most of the EIP need a rule definition to apply a routing logic to a message. The rule is described using an expression. It means that we have to define expressions or predicates in the Enterprise Integration Patterns. An expression returns any kind of value, whereas a predicate returns true or false only. Camel supports a lot of different languages to declare expressions or predicates. It doesn't force you to use one, it allows you to use the most appropriate one. For instance, Camel supports xpath, mvel, ognl, python, ruby, PHP, JavaScript, SpEL (Spring Expression Language), Groovy, and so on as expression languages. It also provides native Camel prebuilt functions and languages that are easy to use such as header, constant, or simple languages. Data format and type conversion Camel is payload-agnostic. This means that it can support any kind of message. Depending on the endpoints, it could be required to convert from one format to another. That's why Camel supports different data formats, in a pluggable way. This means that Camel can marshall or unmarshall a message in a given format. For instance, in addition to the standard JVM serialization, Camel natively supports Avro, JSON, protobuf, JAXB, XmlBeans, XStream, JiBX, SOAP, and so on. Depending on the endpoints and your need, you can explicitly define the data format during the processing of the message. On the other hand, Camel knows the expected format and type of endpoints. Thanks to this, Camel looks for a type converter, allowing to implicitly transform a message from one format to another. You can also explicitly define the type converter of your choice at some points during the processing of the message. Camel provides a set of ready-to-use type converters, but, as Camel supports a pluggable model, you can extend it by providing your own type converters. It's a simple POJO to implement. Easy configuration and URI Camel uses a different approach based on URI. The endpoint itself and its configuration are on the URI. The URI is human readable and provides the details of the endpoint, which is the endpoint component and the endpoint configuration. As this URI is part of the complete configuration (which defines what we name a route, as we will see later), it's possible to have a complete overview of the integration logic and connectivity in a row. Lightweight and different deployment topologies Camel itself is very light. The Camel core is only around 2 MB, and contains everythingrequired to run Camel. As it's based on a pluggable architecture, all Camel components are provided as external modules, allowing you to install only what you need, without installing superfluous and needlessly heavy modules. Camel is based on simple POJO, which means that the Camel core doesn't depend on other frameworks: it's an atomic framework and is ready to use. All other modules (components, DSL, and so on) are built on top of this Camel core. Moreover, Camel is not tied to one container for deployment. Camel supports a wide range of containers to run. They are as follows: A J2EE application server such as WebSphere, WebLogic, JBoss, and so on A Web container such as Apache Tomcat An OSGi container such as Apache Karaf A standalone application using frameworks such as Spring Camel gives a lot of flexibility, allowing you to embed it into your application or to use an enterprise-ready container. Quick prototyping and testing support In any integration project, it's typical that we have some part of the integration logic not yet available. For instance: The application to integrate with has not yet been purchased or not yet ready The remote system to integrate with has a heavy cost, not acceptable during the development phase Multiple teams work in parallel, so we may have some kinds of deadlocks between the teams As a complete integration framework, Camel provides a very easy way to prototype part of the integration logic. Even if you don't have the actual system to integrate, you can simulate this system (mock), as it allows you to implement your integration logic without waiting for dependencies. The mocking support is directly part of the Camel core and doesn't require any additional dependency. Along the same lines, testing is also crucial in an integration project. In such a kind of project, a lot of errors can happen and most are unforeseen. Moreover, a small change in an integration process might impact a lot of other processes. Camel provides the tools to easily test your design and integration logic, allowing you to integrate this in a continuous integration platform. Management and monitoring using JMX Apache Camel uses the Java Management Extension (JMX) standard and provides a lot of insights into the system using MBeans (Management Beans), providing a detailed view of the following current system: The different integration processes with the associated metrics The different components and endpoints with the associated metrics Moreover, these MBeans provide more insights than metrics. They also provide the operations to manage Camel. For instance, the operations allow you to stop an integration process, to suspend an endpoint, and so on. Using a combination of metrics and operations, you can configure a very agile integration solution. Active community The Apache Camel community is very active. This means that potential issues are identified very quickly and a fix is available soon after. However, it also means that a lot of ideas and contributions are proposed, giving more and more features to Camel. Another big advantage of an active community is that you will never be alone; a lot of people are active on the mailing lists who are ready to answer your question and provide advice. Summary Apache Camel is an enterprise integration solution used in many large organizations with enterprise support available through RedHat or Talend. Resources for Article: Further resources on this subject: Getting Started [article] A Quick Start Guide to Flume [article] Best Practices [article]

0
0
2520

article-image-integrating-google-play-services

Packt

08 Jul 2015

41 min read

Integrating Google Play Services

Packt

08 Jul 2015

41 min read

In this article Integrating Google Play Services by Raul Portales, author of the book Mastering Android Game Development, we will cover the tools that Google Play Services offers for game developers. We'll see the integration of achievements and leaderboards in detail, take an overview of events and quests, save games, and use turn-based and real-time multiplaying. Google provides Google Play Services as a way to use special features in apps. Being the game services subset the one that interests us the most. Note that Google Play Services are updated as an app that is independent from the operating system. This allows us to assume that most of the players will have the latest version of Google Play Services installed. (For more resources related to this topic, see here.) More and more features are being moved from the Android SDK to the Play Services because of this. Play Services offer much more than just services for games, but there is a whole section dedicated exclusively to games, Google Play Game Services (GPGS). These features include achievements, leaderboards, quests, save games, gifts, and even multiplayer support. GPGS also comes with a standalone app called "Play Games" that shows the user the games he or she has been playing, the latest achievements, and the games his or her friends play. It is a very interesting way to get exposure for your game. Even as a standalone feature, achievements and leaderboards are two concepts that most games use nowadays, so why make your own custom ones when you can rely on the ones made by Google? GPGS can be used on many platforms: Android, iOS and web among others. It is more used on Android, since it is included as a part of Google apps. There is extensive step-by-step documentation online, but the details are scattered over different places. We will put them together here and link you to the official documentation for more detailed information. For this article, you are supposed to have a developer account and have access to the Google Play Developer Console. It is also advisable for you to know the process of signing and releasing an app. If you are not familiar with it, there is very detailed official documentation at http://developer.android.com/distribute/googleplay/start.html. There are two sides of GPGS: the developer console and the code. We will alternate from one to the other while talking about the different features. Setting up the developer console Now that we are approaching the release state, we have to start working with the developer console. The first thing we need to do is to get into the Game services section of the console to create and configure a new game. In the left menu, we have an option labeled Game services. This is where you have to click. Once in the Game services section, click on Add new game: This bring us to the set up dialog. If you are using other Google services like Google Maps or Google Cloud Messaging (GCM) in your game, you should select the second option and move forward. Otherwise, you can just fill in the fields for I don't use any Google APIs on my game yet and continue. If you don't know whether you are already using them, you probably aren't. Now, it is time to link a game to it. I recommend you publish your game beforehand as an alpha release. This will let you select it from the list when you start typing the package name. Publishing the game to the alpha channel before adding it to Game services makes it much easier to configure. If you are not familiar with signing and releasing your app, check out the official documentation at http://developer.android.com/tools/publishing/app-signing.html. Finally, there are only two steps that we have to take when we link the first app. We need to authorize it and provide branding information. The authorization will generate an OAuth key—that we don't need to use since it is required for other platforms—and also a game ID. This ID is unique to all the linked apps and we will need it to log in. But there is no need to write it down now, it can be found easily in the console at anytime. Authorizing the app will generate the game ID, which is unique to all linked apps. Note that the app we have added is configured with the release key. If you continue and try the login integration, you will get an error telling you that the app was signed with the wrong certificate: You have two ways to work with this limitation: Always make a release build to test GPGS integration Add your debug-signed game as a linked app I recommend that you add the debug signed app as a linked app. To do this, we just need to link another app and configure it with the SHA1 fingerprint of the debug key. To obtain it, we have to open a terminal and run the keytool utility: keytool -exportcert -alias androiddebugkey -keystore <path-to-debug-keystore> -list -v Note that in Windows, the debug keystore can be found at C:Users<USERNAME>.androiddebug.keystore. On Mac and Linux, the debug keystore is typically located at ~/.android/debug.keystore. Dialog to link the debug application on the Game Services console Now, we have the game configured. We could continue creating achievements and leaderboards in the console, but we will put it aside and make sure that we can sign in and connect with GPGS. The only users who can sign in to GPGS while a game is not published are the testers. You can make the alpha and/or beta testers of a linked app become testers of the game services, and you can also add e-mail addresses by hand for this. You can modify this in the Testing tab. Only test accounts can access a game that is not published. The e-mail of the owner of the developer console is prefilled as a tester. Just in case you have problems logging in, double-check the list of testers. A game service that is not published will not appear in the feed of the Play Services app, but it will be possible to test and modify it. This is why it is a good idea to keep it in draft mode until the game itself is ready and publish both the game and the game services at the same time. Setting up the code The first thing we need to do is to add the Google Play Services library to our project. This should already have been done by the wizard when we created the project, but I recommend you to double-check it now. The library needs to be added to the build.gradle file of the main module. Note that Android Studio projects contain a top-level build.gradle and a module-level build.gradle for each module. We will modify the one that is under the mobile module. Make sure that the play services' library is listed under dependencies: apply plugin: 'com.android.application' dependencies { compile 'com.android.support:appcompat-v7:22.1.1' compile 'com.google.android.gms:play-services:7.3.0' } At the point of writing, the latest version is 7.3.0. The basic features have not changed much and they are unlikely to change. You could force Gradle to use a specific version of the library, but in general I recommend you use the latest version. Once you have it, save the changes and click on Sync Project with Gradle Files. To be able to connect with GPGS, we need to let the game know what the game ID is. This is done through the <meta-data> tag on AndroidManifest.xml. You could hardcode the value here, but it is highly recommended that you set it as a resource in your Android project. We are going to create a new file for this under res/values, which we will name play_services.xml. In this file we will put the game ID, but later we will also have the achievements and leaderboard IDs in it. Using a separate file for these values is recommended because they are constants that do not need to be translated: <application> <meta-data android_name="com.google.android.gms.games.APP_ID" android_value="@string/app_id" /> <meta-data android_name="com.google.android.gms.version" android_value="@integer/google_play_services_version"/> [...] </application> Adding this metadata is extremely important. If you forget to update the AndroidManifest.xml, the app will crash when you try to sign in to Google Play services. Note that the integer for the gms version is defined in the library and we do not need to add it to our file. If you forget to add the game ID to the strings the app will crash. Now, it is time to proceed to sign in. The process is quite tedious and requires many checks, so Google has released an open source project named BaseGameUtils, which makes it easier. Unfortunately this project is not a part of the play services' library and it is not even available as a library. So, we have to get it from GitHub (either check it out or download the source as a ZIP file). BaseGameUtils abstracts us from the complexity of handling the connection with Play Services. Even more cumbersome, BaseGameUtils is not available as a standalone download and has to be downloaded together with another project. The fact that this significant piece of code is not a part of the official library makes it quite tedious to set up. Why it has been done like this is something that I do not comprehend myself. The project that contains BaseGameUtils is called android-basic-samples and it can be downloaded from https://github.com/playgameservices/android-basic-samples. Adding BaseGameUtils is not as straightforward as we would like it to be. Once android-basic-samples is downloaded, open your game project in Android Studio. Click on File > Import Module and navigate to the directory where you downloaded android-basic-samples. Select the BaseGameUtils module in the BasicSamples/libraries directory and click on OK. Finally, update the dependencies in the build.gradle file for the mobile module and sync gradle again: dependencies { compile project(':BaseGameUtils') [...] } After all these steps to set up the project, we are finally ready to begin the sign in. We will make our main Activity extend from BaseGamesActivity, which takes care of all the handling of the connections, and sign in with Google Play Services. One more detail: until now, we were using Activity and not FragmentActivity as the base class for YassActivity (BaseGameActivity extends from FragmentActivity) and this change will mess with the behavior of our dialogs while calling navigateBack. We can change the base class of BaseGameActivity or modify navigateBack to perform a pop-on fragment navigation hierarchy. I recommend the second approach: public void navigateBack() { // Do a pop on the navigation history getFragmentManager().popBackStack(); } This util class has been designed to work with single-activity games. It can be used in multiple activities, but it is not straightforward. This is another good reason to keep the game in a single activity. The BaseGameUtils is designed to be used in single-activity games. The default behavior of BaseGameActivity is to try to log in each time the Activity is started. If the user agrees to sign in, the sign in will happen automatically. But if the user rejects doing so, he or she will be asked again several times. I personally find this intrusive and annoying, and I recommend you to only prompt to log in to Google Play services once (and again, if the user logs out). We can always provide a login entry point in the app. This is very easy to change. The default number of attempts is set to 3 and it is a part of the code of GameHelper: // Should we start the flow to sign the user in automatically on startup? If // so, up to // how many times in the life of the application? static final int DEFAULT_MAX_SIGN_IN_ATTEMPTS = 3; int mMaxAutoSignInAttempts = DEFAULT_MAX_SIGN_IN_ATTEMPTS; So, we just have to configure it for our activity, adding one line of code during onCreate to change the default behavior with the one we want: just try it once: getGameHelper().setMaxAutoSignInAttempts(1); Finally, there are two methods that we can override to act when the user successfully logs in and when there is a problem: onSignInSucceeded and onSignInFailed. We will use them when we update the main menu at the end of the article. Further use of GPGS is to be made via the GameHelper and/or the GoogleApiClient, which is a part of the GameHelper. We can obtain a reference to the GameHelper using the getGameHelper method of BaseGameActivity. Now that the user can sign into Google Play services we can continue with achievements and leaderboards. Let's go back to the developer console. Achievements We will first define a few achievements in the developer console and then see how to unlock them in the game. Note that to publish any game with GPGS, you need to define at least five achievements. No other feature is mandatory, but achievements are. We need to define at least five achievements to publish a game with Google Play Game services. If you want to use GPGS with a game that has no achievements, I recommend you to add five dummy secret achievements and let them be. To add an achievement, we just need to navigate to the Achievements tab on the left and click on Add achievement: The menu to add a new achievement has a few fields that are mostly self-explanatory. They are as follows: Name: the name that will be shown (can be localized to different languages). Description: the description of the achievement to be shown (can also be localized to different languages). Icon: the icon of the achievement as a 512x512 px PNG image. This will be used to show the achievement in the list and also to generate the locked image and the in-game popup when it is unlocked. Incremental achievements: if the achievement requires a set of steps to be completed, it is called an incremental achievement and can be shown with a progress bar. We will have an incremental achievement to illustrate this. Initial state: Revealed/Hidden depending on whether we want the achievement to be shown or not. When an achievement is shown, the name and description are visible, players know what they have to do to unlock it. A hidden achievement, on the other hand, is a secret and can be a funny surprise when unlocked. We will have two secret achievements. Points: GPGS allows each game to have 1,000 points to give for unlocking achievements. This gets converted to XP in the player profile on Google Play games. This can be used to highlight that some achievements are harder than others, and therefore grant a bigger reward. You cannot change these once they are published, so if you plan to have more achievements in the future, plan ahead with the points. List order: The order of the achievements is shown. It is not followed all the time, since on the Play Games app the unlocked ones are shown before the locked ones. It is still handy to rearrange them. Dialog to add an achievement on the developer console As we already decided, we will have five achievements in our game and they will be as follows: Big Score: score over 100,000 points in one game. This is to be granted while playing. Asteroid killer: destroy 100 asteroids. This will count them across different games and is an incremental achievement. Survivor: survive for 60 seconds. Target acquired: a hidden achievement. Hit 20 asteroids in a row without missing a hit. This is meant to reward players that only shoot when they should. Target lost: this is supposed to be a funny achievement, granted when you miss with 10 bullets in a row. It is also hidden, because otherwise it would be too easy to unlock. So, we created some images for them and added them to the console. The developer console with all the configured achievements Each achievement has a string ID. We will need these ids to unlock the achievements in our game, but Google has made it easy for us. We have a link at the bottom named Get resources that pops up a dialog with the string resources we need. We can just copy them from there and paste them in our project in the play_services.xml file we have already created. Architecture For our game, given that we only have five achievements, we are going to add the code for achievements directly into the ScoreObject. This will make it less code for you to read so we can focus on how it is done. However, for a real production code I recommend you define a dedicated architecture for achievements. The recommended architecture is to have an AchievementsManager class that loads all the achievements when the game starts and stores them in three lists: All achievements Locked achievements Unlocked achievements Then, we have an Achievement base class with an abstract check method that we implement for each one of them: public boolean check (GameEngine gameEngine, GameEvent gameEvent) { } This base class takes care of loading the achievement state from local storage (I recommend using SharedPreferences for this) and modify it as per the result of check. The achievements check is done at AchievementManager level using a checkLockedAchievements method that iterates over the list of achievements that can be unlocked. This method should be called as a part of onEventReceived of GameEngine. This architecture allows you to check only the achievements that are yet to be unlocked and also all the achievements included in the game in a specific dedicated place. In our case, since we are keeping the score inside the ScoreGameObject, we are going to add all achievements code there. Note that making the GameEngine take care of the score and having it as a variable that other objects can read are also recommended design patterns, but it was simpler to do this as a part of ScoreGameObject. Unlocking achievements To handle achievements, we need to have access to an object of the class GoogleApiClient. We can get a reference to it in the constructor of ScoreGameObject: private final GoogleApiClient mApiClient; public ScoreGameObject(YassBaseFragment parent, View view, int viewResId) { […] mApiClient = parent.getYassActivity().getGameHelper().getApiClient(); } The parent Fragment has a reference to the Activity, which has a reference to the GameHelper, which has a reference to the GoogleApiClient. Unlocking an achievement requires just a single line of code, but we also need to check whether the user is connected to Google Play services or not before trying to unlock an achievement. This is necessary because if the user has not signed it, an exception is thrown and the game crashes. Unlocking an achievement requires just a single line of code. But this check is not enough. In the edge case, when the user logs out manually from Google Play services (which can be done in the achievements screen), the connection will not be closed and there is no way to know whether he or she has logged out. We are going to create a utility method to unlock the achievements that does all the checks and also wraps the unlock method into a try/catch block and make the API client disconnect if an exception is raised: private void unlockSafe(int resId) { if (mApiClient.isConnecting() || mApiClient.isConnected()) { try { Games.Achievements.unlock(mApiClient, getString(resId)); } catch (Exception e) { mApiClient.disconnect(); } } } Even with all the checks, the code is still very simple. Let's work on the particular achievements we have defined for the game. Even though they are very specific, the methodology to track game events and variables and then check for achievements to unlock is in itself generic, and serves as a real-life example of how to deal with achievements. The achievements we have designed require us to count some game events and also the running time. For the last two achievements, we need to make a new GameEvent for the case when a bullet misses, which we have not created until now. The code in the Bullet object to trigger this new GameEvent is as follows: @Override public void onUpdate(long elapsedMillis, GameEngine gameEngine) { mY += mSpeedFactor * elapsedMillis; if (mY < -mHeight) { removeFromGameEngine(gameEngine); gameEngine.onGameEvent(GameEvent.BulletMissed); } } Now, let's work inside ScoreGameObject. We are going to have a method that checks achievements each time an asteroid is hit. There are three achievements that can be unlocked when that event happens: Big score, because hitting an asteroid gives us points Target acquired, because it requires consecutive asteroid hits Asteroid killer, because it counts the total number of asteroids that have been destroyed The code is like this: private void checkAsteroidHitRelatedAchievements() { if (mPoints > 100000) { // Unlock achievement unlockSafe(R.string.achievement_big_score); } if (mConsecutiveHits >= 20) { unlockSafe(R.string.achievement_target_acquired); } // Increment achievement of asteroids hit if (mApiClient.isConnecting() || mApiClient.isConnected()) { try { Games.Achievements.increment(mApiClient, getString(R.string.achievement_asteroid_killer), 1); } catch (Exception e) { mApiClient.disconnect(); } } } We check the total points and the number of consecutive hits to unlock the corresponding achievements. The "Asteroid killer" achievement is a bit of a different case, because it is an incremental achievement. These type of achievements do not have an unlock method, but rather an increment method. Each time we increment the value, progress on the achievement is updated. Once the progress is 100 percent, it is unlocked automatically. Incremental achievements are automatically unlocked, we just have to increment their value. This makes incremental achievements much easier to use than tracking the progress locally. But we still need to do all the checks as we did for unlockSafe. We are using a variable named mConsecutiveHits, which we have not initialized yet. This is done inside onGameEvent, which is the place where the other hidden achievement target lost is checked. Some initialization for the "Survivor" achievement is also done here: public void onGameEvent(GameEvent gameEvent) { if (gameEvent == GameEvent.AsteroidHit) { mPoints += POINTS_GAINED_PER_ASTEROID_HIT; mPointsHaveChanged = true; mConsecutiveMisses = 0; mConsecutiveHits++; checkAsteroidHitRelatedAchievements(); } else if (gameEvent == GameEvent.BulletMissed) { mConsecutiveMisses++; mConsecutiveHits = 0; if (mConsecutiveMisses >= 20) { unlockSafe(R.string.achievement_target_lost); } } else if (gameEvent == GameEvent.SpaceshipHit) { mTimeWithoutDie = 0; } […] } Each time we hit an asteroid, we increment the number of consecutive asteroid hits and reset the number of consecutive misses. Similarly, each time we miss a bullet, we increment the number of consecutive misses and reset the number of consecutive hits. As a side note, each time the spaceship is destroyed we reset the time without dying, which is used for "Survivor", but this is not the only time when the time without dying should be updated. We have to reset it when the game starts, and modify it inside onUpdate by just adding the elapsed milliseconds that have passed: @Override public void startGame(GameEngine gameEngine) { mTimeWithoutDie = 0; […] } @Override public void onUpdate(long elapsedMillis, GameEngine gameEngine) { mTimeWithoutDie += elapsedMillis; if (mTimeWithoutDie > 60000) { unlockSafe(R.string.achievement_survivor); } } So, once the game has been running for 60,000 milliseconds since it started or since a spaceship was destroyed, we unlock the "Survivor" achievement. With this, we have all the code we need to unlock the achievements we have created for the game. Let's finish this section with some comments on the system and the developer console: As a rule of thumb, you can edit most of the details of an achievement until you publish it to production. Once your achievement has been published, it cannot be deleted. You can only delete an achievement in its prepublished state. There is a button labeled Delete at the bottom of the achievement screen for this. You can also reset the progress for achievements while they are in draft. This reset happens for all players at once. There is a button labeled Reset achievement progress at the bottom of the achievement screen for this. Also note that GameBaseActivity does a lot of logging. So, if your device is connected to your computer and you run a debug build, you may see that it lags sometimes. This does not happen in a release build for which the log is removed. Leaderboards Since YASS has only one game mode and one score in the game, it makes sense to have only one leaderboard on Google Play Game Services. Leaderboards are managed from their own tab inside the Game services area of the developer console. Unlike achievements, it is not mandatory to have any leaderboard to be able to publish your game. If your game has different levels of difficulty, you can have a leaderboard for each of them. This also applies if the game has several values that measure player progress, you can have a leaderboard for each of them. Managing leaderboards on Play Games console Leaderboards can be created and managed in the Leaderboards tag. When we click on Add leaderboard, we are presented with a form that has several fields to be filled. They are as follows: Name: the display name of the leaderboard, which can be localized. We will simply call it High Scores. Score formatting: this can be Numeric, Currency, or Time. We will use Numeric for YASS. Icon: a 512x512 px icon to identify the leaderboard. Ordering: Larger is better / Smaller is better. We are going to use Larger is better, but other score types may be Smaller is better as in a racing game. Enable tamper protection: this automatically filters out suspicious scores. You should keep this on. Limits: if you want to limit the score range that is shown on the leaderboard, you can do it here. We are not going to use this List order: the order of the leaderboards. Since we only have one, it is not really important for us. Setting up a leaderboard on the Play Games console Now that we have defined the leaderboard, it is time to use it in the game. As happens with achievements, we have a link where we can get all the resources for the game in XML. So, we proceed to get the ID of the leaderboard and add it to the strings defined in the play_services.xml file. We have to submit the scores at the end of the game (that is, a GameOver event), but also when the user exits a game via the pause button. To unify this, we will create a new GameEvent called GameFinished that is triggered after a GameOver event and after the user exits the game. We will update the stopGame method of GameEngine, which is called in both cases to trigger the event: public void stopGame() { if (mUpdateThread != null) { synchronized (mLayers) { onGameEvent(GameEvent.GameFinished); } mUpdateThread.stopGame(); mUpdateThread = null; } […] } We have to set the updateThread to null after sending the event, to prevent this code being run twice. Otherwise, we could send each score more than once. Similarly, as happens for achievements, submitting a score is very simple, just a single line of code. But we also need to check that the GoogleApiClient is connected and we still have the same edge case when an Exception is thrown. So, we need to wrap it in a try/catch block. To keep everything in the same place, we will put this code inside ScoreGameObject: @Override public void onGameEvent(GameEvent gameEvent) { […] else if (gameEvent == GameEvent.GameFinished) { // Submit the score if (mApiClient.isConnecting() || mApiClient.isConnected()) { try { Games.Leaderboards.submitScore(mApiClient, getLeaderboardId(), mPoints); } catch (Exception e){ mApiClient.disconnect(); } } } } private String getLeaderboardId() { return mParent.getString(R.string.leaderboard_high_scores); } This is really straightforward. GPGS is now receiving our scores and it takes care of the timestamp of the score to create daily, weekly, and all time leaderboards. It also uses your Google+ circles to show the social score of your friends. All this is done automatically for you. The final missing piece is to let the player open the leaderboards and achievements UI from the main menu as well as trigger a sign in if they are signed out. Opening the Play Games UI To complete the integration of achievements and leaderboards, we are going to add buttons to open the native UI provided by GPGS to our main menu. For this, we are going to place two buttons in the bottom–left corner of the screen, opposite the music and sound buttons. We will also check whether we are connected or not; if not, we will show a single sign-in button. For these buttons we will use the official images of GPGS, which are available for developers to use. Note that you must follow the brand guidelines while using the icons and they must be displayed as they are and not modified. This also provides a consistent look and feel across all the games that support Play Games. Since we have seen a lot of layouts already, we are not going to include another one that is almost the same as something we already have. The main menu with the buttons to view achievements and leaderboards. To handle these new buttons we will, as usual, set the MainMenuFragment as OnClickListener for the views. We do this in the same place as the other buttons, that is, inside onViewCreated: @Override public void onViewCreated(View view, Bundle savedInstanceState) { super.onViewCreated(view, savedInstanceState); [...] view.findViewById( R.id.btn_achievements).setOnClickListener(this); view.findViewById( R.id.btn_leaderboards).setOnClickListener(this); view.findViewById(R.id.btn_sign_in).setOnClickListener(this); } As happened with achievements and leaderboards, the work is done using static methods that receive a GoogleApiClient object. We can get this object from the GameHelper that is a part of the BaseGameActivity, like this: GoogleApiClient apiClient = getYassActivity().getGameHelper().getApiClient(); To open the native UI, we have to obtain an Intent and then start an Activity with it. It is important that you use startActivityForResult, since some data is passed back and forth. To open the achievements UI, the code is like this: Intent achievementsIntent = Games.Achievements.getAchievementsIntent(apiClient); startActivityForResult(achievementsIntent, REQUEST_ACHIEVEMENTS); This works out of the box. It automatically grays out the icons for the unlocked achievements, adds a counter and progress bar to the one that is in progress, and a padlock to the hidden ones. Similarly, to open the leaderboards UI we obtain an intent from the Games.Leaderboards class instead: Intent leaderboardsIntent = Games.Leaderboards.getLeaderboardIntent( apiClient, getString(R.string.leaderboard_high_scores)); startActivityForResult(leaderboardsIntent, REQUEST_LEADERBOARDS); In this case, we are asking for a specific leaderboard, since we only have one. We could use getLeaderboardsIntent instead, which will open the Play Games UI for the list of all the leaderboards. We can have an intent to open the list of leaderboards or a specific one. What remains to be done is to replace the buttons for the login one when the user is not connected. For this, we will create a method that reads the state and shows and hides the views accordingly: private void updatePlayButtons() { GameHelper gameHelper = getYassActivity().getGameHelper(); if (gameHelper.isConnecting() || gameHelper.isSignedIn()) { getView().findViewById( R.id.btn_achievements).setVisibility(View.VISIBLE); getView().findViewById( R.id.btn_leaderboards).setVisibility(View.VISIBLE); getView().findViewById( R.id.btn_sign_in).setVisibility(View.GONE); } else { getView().findViewById( R.id.btn_achievements).setVisibility(View.GONE); getView().findViewById( R.id.btn_leaderboards).setVisibility(View.GONE); getView().findViewById( R.id.btn_sign_in).setVisibility(View.VISIBLE); } } This method decides whether to remove or make visible the views based on the state. We will call it inside the important state-changing methods: onLayoutCompleted: the first time we open the game to initialize the UI. onSignInSucceeded: when the user successfully signs in to GPGS. onSignInFailed: this can be triggered when we auto sign in and there is no connection. It is important to handle it. onActivityResult: when we come back from the Play Games UI, in case the user has logged out. But nothing is as easy as it looks. In fact, when the user signs out and does not exit the game, GoogleApiClient keeps the connection open. Therefore the value of isSignedIn from GameHelper still returns true. This is the edge case we have been talking about all through the article. As a result of this edge case, there is an inconsistency in the UI that shows the achievements and leaderboards buttons when it should show the login one. When the user logs out from Play Games, GoogleApiClient keeps the connection open. This can lead to confusion. Unfortunately, this has been marked as work as expected by Google. The reason is that the connection is still active and it is our responsibility to parse the result in the onActivityResult method to determine the new state. But this is not very convenient. Since it is a rare case we will just go for the easiest solution, which is to wrap it in a try/catch block and make the user sign in if he or she taps on leaderboards or achievements while not logged in. This is the code we have to handle the click on the achievements button, but the one for leaderboards is equivalent: else if (v.getId() == R.id.btn_achievements) { try { GoogleApiClient apiClient = getYassActivity().getGameHelper().getApiClient(); Intent achievementsIntent = Games.Achievements.getAchievementsIntent(apiClient); startActivityForResult(achievementsIntent, REQUEST_ACHIEVEMENTS); } catch (Exception e) { GameHelper gameHelper = getYassActivity().getGameHelper(); gameHelper.disconnect(); gameHelper.beginUserInitiatedSignIn(); } } Basically, we have the old code to open the achievements activity, but we wrap it in a try/catch block. If an exception is raised, we disconnect the game helper and begin a new login using the beginUserInitiatedSignIn method. It is very important to disconnect the gameHelper before we try to log in again. Otherwise, the login will not work. We must disconnect from GPGS before we can log in using the method from the GameHelper. Finally, there is the case when the user clicks on the login button, which just triggers the login using the beginUserInitiatedSignIn method from the GameHelper: if (v.getId() == R.id.btn_sign_in) { getYassActivity().getGameHelper().beginUserInitiatedSignIn(); } Once you have published your game and the game services, achievements and leaderboards will not appear in the game description on Google Play straight away. It is required that "a fair amount of users" have used them. You have done nothing wrong, you just have to wait. Other features of Google Play services Google Play Game Services provides more features for game developers than achievements and leaderboards. None of them really fit the game we are building, but it is useful to know they exist just in case your game needs them. You can save yourself lots of time and effort by using them and not reinventing the wheel. The other features of Google Play Games Services are: Events and quests: these allow you to monitor game usage and progression. Also, they add the possibility of creating time-limited events with rewards for the players. Gifts: as simple as it sounds, you can send a gift to other players or request one to be sent to you. Yes, this is seen in the very mechanical Facebook games popularized a while ago. Saved games: the standard concept of a saved game. If your game has progression or can unlock content based on user actions, you may want to use this feature. Since it is saved in the cloud, saved games can be accessed across multiple devices. Turn-based and real-time multiplayer: Google Play Game Services provides an API to implement turn-based and real-time multiplayer features without you needing to write any server code. If your game is multiplayer and has an online economy, it may be worth making your own server and granting virtual currency only on the server to prevent cheating. Otherwise, it is fairly easy to crack the gifts/reward system and a single person can ruin the complete game economy. However, if there is no online game economy, the benefits of gifts and quests may be more important than the fact that someone can hack them. Let's take a look at each of these features. Events The event's APIs provides us with a way to define and collect gameplay metrics and upload them to Google Play Game Services. This is very similar to the GameEvents we are already using in our game. Events should be a subset of the game events of our game. Many of the game events we have are used internally as a signal between objects or as a synchronization mechanism. These events are not really relevant outside the engine, but others could be. Those are the events we should send to GPGS. To be able to send an event from the game to GPGS, we have to create it in the developer console first. To create an event, we have to go to the Events tab in the developer console, click on Add new event, and fill in the following fields: Name: a short name of the event. The name can be up to 100 characters. This value can be localized. Description: a longer description of the event. The description can be up to 500 characters. This value can also be localized. Icon: the icon for the event of the standard 512x512 px size. Visibility: as for achievements, this can be revealed or hidden. Format: as for leaderboards, this can be Numeric, Currency, or Time. Event type: this is used to mark events that create or spend premium currency. This can be Premium currency sink, Premium currency source, or None. While in the game, events work pretty much as incremental achievements. You can increment the event counter using the following line of code: Games.Events.increment(mGoogleApiClient, myEventId, 1); You can delete events that are in the draft state or that have been published as long as the event is not in use by a quest. You can also reset the player progress data for the testers of your events as you can do for achievements. While the events can be used as an analytics system, their real usefulness appears when they are combined with quests. Quests A quest is a challenge that asks players to complete an event a number of times during a specific time frame to receive a reward. Because a quest is linked to an event, to use quests you need to have created at least one event. You can create a quest from the quests tab in the developer console. A quest has the following fields to be filled: Name: the short name of the quest. This can be up to 100 characters and can be localized. Description: a longer description of the quest. Your quest description should let players know what they need to do to complete the quest. The description can be up to 500 characters. The first 150 characters will be visible to players on cards such as those shown in the Google Play Games app. Icon: a square icon that will be associated with the quest. Banner: a rectangular image that will be used to promote the quest. Completion Criteria: this is the configuration of the quest itself. It consists of an event and the number of times the event must occur. Schedule: the start and end date and time for the quest. GPGS uses your local time zone, but stores the values as UTC. Players will see these values appear in their local time zone. You can mark a checkbox to notify users when the quest is about to end. Reward Data: this is specific to each game. It can be a JSON object, specifying the reward. This is sent to the client when the quest is completed. Once configured in the developer console, you can do two things with the quests: Display the list of quests Process a quest completion To get the list of quests, we start an activity with an intent that is provided to us via a static method as usual: Intent questsIntent = Games.Quests.getQuestsIntent(mGoogleApiClient, Quests.SELECT_ALL_QUESTS); startActivityForResult(questsIntent, QUESTS_INTENT); To be notified when a quest is completed, all we have to do is register a listener: Games.Quests.registerQuestUpdateListener(mGoogleApiClient, this); Once we have set the listener, the onQuestCompleted method will be called once the quest is completed. After completing the processing of the reward, the game should call claim to inform Play Game services that the player has claimed the reward. The following code snippet shows how you might override the onQuestCompleted callback: @Override public void onQuestCompleted(Quest quest) { // Claim the quest reward. Games.Quests.claim(mGoogleApiClient, quest.getQuestId(), quest.getCurrentMilestone().getMilestoneId()); // Process the RewardData to provision a specific reward. String reward = new String(quest.getCurrentMilestone().getCompletionRewardData(), Charset.forName("UTF-8")); } The rewards themselves are defined by the client. As we mentioned before, this will make the game quite easy to crack and get rewards. But usually, avoiding the hassle of writing your own server is worth it. Gifts The gifts feature of GPGS allows us to send gifts to other players and to request them to send us one as well. This is intended to make the gameplay more collaborative and to improve the social aspect of the game. As for other GPGS features, we have a built-in UI provided by the library that can be used. In this case, to send and request gifts for in-game items and resources to and from friends in their Google+ circles. The request system can make use of notifications. There are two types of requests that players can send using the game gifts feature in Google Play Game Services: A wish request to ask for in-game items or some other form of assistance from their friends A gift request to send in-game items or some other form of assistance to their friends A player can specify one or more target request recipients from the default request-sending UI. A gift or wish can be consumed (accepted) or dismissed by a recipient. To see the gifts API in detail, you can visit https://developers.google.com/games/services/android/giftRequests. Again, as for quest rewards, this is done entirely by the client, which makes the game susceptible to piracy. Saved games The saved games service offers cloud game saving slots. Your game can retrieve the saved game data to allow returning players to continue a game at their last save point from any device. This service makes it possible to synchronize a player's game data across multiple devices. For example, if you have a game that runs on Android, you can use the saved games service to allow a player to start a game on their Android phone and then continue playing the game on a tablet without losing any of their progress. This service can also be used to ensure that a player's game play continues from where it was left off even if their device is lost, destroyed, or traded in for a newer model or if the game was reinstalled The saved games service does not know about the game internals, so it provides a field that is an unstructured binary blob where you can read and write the game data. A game can write an arbitrary number of saved games for a single player subjected to user quota, so there is no hard requirement to restrict players to a single save file. Saved games are done in an unstructured binary blob. The API for saved games also receives some metadata that is used by Google Play Games to populate the UI and to present useful information in the Google Play Game app (for example, last updated timestamp). Saved games has several entry points and actions, including how to deal with conflicts in the saved games. To know more about these check out the official documentation at https://developers.google.com/games/services/android/savedgames. Multiplayer games If you are going to implement multiplayer, GPGS can save you a lot of work. You may or may not use it for the final product, but it will remove the need to think about the server-side until the game concept is validated. You can use GPGS for turn-based and real-time multiplayer games. Although each one is completely different and uses a different API, there is always an initial step where the game is set up and the opponents are selected or invited. In a turn-based multiplayer game, a single shared state is passed among the players and only the player that owns the turn has permission to modify it. Players take turns asynchronously according to an order of play determined by the game. A turn is finished explicitly by the player using an API call. Then the game state is passed to the other players, together with the turn. There are many cases: selecting opponents, creating a match, leaving a match, canceling, and so on. The official documentation at https://developers.google.com/games/services/android/turnbasedMultiplayer is quite exhaustive and you should read through it if you plan to use this feature. In a real-time multiplayer there is no concept of turn. Instead, the server uses the concept of room: a virtual construct that enables network communication between multiple players in the same game session and lets players send data directly to one another, a common concept for game servers. Real-time multiplayer service is based on the concept of Room. The API of real-time multiplayer allows us to easily: Manage network connections to create and maintain a real-time multiplayer room Provide a player-selection user interface to invite players to join a room, look for random players for auto-matching, or a combination of both Store participant and room-state information on the Play Game services' servers while the game is running Send room invitations and updates to players To check the complete documentation for real-time games, please visit the official web at https://developers.google.com/games/services/android/realtimeMultiplayer. Summary We have added Google Play services to YASS, including setting up the game in the developer console and adding the required libraries to the project. Then, we defined a set of achievements and added the code to unlock them. We have used normal, incremental, and hidden achievement types to showcase the different options available. We have also configured a leaderboard and submitted the scores, both when the game is finished and when it is exited via the pause dialog. Finally, we have added links to the native UI for leaderboards and achievements to the main menu. We have also introduced the concepts of events, quests, and gifts and the features of saved games and multiplayer that Google Play Game services offers. The game is ready to publish now. Resources for Article: Further resources on this subject: SceneKit [article] Creating Games with Cocos2d-x is Easy and 100 percent Free [article] SpriteKit Framework and Physics Simulation [article]

0
0
3837

Packt

08 Jul 2015

14 min read

File Sharing

Packt

08 Jul 2015

14 min read

0
0
2062

How-To Tutorials

article-image-working-large-data-sources

Packt

08 Jul 2015

20 min read

Working with large data sources

Packt

08 Jul 2015

20 min read

In this article, by Duncan M. McGreggor, author of the book Mastering matplotlib, we come across the use of NumPy in the world of matplotlib and big data, problems with large data sources, and the possible solutions to these problems. (For more resources related to this topic, see here.) Most of the data that users feed into matplotlib when generating plots is from NumPy. NumPy is one of the fastest ways of processing numerical and array-based data in Python (if not the fastest), so this makes sense. However by default, NumPy works on in-memory database. If the dataset that you want to plot is larger than the total RAM available on your system, performance is going to plummet. In the following section, we're going to take a look at an example that illustrates this limitation. But first, let's get our notebook set up, as follows: In [1]: import matplotlib matplotlib.use('nbagg') %matplotlib inline Here are the modules that we are going to use: In [2]: import glob, io, math, os import psutil import numpy as np import pandas as pd import tables as tb from scipy import interpolate from scipy.stats import burr, norm import matplotlib as mpl import matplotlib.pyplot as plt from IPython.display import Image We'll use the custom style sheet that we created earlier, as follows: In [3]: plt.style.use("../styles/superheroine-2.mplstyle") An example problem To keep things manageable for an in-memory example, we're going to limit our generated dataset to 100 million points by using one of SciPy's many statistical distributions, as follows: In [4]: (c, d) = (10.8, 4.2) (mean, var, skew, kurt) = burr.stats(c, d, moments='mvsk') The Burr distribution, also known as the Singh–Maddala distribution, is commonly used to model household income. Next, we'll use the burr object's method to generate a random population with our desired count, as follows: In [5]: r = burr.rvs(c, d, size=100000000) Creating 100 million data points in the last call took about 10 seconds on a moderately recent workstation, with the RAM usage peaking at about 2.25 GB (before the garbage collection kicked in). Let's make sure that it's the size we expect, as follows: In [6]: len(r) Out[6]: 100000000 If we save this to a file, it weighs in at about three-fourths of a gigabyte: In [7]: r.tofile("../data/points.bin") In [8]: ls -alh ../data/points.bin -rw-r--r-- 1 oubiwann staff 763M Mar 20 11:35 points.bin This actually does fit in the memory on a machine with a RAM of 8 GB, but generating much larger files tends to be problematic. We can reuse it multiple times though, to reach a size that is larger than what can fit in the system RAM. Before we do this, let's take a look at what we've got by generating a smooth curve for the probability distribution, as follows: In [9]: x = np.linspace(burr.ppf(0.0001, c, d), burr.ppf(0.9999, c, d), 100) y = burr.pdf(x, c, d) In [10]: (figure, axes) = plt.subplots(figsize=(20, 10)) axes.plot(x, y, linewidth=5, alpha=0.7) axes.hist(r, bins=100, normed=True) plt.show() The following plot is the result of the preceding code: Our plot of the Burr probability distribution function, along with the 100-bin histogram with a sample size of 100 million points, took about 7 seconds to render. This is due to the fact that NumPy handles most of the work, and we only displayed a limited number of visual elements. What would happen if we did try to plot all the 100 million points? This can be checked by the following code: In [11]: (figure, axes) = plt.subplots() axes.plot(r) plt.show() formatters.py:239: FormatterWarning: Exception in image/png formatter: Allocated too many blocks After about 30 seconds of crunching, the preceding error was thrown—the Agg backend (a shared library) simply couldn't handle the number of artists required to render all the points. But for now, this case clarifies the point that we stated a while back—our first plot rendered relatively quickly because we were selective about the data we chose to present, given the large number of points with which we are working. However, let's say we have data from the files that are too large to fit into the memory. What do we do about this? Possible ways to address this include the following: Moving the data out of the memory and into the filesystem Moving the data off the filesystem and into the databases We will explore examples of these in the following section. Big data on the filesystem The first of the two proposed solutions for large datasets involves not burdening the system memory with data, but rather leaving it on the filesystem. There are several ways to accomplish this, but the following two methods in particular are the most common in the world of NumPy and matplotlib: NumPy's memmap function: This function creates memory-mapped files that are useful if you wish to access small segments of large files on the disk without having to read the whole file into the memory. PyTables: This is a package that is used to manage hierarchical datasets. It is built on the top of the HDF5 and NumPy libraries and is designed to efficiently and easily cope with extremely large amounts of data. We will examine each in turn. NumPy's memmap function Let's restart the IPython kernel by going to the IPython menu at the top of notebook page, selecting Kernel, and then clicking on Restart. When the dialog box pops up, click on Restart. Then, re-execute the first few lines of the notebook by importing the required libraries and getting our style sheet set up. Once the kernel is restarted, take a look at the RAM utilization on your system for a fresh Python process for the notebook: In [4]: Image("memory-before.png") Out[4]: The following screenshot shows the RAM utilization for a fresh Python process: Now, let's load the array data that we previously saved to disk and recheck the memory utilization, as follows: In [5]: data = np.fromfile("../data/points.bin") data_shape = data.shape data_len = len(data) data_len Out[5]: 100000000 In [6]: Image("memory-after.png") Out[6]: The following screenshot shows the memory utilization after loading the array data: This took about five seconds to load, with the memory consumption equivalent to the file size of the data. This means that if we wanted to build some sample data that was too large to fit in the memory, we'd need about 11 of those files concatenated, as follows: In [7]: 8 * 1024 Out[7]: 8192 In [8]: filesize = 763 8192 / filesize Out[8]: 10.73656618610747 However, this is only if the entire memory was available. Let's see how much memory is available right now, as follows: In [9]: del data In [10]: psutil.virtual_memory().available / 1024**2 Out[10]: 2449.1796875 That's 2.5 GB. So, to overrun our RAM, we'll just need a fraction of the total. This is done in the following way: In [11]: 2449 / filesize Out[11]: 3.2096985583224114 The preceding output means that we only need four of our original files to create a file that won't fit in memory. However, in the following section, we will still use 11 files to ensure that data, if loaded into the memory, will be much larger than the memory. How do we create this large file for demonstration purposes (knowing that in a real-life situation, the data would already be created and potentially quite large)? We can try to use numpy.tile to create a file of the desired size (larger than memory), but this can make our system unusable for a significant period of time. Instead, let's use numpy.memmap, which will treat a file on the disk as an array, thus letting us work with data that is too large to fit into the memory. Let's load the data file again, but this time as a memory-mapped array, as follows: In [12]: data = np.memmap( "../data/points.bin", mode="r", shape=data_shape) The loading of the array to a memmap object was very quick (compared to the process of bringing the contents of the file into the memory), taking less than a second to complete. Now, let's create a new file to write the data to. This file must be larger in size as compared to our total system memory (if held on in-memory database, it will be smaller on the disk): In [13]: big_data_shape = (data_len * 11,) big_data = np.memmap( "../data/many-points.bin", dtype="uint8", mode="w+", shape=big_data_shape) The preceding code creates a 1 GB file, which is mapped to an array that has the shape we requested and just contains zeros: In [14]: ls -alh ../data/many-points.bin -rw-r--r-- 1 oubiwann staff 1.0G Apr 2 11:35 many-points.bin In [15]: big_data.shape Out[15]: (1100000000,) In [16]: big_data Out[16]: memmap([0, 0, 0, ..., 0, 0, 0], dtype=uint8) Now, let's fill the empty data structure with copies of the data we saved to the 763 MB file, as follows: In [17]: for x in range(11): start = x * data_len end = (x * data_len) + data_len big_data[start:end] = data big_data Out[17]: memmap([ 90, 71, 15, ..., 33, 244, 63], dtype=uint8) If you check your system memory before and after, you will only see minimal changes, which confirms that we are not creating an 8 GB data structure on in-memory. Furthermore, checking your system only takes a few seconds. Now, we can do some sanity checks on the resulting data and ensure that we have what we were trying to get, as follows: In [18]: big_data_len = len(big_data) big_data_len Out[18]: 1100000000 In [19]: data[100000000 – 1] Out[19]: 63 In [20]: big_data[100000000 – 1] Out[20]: 63 Attempting to get the next index from our original dataset will throw an error (as shown in the following code), since it didn't have that index: In [21]: data[100000000] ----------------------------------------------------------- IndexError Traceback (most recent call last) ... IndexError: index 100000000 is out of bounds ... But our new data does have an index, as shown in the following code: In [22]: big_data[100000000 Out[22]: 90 And then some: In [23]: big_data[1100000000 – 1] Out[23]: 63 We can also plot data from a memmaped array without having a significant lag time. However, note that in the following code, we will create a histogram from 1.1 million points of data, so the plotting won't be instantaneous: In [24]: (figure, axes) = plt.subplots(figsize=(20, 10)) axes.hist(big_data, bins=100) plt.show() The following plot is the result of the preceding code: The plotting took about 40 seconds to generate. The odd shape of the histogram is due to the fact that, with our data file-hacking, we have radically changed the nature of our data since we've increased the sample size linearly without regard for the distribution. The purpose of this demonstration wasn't to preserve a sample distribution, but rather to show how one can work with large datasets. What we have seen is not too shabby. Thanks to NumPy, matplotlib can work with data that is too large for memory, even if it is a bit slow iterating over hundreds of millions of data points from the disk. Can matplotlib do better? HDF5 and PyTables A commonly used file format in the scientific computing community is Hierarchical Data Format (HDF). HDF is a set of file formats (namely HDF4 and HDF5) that were originally developed at the National Center for Supercomputing Applications (NCSA), a unit of the University of Illinois at Urbana-Champaign, to store and organize large amounts of numerical data. The NCSA is a great source of technical innovation in the computing industry—a Telnet client, the first graphical web browser, a web server that evolved into the Apache HTTP server, and HDF, which is of particular interest to us, were all developed here. It is a little known fact that NCSA's web browser code was the ancestor to both the Netscape web browser as well as a prototype of Internet Explorer that was provided to Microsoft by a third party. HDF is supported by Python, R, Julia, Java, Octave, IDL, and MATLAB, to name a few. HDF5 offers significant improvements and useful simplifications over HDF4. It uses B-trees to index table objects and, as such, works well for write-once/read-many time series data. Common use cases span fields such as meteorological studies, biosciences, finance, and aviation. The HDF5 files of multiterabyte sizes are common in these applications. Its typically constructed from the analyses of multiple HDF5 source files, thus providing a single (and often extensive) source of grouped data for a particular application. The PyTables library is built on the top of the Python HDF5 library and NumPy. As such, it not only provides access to one of the most widely used large data file formats in the scientific computing community, but also links data extracted from these files with the data types and objects provided by the fast Python numerical processing library. PyTables is also used in other projects. Pandas wraps PyTables, thus extending its convenient in-memory data structures, functions, and objects to large on-disk files. To use HDF data with Pandas, you'll want to create pandas.HDFStore, read from the HDF data sources with pandas.read_hdf, or write to one with pandas.to_hdf. Files that are too large to fit in the memory may be read and written by utilizing chunking techniques. Pandas does support the disk-based DataFrame operations, but these are not very efficient due to the required assembly on columns of data upon reading back into the memory. One project to keep an eye on under the PyData umbrella of projects is Blaze. It's an open wrapper and a utility framework that can be used when you wish to work with large datasets and generalize actions such as the creation, access, updates, and migration. Blaze supports not only HDF, but also SQL, CSV, and JSON. The API usage between Pandas and Blaze is very similar, and it offers a nice tool for developers who need to support multiple backends. In the following example, we will use PyTables directly to create an HDF5 file that is too large to fit in the memory (for an 8GB RAM machine). We will follow the following steps: Create a series of CSV source data files that take up approximately 14 GB of disk space Create an empty HDF5 file Create a table in the HDF5 file and provide the schema metadata and compression options Load the CSV source data into the HDF5 table Query the new data source once the data has been migrated Remember the temperature precipitation data for St. Francis, in Kansas, USA, from a previous notebook? We are going to generate random data with similar columns for the purposes of the HDF5 example. This data will be generated from a normal distribution, which will be used in the guise of the temperature and precipitation data for hundreds of thousands of fictitious towns across the globe for the last century, as follows: In [25]: head = "country,town,year,month,precip,tempn" row = "{},{},{},{},{},{}n" filename = "../data/{}.csv" town_count = 1000 (start_year, end_year) = (1894, 2014) (start_month, end_month) = (1, 13) sample_size = (1 + 2 * town_count * (end_year – start_year) * (end_month - start_month)) countries = range(200) towns = range(town_count) years = range(start_year, end_year) months = range(start_month, end_month) for country in countries: with open(filename.format(country), "w") as csvfile: csvfile.write(head) csvdata = "" weather_data = norm.rvs(size=sample_size) weather_index = 0 for town in towns: for year in years: for month in months: csvdata += row.format( country, town, year, month, weather_data[weather_index], weather_data[weather_index + 1]) weather_index += 2 csvfile.write(csvdata) Note that we generated a sample data population that was twice as large as the expected size in order to pull both the simulated temperature and precipitation data at the same time (from the same set). This will take about 30 minutes to run. When complete, you will see the following files: In [26]: ls -rtm ../data/*.csv ../data/0.csv, ../data/1.csv, ../data/2.csv, ../data/3.csv, ../data/4.csv, ../data/5.csv, ... ../data/194.csv, ../data/195.csv, ../data/196.csv, ../data/197.csv, ../data/198.csv, ../data/199.csv A quick look at just one of the files reveals the size of each, as follows: In [27]: ls -lh ../data/0.csv -rw-r--r-- 1 oubiwann staff 72M Mar 21 19:02 ../data/0.csv With each file that is 72 MB in size, we have data that takes up 14 GB of disk space, which exceeds the size of the RAM of the system in question. Furthermore, running queries against so much data in the .csv files isn't going to be very efficient. It's going to take a long time. So what are our options? Well, to read this data, HDF5 is a very good fit. In fact, it is designed for jobs like this. We will use PyTables to convert the .csv files to a single HDF5. We'll start by creating an empty table file, as follows: In [28]: tb_name = "../data/weather.h5t" h5 = tb.open_file(tb_name, "w") h5 Out[28]: File(filename=../data/weather.h5t, title='', mode='w', root_uep='/', filters=Filters( complevel=0, shuffle=False, fletcher32=False, least_significant_digit=None)) / (RootGroup) '' Next, we'll provide some assistance to PyTables by indicating the data types of each column in our table, as follows: In [29]: data_types = np.dtype( [("country", "<i8"), ("town", "<i8"), ("year", "<i8"), ("month", "<i8"), ("precip", "<f8"), ("temp", "<f8")]) Also, let's define a compression filter that can be used by PyTables when saving our data, as follows: In [30]: filters = tb.Filters(complevel=5, complib='blosc') Now, we can create a table inside our new HDF5 file, as follows: In [31]: tab = h5.create_table( "/", "weather", description=data_types, filters=filters) With that done, let's load each CSV file, read it in chunks so that we don't overload the memory, and then append it to our new HDF5 table, as follows: In [32]: for filename in glob.glob("../data/*.csv"): it = pd.read_csv(filename, iterator=True, chunksize=10000) for chunk in it: tab.append(chunk.to_records(index=False)) tab.flush() Depending on your machine, the entire process of loading the CSV file, reading it in chunks, and appending to a new HDF5 table can take anywhere from 5 to 10 minutes. However, what started out as a collection of the .csv files that weigh in at 14 GB is now a single compressed 4.8 GB HDF5 file, as shown in the following code: In [33]: h5.get_filesize() Out[33]: 4758762819 Here's the metadata for the PyTables-wrapped HDF5 table after the data insertion: In [34]: tab Out[34]: /weather (Table(288000000,), shuffle, blosc(5)) '' description := { "country": Int64Col(shape=(), dflt=0, pos=0), "town": Int64Col(shape=(), dflt=0, pos=1), "year": Int64Col(shape=(), dflt=0, pos=2), "month": Int64Col(shape=(), dflt=0, pos=3), "precip": Float64Col(shape=(), dflt=0.0, pos=4), "temp": Float64Col(shape=(), dflt=0.0, pos=5)} byteorder := 'little' chunkshape := (1365,) Now that we've created our file, let's use it. Let's excerpt a few lines with an array slice, as follows: In [35]: tab[100000:100010] Out[35]: array([(0, 69, 1947, 5, -0.2328834718674, 0.06810312195695), (0, 69, 1947, 6, 0.4724989007889, 1.9529216219569), (0, 69, 1947, 7, -1.0757216683235, 1.0415374480545), (0, 69, 1947, 8, -1.3700249968748, 3.0971874991576), (0, 69, 1947, 9, 0.27279758311253, 0.8263207523831), (0, 69, 1947, 10, -0.0475253104621, 1.4530808932953), (0, 69, 1947, 11, -0.7555493935762, -1.2665440609117), (0, 69, 1947, 12, 1.540049376928, 1.2338186532516), (0, 69, 1948, 1, 0.829743501445, -0.1562732708511), (0, 69, 1948, 2, 0.06924900463163, 1.187193711598)], dtype=[('country', '<i8'), ('town', '<i8'), ('year', '<i8'), ('month', '<i8'), ('precip', '<f8'), ('temp', '<f8')]) In [36]: tab[100000:100010]["precip"] Out[36]: array([-0.23288347, 0.4724989 , -1.07572167, -1.370025 , 0.27279758, -0.04752531, -0.75554939, 1.54004938, 0.8297435 , 0.069249 ]) When we're done with the file, we do the same thing that we would do with any other file-like object: In [37]: h5.close() If we want to work with it again, simply load it, as follows: In [38]: h5 = tb.open_file(tb_name, "r") tab = h5.root.weather Let's try plotting the data from our HDF5 file: In [39]: (figure, axes) = plt.subplots(figsize=(20, 10)) axes.hist(tab[:1000000]["temp"], bins=100) plt.show() Here's a plot for the first million data points: This histogram was rendered quickly, with a much better response time than what we've seen before. Hence, the process of accessing the HDF5 data is very fast. The next question might be "What about executing calculations against this data?" Unfortunately, running the following will consume an enormous amount of RAM: tab[:]["temp"].mean() We've just asked for all of the data—all of its 288 million rows. We are going to end up loading everything into the RAM, grinding the average workstation to a halt. Ideally though, when you iterate through the source data and create the HDF5 file, you also crunch the numbers that you will need, adding supplemental columns or groups to the HDF5 file that can be used later by you and your peers. If we have data that we will mostly be selecting (extracting portions) and which has already been crunched and grouped as needed, HDF5 is a very good fit. This is why one of the most common use cases that you see for HDF5 is the sharing and distribution of the processed data. However, if we have data that we need to process repeatedly, then we will either need to use another method besides the one that will cause all the data to be loaded into the memory, or find a better match for our data processing needs. We saw in the previous section that the selection of data is very fast in HDF5. What about calculating the mean for a small section of data? If we've got a total of 288 million rows, let's select a divisor of the number that gives us several hundred thousand rows at a time—2,81,250 rows, to be more precise. Let's get the mean for the first slice, as follows: In [40]: tab[0:281250]["temp"].mean() Out[40]: 0.0030696632864265312 This took about 1 second to calculate. What about iterating through the records in a similar fashion? Let's break up the 288 million records into chunks of the same size; this will result in 1024 chunks. We'll start by getting the ranges needed for an increment of 281,250 and then, we'll examine the first and the last row as a sanity check, as follows: In [41]: limit = 281250 ranges = [(x * limit, x * limit + limit) for x in range(2 ** 10)] (ranges[0], ranges[-1]) Out[41]: ((0, 281250), (287718750, 288000000)) Now, we can use these ranges to generate the mean for each chunk of 281,250 rows of temperature data and print the total number of means that we generated to make sure that we're getting our counts right, as follows: In [42]: means = [tab[x * limit:x * limit + limit]["temp"].mean() for x in range(2 ** 10)] len(means) Out[42]: 1024 Depending on your machine, that should take between 30 and 60 seconds. With this work done, it's now easy to calculate the mean for all of the 288 million points of temperature data: In [43]: sum(means) / len(means) Out[43]: -5.3051780413782918e-05 Through HDF5's efficient file format and implementation, combined with the splitting of our operations into tasks that would not copy the HDF5 data into memory, we were able to perform calculations across a significant fraction of a billion records in less than a minute. HDF5 even supports parallelization. So, this can be improved upon with a little more time and effort. However, there are many cases where HDF5 is not a practical choice. You may have some free-form data, and preprocessing it will be too expensive. Alternatively, the datasets may be actually too large to fit on a single machine. This is when you may consider using matplotlib with distributed data. Summary In this article, we covered the role of NumPy in the world of big data and matplotlib as well as the process and problems in working with large data sources. Also, we discussed the possible solutions to these problems using NumPy's memmap function and HDF5 and PyTables. Resources for Article: Further resources on this subject: First Steps [article] Introducing Interactive Plotting [article] The plot function [article]

0
0
5127

Packt

08 Jul 2015

7 min read

What's BitBake All About?

Packt

08 Jul 2015

7 min read

In this article by H M Irfan Sadiq, the author of the book Using Yocto Project with BeagleBone Black, we will move one step ahead by detailing different aspects of the basic engine behind Yocto Project, and other similar projects. This engine is BitBake. Covering all the various aspects of BitBake in one article is not possible; it will require a complete book. We will familiarize you as much as possible with this tool. We will cover the following topics in this article: Legacy tools and BitBake Execution of BitBake (For more resources related to this topic, see here.) Legacy tools and BitBake This discussion does not intend to invoke any religious row between other alternatives and BitBake. Every step in the evolution has its own importance, which cannot be denied, and so do other available tools. BitBake was developed keeping in mind the Embedded Linux Development domain. So, it tried to solve the problems faced in this core area, and in my opinion, it addresses these in the best way till date. You might get the same output using other tools, such as Buildroot, but the flexibility and ease provided by BitBake in this domain is second to none. The major difference is in addressing the problem. Legacy tools are developed considering packages in mind, but BitBake evolved to solve the problems faced during the creation of BSPs, or embedded distributions. Let's go through the challenges faced in this specific domain and understand how BitBake helps us face them. Cross-compilation BitBake takes care of cross compilation. You do not have to worry about it for each package you are building. You can use the same set of packages and build for different platforms seamlessly. Resolving inter-package dependencies This is the real pain of resolving dependencies of packages on each other and fulfilling them. In this case, we need to specify the different dependency types available, and BitBake takes care of them for us. We can handle both build and runtime dependencies. Variety of target distribution BitBake supports a variety of target distribution creations. We can define a full new distribution of our own, by choosing package management, image types, and other artifacts to fulfill our requirements. Coupling to build system BitBake is not very dependent on the build system we use to build our target images. We don't use libraries and tools installed on the system; we build their native versions and use them instead. This way, we are not dependent on the build system's root filesystem. Variety of build systems distros Since BitBake is very loosely coupled to the build system's distribution type, it's very easy to use on various distributions. Variety of architecture We have to support different architectures. We don't have to modify our recipes for each package. We can write our recipes so that features, parameters, and flags are picked up conditionally. Exploit parallelism For the simplest projects, we have to build images and do more than a thousand tasks. These tasks require us to use the full power available to us, whether they are computational or related to memory. BitBake's architecture supports us in this regard, using its scheduler to run as many tasks in parallel as it can, or as we configure. Also, when we say task, it should not be confused with package, but it is a part of package. A package can contain many tasks, (fetch, compile, configure, package, populate_sysroot, and so on), and all these can run in parallel. Easy to use, extend, and collaborate Keeping and relying on metadata keeps things simple and configurable. Almost nothing is hard coded. Thus, we can configure things according to our requirements. Also, BitBake provides us with a mechanism to reuse things that are already developed. We can keep our metadata structured, so that it gets applied/extended conditionally. You will learn these tricks when we will explore layers. BitBake execution To get us to a successful package or image, BitBake performs some steps that we need to go through to get an understanding of the workflow. In certain cases, some of these steps can be avoided; but we are not discussing such cases, considering them as corner cases. For details on these, we should refer to the BitBake user manual. Parsing metadata When we invoke the BitBake command to build our image, the first thing it does is parse our base configuration metadata. This metadata consists of build_bb/conf/bblayers.conf, multiple layer/conf/layer.conf, and poky/meta/conf/bitbake.conf. This data can be of the following types: Configuration data Class data Recipes Key variables BBFILES and BBPATH, which are constructed from the layer.conf file. Thus, the constructed BBPATH variable is used to locate configuration files under conf/ and class files under class/ directories. The BBFILES variable is used to find recipe files (.bb and .bbappend). bblayers.conf is used to set these variables. Next, the bitbake.conf file is parsed. If there is no bblayers.conf file, it is assumed that the user has set BBFILES and BBPATH directly in the environment. After having dealt with configuration files, class files inclusion and parsing are taken care of. These class files are specified using the INHERIT variable. Next, BitBake will use the BBFILES variable to construct a list of recipes to parse, along with any append files. Thus, after parsing, recipe values for various variables are stored into datastore. After the completion of a recipe parsing BitBake has: A list of tasks that the recipe has defined A set of data consisting of keys and values Dependency information of the tasks Preparing tasklist BitBake starts looking through the PROVIDES set in recipe files. The PROVIDES set defaults to the recipe name, and we can define multiple values to it. We can have multiple recipes providing a similar package. This task is accomplished by setting PROVIDES in the recipes. While actually making such recipes part of the build, we have to define PRREFERED_PROVIDER_foo so that our specific recipe foo can be used. We can do this in multiple locations. In the case of kernels, we use it in the manchin.conf file. BitBake iterates through the list of targets it has to build and resolves them, along with their dependencies. If PRREFERED_PROVIDER is not set and multiple versions of a package exist, BitBake will choose the highest version. Each target/recipe has multiple tasks, such as fetch, unpack, configure, and compile. BitBake considers each of these tasks as independent units to exploit parallelism in a multicore environment. Although these tasks are executed sequentially for a single package/recipe, for multiple packages, they are run in parallel. We may be compiling one recipe, configuring the second, and unpacking the third in parallel. Or, may be at the start, eight packages are all fetching their sources. For now, we should know the dependencies between tasks that are defined using DEPENDS and RDEPENDS. In DEPENDS, we provide the dependencies that our package needs to build successfully. So, BitBake takes care of building these dependencies before our package is built. RDEPENDS are the dependencies that are required for our package to execute/run successfully on the target system. So, BitBake takes care of providing these dependencies on the target's root filesystem. Executing tasks Tasks can be defined using the shell syntax or Python. In the case of shell tasks, a shell script is created under a temporary directory as run.do_taskname.pid and then, it is executed. The generated shell script contains all the exported variables and the shell functions, with all the variables expanded. Output from the task is saved in the same directory with log.do_taskname.pid. In the case of errors, BitBake shows the full path to this logfile. This is helpful for debugging. Summary In this article, you learned the goals and problem areas that BitBake has considered, thus making itself a unique option for Embedded Linux Development. You also learned how BitBake actually works. Resources for Article: Further resources on this subject: Learning BeagleBone [article] Baking Bits with Yocto Project [article] The BSP Layer [article]

0
0
9604

How-To Tutorials

Packt

08 Jul 2015

23 min read

Why Meteor Rocks!

Packt

08 Jul 2015

23 min read

In this article by Isaac Strack, the author of the book, Getting Started with Meteor.js JavaScript Framework - Second Edition, has discussed some really amazing features of Meteor that has contributed a lot to the success of Meteor. Meteor is a disruptive (in a good way!) technology. It enables a new type of web application that is faster, easier to build, and takes advantage of modern techniques, such as Full Stack Reactivity, Latency Compensation, and Data On The Wire. (For more resources related to this topic, see here.) This article explains how web applications have changed over time, why that matters, and how Meteor specifically enables modern web apps through the above-mentioned techniques. By the end of this article, you will have learned: What a modern web application is What Data On The Wire means and how it's different How Latency Compensation can improve your app experience Templates and Reactivity—programming the reactive way! Modern web applications Our world is changing. With continual advancements in displays, computing, and storage capacities, things that weren't even possible a few years ago are now not only possible but are critical to the success of a good application. The Web in particular has undergone significant change. The origin of the web app (client/server) From the beginning, web servers and clients have mimicked the dumb terminal approach to computing where a server with significantly more processing power than a client will perform operations on data (writing records to a database, math calculations, text searches, and so on), transform the data and render it (turn a database record into HTML and so on), and then serve the result to the client, where it is displayed for the user. In other words, the server does all the work, and the client acts as more of a display, or a dumb terminal. This design pattern for this is called…wait for it…the client/server design pattern. The diagrammatic representation of the client-server architecture is shown in the following diagram: This design pattern, borrowed from the dumb terminals and mainframes of the 60s and 70s, was the beginning of the Web as we know it and has continued to be the design pattern that we think of when we think of the Internet. The rise of the machines (MVC) Before the Web (and ever since), desktops were able to run a program such as a spreadsheet or a word processor without needing to talk to a server. This type of application could do everything it needed to, right there on the big and beefy desktop machine. During the early 90s, desktop computers got even more beefy. At the same time, the Web was coming alive, and people started having the idea that a hybrid between the beefy desktop application (a fat app) and the connected client/server application (a thin app) would produce the best of both worlds. This kind of hybrid app—quite the opposite of a dumb terminal—was called a smart app. Many business-oriented smart apps were created, but the easiest examples can be found in computer games. Massively Multiplayer Online games (MMOs), first-person shooters, and real-time strategies are smart apps where information (the data model) is passed between machines through a server. The client in this case does a lot more than just display the information. It performs most of the processing (or acts as a controller) and transforms the data into something to be displayed (the view). This design pattern is simple but very effective. It's called the Model View Controller (MVC) pattern. The model is essentially the data for an application. In the context of a smart app, the model is provided by a server. The client makes requests to the server for data and stores that data as the model. Once the client has a model, it performs actions/logic on that data and then prepares it to be displayed on the screen. This part of the application (talking to the server, modifying the data model, and preparing data for display) is called the controller. The controller sends commands to the view, which displays the information. The view also reports back to the controller when something happens on the screen (a button click, for example). The controller receives the feedback, performs the logic, and updates the model. Lather, rinse, repeat! Since web browsers were built to be "dumb clients", the idea of using a browser as a smart app back then was out of question. Instead, smart apps were built on frameworks such as Microsoft .NET, Java, or Macromedia (now Adobe) Flash. As long as you had the framework installed, you could visit a web page to download/run a smart app. Sometimes, you could run the app inside the browser, and sometimes, you would download it first, but either way, you were running a new type of web app where the client application could talk to the server and share the processing workload. The browser grows up Beginning in the early 2000s, a new twist on the MVC pattern started to emerge. Developers started to realize that, for connected/enterprise "smart apps", there was actually a nested MVC pattern. The server code (controller) was performing business logic against the database (model) through the use of business objects and then sending processed/rendered data to the client application (a "view"). The client was receiving this data from the server and treating it as its own personal "model". The client would then act as a proper controller, perform logic, and send the information to the view to be displayed on the screen. So, the "view" for the server MVC was the "model" for the client MVC. As browser technologies (HTML and JavaScript) matured, it became possible to create smart apps that used the Nested MVC design pattern directly inside an HTML web page. This pattern makes it possible to run a full-sized application using only JavaScript. There is no longer any need to download multiple frameworks or separate apps. You can now get the same functionality from visiting a URL as you could previously by buying a packaged product. A giant Meteor appears! Meteor takes modern web apps to the next level. It enhances and builds upon the nested MVC design pattern by implementing three key features: Data On The Wire through the Distributed Data Protocol (DDP) Latency Compensation with Mini Databases Full Stack Reactivity with Blaze and Tracker Let's walk through these concepts to see why they're valuable, and then, we'll apply them to our Lending Library application. Data On The Wire The concept of Data On The Wire is very simple and in tune with the nested MVC pattern; instead of having a server process everything, render content, and then send HTML across the wire, why not just send the data across the wire and let the client decide what to do with it? This concept is implemented in Meteor using the Distributed Data Protocol, or DDP. DDP has a JSON-based syntax and sends messages similar to the REST protocol. Additions, deletions, and changes are all sent across the wire and handled by the receiving service/client/device. Since DDP uses WebSockets rather than HTTP, the data can be pushed whenever changes occur. But the true beauty of DDP lies in the generic nature of the communication. It doesn't matter what kind of system sends or receives data over DDP—it can be a server, a web service, or a client app—they all use the same protocol to communicate. This means that none of the systems know (or care) whether the other systems are clients or servers. With the exception of the browser, any system can be a server, and without exception, any server can act as a client. All the traffic looks the same and can be treated in a similar manner. In other words, the traditional concept of having a single server for a single client goes away. You can hook multiple servers together, each serving a discreet purpose, or you can have a client connect to multiple servers, interacting with each one differently. Think about what you can do with a system like that: Imagine multiple systems all coming together to create, for example, a health monitoring system. Some systems are built with C++, some with Arduino, some with…well, we don't really care. They all speak DDP. They send and receive data on the wire and decide individually what to do with that data. Suddenly, very difficult and complex problems become much easier to solve. DDP has been implemented in pretty much every major programming language, allowing you true freedom to architect an enterprise application. Latency Compensation Meteor employs a very clever technique called Mini Databases. A mini database is a "lite" version of a normal database that lives in the memory on the client side. Instead of the client sending requests to a server, it can make changes directly to the mini database on the client. This mini database then automatically syncs with the server (using DDP of course), which has the actual database. Out of the box, Meteor uses MongoDB and Minimongo: When the client notices a change, it first executes that change against the client-side Minimongo instance. The client then goes on its merry way and lets the Minimongo handlers communicate with the server over DDP. If the server accepts the change, it then sends out a "changed" message to all connected clients, including the one that made the change. If the server rejects the change, or if a newer change has come in from a different client, the Minimongo instance on the client is corrected, and any affected UI elements are updated as a result. All of this doesn't seem very groundbreaking, but here's the thing—it's all asynchronous, and it's done using DDP. This means that the client doesn't have to wait until it gets a response back from the server. It can immediately update the UI based on what is in the Minimongo instance. What if the change was illegal or other changes have come in from the server? This is not a problem as the client is updated as soon as it gets word from the server. Now, what if you have a slow internet connection or your connection goes down temporarily? In a normal client/server environment, you couldn't make any changes, or the screen would take a while to refresh while the client waits for permission from the server. However, Meteor compensates for this. Since the changes are immediately sent to Minimongo, the UI gets updated immediately. So, if your connection is down, it won't cause a problem: All the changes you make are reflected in your UI, based on the data in Minimongo. When your connection comes back, all the queued changes are sent to the server, and the server will send authorized changes to the client. Basically, Meteor lets the client take things on faith. If there's a problem, the data coming in from the server will fix it, but for the most part, the changes you make will be ratified and broadcast by the server immediately. Coding this type of behavior in Meteor is crazy easy (although you can make it more complex and therefore more controlled if you like): lists = new Mongo.Collection("lists"); This one line declares that there is a lists data model. Both the client and server will have a version of it, but they treat their versions differently. The client will subscribe to changes announced by the server and update its model accordingly. The server will publish changes, listen to change requests from the client, and update its model (its master copy) based on these change requests. Wow, one line of code that does all that! Of course, there is more to it, but that's beyond the scope of this article, so we'll move on. To better understand Meteor data synchronization, see the Publish and subscribe section of the meteor documentation at http://docs.meteor.com/#/full/meteor_publish. Full Stack Reactivity Reactivity is integral to every part of Meteor. On the client side, Meteor has the Blaze library, which uses HTML templates and JavaScript helpers to detect changes and render the data in your UI. Whenever there is a change, the helpers re-run themselves and add, delete, and change UI elements, as appropriate, based on the structure found in the templates. These functions that re-run themselves are called reactive computations. On both the client and the server, Meteor also offers reactive computations without having to use a UI. Called the Tracker library, these helpers also detect any data changes and rerun themselves accordingly. Because both the client and the server are JavaScript-based, you can use the Tracker library anywhere. This is defined as isomorphic or full stack reactivity because you're using the same language (and in some cases the same code!) on both the client and the server. Re-running functions on data changes has a really amazing benefit for you, the programmer: you get to write code declaratively, and Meteor takes care of the reactive part automatically. Just tell Meteor how you want the data displayed, and Meteor will manage any and all data changes. This declarative style is usually accomplished through the use of templates. Templates work their magic through the use of view data bindings. Without getting too deep, a view data binding is a shared piece of data that will be displayed differently if the data changes. Let's look at a very simple data binding—one for which you don't technically need Meteor—to illustrate the point. Let's perform the following set of steps to understand the concept in detail: In LendLib.html, you will see an HTML-based template expression: <div id="categories-container"> {{> categories}} </div> This expression is a placeholder for an HTML template that is found just below it: <template name="categories"> <h2 class="title">my stuff</h2>.. So, {{> categories}} is basically saying, "put whatever is in the template categories right here." And the HTML template with the matching name is providing that. If you want to see how data changes will affect the display, change the h2 tag to an h4 tag and save the change: <template name="categories"> <h4 class="title">my stuff</h4> You'll see the effect in your browser. (my stuff will become itsy bitsy.) That's view data binding at work. Change the h4 tag back to an h2 tag and save the change, unless you like the change. No judgment here...okay, maybe a little bit of judgment. It's ugly, and tiny, and hard to read. Seriously, you should change it back before someone sees it and makes fun of you! Alright, now that we know what a view data binding is, let's see how Meteor uses it. Inside the categories template in LendLib.html, you'll find even more templates: <template name="categories"> <h4 class="title">my stuff</h4> <div id="categories" class="btn-group"> {{#each lists}} <div class="category btn btn-primary"> {{Category}} </div> {{/each}} </div> </template> Meteor uses a template language called Spacebars to provide instructions inside templates. These instructions are called expressions, and they let us do things like add HTML for every record in a collection, insert the values of properties, and control layouts with conditional statements. The first Spacebars expression is part of a pair and is a for-each statement. {{#each lists}} tells the interpreter to perform the action below it (in this case, it tells it to make a new div element) for each item in the lists collection. lists is the piece of data, and {{#each lists}} is the placeholder. Now, inside the {{#each lists}} expression, there is one more Spacebars expression: {{Category}} Since the expression is found inside the #each expression, it is considered a property. That is to say that {{Category}} is the same as saying this.Category, where this is the current item in the for-each loop. So, the placeholder is saying, "add the value of the Category property for the current record." Now, if we look in LendLib.js, we will see the reactive values (called reactive contexts) behind the templates: lists : function () { return lists.find(... Here, Meteor is declaring a template helper named lists. The helper, lists, is found inside the template helpers belonging to categories. The lists helper happens to be a function that returns all the data in the lists collection, which we defined previously. Remember this line? lists = new Mongo.Collection("lists"); This lists collection is returned by the above-mentioned helper. When there is a change to the lists collection, the helper gets updated and the template's placeholder is changed as well. Let's see this in action. On your web page pointing to http://localhost:3000, open the browser console and enter the following line: > lists.insert({Category:"Games"}); This will update the lists data collection. The template will see this change and update the HTML code/placeholder. Each of the placeholders will run one additional time for the new entry in lists, and you'll see the following screen: When the lists collection was updated, the Template.categories.lists helper detected the change and reran itself (recomputed). This changed the contents of the code meant to be displayed in the {{> categories}} placeholder. Since the contents were changed, the affected part of the template was re-run. Now, take a minute here and think about how little we had to do to get this reactive computation to run: we simply created a template, instructing Blaze how we want the lists data collection to be displayed, and we put in a placeholder. This is simple, declarative programming at its finest! Let's create some templates We'll now see a real-life example of reactive computations and work on our Lending Library at the same time. Adding categories through the console has been a fun exercise, but it's not a long-term solution. Let's make it so that we can do that on the page instead as follows: Open LendLib.html and add a new button just before the {{#each lists}} expression: <div id="categories" class="btn-group"> <div class="category btn btn-primary" id="btnNewCat"> <span class="glyphicon glyphicon-plus"></span> </div> {{#each lists}} This will add a plus button on the page, as follows: Now, we want to change the button into a text field when we click on it. So let's build that functionality by using the reactive pattern. We will make it based on the value of a variable in the template. Add the following {{#if…else}} conditionals around our new button: <div id="categories" class="btn-group"> {{#if new_cat}} {{else}} <div class="category btn btn-primary" id="btnNewCat"> <span class="glyphicon glyphicon-plus"></span> </div> {{/if}} {{#each lists}} The first line, {{#if new_cat}}, checks to see whether new_cat is true or false. If it's false, the {{else}} section is triggered, and it means that we haven't yet indicated that we want to add a new category, so we should be displaying the button with the plus sign. In this case, since we haven't defined it yet, new_cat will always be false, and so the display won't change. Now, let's add the HTML code to display when we want to add a new category: {{#if new_cat}} <div class="category form-group" id="newCat"> <input type="text" id="add-category" class="form-control" value="" /> </div> {{else}} ... {{/if}} There's the smallest bit of CSS we need to take care of as well. Open ~/Documents/Meteor/LendLib/LendLib.css and add the following declaration: #newCat { max-width: 250px; } Okay, so now we've added an input field, which will show up when new_cat is true. The input field won't show up unless it is set to true; so, for now, it's hidden. So, how do we make new_cat equal to true? Save your changes if you haven't already done so, and open LendLib.js. First, we'll declare a Session variable, just below our Meteor.isClient check function, at the top of the file: if (Meteor.isClient) { // We are declaring the 'adding_category' flag Session.set('adding_category', false); Now, we'll declare the new template helper new_cat, which will be a function returning the value of adding_category. We need to place the new helper in the Template.categories.helpers() method, just below the declaration for lists: Template.categories.helpers({ lists: function () { ... }, new_cat: function(){ //returns true if adding_category has been assigned //a value of true return Session.equals('adding_category',true); } }); Note the comma (,) on the line above new_cat. It's important that you add that comma, or your code will not execute. Save these changes, and you'll see that nothing has changed. Ta-da! In reality, this is exactly as it should be because we haven't done anything to change the value of adding_category yet. Let's do this now: First, we'll declare our click event handler, which will change the value in our Session variable. To do this, add the following highlighted code just below the Template.categories.helpers() block: Template.categories.helpers({ ... }); Template.categories.events({ 'click #btnNewCat': function (e, t) { Session.set('adding_category', true); Tracker.flush(); focusText(t.find("#add-category")); } }); Now, let's take a look at the following line of code: Template.categories.events({ This line declares that events will be found in the category template. Now, let's take a look at the next line: 'click #btnNewCat': function (e, t) { This tells us that we're looking for a click event on the HTML element with an id="btnNewCat" statement (which we already created in LendLib.html). Session.set('adding_category', true); Tracker.flush(); focusText(t.find("#add-category")); Next, we set the Session variable, adding_category = true, flush the DOM (to clear up anything wonky), and then set the focus onto the input box with the id="add-category" expression. There is one last thing to do, and that is to quickly add the focusText(). helper function. To do this, just before the closing tag for the if (Meteor.isClient) function, add the following code: /////Generic Helper Functions///// //this function puts our cursor where it needs to be. function focusText(i) { i.focus(); i.select(); }; } //<------closing bracket for if(Meteor.isClient){} Now, when you save the changes and click on the plus button, you will see the input box: Fancy! However, it's still not useful, and we want to pause for a second and reflect on what just happened; we created a conditional template in the HTML page that will either show an input box or a plus button, depending on the value of a variable. This variable is a reactive variable, called a reactive context. This means that if we change the value of the variable (like we do with the click event handler), then the view automatically updates because the new_cat helpers function (a reactive computation) will rerun. Congratulations, you've just used Meteor's reactive programming model! To really bring this home, let's add a change to the lists collection (which is also a reactive context, remember?) and figure out a way to hide the input field when we're done. First, we need to add a listener for the keyup event. Or, to put it another way, we want to listen when the user types something in the box and hits Enter. When this happens, we want to add a category based on what the user typed. To do this, let's first declare the event handler. Just after the click handler for #btnNewCat, let's add another event handler: 'click #btnNewCat': function (e, t) { ... }, 'keyup #add-category': function (e,t){ if (e.which === 13) { var catVal = String(e.target.value || ""); if (catVal) { lists.insert({Category:catVal}); Session.set('adding_category', false); } } } We add a "," character at the end of the first click handler, and then add the keyup event handler. Now, let's check each of the lines in the preceding code: This line checks to see whether we hit the Enter/Return key. if (e.which === 13) This line of code checks to see whether the input field has any value in it: var catVal = String(e.target.value || ""); if (catVal) If it does, we want to add an entry to the lists collection: lists.insert({Category:catVal}); Then, we want to hide the input box, which we can do by simply modifying the value of adding_category: Session.set('adding_category', false); There is one more thing to add and then we'll be done. When we click away from the input box, we want to hide it and bring back the plus button. We already know how to do this reactively, so let's add a quick function that changes the value of adding_category. To do this, add one more comma after the keyup event handler and insert the following event handler: 'keyup #add-category': function (e,t){ ... }, 'focusout #add-category': function(e,t){ Session.set('adding_category',false); } Save your changes, and let's see this in action! In your web browser on http://localhost:3000, click on the plus sign, add the word Clothes, and hit Enter. Your screen should now resemble the following screenshot: Feel free to add more categories if you like. Also, experiment by clicking on the plus button, typing something in, and then clicking away from the input field. Summary In this article, you learned about the history of web applications and saw how we've moved from a traditional client/server model to a nested MVC design pattern. You learned what smart apps are, and you also saw how Meteor has taken smart apps to the next level with Data On The Wire, Latency Compensation, and Full Stack Reactivity. You saw how Meteor uses templates and helpers to automatically update content, using reactive variables and reactive computations. Lastly, you added more functionality to the Lending Library. You made a button and an input field to add categories, and you did it all using reactive programming rather than directly editing the HTML code. Resources for Article: Further resources on this subject: Building the next generation Web with Meteor [article] Quick start - creating your first application [article] Meteor.js JavaScript Framework: Why Meteor Rocks! [article]

0
0
4916

How-To Tutorials

Packt

08 Jul 2015

8 min read

Zabbix and I – Almost Heroes

Packt

08 Jul 2015

8 min read

In this article written by Luciano Alves, author of the book Zabbix Performance Tuning, the author explains that ever since he started working with IT infrastructure, he's been noticing that almost every company, when they start thinking about a monitoring tool, think of trying to know in some way when the system or service will go down before it actually happens. They expect the monitoring tool to create some kind of alert when something is broken. But by this approach, the system administrator will know about an error or system outage only after the error occurs (and maybe, at the same time, users are trying to use those systems). We need a monitoring solution to help us predict system outages and any other situation that our services can be affected by. Our approach with monitoring tools should cover not only our system monitoring but also our business monitoring. Nowadays, any company (small, medium, or large) has some dependency on technologies, from servers and network assets to IP equipment with a lower environmental impact. Maybe you need security cameras, thermometers, UPS, access control devices, or any other IP device by which you can gather some useful data. What about applications and services? What about data integration or transactions? What about user experience? What about a supplier website or system that you depend on? We should realize that monitoring things is not restricted to IT infrastructure, and it can be extended to other areas and business levels as well. (For more resources related to this topic, see here.) After starting Zabbix – the initial steps Suppose you already have your Zabbix server up and running. In a few weeks, Zabbix has helped you save a lot of time while restoring systems. It has also helped you notice some hidden things in your environment—maybe a flapping port in a network switch, or lack of CPU in a router. In a few months, Zabbix and you (of course) are like superstars. During lunch, people are talking about you. Some are happy because you've dealt with a recurring error. Maybe, a manager asks you to find a way to monitor a printer because it's very important to their team, another manager asks you to monitor an application, and so on. The other teams and areas also need some kind of monitoring. They have other things to monitor, not only IT things. But are these people familiar with technical things? Technical words, expressions, flows, and lines of thoughts are not so easy for people with nontechnical backgrounds to understand. Of course, in small and medium enterprises (SME), things will go ahead faster and paths will be shorter, but the scenario is not too different in most cases. You can work alone or in a huge team, but now you have another important partner—Zabbix. An immutable fact is that monitoring things comes with more and more responsibility and reliability. At this point, we have some new issues to solve: How do we create and authenticate a user? When Zabbix's visibility starts growing in your environment, you will need to think how to manage and handle these users. Do you have an LDAP or Microsoft Active Directory that you can use for centralized authentication? Of course, depending on the users you have, you will have more requests. Will you permit any user to access the Zabbix interface? Only a few? And which ones? Is it necessary to create a custom monitor? We know that Zabbix has a lot of built-in keys for gathering data. These keys are available for a good number of operating systems. We also have built-in functions used to gather data using the Intelligent Platform Management Interface (IPMI), Simple Network Management Protocol (SNMP), Open Database Connectivity (ODBC), Java Management Extensions (JMX), user parameters in the Zabbix agent, and so on. However, we need to think about a wide scenario where we need to gather data from somewhere Zabbix hasn't reached yet. Our experience shows us that most of the time, it is necessary to create custom monitors (not one, but a lot of them). Zabbix is a very flexible and easy-to-customize platform. It is possible to make Zabbix do anything you want. However, to learn every new function or to monitor Zabbix, you'll need to think about what kind of extension you'll use. More functions, more data, more load, and more TCP connections! This means that when other teams or areas start putting light on Zabbix, you will need to think about the number of new functions or monitors you will need to get. Then, which language to choose to develop these new things? Maybe you know the C language and you are thinking of using Zabbix modules. Will you use bulk operations to avoid network traffic? The natural growth In most scenarios, natural growth will occur without control. I mean, people are not used to planning this growth. It is very important to keep it under control. When some guys start their Zabbix deployment, they probably do not intend to cater to all company teams, areas, or businesses. They think about their needs and their team only. So, they don't think a lot about user rights, mainly because they are technicians and know mostly about hosts, items, triggers, maps, graphs, screens, and so on. What about users who are not technicians? Will they understand the Zabbix interface easily? Do you know that in Zabbix, we have a lot of paths that reach the same point? The Zabbix interface isn't object-based, which means that users need a lot of clicks to reach (read or write) the information related to an object (hosts, items, graphs, triggers, events, and so on). If you need to see the most recent data gathered from a specific item, you'll need to use the Monitoring menu, then use the Latest data menu, choose the group that the host belongs to, choose your host, and finally search for your item in the table. If you need to see a specific custom graph, use the Graphs menu, which is under Monitoring. Choose the group that the hosts belong to, choose your host, and then search for your graph in a combobox. If you need to know about an active trigger in your host, you'll need to use the Triggers menu, which is under Monitoring. Choose the group that your host belongs to and choose your host. Then, you can see the triggers from that specific host. If you want to include a new item in an existing custom graph, you'll need to access the Hosts menu, which is under Configuration. Choose the group that the hosts belong to, search for your host, and click on the Graphs link. Then you can choose which graph you want to change. There are a lot of clicks required to do simple things. Of course, the steps you just saw are something familiar for guys who have deployed Zabbix, but is this true for other teams too? Maybe, you are thinking right now that it doesn't matter to those guys. But actually, it matters, and it's directly related to Zabbix's growth in your environment. Okay, I think the next two questions will be: are you sure it matters? And why? Let's agree that the actual Zabbix interface isn't very user friendly for nontechnical guys. But according to the path of natural growth, you started gathering data from a lot of things that are not just IT related. Also, you can develop custom charts and any data from Zabbix via API functions. Now you'll have a lot of nontechnical guys trying to use Zabbix data. I'm sure that it will be necessary to create some maps and screens to help these users get the required information quickly and smoothly. The following screenshots show how we can transform the viewing layer of Zabbix into something more attractive: Tactical dashboard Here is what a strategic dashboard may look like: Strategic dashboard The point here is whether your Zabbix deployment is prepared to cater to these types of requirements. Summary We've noticed how Zabbix has evolved in terms of performance issues with each version. Also, you realized the importance of the need to be aware of its new features. Another significant point was to realize that the importance of Zabbix is growing, as the other teams and areas of the company are now aware of the potential of this tool. This movement will take Zabbix to all the corners of a company, which often requires a more open approach as far as monitoring tasks is concerned. Monitoring only servers and network assets will not suffice. Resources for Article: Further resources on this subject: Going beyond Zabbix agents [article] Understanding Self-tuning Thresholds [article] Query Performance Tuning [article]

0
0
6006

article-image-understanding-mesos-internals

Packt

08 Jul 2015

26 min read

Understanding Mesos Internals

Packt

08 Jul 2015

26 min read

0
0
5733

article-image-building-biztalk-server-2013-applications

Packt

08 Jul 2015

15 min read

Building BizTalk Server 2013 Applications

Packt

08 Jul 2015

15 min read

Creativity is the power to connect the seemingly unconnected. – William Plomer Let's begin our journey by investigating what BizTalk Server actually is, why one would use it, and how to craft a running application. This article will be a refresher on BizTalk Server for those of you who have some familiarity with the product. In this article by Mark Brimble coauthor of the book SOA Patterns with BizTalk Server 2013 and Microsoft Azure - Second Edition, you will learn: How to articulate BizTalk Server, when to use it, and how it works How to outline the role of BizTalk schemas, maps, and orchestrations BizTalk messaging configurations What is BizTalk Server? So what exactly is BizTalk Server, and why should you care about it? In a nutshell, Microsoft BizTalk Server 2013 uses adapter technology to connect disparate entities and enable the integration of data, events, processes, and services. An entity may be an application, department, or a different organization altogether that you need to be able to share information with. A software adapter is typically used when we need to establish communication between two components that do not natively collaborate. BizTalk Server adapters are built with a common framework; which results in system integration implemented via configuration, not coding. Traditionally, BizTalk Server has solved problems in the following three areas: Enterprise Application Integration Business-to-Business Business Process Automation First, BizTalk Server acts as an Enterprise Application Integration (EAI) server that connects applications that are natively incapable of talking to each other. The applications may have incompatible platforms, data structure formats, or security models. For example, when a new employee is hired, the employee data from the human resources application needs to be sent to the payroll application so that the new employee receives his/her paycheck on time. Nothing prevents you from writing the code necessary to connect these disparate applications with a point-to-point solution. However, using such a strategy often leads to an application landscape that looks like this: Many organizations choose to insert a communication broker between these applications, as shown in the following figure: Some of the benefits that you would realize from such an architectural choice include: Loose coupling of applications where one does not have a physical dependency on the other Durable infrastructure that can guarantee delivery and queue messages during destination system downtime Centralized management of system integration endpoints Message flow control such as in-order delivery Encouragement for the reuse of core components Insight into cross-functional business processes through business activity monitoring BizTalk Server solves a second problem by filling the role of a Business-to-Business (B2B) broker that facilitates communication across different organizations. BizTalk supports B2B scenarios by offering Internet-friendly adapters, industry standard EDI message schemas, and robust support for both channel- and message-based security. The third broad area that BizTalk Server excels in is Business Process Automation (BPA). BPA is all about taking historically manual workflow procedures and turning them into executable processes. For example, consider an organization that typically receives a new order via e-mail, and the sales agent manually checks inventory levels prior to inserting the order into the fulfillment system. If the inventory is too low, then the sales agent has to initiate an order with their supplier and watch out for the response so that the inventory system can be updated. The inevitable problems of this scenario are as follows: Poor scalability when the number of orders increases Lack of visibility into the status of orders and supplier requests Multiple instances of redundant data entry, ripe for mistakes in one system and not the other Unreliable resources when a sales agent is sick By deciding to automate this scenario, the company can reduce human error while streamlining communications between applications and organizations. The beginning of the second decade of the 21st century saw the disruption of the traditional ways in which EAI and B2B problems were solved because of the rise of Software as Service (SaaS). SaaS is a software that is hosted external to your business, and is paid for on a subscription basis; its best known example is Salesforce.com. Many organizations have chosen to modify their EAI and B2B solutions with BizTalk Server to access SaaS applications using hybrid solutions, as shown in the following figure: Four new adapters, the WCF-BasicHTTPRelay, WCF-WebHTTP, WCF-NetTCPRelay, and SB-Messaging adapter, have been added to BizTalk 2013 to support hybrid solutions, and are nicknamed the "cloud adapters". New chapters on RESTful services and the Azure Service Bus have been added to this edition of the book to describe how these cloud adapters enhance the BizTalk Server story. Microsoft Azure BizTalk Service (MABS) has been created as a SaaS offering that can abstract B2B problems to Azure. We have added a chapter that shows how to use BizTalk Server 2013 with this new SaaS model. Examples of how to use all these new components to add new SOA capabilities to BizTalk Server have been added to this book. Azure App Services is Microsoft's next generation SaaS offering that will supersede MABS. While the platform is still very fresh, we have outlined the underlying concepts for you in the final chapter of this book to help you get a head start on usage of this platform. What's the one thing that all of these BizTalk Server cases have in common? They all depend on the real-time interchange and processing of discrete messages in an event-driven fashion. This partially explains why BizTalk Server is such a strong tool within a service-oriented architecture. We'll investigate many of BizTalk's service-oriented capabilities in later chapters, but it's important to note that the functionality that exists to support the three top-level scenarios mentioned earlier (EAI, B2B, and BPM) fits well into a service-oriented mindset. Concepts such as contract-first design, loose coupling, and reusability are soaked into the fabric of BizTalk Server. BizTalk Server should be targeted for solutions that exchange real-time messages as opposed to Extract Transform Load (ETL) products that excel at bulky, batch-oriented exchanges between data stores. BizTalk Server 2013 is the eighth release of the product, the first release being BizTalk Server 2000. Back in those days, developers had access to four native adapters (filesystem, MSMQ, HTTP, and SMTP). Development was done using a series of different tools, and the underlying engine had some fairly tight coupling between components. Since then, the entire product has been rebuilt and reengineered for .NET and a myriad of new services and features have become part of the BizTalk Server suite. The application continues to evolve and take greater advantage of the features of the Microsoft product stack, while still being the most interoperable and platform-neutral offering that Microsoft has ever produced. BizTalk architecture So how does BizTalk Server actually work? At its core, BizTalk Server is an event-processing engine based on a conventional publish-subscribe pattern. Wikipedia defines the publish-subscribe pattern as: "An asynchronous messaging paradigm where senders (publishers) of messages are not programmed to send their messages to specific receivers (subscribers). Rather, published messages are characterized into classes, without knowledge of what (if any) subscribers there may be. Subscribers express interest in one or more classes, and only receive messages that are of interest, without knowledge of what (if any) publishers there are." This pattern enforces a natural loose coupling and provides more scalability than an engine that requires a tight connection between receivers and senders. In the first release of BizTalk Server, the product did have tightly coupled messaging components, but thankfully, the engine was completely redesigned for BizTalk Server 2004. Once a message is received by a BizTalk adapter, it runs through any necessary preprocessing (such as decoding and validations) in BizTalk pipelines before being subjected to data transformation via BizTalk maps, and finally being published to a central database called the MessageBox. Then, the parties that have a corresponding subscription for that message can consume it as they see fit. While introducing a bit of unavoidable latency, the MessageBox database makes up for that by providing us with durability, reliability, and scalability. For instance, if one of our subscriber systems is offline for maintenance, outbound messages are not lost, but rather the MessageBox ensures that the messages are queued until the subscriber is ready to receive them. Worried about a large flood of inbound messages that steal processing threads away from other BizTalk activities? No problem! The MessageBox ensures that each and every message finds its way to its targeted subscriber, even if it must wait until the flood of inbound messages subsides. There are really two ways to look at the way BizTalk is structured. The first is the traditional EAI view, which sees BizTalk receiving messages and routes them to the next system for consumption. The flow is very linear and BizTalk is seen as a broker between two applications, shown as follows: However, the other way to consider BizTalk, and the focus of this book, is as a Service Bus, with numerous input/output channels that process messages in a very dynamic way. That is, instead of visualizing the data flow as a straight path through BizTalk to a destination system, consider BizTalk exposing services as on-ramps to a variety of destinations. Messages published to BizTalk Server may fan out to dozens of subscribers, who have no interest in what the publishing application actually was. Instead of thinking about BizTalk as a simple connector of systems, think of it as a message bus that coordinates a symphony of events between endpoints. This concept is an exciting way to exploit BizTalk's engine in this modern world of service orientation. In the following figure, I've shown how the central BizTalk bus has receiver services hanging from it, and has a multitude of distinct subscriber services that are activated by relevant messages reaching the bus: If the on-ramp concept is a bit abstract to understand, consider a simple analogy. In designing the transportation for a city, it would be foolish to create distinct roads between each and every destination. The design and maintenance of such a project would be lunacy. It would be smart to design a shared highway with on and off ramps, which enable people to use a common route to get to the numerous locations around town. As new destinations in the city emerge, the entire highway (or road system) doesn't need to undergo changes, but rather, only a new entrance/exit point needs to be appended to the existing shared infrastructure. What exactly is a message anyway? A message is data processed through BizTalk Server's messaging engine, whether that data is transported as an XML document, a delimited flat file, or a Microsoft Word document. The message content may contain a command (for example, InsertCustomer), a document (for example, Invoice), or an event (for example, VendorAdded). A message has a set of properties associated with it. First and foremost, a message may have a type associated with it, which uniquely defines it within the messaging bus. The type is typically comprised of the XML namespace and the root node name (for example, http://CompanyA.Purchasing#PurchaseOrder). The message type is much like the class object in an object-oriented programming language; it uniquely identifies entities by their properties. The other critical attribute of a message in BizTalk Server is the property bag called the message context, as shown in the following screenshot: The message context is a set of name/value properties that stay attached to the message as long as it remains within BizTalk Server. These context values include metadata about the transport used to publish the message and attributes of the message itself. Properties in the message context that are visible to the BizTalk engine, and therefore available for routing decisions, are called promoted properties. How does a message actually get into BizTalk Server? A receive location is configured for the actual endpoint that receives messages. The receive location uses a particular adapter that knows how to absorb the inbound message. For instance, a receive location may be configured to use the FILE adapter, which polls a particular directory for XML messages. The receive location stores the file path to monitor, while the adapter provides transport connectivity. Upon receipt of a message, the adapter stamps a set of values into the message context. For the FILE adapter, values such as ReceivedFileName are added to that message's context property bag. Note that BizTalk has both application adapters, such as SQL Server, Oracle, and SAP, as well as transport-level adapters, such as HTTP, MSMQ, and FILE. The key point is that the adapter configuration user experience is virtually identical regardless of the type of adapter chosen. Some of the adapters available are shown in the following figure: Receive locations have a particular receive pipeline associated with them. A pipeline is a sequential set of optional operations that is performed on the message in preparation of being parsed and sent to the message box database by the BizTalk adapter. For instance, I would need a pipeline in order to decrypt, unzip, or validate the XML structure of my inbound message. One of the most critical roles of the pipeline is to identify the type of the inbound message and put the type into the message context as a promoted property. Custom pipelines can serve as preprocessing stages to make the message useful for processing. As discussed earlier, a message type is the unique characterization of a message. Think of a receive pipeline as performing all the preprocessing steps necessary for putting the message in to its most usable format. A receive port contains one or more receive locations. Receive ports have XSLT maps associated with them that are applied to messages prior to publishing them to the MessageBox database. What value does a receive port offer? It acts as a grouping of receive locations where capabilities such as mapping and data tracking can be applied to all of the associated receive locations. It may also act as a container that allows us to publish a single entity to BizTalk Server regardless of how it came in, or what it looked like upon receipt. Let's say that my receive port contains three receive locations, which all receive slightly different "invoice" messages from three different external vendors. At the receive port level, I have three maps that take each unrelated message and maps it to a single, common format, before publishing it to BizTalk. Now that we have a message cleaned up (by the pipeline) and in the final structure (via an XSLT map), it's published to the BizTalk Server MessageBox where message routing can begin. For our purposes, there are two types of subscribers that we care about. The first type of subscriber is a send port. A send port is conceptually the inverse of the receive location and is responsible for transporting messages out of the BizTalk "bus". It has not only the adapter reference, adapter configuration settings, and pipeline (much like the receive location), but also the ability to apply XSLT maps to outbound messages. If a send port subscribes to a message, it first applies any XSLT map to the message, then processes it through a send pipeline, and finally uses the adapter to transmit the message out of BizTalk. The other type of subscriber for a published message is a BizTalk orchestration. An orchestration is an executable business process that uses messages to complete operations in a workflow. We'll spend plenty of time working with orchestration subscribers throughout this book. Summary In this article, we looked at what BizTalk is, its core use cases, and how it works. In my experience, one of the biggest competitors to BizTalk Server is not another product, but custom-built solutions. Many organizations engage a "build versus buy" debate prior to committing to a commercial product. In this article, I highlighted just a few aspects of BizTalk that make it a compelling choice for usage. With BizTalk Server, you get a well-designed scalable messaging engine with a durable persistence tier, which guarantees that your mission-critical messages are not lost in transit. The engine also provides native support for message tracking, recoverability, and straightforward scalability. BizTalk provides you with more than 20 native application adapters that save weeks of custom development time and testing. We also got a glimpse of BizTalk's integrated workflow toolset, which enables us to quickly build executable business processes that run in a load-balanced environment. These features alone often tip the scales in BizTalk Server's favor, not to mention the multitude of features that we are yet to discuss, such as Enterprise Single Sign On, the Business Rules Engine, Business Activity Monitoring, and so on. I hope that this article also planted some seeds in your mind with regards to thinking about BizTalk solutions in a service-oriented fashion. There are best practices for designing reusable, maintainable solutions that we will investigate throughout the rest of this book. In the next chapter, we'll explore one of the most important technologies for building robust service interfaces in BizTalk Server, which is Windows Communication Foundation. Resources for Article: Further resources on this subject: Implementation Case Study [article] Oracle B2B Overview [article] SOA Application Design [article]

0
0
1579

How-To Tutorials

Packt

08 Jul 2015

10 min read

Editors and IDEs

Packt

08 Jul 2015

10 min read

In this article by Daniel Blair, the author of the book Learning Banana Pi, you are going to learn about some editors and the programming languages that are available on the Pi and Linux. These tools will help you write the code that will interact with the hardware through GPIO and on the Pi as a server. (For more resources related to this topic, see here.) Choosing your editor There are many different integrated development environments (generally abbreviated as IDEs) to choose from on Linux. When working on the Banana Pi, you're limited to the software that will run on an ARM-based CPU. Hence, options such as Sublime Text are not available. Some options that you may be familiar with are available for general purpose code editing. Some tools are available for the command line, while others are GUI tools. So, depending on whether you have a monitor or not, you will want to choose an appropriate tool. The following screenshot shows some JavaScript being edited via nano on the command line: Command-line editors The command line is a powerful tool. If you master it, you will rarely need to leave it. There are several editors available for the command line. There has been an ongoing war between the users of two editors: GNU Emacs and Vim. There are many editors like nano (which is my preference), but the war tends to be between the two aforementioned editors. The Emacs editor This is my least favorite command-line editor (just my preference). Emacs is a GNU-flavored editor for the command line. It is often installed by default, but you can easily install it if it is missing by running a quick command, as follows: sudo apt-get install emacs Now, you can edit a file via the CLI by using the following code: emacs <command-line arguments> <your file> The preceding code will open the file in Emacs for you to edit. You can also use this to create new files. You can save and close the editor with a couple of key combinations: Ctrl + X and Ctrl + S Ctrl + X and Ctrl + C Thus, your document will be saved and closed. The Vim editor Vim is actually an extension of Vi, and it is functionally the same thing. Vim is a fine editor. Many won't personally go out of their way to not use it. However, people do find it a bit difficult to remember all the commands. If you do get good at it, though, you can code very quickly. You can install Vim with the command line: sudo apt-get install vim Also, there is a GUI version available that allows interaction with the mouse; this is functionally the same program as the Vim command line. You don't have to be confined to the terminal window. You can install it with an identical command: sudo apt-get install vim-gnome You can edit files easily with Vim via the command line for both Vim and Vim-Gnome, as follows: vim <your file> gvim <your file> The gnome version will open the file in a window There is a handy tutorial that you can use to learn the commands of Vim. You can run the tutorial with the help of the following command: vimtutor This tutorial will teach you how to run this editor, which is awesome because the commands can be a bit complicated at first. The following screenshot shows Vim editing the file that we used earlier: The nano editor The nano editor is my favorite editor for the command line. This is probably because it was the first editor that I was exposed to when I started to learn Linux and experiment with the servers and eventually, the Raspberry Pi and Banana Pi. The nano editor is generally considered the easiest to use and is installed by default on the Banana Pi images. If, for some reason, you need to install it, you can get it quickly with the help of the following command: sudo apt-get install nano The editor is easy to use. It comes with several commands that you will use frequently. To save and close the editor, use the following key combinations: Ctrl + O Ctrl + X You can get help at any time by pressing Ctrl + G. Graphic editors With the exception of gVim, all the editors we just talked about live on the command line. If you are more accustomed to graphical tools, you may be more comfortable with a full-featured IDE. There are a couple of choices in this regard that you may be familiar with. These tools are a little heavier than the command-line tools because you will need to not only run the software, but also render the window. This is not as much of a big deal on the Banana Pi as it is on the Raspberry Pi, because we have more RAM to play with. However, if you have a lot of programs running already, it might cause some performance issues. Eclipse Eclipse is a very popular IDE that is available for everything. You can use it to develop all kinds of systems and use all kinds of programming languages. This is a tool that can be used to do professional development. There are a lot of plugins available in this IDE. It is also used to develop apps for Android (although Android Studio is also available now). Eclipse is written in Java. Hence, in order to make it work, you will require a Java Runtime Environment. The Banana Pi should come equipped with the Java development and runtime environments. If this is not the case, they are not difficult to install. In order to grab the proper version of Eclipse and avoid browsing all the specific versions on the website, you can just install it via the command line by entering the following code: sudo apt-get install eclipse Once Eclipse is installed, you will find it in the application menu under programming tools. The following screenshot shows the Eclipse IDE running on the Banana Pi: The Geany IDE Geany is a lighter weight IDE than Eclipse although the former is not quite fully featured. It is a clean UI that can be customized and used to write a lot of different programming languages. Geany was one of the first IDEs I ever used when first exploring Linux when I was a kid. Geany does not come preinstalled on the Banana Pi images, but it is easy to get via the command line: sudo apt-get install geany Depending on what you plan to do code-wise on the Banana Pi, Geany may be your best bet. It is GUI-based and offers quite a bit of functionality. However, it is a lot faster to load than Eclipse. It may seem familiar for Windows users, and they might find it easier to operate since it resembles Windows software. The following screenshot shows Geany on Linux: Both of these editors, Geany and Eclipse, are not specific to a particular programming language, but they both are slightly better for certain languages. Geany tends to be better for web languages such as HTML, PHP, JavaScript, and CSS, while Eclipse tends to be better for compiled languages such as C++, Go, and Java as well as PHP and Ruby with plugins. If you plan to write scripts or languages that are intended to be run from the command line such as Bash, Ruby, or Python, you may want to stick to the command line and use an editor such as Vim or nano. It is worth your time to play around with the editors and find your preferences. Web IDEs In addition to the command line and GUI editors, there are a couple of web-based IDEs. These essentially turn your Pi into a code server, which allows you to run and even execute certain types of code on an IDE written in web languages. These IDEs are great for learning code, but they are not really replacements for the solutions that were listed previously. Google Coder Google Coder is an educational web IDE that was released as an open source project by Google for the Raspberry Pi. Although there is a readily available image for the Raspberry Pi, we can manually install it for the Banana Pi. The following screenshot shows the Google Coder's interface: The setup is fairly straightforward. We will clone the Git repo and install it with Node.js. If you don't have Git and Node.js installed, you can install them with a quick command in the terminal, as follows: sudo apt-get install nodejs npm git Once it is installed, we can clone the coder repo by using the following code: git clone https://github.com/googlecreativelab/coder After it is cloned, we will move into the directory and install it with the help of the following code: cd ~/coder/coder-base/ npm install It may take several minutes to install, even on the Banana Pi. Next, we will edit the config.js file, which will be used to configure the ports and IP addresses. nano config.js The preceding code will reveal the contents of the file. Change the top values to match the following: exports.listenIP = '127.0.0.1'; exports.listenPort = '8081'; exports.httpListenPort = '8080'; exports.cacheApps = true; exports.httpVisiblePort = '8080'; exports.httpsVisiblePort = '8081'; After you change the settings you need, run a server by using Node.js: nodejs server.js You should now be able to connect to the Pi in a browser either on it or on another computer and use Coder. Coder is an educational tool with a lot of different built-in tutorials. You can use Coder to learn JavaScript, CSS, HTML, and jQuery. Adafruit WebIDE Adafruit has developed its own Web IDE, which is designed to run on the Raspberry Pi and BeagleBone. Since we are using the Banana Pi, it will only run better. This IDE is designed to work with Ruby, Python, and JavaScript, to name a few. It includes a terminal via which you can send commands to the Pi from the browser. It is an interesting tool if you wish to learn how to code. The following screenshot shows the interface of the WebIDE: The installation of WebIDE is very simple compared to that of Google Coder, which took several steps. We will just run one command: curl https://raw.githubusercontent.com/adafruit/Adafruit-WebIDE/alpha/scripts/install.sh | sudo sh After a few minutes, you will see an output that indicates that the server is starting. You will be able to access the IDE just like Google Coder—through a browser from another computer or from itself. It should be noted that you will be required to create a free Bit Bucket account to use this software. Summary In this article, we explored several different programming languages, command-line tools, graphical editors, and even some web IDEs. These tools are valuable for all kinds of projects that you may be working on. Resources for Article: Further resources on this subject: Prototyping Arduino Projects using Python [article] Raspberry Pi and 1-Wire [article] The Raspberry Pi and Raspbian [article]

0
0
8186

How-To Tutorials

Process of designing XenDesktop® deployments

Architecting and coding high performance .NET applications

The Essentials of Working with Python Collections

Responsive Web Design with WordPress

Clustering and Other Unsupervised Learning Methods

What is Apache Camel?

Integrating Google Play Services

File Sharing

Working with large data sources

What's BitBake All About?

Trending Topics

Why Meteor Rocks!

Zabbix and I – Almost Heroes

Understanding Mesos Internals

Building BizTalk Server 2013 Applications

Editors and IDEs

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access