Data | 0 articles | Tech News, Tutorials & Expert Insights

article-image-extending-your-structure-and-search

15 Mar 2013

6 min read

Extending Your Structure and Search

15 Mar 2013

0
0
1973

Packt

08 Mar 2013

6 min read

Working with Apps in Splunk

Packt

08 Mar 2013

6 min read

(For more resources related to this topic, see here.) Defining an app In the strictest sense, an app is a directory of configurations and, sometimes, code. The directories and files inside have a particular naming convention and structure. All configurations are in plain text, and can be edited using your choice of text editor. Apps generally serve one or more of the following purposes: A container for searches, dashboards, and related configurations: This is what most users will do with apps. This is not only useful for logical grouping, but also for limiting what configurations are applied and at what time. This kind of app usually does not affect other apps. Providing extra functionality: Many objects can be provided in an app for use by other apps. These include field extractions, lookups, external commands, saved searches, workflow actions, and even dashboards. These apps often have no user interface at all; instead they add functionality to other apps. Configuring a Splunk installation for a specific purpose: In a distributed deployment, there are several different purposes that are served by the multiple installations of Splunk. The behavior of each installation is controlled by its configuration, and it is convenient to wrap those configurations into one or more apps. These apps completely change the behavior of a particular installation. Included apps Without apps, Splunk has no user interface, rendering it essentially useless. Luckily, Splunk comes with a few apps to get us started. Let's look at a few of these apps: gettingstarted: This app provides the help screens that you can access from the launcher. There are no searches, only a single dashboard that simply includes an HTML page. search: This is the app where users spend most of their time. It contains the main search dashboard that can be used from any app, external search commands that can be used from any app, admin dashboards, custom navigation, custom css, a custom app icon, a custom app logo, and many other useful elements. splunk_datapreview: This app provides the data preview functionality in the admin interface. It is built entirely using JavaScript and custom REST endpoints. SplunkDeploymentMonitor: This app provides searches and dashboards to help you keep track of your data usage and the health of your Splunk deployment. It also defines indexes, saved searches, and summary indexes. It is a good source for more advanced search examples. SplunkForwarder and SplunkLightForwarder: These apps, which are disabled by default, simply disable portions of a Splunk installation so that the installation is lighter in weight. If you never create or install another app, and instead simply create saved searches and dashboards in the app search, you can still be quite successful with Splunk. Installing and creating more apps, however, allows you to take advantage of others' work, organize your own work, and ultimately share your work with others. Installing apps Apps can either be installed from Splunkbase or uploaded through the admin interface. To get started, let's navigate to Manager | Apps, or choose Manage apps... from the App menu as shown in the following screenshot: Installing apps from Splunkbase If your Splunk server has direct access to the Internet, you can install apps from Splunkbase with just a few clicks. Navigate to Manager | Apps and click on Find more apps online. The most popular apps will be listed as follows: Let's install a pair of apps and have a little fun. First, install Geo Location Lookup Script (powered by MAXMIND) by clicking on the Install free button. You will be prompted for your splunk.com login. This is the same login that you created when you downloaded Splunk. If you don't have an account, you will need to create one. Next, install the Google Maps app. This app was built by a Splunk customer and contributed back to the Splunk community. This app will prompt you to restart Splunk. Once you have restarted and logged back in, check the App menu. Google Maps is now visible, but where is Geo Location Lookup Script? Remember that not all apps have dashboards; nor do they necessarily have any visible components at all. Using Geo Location Lookup Script Geo Location Lookup Script provides a lookup script to provide geolocation information for IP addresses. Looking at the documentation, we see this example: eventtype=firewall_event | lookup geoip clientip as src_ip You can find the documentation for any Splunkbase app by searching for it at splunkbase.com, or by clicking on Read more next to any installed app by navigating to Manager | Apps | Browse more apps. Let's read through the arguments of the lookup command: geoip: This is the name of the lookup provided by Geo Location Lookup Script. You can see the available lookups by going to Manager | Lookups | Lookup definitions. clientip: This is the name of the field in the lookup that we are matching against. as src_ip: This says to use the value of src_ip to populate the field before it; in this case, clientip. I personally find this wording confusing. In my mind, I read this as "using" instead of "as". Included in the ImplementingSplunkDataGenerator app (available at http://packtpub.com/) is a sourcetype instance named impl_splunk_ips, which looks like this: 2012-05-26T18:23:44 ip=64.134.155.137 The IP addresses in this fictitious log are from one of my websites. Let's see some information about these addresses: sourcetype="impl_splunk_ips" | lookup geoip clientip AS ip | top client_country This gives us a table similar to the one shown in the following screenshot: That's interesting. I wonder who is visiting my site from Slovenia! Using Google Maps Now let's do a similar search in the Google Maps app. Choose Google Maps from the App menu. The interface looks like the standard search interface, but with a map instead of an event listing. Let's try this remarkably similar (but not identical) query using a lookup provided in the Google Maps app: sourcetype="impl_splunk_ips" | lookup geo ip The map generated looks like this: Unsurprisingly, most of the traffic to this little site came from my house in Austin, Texas. Installing apps from a file It is not uncommon for Splunk servers to not have access to the Internet, particularly in a datacenter. In this case, follow these steps: Download the app from splunkbase.com. The file will have a .spl or .tgz extension. Navigate to Manager | Apps. Click on Install app from file. Upload the downloaded file using the form provided. Restart if the app requires it. Configure the app if required. That's it. Some apps have a configuration form. If this is the case, you will see a Set up link next to the app when you go to Manager | Apps. If something goes wrong, contact the author of the app. If you have a distributed environment, in most cases the app only needs to be installed on your search head. The components that your indexers need will be distributed automatically by the search head. Check the documentation for the app.

0
0
4316

article-image-article-constructing-and-evaluating-your-design-solution

Packt

25 Feb 2013

9 min read

Constructing and Evaluating Your Design Solution

Packt

25 Feb 2013

9 min read

(For more resources related to this topic, see here.) For constructing visualizations, technology matters The importance of being able to rationalize options has been a central theme of this book. As we reach the final stage of this journey and we are faced with the challenge of building our visualization solution, the keyword is, once again, choice. The intention of this book has been to focus on offering a handy strategy to help you work through the many design issues and decisions you're faced with. Up to now discussions about issues relating to technology and technical capability have been kept to a minimum in order to elevate the importance of the preparatory and conceptual stages. You have to work through these challenges regardless of what tools or skills you have. However, it is fair to say that to truly master data visualization design, it is inevitable that you will need to achieve technical literacy across a number of different applications and environments. All advanced designers need to be able to rely on a symphony of different tools and capabilities for gathering data, handling, and analyzing it before presenting, and launching the visual design. While we may have great concepts and impressively creative ideas, without the means to convert these into built solutions they will ultimately remain unrealized. The following example, tracking 61 years of tornado activity in the US, demonstrates a project that would have involved a great amount of different analytical and design-based technical skills and would not have been possible without these: Image In contrast to most of the steps that we have covered this far, the choices we make when it comes to producing the final data visualization design are more heavily influenced by capability and access to resources than necessarily the suitability of a given tool. This is something we covered earlier when identifying the key factors that shape what may or may not be possible to achieve. To many, the technology side of data visualization can be quite an overwhelming prospect—trying to harness and master the many different options available, knowing each one's relative strengths and weaknesses, identifying specific function and purpose, keeping on top of the latest developments and trends, and so on. Acquiring a broad technical skillset is clearly not easily accomplished. We touched on the different capability requirements of data visualization in article 2, Setting the Purpose and Identifying Key Factors, in the The "eight hats" of data visualization design section. This highlighted the importance of recognizing your strengths and weaknesses and where your skillset marries up with the varied and numerous demands of visualization design. In order to accommodate the absence of technical skills, in particular, you may need to find a way to collaborate with others or possibly scale down the level of your ambition. Visualization software, applications, and programs The scope of this book does not lend itself to provide a detailed dissection and evaluation of the many different possible tools and resources available for data visualization design. There are so many to choose from and it is a constantly evolving landscape—it feels like each new month sees an additional resource entering the fray. To help, you can find an up-to-date, curated list of the many technology options in this field by visiting http://www.visualisingdata.com/ index.php/resources/. Unlike other design disciplines, there is no single killer tool that does everything. To accommodate the agility of different technical solutions required in this field we have to be prepared to develop a portfolio of capabilities. What follows is a selection of just some of the most common, most useful, and most accessible options for you to consider utilizing and developing experience with. The tools presented have been classified to help you understand their primary purpose or function. Charting and statistical analysis tools This category covers some of the main charting productivity tools and the more effective visual analytics or Business Intelligence (BI) applications that offer powerful visualization capabilities. Microsoft Excel (http://office.microsoft.com/en-gb/excel/) is ubiquitous and has been a staple diet for many of us number crunchers for most of our working lives. Within the data visualization world, Excel's charting capabilities are somewhat derided largely down to the terrible default settings and the range of bad-practice charting functions it enables. (3D cone charts, anyone? No, thank you.) However, Excel does allow you to do much more than you would expect and, when fully exploited, it can prove to be quite a valuable ally. With experience and know-how, you can control and refine many chart properties and you will find that most of your basic charting requirements are met, certainly those that you might associate more with a pragmatic or analytical tone. Image Excel can also be used to serve up chart images for exporting to other applications (such as Illustrator, see later). Search online for the work of Jorge Camoes ( http://www.excelcharts.com/blog/), Jon Peltier (http://peltiertech.com/), and Chandoo (http://chandoo.org/) and you'll find some excellent visualization examples produced in Excel. Tableau (http://www.tableausoftware.com/) is a very powerful and rapid visual analytics application that allows you to potentially connect up millions of records from a range of origins and formats. From there you can quickly construct good practice charts and dashboards to visually explore and present your data. It is available as a licensed desktop or server version as well as a free-to-use public version. Tableau is particularly valuable when it comes to the important stage of data familiarization. When you want to quickly discover the properties, the shapes and quality of your data, Tableau is a great solution. It also enables you to create embeddable interactive visualizations and, like Excel, lets you export charts as images for use in other applications. Image There are many excellent Tableau practitioners out there whose work you should check out, such as Craig Bloodworth (http://www.theinformationlab.co.uk/ blog/), Jérôme Cukier (http://www.jeromecukier.net/), and Ben Jones (http://dataremixed.com/), among many others. While the overall landscape of BI is patchy in terms of its visualization quality, you will find some good additional solutions such as QlikView (http://www.qlikview.com/uk), TIBCO Spotfire (http://spotfire.tibco.com/), Grapheur (http://grapheur. com/), and Panopticon (http://www.panopticon.com/). You will also find that there are many chart production tools available online. Google has created a number of different ways to create visualizations through its Chart Tools (https://developers.google.com/chart/) and Visualization API (https://developers.google.com/chart/interactive/docs/reference) environments. While you can exploit these tools without the need for programming skills, the API platforms do enable developers to enhance the functional and design options themselves. Additionally, Google Fusion Tables (http://www.google.com/drive/start/ apps.html) offers a convenient method for publishing simple choropleth maps, timelines, and a variety of reasonably interactive charts. Image Other notable browser-based tools for analyzing data and creating embeddable or exportable data visualizations include DataWrapper (http://datawrapper.de/) and Polychart (http://polychart.com/). One of the first online offerings was Many Eyes, created by the IBM Visual Communications Lab in 2007, though ongoing support and development has lapsed. Many Eyes introduced many to Wordle (http://www.wordle.net/) a popular tool for visualizing the frequency of words used in text via "word clouds". Note, however, the novelty of this type of display has long since worn off for many people (especially please stop using it as your final PowerPoint slide in presentations!). Programming environments The ultimate capability in visualization design is to have complete control over the characteristics and behavior of every mark, property, and user-driven event on a chart or graph. The only way to fundamentally achieve this level of creative control is through the command of one or a range of programming languages. Until recent times one of the most important and popular options was Adobe Flash (http://www.adobe.com/uk/products/flash.html), a powerful and creative environment for animated and multimedia designs. Flash was behind many prominent interactive visualization designs in the field. However, Apple's decision to not support Flash on its mobile platforms effectively signaled the beginning of the end. Subsequently, most contemporary visualization programmers are focusing their developments on a range of powerful JavaScript environments and libraries. D3.js (http://d3js.org/) is the newest and coolest kid in town. Launched in 2011 from the Stanford Visualization Group that previously brought us Protovis (no longer in active development) this is a JavaScript library that has rapidly evolved into to the major player in interactive visualization terms. D3 enables you to take full creative control over your entire visualization design (all data representation and presentation features) to create incredibly smooth, expressive, and immersive interactive visualizations. Mike Bostock, the key creative force behind D3 and who now works at the New York Times, has an incredible portfolio of examples (http://bost.ocks.org/mike/) and you should also take a look at the work and tutorials provided by another D3 "hero", Scott Murray (http://alignedleft.com/). Image D3 and Flash are particularly popular (or have been popular, in the latter's case) because they are suitable for creating interactive projects to work in the browser. Over the past decade, Processing (http://processing.org/) has reigned as one of the most important solutions for creating powerful, generative, and animated visualizations that sit outside the browser, either as video, a separate application, or an installation. As an open source language it has built a huge following of creative programmers, designers, and artists look to optimize the potential of data representation and expression. There is a large and dynamic community of experts, authors, and tutorial writers that provide wonderful resources for anyone interested in picking up capabilities in this environment. There are countless additional JavaScript libraries and plugins that offer specialist capability, such as Paper.js (http://paperjs.org/) and Raphaël (http:// raphaeljs.com/), to really maximize your programming opportunities.

0
0
2629

Packt

19 Feb 2013

9 min read

Getting Started with InnoDB

Packt

19 Feb 2013

9 min read

(For more resources related to this topic, see here.) Basic features of InnoDB InnoDB is more than a fast disk-based relational database engine. It offers, at its core, the following features that separate it from other disk-based engines: MVCC ACID compliance Transaction support Row-level locking These features are responsible for providing what is known as Referential integrity; a core requirement for enterprise database applications. Referential integrity Referential integrity can be best thought of as the ability for the database application to store relational data in multiple tables with consistency. If a database lacks consistency between relational data, the data cannot be relied upon for applications. If, for example, an application stores financial transactions where monetary data is processed, referential integrity and consistency of transactional data is a key component. Financial data is not the only case where this is an important feature, as many applications store and process sensitive data that must be consistent Multiversion concurrency control A vital component is Multiversion concurrency control (MVCC), which is a control process used by databases to ensure that multiple concurrent connections can see and access consistent states of data over time. A common scenario relying on MVCC can be thought of as follows: data exists in a table and an application connection accesses that data, then a second connection accesses the same original data set while the first connection is making changes to it; since the first connection has not finalized its changes and committed its information we don't want the second connection to see the nonfinalized data. Thus two versions of the data exist at the same time—multiple versions—to allow the database to control the concurrent state of the data. MVCC also provides for the existence of point-in-time consistent views, where multiple versions of data are kept and are available for access based on their point-in-time existence. Transaction isolation Transaction support at the database level refers to the ability for units of work to be processed in separate units of execution from others. This isolation of data execution allows each database connection to manipulate, read, and write information at the same time without conflicting with each other. Transactions allow connections to operate on data on an all-or-nothing operation, so that if the transaction completes successfully it will be written to disk and recorded for upcoming transactions to then operate on. However, if the sequence of changes to the data in the transaction process do not complete then they can be rolled back, and no changes will be recorded to disk. This allows sequences of execution that contain multiple steps to fully succeed only if all of the changes complete, and to roll back any changed data to its original state if one or more of the sequence of changes in the transaction fail. This feature guarantees that the data remains consistent and referentially safe. ACID compliance An integral part of InnoDB is its ability to ensure that data is atomic, consistent, isolated, and durable; these features make up components of ACID compliance. Simply put, atomicity requires that if a transaction fails then the changes are rolled back and not committed. Consistency requires that each successfully executed transaction will move the database ahead in time from one state to the next in a consistent manner without errors or data integrity issues. Isolation defines that each transaction will see separate sets of data in time and not conflict with other transactional data access. Finally, the durability clause ensures that any data that has been committed in a successful transaction will be written to disk in its final state, without the risk of data loss from errors or system failure, and will then be available to transactions that come in the future. Locking characteristics Finally, InnoDB differs from other on-disk storage engines in that it offers row-level locking. This primarily differs, in the MySQL world, with the MyISAM storage engine which features table-level locking. Locking refers to an internal operation of the database that prohibits reading or writing of table data by connections if another is currently using that data. This prevents concurrent connections from causing data corruption or forcing data invalidation when data is in use. The primary difference between table- and row-level locking is that when a connection requests data from a table it can either lock the row of data being accessed or the whole table of data being accessed. For performance and concurrency benefits, row-level locking excels. System requirements and supported platforms InnoDB can be used on all platforms on which MySQL can be installed. These include: Linux: RPM, Deb, Tar BSDs: FreeBSD, OpenBSD, NetBSD Solaris and OpenSolaris / Illumos: SPARC + Intel IBM AIX HP-UX Mac OSX Windows 32 bit and 64 bit There are also custom ports of MySQL from the open source community for running MySQL on various embedded platforms and non-standard operating systems. Hardware-wise, MySQL and correspondingly InnoDB, will run on a wide variety of hardware, which at the time of this writing includes: Intel x86 32 bit AMD/Intel x 86_64 Intel Itanium IA-64 IBM Power architecture Apple's PPC PA-RISC 1.0 + 2.0 SPARC 32 + 64 bit Keep in mind when installing and configuring InnoDB, depending on the architecture in which it is installed, it will have certain options available and enabled that are not available on all platforms. In addition to the underlying hardware, the operating system will also determine whether certain configuration options are available and the range to which some variables can be set. One of the more decisively important differences to be considered while choosing an operating system for your database server is the manner in which the operating system and underlying filesystem handles write caching and write flushes to the disk storage subsystem. These operating system abilities can cause a dramatic difference in the performance of InnoDB, often to the order of 10 times the concurrency ability. When reading the MySQL documentation you may find that InnoDB has over fifty-eight configuration settings, more or less depending on the version, for tuning the performance and operational defaults. The majority of these default settings can be left alone for development and production server environments. However, there are several core settings that can affect great change, in either positive or negative directions depending on the application workload and hardware resource limits, with which every MySQL database administrator should be familiar and proficient. Keep in mind when setting values that some variables are considered dynamic while others are static; dynamic variables can be changed during runtime and do not require a process restart while static variables can only be changed prior to process start, so any changes made to static variables during runtime will only take effect upon the next restart of the database server process. Dynamic variables can be changed on the MySQL command line via the following command: mysql> SET GLOBAL [variable]=[value]; If a value is changed on the command line, it should also be updated in the global my.cnf configuration file so that the change is applied during each restart. MySQL memory allocation equations Before tuning any InnoDB configuration settings—memory buffers in particular—we need to understand how MySQL allocates memory to various areas of the application that handles RAM. There are two simple equations for referencing total memory usage that allocate memory based on incoming client connections: Per-thread buffers: Per-thread buffers, also called per-connection buffers since MySQL uses a separate thread for each connection, operate in contrast to global buffers in that per-thread buffers only allocate memory when a connection is made and in some cases will only allocate as much memory as the connection's workload requires, thus not necessarily utilizing the entire size of the allowable buffer. This memory utilization method is described in the MySQL manual as follows: Each client thread is associated with a connection buffer and a result buffer. Both begin with a size given by net_buffer_length but are dynamically enlarged up to max_allowed_packet bytes as needed. The result buffer shrinks to net_buffer_length after each SQL statement. Global buffers: Global buffers are allocated memory resources regardless of the number of connections being handled. These buffers request their memory requirements during the startup process and retain this reservation of resources until the server process has ended. When allocating memory to MySQL buffers we need to ensure that there is also enough RAM available for the operating system to perform its tasks and processes; in general it is a best practice to limit MySQL between 85 to 90 percent allocation of total system RAM. The memory utilization equations for each of the buffers is given as follows: Per-thread Buffer memory utilization equation: (read_buffer_size + read_rnd_buffer_size + sort_buffer_size + thread_stack + join_buffer_size + binlog_cache_size) * max_connections = total memory allocation for all connections, or MySQL Thread Buffers (MTB) Global Buffer memory utilization equation: innodb_buffer_pool_size + innodb_additional_mem_pool_size + innodb_ log_buffer_size + key_buffer_size + query_cache_size = total memory used by MySQL Global Buffers (MGB) Total memory allocation equation: MTB + MGB = Total Memory Used by MySQL If the total memory used by the combination of MTB and MGB is greater than 85 to 90 percent of the total system RAM then you may experience resource contention, a resource bottleneck, or in the worst case you will see memory pages swapping to on-disk resources (virtual memory) which results in performance degradation and, in some cases, process failure or connection timeouts. Therefore it is wise to check memory allocation via the equations mentioned previously before making changes to the memory buffers or increasing the value of max_connections to the database. More information about how MySQL manages memory and threads can be read about in the following pages of the MySQL documentation: http://dev.mysql.com/doc/refman/5.5/en/connection-threads.html http://dev.mysql.com/doc/refman/5.5/en/memory-use.html Summary This article provided a quick overview of the core terminology and basic features, system requirements, and a few memory allocation equations. Resources for Article : Further resources on this subject: Configuring MySQL [Article] Optimizing your MySQL Servers' performance using Indexes [Article] Indexing in MySQL Admin [Article]

0
0
1891

article-image-marker-based-augmented-reality-iphone-or-ipad

Packt

01 Feb 2013

23 min read

Marker-based Augmented Reality on iPhone or iPad

Packt

01 Feb 2013

23 min read

(For more resources related to this topic, see here.) Creating an iOS project that uses OpenCV In this section we will create a demo application for iPhone/iPad devices that will use the OpenCV ( Open Source Computer Vision ) library to detect markers in the camera frame and render 3D objects on it. This example will show you how to get access to the raw video data stream from the device camera, perform image processing using the OpenCV library, find a marker in an image, and render an AR overlay. We will start by first creating a new XCode project by choosing the iOS Single View Application template, as shown in the following screenshot: Now we have to add OpenCV to our project. This step is necessary because in this application we will use a lot of functions from this library to detect markers and estimate position position. OpenCV is a library of programming functions for real-time computer vision. It was originally developed by Intel and is now supported by Willow Garage and Itseez. This library is written in C and C++ languages. It also has an official Python binding and unofficial bindings to Java and .NET languages. Adding OpenCV framework Fortunately the library is cross-platform, so it can be used on iOS devices. Starting from version 2.4.2, OpenCV library is officially supported on the iOS platform and you can download the distribution package from the library website at http://opencv.org/. The OpenCV for iOS link points to the compressed OpenCV framework. Don't worry if you are new to iOS development; a framework is like a bundle of files. Usually each framework package contains a list of header files and list of statically linked libraries. Application frameworks provide an easy way to distribute precompiled libraries to developers. Of course, you can build your own libraries from scratch. OpenCV documentation explains this process in detail. For simplicity, we follow the recommended way and use the framework for this article. After downloading the file we extract its content to the project folder, as shown in the following screenshot: To inform the XCode IDE to use any framework during the build stage, click on Project options and locate the Build phases tab. From there we can add or remove the list of frameworks involved in the build process. Click on the plus sign to add a new framework, as shown in the following screenshot: From here we can choose from a list of standard frameworks. But to add a custom framework we should click on the Add other button. The open file dialog box will appear. Point it to opencv2.framework in the project folder as shown in the following screenshot: Including OpenCV headers Now that we have added the OpenCV framework to the project, everything is almost done. One last thing—let's add OpenCV headers to the project's precompiled headers. The precompiled headers are a great feature to speed up compilation time. By adding OpenCV headers to them, all your sources automatically include OpenCV headers as well. Find a .pch file in the project source tree and modify it in the following way. The following code shows how to modify the .pch file in the project source tree: // // Prefix header for all source files of the 'Example_MarkerBasedAR' // #import <Availability.h> #ifndef __IPHONE_5_0 #warning "This project uses features only available in iOS SDK 5.0 and later." #endif #ifdef __cplusplus #include <opencv2/opencv.hpp> #endif #ifdef __OBJC__ #import <UIKit/UIKit.h> #import <Foundation/Foundation.h> #endif Now you can call any OpenCV function from any place in your project. That's all. Our project template is configured and we are ready to move further. Free advice: make a copy of this project; this will save you time when you are creating your next one! Application architecture Each iOS application contains at least one instance of the UIViewController interface that handles all view events and manages the application's business logic. This class provides the fundamental view-management model for all iOS apps. A view controller manages a set of views that make up a portion of your app's user interface. As part of the controller layer of your app, a view controller coordinates its efforts with model objects and other controller objects—including other view controllers—so your app presents a single coherent user interface. The application that we are going to write will have only one view; that's why we choose a Single-View Application template to create one. This view will be used to present the rendered picture. Our ViewController class will contain three major components that each AR application should have (see the next diagram): Video source Processing pipeline Visualization engine The video source is responsible for providing new frames taken from the built-in camera to the user code. This means that the video source should be capable of choosing a camera device (front- or back-facing camera), adjusting its parameters (such as resolution of the captured video, white balance, and shutter speed), and grabbing frames without freezing the main UI. The image processing routine will be encapsulated in the MarkerDetector class. This class provides a very thin interface to user code. Usually it's a set of functions like processFrame and getResult. Actually that's all that ViewController should know about. We must not expose low-level data structures and algorithms to the view layer without strong necessity. VisualizationController contains all logic concerned with visualization of the Augmented Reality on our view. VisualizationController is also a facade that hides a particular implementation of the rendering engine. Low code coherence gives us freedom to change these components without the need to rewrite the rest of your code. Such an approach gives you the freedom to use independent modules on other platforms and compilers as well. For example, you can use the MarkerDetector class easily to develop desktop applications on Mac, Windows, and Linux systems without any changes to the code. Likewise, you can decide to port VisualizationController on the Windows platform and use Direct3D for rendering. In this case you should write only new VisualizationController implementation; other code parts will remain the same. The main processing routine starts from receiving a new frame from the video source. This triggers video source to inform the user code about this event with a callback. ViewController handles this callback and performs the following operations: Sends a new frame to the visualization controller. Performs processing of the new frame using our pipeline. Sends the detected markers to the visualization stage. Renders a scene. Let's examine this routine in detail. The rendering of an AR scene includes the drawing of a background image that has a content of the last received frame; artificial 3D objects are drawn later on. When we send a new frame for visualization, we are copying image data to internal buffers of the rendering engine. This is not actual rendering yet; we are just updating the text with a new bitmap. The second step is the processing of new frame and marker detection. We pass our image as input and as a result receive a list of the markers detected. on it. These markers are passed to the visualization controller, which knows how to deal with them. Let's take a look at the following sequence diagram where this routine is shown: We start development by writing a video capture component. This class will be responsible for all frame grabbing and for sending notifications of captured frames via user callback. Later on we will write a marker detection algorithm. This detection routine is the core of your application. In this part of our program we will use a lot of OpenCV functions to process images, detect contours on them, find marker rectangles, and estimate their position. After that we will concentrate on visualization of our results using Augmented Reality. After bringing all these things together we will complete our first AR application. So let's move on! Accessing the camera The Augmented Reality application is impossible to create without two major things: video capturing and AR visualization. The video capture stage consists of receiving frames from the device camera, performing necessary color conversion, and sending it to the processing pipeline. As the single frame processing time is so critical to AR applications, the capture process should be as efficient as possible. The best way to achieve maximum performance is to have direct access to the frames received from the camera. This became possible starting from iOS Version 4. Existing APIs from the AVFoundation framework provide the necessary functionality to read directly from image buffers in memory. You can find a lot of examples that use the AVCaptureVideoPreviewLayer class and the UIGetScreenImage function to capture videos from the camera. This technique was used for iOS Version 3 and earlier. It has now become outdated and has two major disadvantages: Lack of direct access to frame data. To get a bitmap, you have to create an intermediate instance of UIImage, copy an image to it, and get it back. For AR applications this price is too high, because each millisecond matters. Losing a few frames per second (FPS) significantly decreases overall user experience. To draw an AR, you have to add a transparent overlay view that will present the AR. Referring to Apple guidelines, you should avoid non-opaque layers because their blending is hard for mobile processors. Classes AVCaptureDevice and AVCaptureVideoDataOutput allow you to configure, capture, and specify unprocessed video frames in 32 bpp BGRA format. Also you can set up the desired resolution of output frames. However, it does affect overall performance since the larger the frame the more processing time and memory is required. There is a good alternative for high-performance video capture. The AVFoundation API offers a much faster and more elegant way to grab frames directly from the camera. But first, let's take a look at the following figure where the capturing process for iOS is shown: AVCaptureSession is a root capture object that we should create. Capture session requires two components—an input and an output. The input device can either be a physical device (camera) or a video file (not shown in diagram). In our case it's a built-in camera (front or back). The output device can be presented by one of the following interfaces: AVCaptureMovieFileOutput AVCaptureStillImageOutput AVCaptureVideoPreviewLayer AVCaptureVideoDataOutput The AVCaptureMovieFileOutput interface is used to record video to the file, the AVCaptureStillImageOutput interface is used to to make still images, and the AVCaptureVideoPreviewLayer interface is used to play a video preview on the screen. We are interested in the AVCaptureVideoDataOutput interface because it gives you direct access to video data. The iOS platform is built on top of the Objective-C programming language. So to work with AVFoundation framework, our class also has to be written in Objective-C. In this section all code listings are in the Objective-C++ language. To encapsulate the video capturing process, we create the VideoSource interface as shown by the following code: @protocol VideoSourceDelegate<NSObject> -(void)frameReady:(BGRAVideoFrame) frame; @end @interface VideoSource : NSObject<AVCaptureVideoDataOutputSampleBuffe rDelegate> { } @property (nonatomic, retain) AVCaptureSession *captureSession; @property (nonatomic, retain) AVCaptureDeviceInput *deviceInput; @property (nonatomic, retain) id<VideoSourceDelegate> delegate; - (bool) startWithDevicePosition:(AVCaptureDevicePosition) devicePosition; - (CameraCalibration) getCalibration; - (CGSize) getFrameSize; @end In this callback we lock the image buffer to prevent modifications by any new frames, obtain a pointer to the image data and frame dimensions. Then we construct temporary BGRAVideoFrame object that is passed to outside via special delegate. This delegate has following prototype: @protocol VideoSourceDelegate<NSObject> -(void)frameReady:(BGRAVideoFrame) frame; @end Within VideoSourceDelegate, the VideoSource interface informs the user code that a new frame is available. The step-by-step guide for the initialization of video capture is listed as follows: Create an instance of AVCaptureSession and set the capture session quality preset. Choose and create AVCaptureDevice. You can choose the front- or backfacing camera or use the default one. Initialize AVCaptureDeviceInput using the created capture device and add it to the capture session. Create an instance of AVCaptureVideoDataOutput and initialize it with format of video frame, callback delegate, and dispatch the queue. Add the capture output to the capture session object. Start the capture session. Let's explain some of these steps in more detail. After creating the capture session, we can specify the desired quality preset to ensure that we will obtain optimal performance. We don't need to process HD-quality video, so 640 x 480 or an even lesser frame resolution is a good choice: - (id)init { if ((self = [super init])) { AVCaptureSession * capSession = [[AVCaptureSession alloc] init]; if ([capSession canSetSessionPreset:AVCaptureSessionPreset64 0x480]) { [capSession setSessionPreset:AVCaptureSessionPreset640x480]; NSLog(@"Set capture session preset AVCaptureSessionPreset640x480"); } else if ([capSession canSetSessionPreset:AVCaptureSessionPresetL ow]) { [capSession setSessionPreset:AVCaptureSessionPresetLow]; NSLog(@"Set capture session preset AVCaptureSessionPresetLow"); } self.captureSession = capSession; } return self; } Always check hardware capabilities using the appropriate API; there is no guarantee that every camera will be capable of setting a particular session preset. After creating the capture session, we should add the capture input—the instance of AVCaptureDeviceInput will represent a physical camera device. The cameraWithPosition function is a helper function that returns the camera device for the requested position (front, back, or default): - (bool) startWithDevicePosition:(AVCaptureDevicePosition) devicePosition { AVCaptureDevice *videoDevice = [self cameraWithPosition:devicePosit ion]; if (!videoDevice) return FALSE; { NSError *error; AVCaptureDeviceInput *videoIn = [AVCaptureDeviceInput deviceInputWithDevice:videoDevice error:&error]; self.deviceInput = videoIn; if (!error) { if ([[self captureSession] canAddInput:videoIn]) { [[self captureSession] addInput:videoIn]; } else { NSLog(@"Couldn't add video input"); return FALSE; } } else { NSLog(@"Couldn't create video input"); return FALSE; } } [self addRawViewOutput]; [captureSession startRunning]; return TRUE; } Please notice the error handling code. Take care of return values for such an important thing as working with hardware setup is a good practice. Without this, your code can crash in unexpected cases without informing the user what has happened. We created a capture session and added a source of the video frames. Now it's time to add a receiver—an object that will receive actual frame data. The AVCaptureVideoDataOutput class is used to process uncompressed frames from the video stream. The camera can provide frames in BGRA, CMYK, or simple grayscale color models. For our purposes the BGRA color model fits best of all, as we will use this frame for visualization and image processing. The following code shows the addRawViewOutput function: - (void) addRawViewOutput { /*We setupt the output*/ AVCaptureVideoDataOutput *captureOutput = [[AVCaptureVideoDataOutput alloc] init]; /*While a frame is processes in -captureOutput:didOutputSampleBuff er:fromConnection: delegate methods no other frames are added in the queue. If you don't want this behaviour set the property to NO */ captureOutput.alwaysDiscardsLateVideoFrames = YES; /*We create a serial queue to handle the processing of our frames*/ dispatch_queue_t queue; queue = dispatch_queue_create("com.Example_MarkerBasedAR. cameraQueue", NULL); [captureOutput setSampleBufferDelegate:self queue:queue]; dispatch_release(queue); // Set the video output to store frame in BGRA (It is supposed to be faster) NSString* key = (NSString*)kCVPixelBufferPixelFormatTypeKey; NSNumber* value = [NSNumber numberWithUnsignedInt:kCVPixelFormatType_32BGRA]; NSDictionary* videoSettings = [NSDictionary dictionaryWithObject:value forKey:key]; [captureOutput setVideoSettings:videoSettings]; // Register an output [self.captureSession addOutput:captureOutput]; } Now the capture session is finally configured. When started, it will capture frames from the camera and send it to user code. When the new frame is available, an AVCaptureSession object performs a captureOutput: didOutputSampleBuffer:fromConnection callback. In this function, we will perform a minor data conversion operation to get the image data in a more usable format and pass it to user code: - (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection { // Get a image buffer holding video frame CVImageBufferRef imageBuffer = CMSampleBufferGetImageBuffer (sampleB uffer); // Lock the image buffer CVPixelBufferLockBaseAddress(imageBuffer,0); // Get information about the image uint8_t *baseAddress = (uint8_t *)CVPixelBufferGetBaseAddress(image Buffer); size_t width = CVPixelBufferGetWidth(imageBuffer); size_t height = CVPixelBufferGetHeight(imageBuffer); size_t stride = CVPixelBufferGetBytesPerRow(imageBuffer); BGRAVideoFrame frame = {width, height, stride, baseAddress}; [delegate frameReady:frame]; /*We unlock the image buffer*/ CVPixelBufferUnlockBaseAddress(imageBuffer,0); } We obtain a reference to the image buffer that stores our frame data. Then we lock it to prevent modifications by new frames. Now we have exclusive access to the frame data. With help of the CoreVideo API, we get the image dimensions, stride (number of pixels per row), and the pointer to the beginning of the image data. I draw your attention to the CVPixelBufferLockBaseAddress/ CVPixelBufferUnlockBaseAddress function call in the callback code. Until we hold a lock on the pixel buffer, it guarantees consistency and correctness of its data. Reading of pixels is available only after you have obtained a lock. When you're done, don't forget to unlock it to allow the OS to fill it with new data. Marker detection A marker is usually designed as a rectangle image holding black and white areas inside it. Due to known limitations, the marker detection procedure is a simple one. First of all we need to find closed contours on the input image and unwarp the image inside it to a rectangle and then check this against our marker model. In this sample the 5 x 5 marker will be used. Here is what it looks like: In the sample project that you will find in this book, the marker detection routine is encapsulated in the MarkerDetector class: /** * A top-level class that encapsulate marker detector algorithm */ class MarkerDetector { public: /** * Initialize a new instance of marker detector object * @calibration[in] - Camera calibration necessary for pose estimation. */ MarkerDetector(CameraCalibration calibration); void processFrame(const BGRAVideoFrame& frame); const std::vector<Transformation>& getTransformations() const; protected: bool findMarkers(const BGRAVideoFrame& frame, std::vector<Marker>& detectedMarkers); void prepareImage(const cv::Mat& bgraMat, cv::Mat& grayscale); void performThreshold(const cv::Mat& grayscale, cv::Mat& thresholdImg); void findContours(const cv::Mat& thresholdImg, std::vector<std::vector<cv::Point> >& contours, int minContourPointsAllowed); void findMarkerCandidates(const std::vector<std::vector<cv::Point> >& contours, std::vector<Marker>& detectedMarkers); void detectMarkers(const cv::Mat& grayscale, std::vector<Marker>& detectedMarkers); void estimatePosition(std::vector<Marker>& detectedMarkers); private: }; To help you better understand the marker detection routine, a step-by-step processing on one frame from a video will be shown. A source image taken from an iPad camera will be used as an example: Marker identification Here is the workflow of the marker detection routine: Convert the input image to grayscale. Perform binary threshold operation. Detect contours. Search for possible markers. Detect and decode markers. Estimate marker 3D pose. Grayscale conversion The conversion to grayscale is necessary because markers usually contain only black and white blocks and it's much easier to operate with them on grayscale images. Fortunately, OpenCV color conversion is simple enough. Please take a look at the following code listing in C++: void MarkerDetector::prepareImage(const cv::Mat& bgraMat, cv::Mat& grayscale) { // Convert to grayscale cv::cvtColor(bgraMat, grayscale, CV_BGRA2GRAY); } This function will convert the input BGRA image to grayscale (it will allocate image buffers if necessary) and place the result into the second argument. All further steps will be performed with the grayscale image. Image binarization The binarization operation will transform each pixel of our image to black (zero intensity) or white (full intensity). This step is required to find contours. There are several threshold methods; each has strong and weak sides. The easiest and fastest method is absolute threshold. In this method the resulting value depends on current pixel intensity and some threshold value. If pixel intensity is greater than the threshold value, the result will be white (255); otherwise it will be black (0). This method has a huge disadvantage—it depends on lighting conditions and soft intensity changes. The more preferable method is the adaptive threshold. The major difference of this method is the use of all pixels in given radius around the examined pixel. Using average intensity gives good results and secures more robust corner detection. The following code snippet shows the MarkerDetector function: void MarkerDetector::performThreshold(const cv::Mat& grayscale, cv::Mat& thresholdImg) { cv::adaptiveThreshold(grayscale, // Input image thresholdImg,// Result binary image 255, // cv::ADAPTIVE_THRESH_GAUSSIAN_C, // cv::THRESH_BINARY_INV, // 7, // 7 // ); } After applying adaptive threshold to the input image, the resulting image looks similar to the following one: Each marker usually looks like a square figure with black and white areas inside it. So the best way to locate a marker is to find closed contours and approximate them with polygons of 4 vertices. Contours detection The cv::findCountours function will detect contours on the input binary image: void MarkerDetector::findContours(const cv::Mat& thresholdImg, std::vector<std::vector<cv::Point> >& contours, int minContourPointsAllowed) { std::vector< std::vector<cv::Point> > allContours; cv::findContours(thresholdImg, allContours, CV_RETR_LIST, CV_ CHAIN_APPROX_NONE); contours.clear(); for (size_t i=0; i<allContours.size(); i++) { int contourSize = allContours[i].size(); if (contourSize > minContourPointsAllowed) { contours.push_back(allContours[i]); } } } The return value of this function is a list of polygons where each polygon represents a single contour. The function skips contours that have their perimeter in pixels value set to be less than the value of the minContourPointsAllowed variable. This is because we are not interested in small contours. (They will probably contain no marker, or the contour won't be able to be detected due to a small marker size.) The following figure shows the visualization of detected contours: Candidates search After finding contours, the polygon approximation stage is performed. This is done to decrease the number of points that describe the contour shape. It's a good quality check to filter out areas without markers because they can always be represented with a polygon that contains four vertices. If the approximated polygon has more than or fewer than 4 vertices, it's definitely not what we are looking for. The following code implements this idea: void MarkerDetector::findCandidates ( const ContoursVector& contours, std::vector<Marker>& detectedMarkers ) { std::vector<cv::Point> approxCurve; std::vector<Marker> possibleMarkers; // For each contour, analyze if it is a parallelepiped likely to be the marker for (size_t i=0; i<contours.size(); i++) { // Approximate to a polygon double eps = contours[i].size() * 0.05; cv::approxPolyDP(contours[i], approxCurve, eps, true); // We interested only in polygons that contains only four points if (approxCurve.size() != 4) continue; // And they have to be convex if (!cv::isContourConvex(approxCurve)) continue; // Ensure that the distance between consecutive points is large enough float minDist = std::numeric_limits<float>::max(); for (int i = 0; i < 4; i++) { cv::Point side = approxCurve[i] - approxCurve[(i+1)%4]; float squaredSideLength = side.dot(side); minDist = std::min(minDist, squaredSideLength); } // Check that distance is not very small if (minDist < m_minContourLengthAllowed) continue; // All tests are passed. Save marker candidate: Marker m; for (int i = 0; i<4; i++) m.points.push_back( cv::Point2f(approxCurve[i].x,approxCu rve[i].y) ); // Sort the points in anti-clockwise order // Trace a line between the first and second point. // If the third point is at the right side, then the points are anticlockwise cv::Point v1 = m.points[1] - m.points[0]; cv::Point v2 = m.points[2] - m.points[0]; double o = (v1.x * v2.y) - (v1.y * v2.x); if (o < 0.0) //if the third point is in the left side, then sort in anti-clockwise order std::swap(m.points[1], m.points[3]); possibleMarkers.push_back(m); } // Remove these elements which corners are too close to each other. // First detect candidates for removal: std::vector< std::pair<int,int> > tooNearCandidates; for (size_t i=0;i<possibleMarkers.size();i++) { const Marker& m1 = possibleMarkers[i]; //calculate the average distance of each corner to the nearest corner of the other marker candidate for (size_t j=i+1;j<possibleMarkers.size();j++) { const Marker& m2 = possibleMarkers[j]; float distSquared = 0; for (int c = 0; c < 4; c++) { cv::Point v = m1.points[c] - m2.points[c]; distSquared += v.dot(v); } distSquared /= 4; if (distSquared < 100) { tooNearCandidates.push_back(std::pair<int,int>(i,j)); } } } // Mark for removal the element of the pair with smaller perimeter std::vector<bool> removalMask (possibleMarkers.size(), false); for (size_t i=0; i<tooNearCandidates.size(); i++) { float p1 = perimeter(possibleMarkers[tooNearCandidates[i]. first ].points); float p2 = perimeter(possibleMarkers[tooNearCandidates[i].second]. points); size_t removalIndex; if (p1 > p2) removalIndex = tooNearCandidates[i].second; else removalIndex = tooNearCandidates[i].first; removalMask[removalIndex] = true; } // Return candidates detectedMarkers.clear(); for (size_t i=0;i<possibleMarkers.size();i++) { if (!removalMask[i]) detectedMarkers.push_back(possibleMarkers[i]); } } Now we have obtained a list of parallelepipeds that are likely to be the markers. To verify whether they are markers or not, we need to perform three steps: First, we should remove the perspective projection so as to obtain a frontal view of the rectangle area. Then we perform thresholding of the image using the Otsu algorithm. This algorithm assumes a bimodal distribution and finds the threshold value that maximizes the extra-class variance while keeping a low intra-class variance. Finally we perform identification of the marker code. If it is a marker, it has an internal code. The marker is divided into a 7 x 7 grid, of which the internal 5 x 5 cells contain ID information. The rest correspond to the external black border. Here, we first check whether the external black border is present. Then we read the internal 5 x 5 cells and check if they provide a valid code. (It might be required to rotate the code to get the valid one.) To get the rectangle marker image, we have to unwarp the input image using perspective transformation. This matrix can be calculated with the help of the cv::getPerspectiveTransform function. It finds the perspective transformation from four pairs of corresponding points. The first argument is the marker coordinates in image space and the second point corresponds to the coordinates of the square marker image. Estimated transformation will transform the marker to square form and let us analyze it: cv::Mat canonicalMarker; Marker& marker = detectedMarkers[i]; // Find the perspective transfomation that brings current marker to rectangular form cv::Mat M = cv::getPerspectiveTransform(marker.points, m_ markerCorners2d); // Transform image to get a canonical marker image cv::warpPerspective(grayscale, canonicalMarker, M, markerSize); Image warping transforms our image to a rectangle form using perspective transformation: Now we can test the image to verify if it is a valid marker image. Then we try to extract the bit mask with the marker code. As we expect our marker to contain only black and white colors, we can perform Otsu thresholding to remove gray pixels and leave only black and white pixels: //threshold image cv::threshold(markerImage, markerImage, 125, 255, cv::THRESH_BINARY | cv::THRESH_OTSU);

0
0
10994

article-image-oracle-using-metadata-service-share-xml-artifacts

Packt

23 Jan 2013

11 min read

Oracle: Using the Metadata Service to Share XML Artifacts

Packt

23 Jan 2013

11 min read

(For more resources related to this topic, see here.) The WSDL of a web service is made up of the following XML artifacts: WSDL Definition: It defines the various operations that constitute a service, their input and output parameters, and the protocols (bindings) they support. XML Schema Definition (XSD): It is either embedded within the WSDL definition or referenced as a standalone component; this defines the XML elements and types that constitute the input and output parameters. To better facilitate the exchange of data between services, as well as achieve better interoperability and re-usability, it is good practice to de?ne a common set of XML Schemas, often referred to as the canonical data model, which can be referenced by multiple services (or WSDL De?nitions). This means, we will need to share the same XML schema across multiple composites. While typically a service (or WSDL) will only be implemented by a single composite, it will often be invoked by multiple composites; so the corresponding WSDL will be shared across multiple composites. Within JDeveloper, the default behavior, when referencing a predefined schema or WSDL, is for it to add a copy of the file to our SOA project. However, if we have several composites, each referencing their own local copy of the same WSDL or XML schema, then every time that we need to change either the schema or WSDL, we will be required to update every copy. This can be a time-consuming and error-prone approach; a better approach is to have a single copy of each WSDL and schema that is referenced by all composites. The SOA infrastructure incorporates a Metadata Service (MDS), which allows us to create a library of XML artifacts that we can share across SOA composites. MDS supports two types of repositories: File-based repository: This is quicker and easier to set up, and so is typically used as the design-time MDS by JDeveloper. Database repository: It is installed as part of the SOA infrastructure. This is used at runtime by the SOA infrastructure. As you move projects from one environment to another (for example, from test to production), you must typically modify several environment-specific values embedded within your composites, such as the location of a schema or the endpoint of a referenced web service. By placing all this information within the XML artifacts deployed to MDS, you can make your composites completely agnostic of the environment they are to be deployed to. The other advantage of placing all your referenced artifacts in MDS is that it removes any direct dependencies between composites, which means that they can be deployed and started in any order (once you have deployed the artifacts to MDS). In addition, an SOA composite leverages many other XML artifacts, such as fault policies, XSLT Transformations, EDLs for event EDN event definitions, and Schematrons for validation, each of which may need to be shared across multiple composites. These can also be shared between composites by placing them in MDS. Defining a project structure Before placing all our XML artifacts into MDS, we need to define a standard file structure for our XML library. This allows us to ensure that if any XML artifact within our XML library needs to reference another XML artifact (for example a WSDL importing a schema), it can do so via a relative reference; in other words, the XML artifact doesn't include any reference to MDS and is portable. This has a number of benefits, including: OSB compatibility; the same schemas and WSDLs can be deployed to the Oracle Service Bus without modification Third-party tool compatibility; often we will use a variety of tools that have no knowledge of MDS to create/edit XML schemas, WSDLs, and so on (for example XML Spy, Oxygen) In this article, we will assume that we have defined the following directory structure under our <src> directory. Under the xmllib folder, we have defined multiple <solution> directories, where a solution (or project) is made up of one or more related composite applications. This allows each solution to maintain its XML artifacts independently. However, it is also likely that there will be a number of XML artifacts that need to be shared between different solutions (for example, the canonical data model for the organization), which in this example would go under <core>. Where we have XML artifacts shared between multiple solutions, appropriate governance is required to manage the changes to these artifacts. For the purpose of this article, the directory structure is over simpli?ed. In reality, a more comprehensive structure should be de?ned as part of the naming and deployment standards for your SOA Reference Architecture. The other consideration here is versioning; over time it is likely that multiple versions of the same schema, WSDL and so on, will require to be deployed side by side. To support this, we typically recommend appending the version number to the filename. We would also recommend that you place this under some form of version control, as it makes it far simpler to ensure that everyone is using an up-to-date version of the XML library. For the purpose of this article, we will assume that you are using Subversion. Creating a file-based MDS repository for JDeveloper Before we can reference this with JDeveloper, we need to define a connection to the file-based MDS. Getting ready By default, a file-based repository is installed with JDeveloper and sits under the directory structure: <JDeveloper Home>/jdeveloper/integration/seed This already contains the subdirectory soa, which is reserved for, and contains, artifacts used by the SOA infrastructure For artifacts that we wish to share across our applications in JDeveloper, we should create the subdirectory apps (under the seed directory); this is critical, as when we deploy the artifacts to the SOA infrastructure, they will be placed in the apps namespace We need to ensure that the content of the apps directory always contains the latest version of our XML library; as these are stored under Subversion, we simply need to check out the right portion of the Subversion project structure. How to do it... First, we need to create and populate our file-based repository. Navigate to the seed directory, and right-click and select SVN Checkout..., this will launch the Subversion Checkout window. For URL of repository, ensure that you specify the path to the apps subdirectory. For Checkout directory, specify the full pathname of the seed directory and append /apps at the end. Leave the other default values, as shown in the following screenshot, and then click on OK: Subversion will check out a working copy of the apps subfolder within Subversion into the seed directory. Before we can reference our XML library with JDeveloper, we need to define a connection to the file-based MDS. Within JDeveloper, from the File menu select New to launch the Gallery, and under Categories select General | Connections | SOA-MDS Connection from the Items list. This will launch the MDS Connection Wizard. Enter File Based MDS for Connection Name and select a Connection Type of File Based MDS. We then need to specify the MDS root folder on our local filesystem; this will be the directory that contains the apps directory, namely: <JDeveloper Home>jdeveloperintegrationseed Click on Test Connection; the Status box should be updated to Success!. Click on OK. This will create a file-based MDS connection in JDeveloper. Browse the File Based MDS connection in JDeveloper. Within JDeveloper, open the Resource Palette and expand SOA-MDS. This should contain the File Based MDS connection that we just created. Expand all the nodes down to the xsd directory, as shown in the following screenshot: If you double-click on one of the schema files, it will open in JDeveloper (in read-only mode). There's more... Once the apps directory has been checked out, it will contain a snapshot of the MDS artifacts at the point in time that you created the checkpoint. Over time, the artifacts in MDS will be modified or new ones will be created. It is important that you ensure that your local version of MDS is updated with the current version. To do this, navigate to the seed directory, right-click on apps, and select SVN Update. Creating Mediator using a WSDL in MDS In this recipe, we will show how we can create Mediator using an interface definition from a WSDL held in MDS. This approach enables us to separate the implementation of a service (a composite) from the definition of its contract (WSDL). Getting ready Make sure you have created a file-based MDS repository for JDeveloper, as described in the first recipe. Create an SOA application with a project containing an empty composite. How to do it... Drag Mediator from SOA Component Palette onto your composite. This will launch the Create Mediator wizard; specify an appropriate name (EmployeeOnBoarding in the following example), and for the Template select Interface Definition from WSDL Click on the Find Existing WSDLs icon (circled in the previous screenshot); this will launch the SOA Resource Browser. Select Resource Palette from the drop-down list (circled in the following screenshot). Select the WSDL that you wish to import and click on OK. This will return you to the Create Mediator wizard window; ensure that the Port Type is populated and click on OK. This will create Mediator based on the specified WSDL within our composite. How it works... When we import the WSDL in this fashion, JDeveloper doesn't actually make a copy of the schema; rather within the componentType file, it sets the wsdlLocation attribute to reference the location of the WSDL in MDS (as highlighted in the following screenshot). For WSDLs in MDS, the wsdlLocation attribute uses the following format: oramds:/apps/<wsdl name> Where oramds indicates that it is located in MDS, apps indicates that it is in the application namespace and <wsdl name> is the full pathname of the WSDL in MDS. The wsdlLocation doesn't specify the physical location of the WSDL; rather it is relative to MDS, which is specific to the environment in which the composite is deployed. This means that when the composite is open in JDeveloper, it will reference the WSDL in the file-based MDS, and when deployed to the SOA infrastructure, it will reference the WSDL deployed to the MDS database repository, which is installed as part of the SOA infrastructure. There's more... This method can be used equally well to create a BPEL process based on the WSDL from within the Create BPEL Process wizard; for Template select Base on a WSDL and follow the same steps. This approach works well with Contract First Design as it enables the contract for a composite to be designed first, and when ready for implementation, be checked into Subversion. The SOA developer can then perform a Subversion update on their file-based MDS repository, and then use the WSDL to implement the composite Creating Mediator that subscribes to EDL in MDS In this recipe, we will show how we can create Mediator that subscribes to an EDN event whose EDL is defined in MDS. This approach enables us to separate the definition of an event from the implementation of a composite that either subscribes to, or publishes, the event. Getting ready Make sure you have created a file-based MDS repository for JDeveloper, as described in the initial recipe. Create an SOA application with a project containing an empty composite. How to do it... Drag Mediator from SOA Component Palette onto your composite. This will launch the Create Mediator wizard; specify an appropriate name for it (UserRegistration in the following example), and for the Template select Subscribe to Events. Click on the Subscribe to new event icon (circled in the previous screenshot); this will launch the Event Chooser window. Click on the Browse for Event Definition (edl) files icon (circled in the previous screenshot); this will launch SOA Resource Browser. Select Resource Palette from the drop-down list. Select the EDL that you wish to import and click on OK. This will return you to the Event Chooser window; ensure that the required event is selected and click on OK. This will return you to the Create Mediator window; ensure that the required event is configured as needed, and click on OK. This will create an event subscription based on the EDL specified within our composite. How it works... When we reference an EDL in MDS, JDeveloper doesn't actually make a copy of the EDL; rather within the composite.xml file, it creates an import statement to reference the location of the EDL in MDS. There's more... This approach can be used equally well to subscribe to an event within a BPEL process or publish an event using either Mediator or BPEL.

0
0
2433

article-image-creating-cartesian-based-graphs

Packt

17 Jan 2013

19 min read

Creating Cartesian-based Graphs

Packt

17 Jan 2013

19 min read

0
0
2612

article-image-sap-hana-integration-microsoft-excel

Packt

03 Jan 2013

4 min read

SAP HANA integration with Microsoft Excel

Packt

03 Jan 2013

4 min read

(For more resources related to this topic, see here.) Once your application is finished inside SAP HANA, and you can see that it performs as expected inside the Studio, you need to be able to deploy it to your users. Asking them to use the Studio is not really practical, and you don’t necessarily want to put the modeling software in the hands of all your users. Reporting on SAP HANA can be done in most of SAP’s Business Objects suite of applications, or in tools which can create and consume MDX queries and data. The simplest of these tools to start with is probably Microsoft Excel. Excel can connect to SAP HANA using the MDX language (a kind of multidimensional SQL) in the form of pivot tables. These in turn allow users to “slice and dice” data as they require, to extract the metrics they need. There are (at time of writing) limitations to the integration with SAP HANA and external reporting tools. These limitations are due to the relative youth of the HANA product, and are being addressed with each successive update to the software. Those listed here are valid for SAP HANA SP04, they may or may not be valid for your version: Hierarchies can only be visualized in Microsoft Excel, not in BusinessObjects Prompts can only be used in Business Objects BI4. Views which use variables can be used in other tools, but only if the variable has a default value (if you don’t have a default value on the variable, then Excel, notably, will complain that the view has been “changed on the server”) In order to make MDX connections to SAP HANA, the SAP HANA Client software is needed. This is separate to the Studio, and must be installed on the client workstation. Like the Studio itself, it can be found on the SAP HANA DVD set, or in the SWDC. Additionally, like the studio, SAP provides a developer download of the client software on SDN, at the following link: http://www.sdn.sap.com/irj/scn/go/portal/prtroot/docs/webcontent/uuid/402aa158-6a7a-2f10-0195-f43595f6fe5f Just download the appropriate version for your Microsoft Office installation. Even if your PC has a 64-bit installation of Windows, you most likely have a 32-bit installation of Office, and you’ll need the 32-bit version of the SAP HANA Client software. If you’re not sure, you can find the information in the Help | About dialog box. In Excel 2010, for example, click on the File tab, then the Help menu entry. The version is specified on the right of the page: Just install the client software like you installed the studio, usually to the default location. Once the software is installed, there is no shortcut created on your desktop, and no entry will be created in your “Start” menu, so don’t be surprised to not see anything to run. We’re going to incorporate our sales simulator in Microsoft Excel, so launch Excel now. Go to the Data tab, and click on From Other Sources, then From Data Connection Wizard, as shown: Next, select Other/Advanced, then SAP HANA MDX provider, and then click Next. The SAP HANA Logon dialog will appear, so enter your Host, Instance, and login information (the same information you use to connect to SAP HANA with the Studio). Click on Test Connection to validate the connection. If the test succeeds, click on OK to choose the CUBE to which you want to connect. In Excel, all your Analytic and Calculation Views are considered to the cubes. Choose your Analytic or Calculation view and click Next. On this screen there’s a checkbox Save password in file – this will avoid having to type in the SAP HANA password every time the Excel file is opened – but the password is stored in the Excel file, which is a little less secure. Click on the Finish button to create the connection to SAP HANA, and your View. On the next screen you’ll be asked where you want to insert the pivot table, just click on OK, to see the results: Congratulations! You now have your reporting application available in Microsoft Excel, showing the same information you could see using the Data Preview feature of the SAP HANA Studio. Resources for Article : Further resources on SAP HANA Starter: SAP NetWeaver: MDM Scenarios and Fundamentals [Article] SAP BusinessObjects: Customizing the Dashboard [Article] SQL Query Basics in SAP Business One [Article]

0
0
4336

article-image-creating-interactive-graphics-and-animation

Packt

02 Jan 2013

15 min read

Creating Interactive Graphics and Animation

Packt

02 Jan 2013

15 min read

(For more resources related to this topic, see here.) Interactive graphics and animations This article showcases MATLAB's capabilities for creating interactive graphics and animations. A static graphic is essentially two dimensional. The ability to rotate the axes and change the view, add annotations in real time, delete data, and zoom in or zoom out adds significantly to the user experience, as the brain is able to process and see more from that interaction. MATLAB supports interactivity with the standard zoom, pan features, a powerful set of camera tools to change the data view, data brushing, and axes linking. The set of functionalities accessible from the figure and camera toolbars are outlined briefly as follows: The steps of interactive exploration can also be recorded and presented as an animation. This is very useful to demonstrate the evolution of the data in time or space or along any dimension where sequence has meaning. Note that some recipes in this article may require you to run the code from the source code files as a whole unit because they were developed as functions. As functions, they are not independently interpretable using the separate code blocks corresponding to each step. Callback functions A mouse drag movement from the top-left corner to bottom-right corner is commonly used for zooming in or selecting a group of objects. You can also program a custom behavior to such an interaction event, by using a callback function. When a specific event occurs (for example, you click on a push button or double-click with your mouse), the corresponding callback function executes. Many event properties of graphics handle objects can be used to define callback functions. In this recipe, you will write callback functions which are essential to implement a slider element to get input from the user on where to create the slice or an isosurface for 3D exploration. You will also see options available to share data between the calling and callback functions. Getting started Load the dataset. Split the data into two main sets—userdataA is a structure with variables related to the demographics and userdataB is a structure with variables related to the Income Groups. Now create a nested structure with these two data structures as shown in the following code snippet: load customCountyData userdataA.demgraphics = demgraphics; userdataA.lege = lege; userdataB.incomeGroups = incomeGroups; userdataB.crimeRateLI = crimeRateLI; userdataB.crimeRateHI = crimeRateHI; userdataB.crimeRateMI = crimeRateMI; userdataB.AverageSATScoresLI = AverageSATScoresLI; userdataB.AverageSATScoresMI = AverageSATScoresMI; userdataB.AverageSATScoresHI = AverageSATScoresHI; userdataB.icleg = icleg; userdataAB.years = years; userdataAB.userdataA = userdataA; userdataAB.userdataB = userdataB; How to do it... Perform the following steps: Run this as a function at the console: c3165_07_01_callback_functions A figure is brought up with a non-standard menu item as highlighted in the following screenshot. Select the By Population item: Here is the resultant figure: Continue to explore the other options to fully exercise the interactivity built into this graphic. How it works... The function c3165_07_01_callback_functions works as follows: A custom menu item Data Groups is created, with additional submenu items—By population, By Income Groups, or Show all. % add main menu item f = uimenu('Label','Data Groups'); % add sub menu items with additional parameters uimenu(f,'Label','By Population','Callback','showData',... 'tag','demographics','userdata',userdataAB); uimenu(f,'Label','By IncomeGroups',... 'Callback','showData','tag','IncomeGroups',... 'userdata',userdataAB); uimenu(f,'Label','ShowAll','Callback','showData',... 'tag','together','userdata',userdataAB); You defined the tag name and the callback function for each submenu item above. Having a tag name makes it easier to use the same callback function with multiple objects because you can query the tag name to find out which object initiated the call to the callback function (if you need that information). In this example, the callback function behavior is dependent upon which submenu item was selected. So the tag property allowed you to use the single function showData as callback for all three submenu items and still implement submenu item specific behavior. Alternately, you could also register three different callback functions and use no tag names. You can specify the value of a callback property in three ways. Here, you gave it a function handle. Alternately, you can supply a string that is a MATLAB command that executes when the callback is invoked. Or, a cell array with the function handle and additional arguments as you will see in the next section. For passing data between the calling and callback function, you also have three options. Here, you set the userdata property to the variable name that has the data needed by the callback function. Note that the userdata is just one variable and you passed a complicated data structure as userdata to effectively pass multiple values. The user data can be extracted from within the callback function of the object or menu item whose callback is executing as follows: userdata = get(gcbo,'userdata'); The second alternative to pass data to callback functions is by means of the application data. This does not require you to build a complicated data structure. Depending on how much data you need to pass, this later option may be the faster mechanism. It also has the advantage that the userdata space cannot inadvertently get overwritten by some other function. Use the setappdata function to pass multiple variables. In this recipe, you maintained the main drawing area axis handles and the custom legend axis handles as application data. setappdata(gcf,'mainAxes',[]); setappdata(gcf,'labelAxes',[]); This was retrieved each time within the executing callback functions, to clear the graphic as new choices are selected by the user from the custom menu. mainAxesHandle = getappdata(gcf,'mainAxes'); labelAxesHandles = getappdata(gcf,'labelAxes'); if ~isempty(mainAxesHandle), cla(mainAxesHandle); [mainAxesHandle, x, y, ci, cd] = ... redrawGrid(userdata.years, mainAxesHandle); else [mainAxesHandle, x, y, ci, cd] = ... redrawGrid(userdata.years); end if ~isempty(labelAxesHandles) for ij = 1:length(labelAxesHandles) cla(labelAxesHandles(ij)); end end The third option to pass data to callback functions is at the time of defining the callback property, where you can supply a cell array with the function handle and additional arguments as you will see in the next section. These are local copies of data passed onto the function and will not affect the global values of the variables. The callback function showData is given below. Functions that you want to use as function handle callbacks must define at least two input arguments in the function definition: the handle of the object generating the callback (the source of the event), the event data structure (can be empty for some callbacks). function showData(src, evt) userdata = get(gcbo,'userdata'); if strcmp(get(gcbo,'tag'),'demographics') % Call grid f drawing code block % Call showDemographics with relevant inputs elseif strcmp(get(gcbo,'tag'),'IncomeGroups') % Call grid drawing code block % Call showIncomeGroups with relevant inputs else % Call grid drawing code block % Call showDemographics with relevant inputs % Call showIncomeGroups with relevant inputs end function labelAxesHandle = ... showDemographics(userdata, mainAxesHandle, x, y, cd) % Function specific code end function labelAxesHandle = ... showIncomeGroups(userdata, mainAxesHandle, x, y, ci) % Function specific code end function [mainAxesHandle x y ci cd] = ... redrawGrid(years, mainAxesHandle) % Grid drawing function specific code end end There's more... This section demonstrates the third option to pass data to callback functions by supplying a cell array with the function handle and additional arguments at the time of defining the callback property. Add a fourth submenu item as follows (uncomment line 45 of the source code): uimenu(f,'Label',... 'Alternative way to pass data to callback',... 'Callback',{@showData1,userdataAB},'tag','blah'); Define the showData1 function as follows (uncomment lines 49 to 51 of the source code): function showData1(src, evt, arg1) disp(arg1.years); end Execute the function and see that the value of the years variable are displayed at the MATLAB console when you select the last submenu Alternative way to pass data to callback option. Takeaways from this recipe: Use callback functions to define custom responses for each user interaction with your graphic Use one of the three options for sharing data between calling and callback functions—pass data as arguments with the callback definition, or via the user data space, or via the application data space, as appropriate See also Look up MATLAB help on the setappdata, getappdata, userdata property, callback property, and uimenu commands. Obtaining user input from the graph User input may be desired for annotating data in terms of adding a label to one or more data points, or allowing user settable boundary definitions on the graphic. This recipe illustrates how to use MATLAB to support these needs. Getting started The recipe shows a two-dimensional dataset of intensity values obtained from two different dye fluorescence readings. There are some clearly identifiable clusters of points in this 2D space. The user is allowed to draw boundaries to group points and identify these clusters. Load the data: load clusterInteractivData The imellipse function from the MATLAB image processing toolboxTM is used in this recipe. Trial downloads are available from their website. How to do it... The function constitutes the following steps: Set up the user data variables to share the data between the callback functions of the push button elements in this graph: userdata.symbChoice = {'+','x','o','s','^'}; userdata.boundDef = []; userdata.X = X; userdata.Y = Y; userdata.Calls = ones(size(X)); set(gcf,'userdata',userdata); Make the initial plot of the data: plot(userdata.X,userdata.Y,'k.','Markersize',18); hold on; Add the push button elements to the graphic: uicontrol('style','pushbutton',... 'string','Add cluster boundaries?', ... 'Callback',@addBound, ... 'Position', [10 21 250 20],'fontsize',12); uicontrol('style','pushbutton', ... 'string','Classify', ... 'Callback',@classifyPts, ... 'Position', [270 21 100 20],'fontsize',12); uicontrol('style','pushbutton', ... 'string','Clear Boundaries', ... 'Callback',@clearBounds, ... 'Position', [380 21 150 20],'fontsize',12); Define callback for each of the pushbutton elements. The addBound function is for defining the cluster boundaries. The steps are as follows: % Retrieve the userdata data userdata = get(gcf,'userdata'); % Allow a maximum of four cluster boundary definitions if length(userdata.boundDef)>4 msgbox('A maximum of four clusters allowed!'); return; end % Allow user to define a bounding curve h=imellipse(gca); % The boundary definition is added to a cell array with % each element of the array storing the boundary def. userdata.boundDef{length(userdata.boundDef)+1} = ... h.getPosition; set(gcf,'userdata',userdata); The classifyPts function draws points enclosed in a given boundary with a unique symbol per boundary definition. The logic used in this classification function is simple and will run into difficulties with complex boundary definitions. However, that is ignored as that is not the focus of this recipe. Here, first find points whose coordinates lie in the range defined by the coordinates of the boundary definition. Then, assign a unique symbol to all points within that boundary: for i = 1:length(userdata.boundDef) pts = ... find( (userdata.X>(userdata.boundDef{i}(:,1)))& ... (userdata.X<(userdata.boundDef{i}(:,1)+ ... userdata.boundDef{i}(:,3))) &... (userdata.Y>(userdata.boundDef{i}(:,2)))& ... (userdata.Y<(userdata.boundDef{i}(:,2)+ ... userdata.boundDef{i}(:,4)))); userdata.Calls(pts) = i; plot(userdata.X(pts),userdata.Y(pts), ... [userdata.colorChoice{i} '.'], ... 'Markersize',18); hold on; end The clearBounds function clears the drawn boundaries and removes the clustering based upon those boundary definitions. function clearBounds(src, evt) cla; userdata = get(gcf,'userdata'); userdata.boundDef = []; set(gcf,'userdata',userdata); plot(userdata.X,userdata.Y,'k.','Markersize',18); hold on; end Run the code and define cluster boundaries using the mouse. Note that until you click the on the Classify button, classification does not occur. Here is a snapshot of how it looks (the arrow and dashed boundary is used to depict the cursor movement from user interaction): Initiate a classification by clicking on Classify. The graph will respond by re-drawing all points inside the constructed boundary with a specific symbol: How it works... This recipe illustrates how user input is obtained from the graphical display in order to impact the results produced. The image processing toolbox has several such functions that allow user to provide input by mouse clicks on the graphical display—such as imellipse for drawing elliptical boundaries, and imrect for drawing rectangular boundaries. You can refer to the product pages for more information. Takeaways from this recipe: Obtain user input directly via the graph in terms of data point level annotations and/or user settable boundary definitions See also Look up MATLAB help on the imlineimpoly, imfreehandimrect, and imelli pseginput commands. Linked axes and data brushing MATLAB allows creation of programmatic links between the plot and the data sources and linking different plots together. This feature is augmented by support for data brushing, which is a way to select data and mark it up to distinguish from others. Linking plots to their data source allows you to manipulate the values in the variables and have the plot automatically get updated to reflect the changes. Linking between axes enables actions such as zoom or pan to simultaneously affect the view in all linked axes. Data brushing allows you to directly manipulate the data on the plot and have the linked views reflect the effect of that manipulation and/or selection. These features can provide a live and synchronized view of different aspects of your data. Getting ready You will use the same cluster data as the previous recipe. Each point is denoted by an x and y value pair. The angle of each point can be computed as the inverse tangent of the ratio of the y value to the x value. The amplitude of each point can be computed as the square root of the sum of squares of the x and y values. The main panel in row 1 show the data in a scatter plot. The two plots in the second row have the angle and amplitude values of each point respectively. The fourth and fifth panels in the third row are histograms of the x and y values respectively. Load the data and calculate the angle and amplitude data as described earlier: load clusterInteractivData data(:,1) = X; data(:,2) = Y; data(:,3) = atan(Y./X); data(:,4) = sqrt(X.^2 + Y.^2); clear X Y How to do it... Perform the following steps: Plot the raw data: axes('position',[.3196 .6191 .3537 .3211], ... 'Fontsize',12); scatter(data(:,1), data(:,2),'ks', ... 'XDataSource','data(:,1)','YDataSource','data(:,2)'); box on; xlabel('Dye 1 Intensity'); ylabel('Dye 1 Intensity');title('Cluster Plot'); Plot the angle data: axes('position',[.0682 .3009 .4051 .2240], ... 'Fontsize',12); scatter(1:length(data),data(:,3),'ks',... 'YDataSource','data(:,3)'); box on; xlabel('Serial Number of Points'); title('Angle made by each point to the x axis'); ylabel('tan^{-1}(Y/X)'); Plot the amplitude data: axes('position',[.5588 .3009 .4051 .2240], ... 'Fontsize',12); scatter(1:length(data),data(:,4),'ks', ... 'YDataSource','data(:,4)'); box on; xlabel('Serial Number of Points'); title('Amplitude of each point'); ylabel('{surd(X^2 + Y^2)}'); Plot the two histograms: axes('position',[.0682 .0407 .4051 .1730], ... 'Fontsize',12); hist(data(:,1)); title('Histogram of Dye 1 Intensities'); axes('position',[.5588 .0407 .4051 .1730], ... 'Fontsize',12); hist(data(:,2)); title('Histogram of Dye 2 Intensities'); The output is as follows: Programmatically, link the data to their source: linkdata; Programmatically, turn brushing on and set the brush color to green: h = brush; set(h,'Color',[0 1 0],'Enable','on'); Use mouse movements to brush a set of points. You could do this on any one of the first three panels and observe the impact on corresponding points in the other graphs by its turning green. (The arrow and dashed boundary is used to depict the cursor movement from user interaction in the following figure): How it works... Because brushing is turned on, when you focus the mouse on any of the graph areas, a cross hair shows up at the cursor. You can drag to select an area of the graph. Points falling within the selected area are brushed to the color green, for the graphs on rows 1 and 2. Note that nothing is highlighted on the histograms at this point. This is because the x and y data source for the histograms is not correctly linked to the data source variables yet. For the other graphs, you programmatically set their x and y data source via the XDataSource and the YDataSource properties. You can also define the source data variables to link to a graphic and turn brushing on by using the icons from the figure toolbar as shown in the following screenshot. The first circle highlights the brush button; the second circle highlights the link data button. You can click on the Edit link pointed by the arrow to exactly define the x and y sources: There's more... To define the source data variables to link to a graphic and turn brushing on by using the icons from the Figure toolbar, do as follows: Clicking on Edit (pointed to in preceding figure) will bring up the following window: Enter data(:,1) in the YDataSource column for row 1 and data(:,2) in the YDataSource column for row 2. Now try brushing again. Observe that bins of the histogram get highlights in a bottom up order as corresponding points get selected (again, the arrow and dashed boundary is used to depict the cursor movement from user interaction): Link axes together to simultaneously investigate multiple aspects of the same data point. For example, in this step you plot the cluster data alongside a random quality value for each point of the data. Link the axes such that zoom and pan functions on either will impact the axes of the other linked axes: axes('position',[.13 .11 .34 .71]); scatter(data(:,1), data(:,2),'ks');box on; axes('position',[.57 .11 .34 .71]); scatter(data(:,1), data(:,2),[],rand(size(data,1),1), ... 'marker','o', 'LineWidth',2);box on; linkaxes; The output is as follows. Experiment with zoom and pan functionalities on this graph. Takeaways from this recipe: Use data brushing and linked axes features to provide a live and synchronized view of different aspects of your data. See also Look up MATLAB help on the linkdata, linkaxes, and brush commands.

0
0
3581

Packt

13 Dec 2012

15 min read

Meet QlikView

Packt

13 Dec 2012

15 min read

(For more resources related to this topic, see here.) What is QlikView? QlikView is developed by QlikTech, a company that was founded in Sweden in 1993, but has since moved its headquarters to the US. QlikView is a tool used for Business Intelligence, often shortened to BI. Business Intelligence is defined by Gartner, a leading industry analyst firm, as: An umbrella term that includes the application, infrastructure and tools, and best practices that enable access to and analysis of information to improve and optimize decisions and performance. Following this definition, QlikView is a tool that enables access to information in order to analyze this information, which in turn improves and optimizes business decisions and performance. Historically, BI has been very much IT-driven. IT departments were responsible for the entire Business Intelligence life cycle, from extracting the data to delivering the final reports, analyses, and dashboards. While this model works very well for delivering predefined static reports, most businesses find that it does not meet the needs of their business users. As IT tightly controls the data and tools, users often experience long lead-times whenever new questions arise that cannot be answered with the standard reports. How does QlikView differ from traditional BI? QlikTech prides itself in taking an approach to Business Intelligence that is different from what companies such as Oracle, SAP, and IBM—described by QlikTech as traditional BI vendors—are delivering. They aim to put the tools in the hands of business users, allowing them to become self-sufficient because they can perform their own analyses. Independent industry analyst firms have noticed this different approach as well. In 2011, Gartner created a subcategory for Data Discovery tools in its yearly market evaluation, the Magic Quadrant Business Intelligence platform. QlikView was named the poster child for this new category of BI tools. QlikTech chooses to describe itself as a Business Discovery enterprise instead of Data Discovery enterprise. It believes that discovering business insights is much more important than discovering data. The following diagram outlines this paradigm: Besides the difference in who uses the tool — IT users versus business users — there are a few other key features that differentiate QlikView from other solutions. Associative user experience The main difference between QlikView and other BI solutions is the associative user experience. Where traditional BI solutions use predefined paths to navigate and explore data, QlikView allows users to take whatever route they want. This is a far more intuitive way to explore data. QlikTech describes this as "working the way your mind works." An example is shown in the following image. While in a typical BI solution, we would need to start by selecting a Region and then drill down step-by-step through the defined drill path, in QlikView we can choose whatever entry point we like — Region, State, Product, or Sales Person. We are then shown only the data related to that selection, and in our next selection we can go wherever we want. It is infinitely flexible. Additionally, the QlikView user interface allows us to see which data is associated with our selection. For example, the following screenshot (from QlikTech's What's New in QlikView 11 demo document) shows a QlikView Dashboard in which two values are selected. In the Quarter field, Q3 is selected, and in the Sales Reps field, Cart Lynch is selected. We can see this because these values are green, which in QlikView means that they have been selected. When a selection is made, the interface automatically updates to not only show which data is associated with that selection, but also which data is not associated with the selection. Associated data has a white background, while non-associated data has a gray background. Sometimes the associations can be pretty obvious; it is no surprise that the third quarter is associated with the months July, August, and September. However, at other times, some not-so-obvious insights surface, such as the information that Cart Lynch has not sold any products in Germany or Spain. This extra information, not featured in traditional BI tools, can be of great value, as it offers a new starting point for investigation. Technology QlikView's core technological differentiator is that it uses an in-memory data model, which stores all of its data in RAM instead of using disk. As RAM is much faster than disk, this allows for very fast response times, resulting in a very smooth user-experience. Adoption path There is also a difference between QlikView and traditional BI solutions in the way it is typically rolled out within a company. Where traditional BI suites are often implemented top-down—by IT selecting a BI tool for the entire company—QlikView often takes a bottom-up adoption path. Business users in a single department adopt it, and its use spreads out from there. QlikView is free of charge for single-user use. This is called the Personal Edition or PE. Documents created in Personal Edition can be opened by fully-licensed users or deployed on a QlikView server. The limitation is that, with the exception of some documents enabled for PE by QlikTech, you cannot open documents created elsewhere, or even your own documents if they have been opened and saved by another user or server instance. Often, a business user will decide to download QlikView to see if he can solve a business problem. When other users within the department see the software, they get enthusiastic about it, so they too download a copy. To be able to share documents, they decide to purchase a few licenses for the department. Then other departments start to take notice too, and QlikView gains traction within the organization. Before long, IT and senior management also take notice, eventually leading to enterprise-wide adoption of QlikView. QlikView facilitates every step in this process, scaling from single laptop deployments to full enterprise-wide deployments with thousands of users. The following graphic demonstrates this growth within an organization: As the popularity and track record of QlikView have grown, it has gotten more and more visibility at the enterprise level. While the adoption path described before is still probably the most common adoption path, it is not uncommon nowadays for a company to do a top-down, company-wide rollout of QlikView. Exploring data with QlikView Now that we know what QlikView is and how it is different from traditional BI offerings, we will learn how we can explore data within QlikView. Getting QlikView Of course, before we can start exploring, we need to install QlikView. You can download QlikView's Personal Edition from http://www.qlikview.com/download. You will be asked to register on the website, or log in if you have registered before. Registering not only gives you access to the QlikView software, but you can also use it to read and post on the QlikCommunity (http://community.qlikview.com) which is the QlikTech's user forum. This forum is very active and many questions can be answered by either a quick search or by posting a question. Installing QlikView is very straightforward, simply double-click on the executable file and accept all default options offered. After you are done installing it, launch the QlikView application. QlikView will open with the start page set to the Getting Started tab, as seen in the following screenshot: The example we will be using is the Movie Database, which is an example document that is supplied with QlikView. Find this document by scrolling down the Examples list (it is around halfway down the list) and click to open it. The opening screen of the document will now be displayed: Navigating the document Most QlikView documents are organized into multiple sheets. These sheets often display different viewpoints on the same data, or display the same information aggregated to suit the needs of different types of users. An example of the first type of grouping might be a customer or marketing view of the data, an example of the second type of grouping might be a KPI dashboard for executives, with a more in-depth sheet for analysts. Navigating the different sheets in a QlikView document is typically done by using the tabs at the top of the sheet, as shown in the following screenshot. More sophisticated designs may opt to hide the tab row and use buttons to switch between the different sheets. The tabs in the Movie Database document also follow a logical order. An introduction is shown on the Intro tab, followed by a demonstration of the key concept of QlikView on the How QlikView works tab. After the contrast with Traditional OLAP is shown, the associative QlikView Model is introduced. The last two tabs show how this can be leveraged by showing a concrete Dashboard and Analysis: Slicing and dicing your data As we saw when we learned about the associative user experience, any selections made in QlikView are automatically applied to the entire data model. As we will see in the next section, slicing and dicing your data really is as easy as clicking and viewing! List-boxes But where should we click? QlikView lets us select data in a number of ways. A common method is to select a value from a list-box. This is done by clicking in the list-box. Let's switch to the How QlikView works tab to see how this works. We can do this by either clicking on the How QlikView works tab on the top of the sheet, or by clicking on the Get Started button. The selected tab shows two list boxes, one containing Fruits and the other containing Colors. When we select Apple in the Fruits list-box, the screen automatically updates to show the associated data in the Colors list-box: Green and Red. The color Yellow is shown with a gray background to indicate that it is not associated, as seen below, since there are no yellow apples. To select multiple values, all we need to do is hold down Ctrl while we are making our selection. Selections in charts Besides selections in list-boxes, we can also directly select data in charts. Let's jump to the Dashboard tab and see how this is done. The Dashboard tab contains a chart labeled Number of Movies, which lists the number of movies by a particular actor. If we wish to select only the top three actors, we can simply drag the pointer to select them in the chart, instead of selecting them from a list-box: Because the selection automatically cascades to the rest of the model, this also results in the Actor list-box being updated to reflect the new selection: Of course, if we want to select only a single value in a chart, we don't necessarily need to lasso it. Instead, we can just click on the data point to select it. For example, clicking on James Stewart leads to only that actor being selected. Search While list-boxes and lassoing are both very convenient ways of selecting data, sometimes we may not want to scroll down a big list looking for a value that may or may not be there. This is where the search option comes in handy. For example, we may want to run a search for the actor Al Pacino. To do this, we first activate the corresponding list-box by clicking on it. Next, we simply start typing and the list-box will automatically be updated to show all values that match the search string. When we've found the actor we're looking for, Al Pacino in this case, we can click on that value to select it: Sometimes, we may want to select data based on associated values. For example, we may want to select all of the actors that starred in the movie Forrest Gump. While we could just use the Title list-box, there is also another option: associated search. To use associated search, we click on the chevron on the right-hand side of the search box. This expands the search box and any search term we enter will not only be checked against the Actor list-box, but also against the contents of the entire data model. When we type in Forrest Gump, the search box will show that there is a movie with that title, as seen in the screenshot below. If we select that movie and click on Return, all actors which star in the movie will be selected. Bookmarking selections Inevitably, when exploring data in QlikView, there comes a point where we want to save our current selections to be able to return to them later. This is facilitated by the bookmark option. Bookmarks are used to store a selection for later retrieval. Creating a new bookmark To create a new bookmark, we need to open the Add Bookmark dialog. This is done by either pressing Ctrl + B or by selecting Bookmark | Add Bookmark from the menu. In the Add Bookmark dialog, seen in the screenshot below, we can add a descriptive name for the bookmark. Other options allow us to change how the selection is applied (as either a new selection or on top of the existing selection) and if the view should switch to the sheet that was open at the time of creating the bookmark. The Info Text allows for a longer description to be entered that can be shown in a pop-up when the bookmark is selected. Retrieving a bookmark We can retrieve a bookmark by selecting it from the Bookmarks menu, seen here: Undoing selections Fortunately, if we end up making a wrong selection, QlikView is very forgiving. Using the Clear, Back, and Forward buttons in the toolbar, we can easily clear the entire selection, go back to what we had in our previous selections, or go forward again. Just like in our Internet browser, the Back button in QlikView can take us back multiple steps: Changing the view Besides filtering data, QlikView also lets us change the information being displayed. We'll see how this is done in the following sections. Cyclic Groups Cyclic Groups are defined by developers as a list of dimensions that can be switched between users. On the frontend, they are indicated with a circular arrow. For an example of how this works, let's look at the Ratio to Total chart, seen in the following image. By default, this chart shows movies grouped by duration. If we click on the little downward arrow next to the circular arrow, we will see a list of alternative groupings. Click on Decade to switch to the view to movies grouped by decade. Drill down Groups Drill down Groups are defined by the developer as a hierarchical list of dimensions which allows users to drill down to more detailed levels of the data. For example, a very common drill down path is Year | Quarter | Month | Day. On the frontend, drill down groups are indicated with an upward arrow. In the Movies Database document, a drill down can be found on the tab labeled Traditional OLAP. Let's go there. This drill down follows the path Director | Title | Actor. Click on the Director A. Edward Sutherland to drill down to all movies that he directed, shown in the following screenshot. Next, click on Every Day's A Holiday to see which actors starred in that movie. When drilling down, we can always go back to the previous level by clicking on the upward arrow, located at the top of the list-box in this example. Containers Containers are used to alternate between the display of different objects in the same screen space. We can select the individual objects by selecting the corresponding tab within the container. Our Movies Database example includes a container on the Analysis sheet. The container contains two objects, a chart showing Average length of Movies over time and a table showing the Movie List, shown in the following screenshot. The chart is shown by default, you can switch to the Movie List by clicking on the corresponding tab at the top of the object. On the time chart, we can switch between Average length of Movies and Movie List by using the tabs at the top of the container object. But wait, there's more! After all of the slicing, dicing, drilling, and view-switching we've done, there is still the question on our minds: how can we export our selected data to Excel? Fortunately, QlikView is very flexible when it comes to this, we can simply right-click on any object and choose Send to Excel, or, if it has been enabled by the developer, we can click on the XL icon in an object's header. Click on the XL icon in the Movie List table to export the list of currently selected movies to Excel. A word of warning when exporting data When viewing tables with a large number of rows, QlikView is very good at only rendering those rows that are presently visible on the screen. When Export values to Excel is selected, all values must be pulled down into an Excel file. For large data sets, this can take a considerable amount of time and may cause QlikView to become unresponsive while it provides the data.

0
0
4802

Packt

05 Dec 2012

16 min read

Managing Files

Packt

05 Dec 2012

16 min read

(For more resources related to this topic, see here.) Managing local files In this section we will look at local file operations. We'll cover common operations that all computer users will be familiar with—copying, deleting, moving, renaming, and archiving files. We'll also look at some not-so-common techniques, such as timestamping files, checking for the existence of a file, and listing the files in a directory. Copying files For our first file job, let's look at a simple file copy process. We will create a job that looks in a specific directory for a file and copies it to another location. Let's do some setup first (we can use this for all of the file examples). In your project directory, create a new folder and name it FileManagement. Within this folder, create two more folders and name them Source and Target. In the Source directory, drop a simple text file and name it original.txt. Now let's create our job: Create a new folder in Repository and name it Chapter6 Create a new job within the Chapter6 directory and name it FileCopy. In the Palette, search for copy. You should be able to locate a tFileCopy component. Drop this onto the Job Designer. Click on its Component tab. Set the File Name field to point to the original.txt file in the Source directory. Set the Destination directory field to direct to the Target directory. For now, let's leave everything else unchanged. Click on the Run tab and then click on the Run button. The job should complete pretty quickly and, because we only have a single component, there are now data fl ows to observe. Check your Target folder and you will see the original.txt file in there, as expected. Note that the file still remains in the Source folder, as we were simply copying the file. Copying and removing files Our next example is a variant of our first file management job. Previously, we copied a file from one folder to another, but often you will want to affect a file move. To use an analogy from desktop operating systems and programs, we want to do a cut and paste rather than a copy and paste. Open the FileCopy job and follow the given steps: Remove the original.txt file from the Target directory, making sure it still exists in the Source directory. In the Basic settings tab of the tFileCopy component, select the checkbox for Remove source file. Now run the job. This time the original.txt file will be copied to the Target directory and then removed from the Source directory. Renaming files We can also use the tFileCopy component to rename files as we copy or move. Again, let's work with the FileCopy job we have created previously. Reset your Source and Target directories so that the original.txt file only exists in Source. In the Basic settings tab, check the Rename checkbox. This will reveal a new parameter, Destination filename. Change the default value of the Destination filename parameter to modified_name.txt. Run the job. The original file will be copied to the Target directory and renamed. The original file will also be removed from the Source directory. Deleting files It is really useful to be able to delete files. For example, once they have been transformed or processed into other systems. Our integration jobs should "clean up afterwards", rather than leaving lots of interim files cluttering up the directories. In this job example we'll delete a file from a directory.This is a single-component job. Create a new job and name it FileDelete. In your workspace directory, FileManagement/Source, create a new text file and name it file-to-delete.txt. From the Palette, search for filedelete and drag a tFileDelete component onto the Job Designer. Click on its Component tab to configure it. Change the File Name parameter to be the path to the file you created earlier in step 2. Run the job. After it is complete, go to your Source directory and the file will no longer be there. Note that the file does not get moved to the recycle bin on your computer, but is deleted immediately. Timestamping a file Sometimes in real life use, integration jobs, like any software, can fail or give an error. Server issues, previously unencountered bugs, or a host of other things can cause a job to behave in an unexpected manner, and when this happens, manual intervention may be needed to investigate the issue or recover the job that failed. A useful trick to try to incorporate into your jobs is to save files once they have been consumed or processed, in case you need to re-process them again at some point or, indeed, just for investigation and debugging purposes should something go wrong. A common way to save files is to rename them using a date/timestamp. By doing this you can easily identify when files were processed by the job. Follow the given steps to achieve this: Create a new job and call it FileTimestamp. Create a file in the Source directory named timestamp.txt. The job is going to move this to the Target directory, adding a time-stamp to the file as it processes. From the Palette, search for filecopy and drop a tFileCopy component onto the Job Designer. Click on its Component tab and change the File Name parameter to point to the timestamp.txt file we created in the Source directory. Change the Destination Directory to direct to your Target directory. Check the Rename checkbox and change the Destination filename parameter to "timestamp"+TalendDate.getDate("yyyyMMddhhmmss")+".txt". The previous code snippet concatenates the fixed file name, "timestamp", with the current date/time as generated by the Studio's getDate function at runtime. The file extension ".txt" is added to the end too. Run the job and you will see a new version of the original file drop into the Target directory, complete with timestamp. Run the job again and you will see another file in Target with a different timestamp applied. Depending on your requirements you can configure different format timestamps. For example, if you are only going to be processing one file a day, you could dispense with the hours, minutes, and second elements of the timestamp and simply set the output format to "yyyyMMdd". Alternatively, to make the timestamp more readable, you could separate its elements with hyphens—"yyyy-MM-dd", for example. You can find more information about Java date formats at http://docs.oracle.com/javase/6/docs/api/java/text/SimpleDateFormat.html.. Listing files in a directory Our next example job will show how to list all of the files (or all the files matching a specific naming pattern) in a directory. Where might we use such a process? Suppose our target system had a data "drop-off" directory, where all integration files from multiple sources were placed before being picked up to be processed. As an example, this drop-off directory might contain four product catalogue XML files, three CSV files containing inventory data, and 50 order XML files detailing what had been ordered by the customers. We might want to build a catalogue import process that picks up the four catalogue files, processes them by mapping to a different format, and then moves them to the catalogue import directory. The nature of the processing means we have to deal with each file individually, but we want a single execution of the process to pick up all available files at that point in time. This is where our file listing process comes in very handy and, as you might expect, the Studio has a component to help us with this task. Follow the given steps: Let's start by preparing the directory and files we want to list. Copy the FileList directory from the resource files to the FileManagement directory we created earlier. The FileList directory contains six XML files. Create a new job and name it FileList. Search for Filelist in the Palette and drop a tFileList component onto the Job Designer. Additionally, search for logrow and drop a tLogRow component onto the designer too. We will use the tFileList component to read all of the filenames in the directory and pass this through to the tLogRow component. In order to do this, we need to connect the tFileList and tLogRow. The tFileList component works in an iterative manner—it reads each filename and passes it onwards before getting the next filename. Its connector type is Iterative, rather than the more common Main connector. However, we cannot connect an iterative component to the tLogRow component, so we need to introduce another component that will act as an intermediary between the two. Search for iteratetoflow in the Palette and drop a tIterateToFlow component onto the Job Designer. This bridges the gap between an iterate component and a fl ow component. Click on the tFileList component and then click on its Component tab. Change the directory value so that it points to the FileList directory we created in step 1. Click on the + button to add a new row to the File section. Change the value to "*.xml". This configures the component to search for any files with an XML extension. Right-click on the tFileList component, select Row | Iterate, and drop the resulting connector onto the tIterateToFlow component. The tIterateToFlow component requires a schema and, as the tFileList component does not have a schema, it cannot propagate this to the iterateto-flow component when we join them. Instead we will have to create the schema directly. Click on the tIterateToFlow component and then on its Component tab. Click on the Edit schema button and, in the pop-up schema editor, click on the + button to add a row and then rename the column value to filename. Click on OK to close the window. A new row will be added to the Mapping table. We need to edit its value, so click in the Value column, delete the setting that exists, and press Ctrl + space bar to access the global variables list. Scroll through the global variable drop-down list and select "tFileList_1_CURRENT_FILE". This will add the required parameter to the Value column. Right-click on the tIterateToFlow component, select Row | Main, and connect this to the tLogRow component. Let's run the job. It may run too quickly to be visible to the human eye, but the tFileList component will read the name of the first file it finds, pass this forward to the tIterateToFlow component, go back and read the second file, and so on. As the iterate-to-flow component receives its data, it will pass this onto tLogRow as row data. You will see the following output in the tLogRow component: Now that we have cracked the basics of the file list component, let's extend the example to a real-life situation. Let's suppose we have a number of text files in our input directory, all conforming to the same schema. In the resources directory, you will find five files named fileconcat1.txt, fileconcat2.txt, and so on. Each of these has a "random" number of rows. Copy these files into the Source directory of your workspace. The aim of our job is to pick up each file in turn and write its output to a new file, thereby concatenating all of the original files. Let's see how we do this: Create a new job and name it FileConcat. For this job we will need a file list component, a delimited file output component, and a delimited file input component. As we will see in a minute, the delimited input component will be a "placeholder" for each of the input files in turn. Find the components in the Palette and drop them onto the Job Designer. Click on the file list component and change its Directory value to point to the Source directory. In the Files box, add a row and change the Filemask value to "*.txt". Right-click on the file list component and select Row | Iterate. Drop the connector onto the delimited input component. Select the delimited input component and edit its schema so that it has a single field rowdata of data type String We need to modify the File name/Stream value, but in this case it is not a fixed file we are looking for but a different file with each iteration of the file list component. TOS gives us an easy way to add such variables into the component definitions. First, though, click on the File name/Stream box and clear the default value. In the bottom-left corner of the Studio you should see a window named Outline. If you cannot see the Outline window, select Window | Show View from the menu bar and type outline into the pop-up search box. You will see the Outline view in the search results—double click on this to open it. Now that we can see the Outline window, expand the tFileList item to see the variables available in it. The variables are different depending upon the component selected. In the case of a file list component, the variables are mostly attributes of the current file being processed. We are interested in the filename for each iteration, so click on the variable Current File Name with path and drag it to the File name/Stream box in the Component tab of the delimited input component. You can see that the Studio completes the parameter value with a globalMap variable—in this case, tFileList_1_CURRENT_FILEPATH, which denotes the current filename and its directory path. Now right-click on the delimited input, select Row | Main, and drop the connector onto the delimited output. Change the File Name of the delimited output component to fileconcatout.txt in our target directory and check the Append checkbox, so that the Studio adds the data from each iteration to the bottom of each file. If Append is not checked, then the Studio will overwrite the data on each iteration and all that will be left will be the data from the final iteration. Run the job and check the output file in the target directory. You will see a single file with the contents of the five original files in it. Note that the Studio shows the number of iterations of the file list component that have been executed, but does not show the number of lines written to the output file, as we are used to seeing in non-iterative jobs. Checking for files Let's look at how we can check for the existence of a file before we undertake an operation on it. Perhaps the first question is "Why do we need to check if a file exists?" To illustrate why, open the FileDelete job that we created earlier. If you look at its component configuration, you will see that it will delete a file named file-todelete. txt in the Source directory. Go to this directory using your computer's file explorer and delete this file manually. Now try to run the FileDelete job. You will get an error when the job executes: The assumption behind a delete component (or a copy, rename, or other file operation process) is that the file does, in fact, exist and so the component can do its work. When the Studio finds that the file does not exist, an error is produced. Obviously, such an error is not desirable. In this particular case nothing too untoward happens—the job simply errors and exits—but it is better if we can avoid unnecessary errors. What we should really do here is check if the file exists and, if it does, then delete it. If it does not exist, then the delete command should not be invoked. Let's see how we can put this logic together Create a new job and name it FileExist. Search for fileexist in the Palette and drop a tFileExist component onto the Job Designer. Then search for filedelete and place a tFileDelete component onto the designer too. In our Source directory, create a file named file-exist.txt and configure File Name of the tFileDelete component to point to this. Now click on the tFileExist component and set its File name/Stream parameter to be the same file in the Source directory. Right-click on the tFileExist component, select Trigger | Run if, and drop the connector onto the tFileDelete component. The connecting line between the two components is labeled If. When our job runs the first component will execute, but the second component, tFileDelete, will only run if some conditions are satisfied. We need to configure the if conditions. Click on If and, in the Component tab, a Condition box will appear. In the Outline window (in the bottom-left corner of the Studio), expand the tFileExist component. You will see three attributes there. The Exists attribute is highlighted in red in the following screenshot: Click on the Exists attribute and drag it into the Conditions box of the Component tab. As before, a global-map variable is written to the configuration. The logic of our job is as follows: i. Run the tFileExist component. ii. If the file named in tFileExist actually exists, run the tFileDelete component. Note that if the file does not exist, the job will exit. We can check if the job works as expected by running it twice. The file we want to delete is in the Source directory, so we would expect both components to run on the first execution (and for the file to be deleted). When the if condition is evaluated, the result will show in the Job Designer view. In this case, the if condition was true—the file did exist. Now try to run the job again. We know that the file we are checking for does not exist, as it was deleted on the last execution. This time, the if condition evaluates to false, and the delete component does not get invoked. You can also see in the console window that the Studio did not log any errors. Much better! Sometimes we may want to verify that a file does not exist before we invoke another component. We can achieve this in a similar way to checking for the existence of a file, as shown earlier. Drag the Exists variable into the Conditions box and prefix the statement with !—the Java operator for "not": !((Boolean)globalMap.get("tFileExist_1_EXISTS"))

0
0
2827

article-image-securing-data-rest-oracle-11g

Packt

23 Oct 2012

11 min read

Securing Data at Rest in Oracle 11g

Packt

23 Oct 2012

11 min read

0
0
3919

article-image-piwik-tracking-user-interactions

Packt

11 Oct 2012

15 min read

Piwik: Tracking User Interactions

Packt

11 Oct 2012

15 min read

Tracking events with Piwik Many of you may be familiar with event tracking with Google Analytics; and many of you may not. In Google Analytics, event tracking is pretty structured. When you track an event with Google, you get five parameters: Category: The name for the group of objects you want to track Action: A string that is used to define the user in action for the category of object Label: This is optional and is for additional data Value: This is optional and is used for providing addition numerical data Non-interaction: This is optional and will not add the event hit into bounce rate calculation if set to true We are going over a few details on event tracking with Google Analytics because the custom variable feature we will be using for event tracking in Piwik is a little less structured. And a little structure will help you drill down the details of your data more effectively from the start. You won't have to restructure and change your naming conventions later on and lose all of your historical data in the process. We don't need to look over the code for Google Analytics. Just know that it may help to set up your event tracking with a similar structure. If you had videos on your site, enough to track, you would most likely make a category of events Videos. You can create as many as you need for the various objects you want to track on your site: Videos Maps Games Ads Blog posts (social media actions) Products As for the actions that can be performed on those Videos, I can think of a few: Play Pause Stop Tweet Like Download There are probably more than you can think of, but now that we have these actions we can connect them with elements of the site. As for the label parameter, you would probably want to use that to store the title of the movie the visitor is interacting with or the page it is located on. We will skip the value parameter which is for numeric data because with Piwik, you won't have a similar value. But non-interaction is interesting; it means that by default an action on a page counts to lower the bounce rate from that page since the user is doing something. Unfortunately, this is not a feature that we have using Piwik currently, although that could change in the future. Okay, now that we have learned one of the ways to structure our events, let's look at the way we can track events in Piwik. There is really nothing called event tracking in Piwik, but Piwik does have custom variables which will do the same job. But, since it is not really event tracking in the truest sense of the word, the bounce rate will be unaffected by any of the custom variables collected. In other words, unlike Google Analytics, you don't get the non-interaction parameter you can set. But let's see what you can do with Piwik. Custom variables are name-value pairs that you can set in Piwik. You can assign up to five custom variables for each visitor and/or each page view. The function for setting a custom variable is setCustomVariable . You must call it for each custom variable you set up to the limit of five. piwikTracker.setCustomVariable (index, name, value, scope); And here are what you set the parameters to: index: This is a number from 1 to 5 where your custom variables are stored. It should stay the same as the name of custom variable. Changing the index of a name later will reset all old data. name: This is the custom variable name or key. value: This is the value for name. scope: This parameter sets the scope of the custom variable, whether it is being tracked per visit or per page. And what scope we set depends upon what we are tracking and how complex our site is. So how do these custom variables fit our model of event tracking? Well we have to do things a little bit differently. For most of our event tracking, we will have to set our variable scope per page. There is not enough room to store much data at the visit level. That is good for other custom tracking you may need but for event tracking you will need more space for data. So with page level custom variables, you get five name-value sets per page. So, we would set up our variables similar to something like this for a video on the page: Index = 1 Name = "Video" Value = "Play","Pause","Stop","Tweet","Like", and so on Scope = "page" And this set of variables in using Piwik's custom variable function would look like one of the following: piwikTracker.setCustomVariable(1,"Video","Play","page"); piwikTracker.setCustomVariable(1,"Video","Pause","page"); piwikTracker.setCustomVariable(1,"Video","Tweet","page"); Which one you would use would depend on what action you are tracking. You would use JavaScript in the page to trigger these variables to be set, most likely by using an onClick event on the button. We will go into the details of various event tracking scenarios later in this chapter. You will notice in the previous snippets of code that the index value of each call is 1. We have set the index of the "Video" name to 1 and must stick to this now on the page or data could be overwritten. This also leaves us the two to five indexes still available for use on the same page. That means if we have banner ads on the page, we could use one of the spare indexes to track the ads. piwikTracker.setCustomVariable(2,"SidebarBanner","Click","page"); You will notice that Google event tracking has the label variable. As we are using page leveling custom variables with Piwik and the variables will be attached to the page itself, there is no need to have this extra variable in most cases. If we do need to add extra data other than an action value, we will have to concatenate our data to the action and use the combined value in Piwik's custom tracking's value variable. Most likely, if we have one banner on our video page, we will have more and to track those click events per banner, we may have to get a little creative using the following: piwikTracker.setCustomVariable(2,"SidebarBanner", "AddSlot1Click","page"); piwikTracker.setCustomVariable(2,"SidebarBanner", "AddSlot2Click","page"); piwikTracker.setCustomVariable(2,"SidebarBanner", "AddSlot3Click","page"); Of course, it is up to you whether you join your data together by using CamelCase, which means joining each piece of data together after capitalizing each. This is what I did previously. You can also use spaces or underscores as long as it is understandable to you and you stick to it. Since the name and value are in quotation marks, you can use any suitable string. And again, since these are custom variables, if you come up with a better system of setting up your event tracking that works better with your website and business model, then by all means try it. Whatever works best for you and your site is better in the long run. So now that we have a general idea of how we will be tracking events with Piwik, let's look at some specific examples and more in depth at what events are, compared to goals or page views. Tracking social engagement You know that you have a Facebook "Like" button on your page, a Twitter "tweet" button, and possibly lots more buttons that do various things at other sites that you yourself have no control over and can add no tracking code to. But you can track clicks on the button itself. You use event tracking for what you could call micro-conversions. But there is really nothing micro about them. That Facebook "Like" could end up in many more sales or conversions than a standard conversion. They could be the route on the way to one or multiple conversions. There may be a blurry line between engagement goals and micro-conversions. And really, it is up to you what weight you give to visitor actions on your site, but use events for something smaller than you would consider a goal. If your goal is sales on your website, that Facebook "Like" should cause a spike in your sales and you will be able to correlate that to your event tracking, but the "Like" is not the end of the road, or the goal. It is a stop on the way. If your website is a blog and your goal is to promote advertising or your services with your content, then tracking social engagement can tell you which topics have the highest social interest so that you can create similar content in the future. So what are some other events we can track? Of course, you would want to track anything having to do with liking, tweeting, bookmarking, or somehow spreading your site on a social network. That includes Facebook , Twitter , Digg , StumbleUpon , Pinterest , and any other social network whose button you put on your site. If you spent enough time to put the buttons on your pages, you can at least track these events. And if you don't have buttons, you have to remember that each generation is using the Internet more often; smartphones make it available everywhere, and everyone is on a social network. Get with it. And don't forget to add event tracking to any sort of Follow Me or Subscribe button. That too is an event worth tracking. We will also look at blog comments since we can consider them to be in the social realm of our tracking. Tracking content sharing So let's look at a set of social sharing buttons on our website. We aren't going to blow things out of proportion by using buttons for every social network out there, just two: Twitter and Facebook. Your site may have less and should have more, but the same methods we will explore next can be used for any amount of content sharing buttons. We are event tracking, so let's begin by defining what our custom variable data will be. We need to figure out how we are going to set up our categories of events and the actions. In this example, we will be using buttons on our Holy Hand Grenade site: You will see our Twitter button and our Facebook button right underneath the image of our Holy Hand Grenade. We are going to act as if our site has many more pages and events to track on it and use a naming convention that will leave room for growth. So we are going to use the category of product shares. That way we have room for video shares when we finally get that cinematographer and film our Holy Hand Grenade in action. Now we need to define our actions. We will be adding more buttons later after we test the effectiveness of our Facebook and Twitter buttons. This means we need a separate action to distinguish each social channel. Share on Facebook Share on Twitter And then we add more buttons: Share on Google+ Share on Digg Share on Reddit Share on StumbleUpon So let's look at the buttons in the source of the page for a minute to see what we are working with: <li class="span1"> <script type="text/javascript" src="http://platform.twitter.com/widgets.js"></script> <a href="https://twitter.com/intent/tweet?url=http%3A%2F%2Fdir23.com&text=Holy%20Hand%20Grenades"class="twitter-share-button">Tweet</a> </li> <li class="span1"></li> <li class="span1"> <script src="http://connect.facebook.net/en_US/all.js#xfbml=1"></script> <fb:like href="http://dir23.com/" show_faces="false"width="50" font=""></fb:like> </li> </ul> <p><a class="btn btn-primary btn-large">Buy Now >></a></p> You see that the buttons are not really buttons yet, they are only HTML anchors in the code and JavaScript includes. Before we start looking at the code to track clicks on these buttons, we need to go over some details about the way Piwik's JavaScript works. Setting a custom variable in Piwik using an onclick event is a very tricky procedure. To start with, you must call more than just setCustomVariable because that will not work after the Piwik tracking JavaScript has loaded and trackPageView has been called. But there is a way around this. First, you call setCustomVariable and then, in that same onclick event , you call trackLink , as in the next example: <p><a href="buynow.html" class="btn btn-primary btn-large" onclick="javascript:piwikTracker.setCustomVariable(2,'Product Pricing','ViewProduct Price','page');piwikTracker.trackLink();">Buy Now >></a></p> If you forget to add the piwikTracker.trackLink() call, nothing will happen and no custom variables will be set. Now with the sharing buttons, we have another issue when it comes to tracking clicks. Most of these buttons, including Facebook, Twitter, and Google+ use JavaScript to create an iframe that has the button. This is a problem, because the iframe is on another domain and there is not an easy way to track clicks. For this reason, I suggest using your social network's API functionality to create the button so that you can create a callback that will fire when someone likes or tweets your page. Another advantage to this method is that you will be sure that each tracked tweet or like will be logged accurately. Using an on_click event will cause a custom variable to be created with every click. If the person is not logged in to their social account at the time, the actual tweet or like will not happen until after they log in, even if they decide to do so. Facebook, Twitter, and Google+ all have APIs with this functionality. But if you decide to try to track the click on the iframe, you can take a look at the code at http://www.bennadel.com/blog/1752-Tracking-Google-AdSense- Clicks-With-jQuery-And-ColdFusion.htm to see how complicated it can get. The click is not really tracked. The blur on the page is tracked, because blur usually happens if a link in the iframe is clicked and a new page is about to load. We already have our standard Piwik tracking code on the page. This does not have to be modified in any way for event tracking. Instead we will be latching into Twitters and Facebook's APIs which we loaded in the page by including their JavaScript. <script> twttr.events.bind('tweet', function(event) { piwikTracker.setCustomVariable(1,'Product Shares','Share on Twitter','page'); piwikTracker.trackLink(); }); </script> <script type="text/javascript"> FB.Event.subscribe('edge.create', function(response) { piwikTracker.setCustomVariable(1,'Product Shares','Share onFacebook','page'); piwikTracker.trackLink(); }); </script> We add these two simple scripts to the bottom of the page. I put them right before the Piwik tracking code. The first script binds to the tweet event in Twitter's API and once that event fires, our Piwik code executes and sets our custom variable. Notice that here too we have to call trackLink right afterwards. The second script does the same thing when someone likes the page on Facebook. It is beyond the scope of this book to go into more details about social APIs, but this code will get you started and you can do more research on your chosen social network's API on your own to see what type of event tracking will be possible. For example, with the Twitter API you can bind a function to each one of these actions: click, tweet, retweet, favorite, or follow. There are definitely more possibilities with this than there is with a simple onclick event. Using event tracking on your social sharing buttons will let you know where people share your line of Holy Hand Grenades. This will help you figure out just which social networks you should have a presence on. If people on Twitter like the grenades, then you should make sure to keep your Twitter account active, and if you don't have a Twitter account and your product is going viral there, you need to get one quick and participate in the conversation about your product. Or you may want to invest in the development of a Facebook app and you are not quite sure that it is worth the investment. Well, a little bit of event tracking will tell you if you have enough people interested in your website or products to fork over the money for an app. Or maybe a person goes down deep into the pages of your site, digs out a gem, and it gets passed around StumbleUpon like crazy. This might indicate a page that you should feature on the home page of your website. And if it's a product page that's been hidden from light for years, maybe throw some advertising its way too.

0
0
3271

Packt

03 Sep 2012

18 min read

Overview of FIM 2010 R2

Packt

03 Sep 2012

18 min read

0
0
3966

article-image-infinispan-data-grid-infinispan-and-jboss-7

Packt

23 Aug 2012

2 min read

Infinispan Data Grid: Infinispan and JBoss AS 7

Packt

23 Aug 2012

2 min read

The new modular application server JBoss AS has changed a lot with the latest distribution. The new application server has improved in many areas, including lower memory footprint, lightning fast startup, true classloading isolation (between built-in modules and modules delivered by developers), and excellent management of resources with the addition of domain controllers. How Infinispan platform fits into this new picture will be illustrated shortly. Should you need to know all the core details of the AS 7 architecture, you might consider looking for a copy of JBoss AS 7 Configuration, Deployment and Administration, which has been authored by Francesco Marchioni and was published in December, 2011. In a nutshell, JBoss AS 7 is composed of a set of modules that provide the basic server functionalities. The configuration of modules is not spread in a set of single XML files anymore, but it is centralized into a single file. Thus, every configuration file holds a single server configuration. A server configuration can be in turn based on a set of standalone servers or domain servers. The main difference between standalone servers and domain servers encompasses the management area; as a matter of fact, domain-based servers can be managed from a centralized point (the domain controller) while, on the other hand, standalone servers are independent server units, each one managing its own configuration.

0
0
2233

How-To Tutorials - Data

Extending Your Structure and Search

Working with Apps in Splunk

Constructing and Evaluating Your Design Solution

Getting Started with InnoDB

Marker-based Augmented Reality on iPhone or iPad

Oracle: Using the Metadata Service to Share XML Artifacts

Creating Cartesian-based Graphs

SAP HANA integration with Microsoft Excel

Creating Interactive Graphics and Animation

Meet QlikView

Trending Topics

Managing Files

Securing Data at Rest in Oracle 11g

Piwik: Tracking User Interactions

Overview of FIM 2010 R2

Infinispan Data Grid: Infinispan and JBoss AS 7

Create a Free Account To Continue Reading

Sign in to activate your 7-day free access