Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials

7018 Articles
article-image-fast-array-operations-numpy
Packt
19 Dec 2013
10 min read
Save for later

Fast Array Operations with NumPy

Packt
19 Dec 2013
10 min read
(For more resources related to this topic, see here.) Getting started with NumPy NumPy is founded around its multidimensional array object, numpy.ndarray. NumPy arrays are a collection of elements of the same data type; this fundamental restriction allows NumPy to pack the data in an efficient way. By storing the data in this way NumPy can handle arithmetic and mathematical operations at high speed. Creating arrays You can create NumPy arrays using the numpy.array function. It takes list-like object (or another array) as input and, optionally, a string expressing its data type. You can interactively test array creation using an IPython shell as follows: In [1]: import numpy as np In [2]: a = np.array([0, 1, 2]) Every NumPy array has a data type that can be accessed by the dtype attribute, as shown in the following code. In the following code example, dtype is a 64-bit integer. In [3]: a.dtype Out[3]: dtype('int64') If we want those numbers to be treated as a float type of variable, we can either pass the dtype argument in the np.array function or cast the array to another data type using the astype method as shown in the following code: In [4]: a = np.array([1, 2, 3], dtype='float32') In [5]: a.astype('float32') Out[5]: array([ 0.,  1.,  2.], dtype=float32) To create an array with two dimensions (an array of arrays) we can initialize the array using a nested sequence shown as follows: In [6]: a = np.array([[0, 1, 2], [3, 4, 5]]) In [7]: print(a) Out[7]: [[0 1 2]         [3 4 5]] The array created in this way has two dimensions—axes in NumPy's jargon. Such an array is like a table that contains two rows and three columns. We can access the axes structure using the ndarray.shape attribute: In [7]: a.shape Out[7]: (2, 3) Arrays can also be reshaped only as long as the product of the shape dimensions is equal to the total number of elements in the array. For example, we can reshape an array containing 16 elements in the following ways: (2, 8), (4, 4), or (2, 2, 4). To reshape an array we can either use the ndarray.reshape method or directly change the ndarray.shape attribute. The following code illustrates the use of the ndarray.reshape method: In [7]: a = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8,                       9, 10, 11, 12, 13, 14, 15]) In [7]: a.shape Out[7]: (16,) In [8]: a.reshape(4, 4) # Equivalent: a.shape = (4, 4) Out[8]: array([[ 0,  1,  2,  3],        [ 4,  5,  6,  7],        [ 8,  9, 10, 11],        [12, 13, 14, 15]]) Thanks to this property you are also free to add dimensions of size one. You can reshape an array with 16 elements to (16, 1), (1, 16), (16, 1, 1), and so on. NumPy provides convenience functions, shown in the following code, to create arrays filled with zeros, filled with ones, or without an initialization value (empty—their actual value is meaningless and depends on the memory state). Those functions take the array shape as a tuple and optionally its dtype. In [8]: np.zeros((3, 3)) In [9]: np.empty((3, 3)) In [10]: np.ones((3, 3), dtype='float32') In our examples we will use the numpy.random module to generate random floating point numbers in the (0, 1) interval. The numpy.random module is shown as follows: In [11]: np.random.rand(3, 3) Sometimes it is convenient to initialize arrays that have a similar shape to other arrays. Again, NumPy provides some handy functions for that purpose such as zeros_like, empty_like, and ones_like. These functions are as follows: In [12]: np.zeros_like(a) In [13]: np.empty_like(a) In [14]: np.ones_like(a) Accessing arrays NumPy array interface is, on a shallow level, similar to Python lists. They can be indexed using integers, and can also be iterated using a for loop. In [15]: A = np.array([0, 1, 2, 3, 4, 5, 6, 7, 8]) In [16]: A[0] Out[16]: 0 In [17]: [a for a in A] Out[17]: [0, 1, 2, 3, 4, 5, 6, 7, 8] It is also possible to index an array in multiple dimensions. If we take a (3,3) array (an array containing 3 triplets) and we index the first element, we obtain the first triplet shown as follows: In [18]: A = np.array([[0, 1, 2], [3, 4, 5], [6, 7, 8]]) In [19]: A[0] Out[19]: array([0, 1, 2]) We can index the triplet again by adding the other index separated by a comma. To get the second element of the first triplet we can index using [0, 1] as shown in the following code: In [20]: A[0, 1] Out[20]: 1 NumPy allows you to slice arrays in single and multiple dimensions. If we index on the first dimension we will get a collection of triplets shown as follows: In [21]: A[0:2] Out[21]: array([[0, 1, 2],                [3, 4, 5]]) If we slice the array with [0:2]. for every selected triplet we extract the first two elements, resulting in a (2, 2) array shown in the following code: In [22]: A[0:2, 0:2] Out[22]: array([[0, 1],                 [3, 4]]) Intuitively, you can update values in the array by using both numerical indexes and slices. The syntax is as follows: In [23]: A[0, 1] = 8 In [24]: A[0:2, 0:2] = [[1, 1], [1, 1]] Indexing with the slicing syntax is fast because it doesn't make copies of the array. In NumPy terminology it returns a view over the same memory area. If we take a slice of the original array and then changes one of its value; the original array will be updated as well. The following code illustrates an example of the same: In [25]: a = np.array([1, 1, 1, 1]) In [26]: a_view = A[0:2] In [27]: a_view[0] = 2 In [28]: print(A) Out[28]: [2 1 1 1] We can take a look at another example that shows how the slicing syntax can be used in a real-world scenario. We define an array r_i, shown in the following line of code, which contains a set of 10 coordinates (x, y); its shape will be (10, 2): In [29]: r_i = np.random.rand(10, 2) A typical operation is extracting the x component of each coordinate. In other words you want to extract the items [0, 0], [1, 0], [2, 0], and so on. resulting in an array with shape (10,). It is helpful to think that the first index is moving while the second one is fixed (at 0). With this in mind, we will slice every index on the first axis (the moving one) and take the first element (the fixed one) on the second axis as shown in the following line of code: In [30]: x_i = r_i[:, 0] On the other hand, the following expression of code will keep the first index fixed and the second index moving, giving the first (x, y) coordinate: In [31]: r_0 = r_i[0, :] Slicing all the indexes over the last axis is optional; using r_i[0] has the same effect as r_i[0, :]. NumPy allows to index an array by using another NumPy array made of either integer or Boolean values—a feature called fancy indexing. If you index with an array of integers, NumPy will interpret the integers as indexes and will return an array containing their corresponding values. If we index an array containing 10 elements with [0, 2, 3], we obtain an array of size 3 containing the elements at positions 0, 2 and 3. The following code gives us an illustration of this concept: In [32]: a = np.array([9, 8, 7, 6, 5, 4, 3, 2, 1, 0]) In [33]: idx = np.array([0, 2, 3]) In [34]: a[idx] Out[34]: array([9, 7, 6]) You can use fancy indexing on multiple dimensions by passing an array for each dimension. If we want to extract the elements [0, 2] and [1, 3] we have to pack all the indexes acting on the first axis in one array, and the ones acting on the second axis in another. This can be seen in the following code: In [35]: a = np.array([[0, 1, 2], [3, 4, 5],                        [6, 7, 8], [9, 10, 11]]) In [36]: idx1 = np.array([0, 1]) In [37]: idx2 = np.array([2, 3]) In [38]: a[idx1, idx2] You can also use normal lists as index arrays, but not tuples. For example the following two statements are equivalent: >>> a[np.array([0, 1])] # is equivalent to >>> a[[0, 1]] However, if you use a tuple, NumPy will interpret the following statement as an index on multiple dimensions: >>> a[(0, 1)] # is equivalent to >>> a[0, 1] The index arrays are not required to be one-dimensional; we can extract elements from the original array in any shape. For example we can select elements from the original array to form a (2,2) array shown as follows: In [39]: idx1 = [[0, 1], [3, 2]] In [40]: idx2 = [[0, 2], [1, 1]] In [41]: a[idx1, idx2] Out[41]: array([[ 0,  5],                 [10,  7]]) The array slicing and fancy indexing features can be combined. For example, this is useful if we want to swap the x and y columns in a coordinate array. In the following code, the first index will be running over all the elements (a slice), and for each of those we extract the element in position 1 (the y) first and then the one in position 0 (the x): In [42]: r_i = np.random(10, 2) In [43]: r_i[:, [0, 1]] = r_i[:, [1, 0]] When the index array is a Boolean there are slightly different rules. The Boolean array will act like a mask; every element corresponding to True will be extracted and put in the output array. This procedure is shown as follows: In [44]: a = np.array([0, 1, 2, 3, 4, 5]) In [45]: mask = np.array([True, False, True, False, False, False]) In [46]: a[mask] Out[46]: array([0, 2]) The same rules apply when dealing with multiple dimensions. Furthermore, if the index array has the same shape as the original array, the elements corresponding to True will be selected and put in the resulting array. Indexing in NumPy is a reasonably fast operation. Anyway, when speed is critical, you can use the, slightly faster, numpy.take and numpy.compress functions to squeeze out a little more speed. The first argument of numpy.take is the array we want to operate on, and the second is the list of indexes we want to extract. The last argument is axis; if not provided, the indexes will act on the flattened array, otherwise they will act along the specified axis. In [47]: r_i = np.random(100, 2) In [48]: idx = np.arange(50) # integers 0 to 50 In [49]: %timeit np.take(r_i, idx, axis=0) 1000000 loops, best of 3: 962 ns per loop In [50]: %timeit r_i[idx] 100000 loops, best of 3: 3.09 us per loop The similar, but faster version for Boolean arrays is numpy.compress which works in the same way. The use of numpy.compress is shown as follows: In [51]: idx = np.ones(100, dtype='bool') # all True values In [52]: %timeit np.compress(idx, r_i, axis=0) 1000000 loops, best of 3: 1.65 us per loop In [53]: %timeit r_i[idx] 100000 loops, best of 3: 5.47 us per loop Summary The article thus covers the basics of NumPy arrays, talking about the creating of arrays and how we can access them. Resources for Article: Further resources on this subject: Getting Started with Spring Python [Article] Python Testing: Installing the Robot Framework [Article] Python Multimedia: Fun with Animations using Pyglet [Article]
Read more
  • 0
  • 0
  • 107489

article-image-applied-modeling
Packt
19 Dec 2013
15 min read
Save for later

Applied Modeling

Packt
19 Dec 2013
15 min read
(For more resources related to this topic, see here.) This article examines the foundations of tabular modeling (at least from the modelers point of view). In order to provide a familiar model which can be used later, this article will progressively build the model. Each recipe is intended to demonstrate a particular technique; however, they need to be followed in order so that the final model is completed. Grouping by binning and sorting with ranks Often, we want to provide descriptive information in our data, based on values that are derived from downstream activities. For example, this could arise if we wish to include a field in the Customer table that shows the value of the previous sales for that customer or a bin grouping that defines the customer into banded sales. This value can then be used to rank each customer according to their relative importance in relation to other customers. This type of value adding activity is usually troublesome and time intensive in a traditional data mart design as the customer data can be fully updated only once the sales data has been loaded. While this process may seem relatively straightforward, it is a recursive problem as the customer data must be loaded before the sales data (since customers must exist before the sales can be assigned to them), but the full view of the customer is reliant on loading the sales data. In a standard dimensional (star schema) modeling approach, including this type of information for dimensions requires a three-step process: The dimension (reseller customer data) is updated for known fields and attributes. This load excludes information that is derived (such as sales). Then, the sales data (referred to as fact data) is loaded in data warehouse. This ensures that the data mart is in a current state and all the sales transaction data is up-to-date. Information relating to any new and changed stores can be loaded correctly. The dimension data which relies on other fact data is updated based on the current state of the data mart. Since the tabular model is less confined by the traditional star schema requirements of the fact and dimension tables (in fact, the tabular model does not explicitly identify facts or dimensions), the inclusion and processing of these descriptive attributes can be built directly into the model. The calculation of a simple measure such as a historic sales value may be included in OLAP modeling through calculated columns in the data source view. However, this is restrictive and limited to simple calculations (such as total sales or n period sales). Other manipulations (such as ranking and binning) are a lot more flexible in tabular modeling (as we will see). This recipe examines how to manipulate a dimensional table in order to provide a richer end user experience. Specifically, we will do the following: Introduce a calculated field to calculate the historical sales for the customer Determine the rank of the customer based on that field Create a discretization bin for the customer based on their sales Create an ordered hierarchy based on their discretization bins Getting ready Continuing with the scenario that was discussed in the Introduction section of the article, the purpose of this article is to identify each reseller's (customer's) historic sales and then rank them accordingly. We then discretize the Resellers table (customer) based on this. This problem is further complicated by the consideration that a sale occurs in the country of origin (the sales data in the Reseller Sales table will appear in any currency). In order to provide a concise recipe, we break the entire process into two distinct steps: Conversion of sales (which manages the ranking of Resellers based on a unified sales value) Classification of sales (which manages the manipulation of sales values based on discretized bins to format those bins) How to do it… Firstly, we need to provide a common measure to compare the sales value of Resellers. Convert sales to a uniform currency using the following steps: Open a new workbook and launch the PowerPivot Window. Import the text files Reseller Sales.txt, Currency Conversion.txt, and Resellers.txt. The source data folder for this article includes the base schema.ini file that correctly transforms all data upon import. When importing the data, you should be prompted that the schema.ini file exists and will override the import settings. If this does not occur, ensure that the schema.ini file exists in the same directory as your source data. The prompt should look like the following screenshot: Although it is not mandatory, it is recommended that connection managers are labeled according to a standard. In this model, I have used the convention type_table_name where type refers to the connection type (.txt) and table_name refers to the name of the table. Connections can be edited using the Existing Connections button in the Design tab. Create a relationship between the Customer ID field in the Resellers table and the Customer ID field in the Reseller Sales table. Add a new field (also called a calculated column) in the Resellers Sales table to show the gross value of the sale amount in USD. Add usd_gross_sales and hide it from client tools using the following code: = [Quantity Ordered] *[Price Per Unit] *LOOKUPVALUE ( 'Currency Conversion'[AVG Rate] ,'Currency Conversion'[Date] ,[Order dt] ,'Currency Conversion'[Currency ID] ,[Currency ID] ) Add a new measure in the Resellers Sales table to show sales (in USD). Add USD Gross Sales as: USD Gross Sales := SUM ( [usd_gross_sales] ) Add a new calculated column to the Resellers table to show the USD Sales Total value. The formula for the field should be: = 'Reseller Sales' [USD Gross Sales] Add a Sales Rank field in the Resellers table to show the order for each resellers USD Sales Total. The formula for Sales Rank is: =RANKX(Resellers, [USD Sales Total]) Hide the USD Sales Total and Sales Rank fields from client tools. Now that all the entries in the Resellers table show their sales value in a uniform currency, we can determine a grouping for the Reseller table. In this case, we are going to group them into 100,000 dollar bands. Add a new field to show each value in the USD Sales Total column of Resellers rounded down to the nearest 100,000 dollars. Hide it from client tools. Now, add Round Down Amount as: =ROUNDDOWN([USD Sales Total],-5) Add a new field to show the rank of Round Down Amount in descending order and hide it from client tools. Add Round Down Order as: =RANKX(Resellers,[Round Down Amount],,FALSE(), DENSE) Add a new field in the Resellers table to show the 100,000 dollars group that the reseller belongs to. Since we know what the lower bound of the sales bin is, we can also infer that the upper bin is the rounded up 100,000 dollars sales group. Add Sales Group as follows: =IF([Round Down Amount]=0 || ISBLANK([Round Down Amount]) , "Sales Under 100K" , FORMAT([Round Down Amount], "$#,K") & " - " & FORMAT(ROUNDUP([USD Sales Total],-5),"$#,K") ) Set the Sort By Column of the Sales Group field to the Round Down Order column. Note that the Round Down Order column should display in a descending order (that is, entries in the Resellers table with high sales values should appear first). Create a hierarchy on the Resellers table which shows the Sales Group field and the Customer Name column as levels. Title the hierarchy as Customer By Sales Group. Add a new measure titled Number of Resellers to the Resellers table: Number of Resellers:=COUNTROWS(Resellers) Create a pivot table that shows the Customer By Sales Group hierarchy on the rows and Number of Resellers as values. If you created the usd_gross_sales field in the Reseller Sales table, it can also be added as an implicit measure to verify values. Expand the first bin of Sales Group. The pivot should look like the following screenshot: How it works… This recipe has included various steps, which add descriptive information to the Resellers table. This includes obtaining data from a separate table (the USD sales) and then manipulating that field within the Resellers table. In order to provide a clearer definition of how this process works, we will break the explanation into several subsections. This includes the sales data retrieval, the use of rank functions, and finally, the discretization of sales. The next section deals with the process of converting sales data to a single currency. The starting point for this recipe is to determine a common currency sales value for each reseller (or customer). While the inclusion of the calculated column USD Sales Total in the Resellers table should be relatively straightforward, it is complicated by the consideration that the sales data is stored in multiple currencies. Therefore, the first step needs to include a currency conversion to determine the USD sales value for each line. This is simply the local value multiplied by the daily exchange rate. The LOOKUPVALUE function is used to return the row-by-row exchange rate. Now that we have the usd_gross_sales value for each sales line, we define a measure that calculates its sum in whatever filter context it is applied in. Including it in the Reseller Sales table makes sense (since, it relates to sales data), but what is interesting is how the filter context is applied when it is used as a field in the Resellers table. Here, the row filter context that exists in the Resellers table (after all, each row refers to a reseller) applies a restriction to the sales data. This shows the sales value for each reseller. For this recipe to work correctly, it is not necessary to include the calculated field usd_gross_sales in Reseller Sales. We simply need to define a calculation, which shows the gross sales value in USD and then use the row filter context in the Resellers table to restrict sales to the reseller in question (that is, the reseller in the row). It is obvious that the exchange rate should be applied on a daily basis because the value can change every day. We could use an X function in the USD Gross Sales measure to achieve exactly the same outcome. Our formula will be: SUMX ( 'Reseller Sales' , 'Reseller Sales'[Quantity Ordered] * 'Reseller Sales'[Price Per Unit] * LOOKUPVALUE ( 'Currency Conversion'[AVG Rate] , 'Currency Conversion'[Date] ,'Reseller Sales'[Order dt] ,'Currency Conversion'[Currency ID] ,'Reseller Sales'[Currency ID] ) ) Furthermore, if we wanted to, we could completely eliminate the USD Gross Sales measure from the model. To do this, we could wrap the entire formula (the previous definition on USD Gross Sales) into the CALCULATE statement in the Resellers table's field definition of USD Gross Sales. This forces the calculation to occur at the current row context. Why have we included the additional fields and measures in Reseller Sales? This is a modeling choice. It makes the model easier to understand because it is more modular. This would otherwise require two calculations (one into a default currency and the second into a selected currency) and the field usd_gross_sales is used in that calculation. Now that sales are converted to a uniform currency, we can determine the importance by rank. RANKX is used to rank the rows in the Resellers table based on the USD Gross Sales field. The simplest implementation of RANKX is demonstrated within the Sales Rank field. Here, the function simply returns a rank based on the value according to the supplied measure (which is of course USD Gross Sales). However, the RANKX function provides a lot of versatility and follows the syntax: RANKX(<table> , <expression>[, <value>[, <order>[, <ties>]]] ) After the initial implementation of RANKX in its simplest form, the arguments of particular interest are the <order> and <ties> arguments. These can be used to specify the sort order (whether the rank is to be applied from highest to lowest or lowest to highest) and the function behavior when duplicate values are encountered. This may be best demonstrated with an example. To do this, we will examine the operation of rank in relation to Round Down Amount. When a simple RANKX function is applied, the function sorts the columns in an ascending order and returns the position of a row based on the sorted order of the value and the number of prior rows within the table. This includes rows attributable to duplicate values. This is shown in the following screenshot where the Simple column is defined as RANKX(Resellers,[Round Down Amount]). Note, the data is sorted by Round Down Amount and the first four tied values have a RANKX value of 1. This is the behavior we expect since all rows have the same value. For the next value (700000), RANKX returns 5 because this is the fifth element in the sequence. When the DENSE argument is specified, the value returned after a tie is the next sequential number in the list. In the preceding screenshot, this is shown through the DENSE column. The formula for the field DENSE is: RANKX(Resellers, [Round Down Amount],,,DENSE)) Finally, we can specify the sort order that is used by the function (the default is ascending) with the help of <order> argument of the function. If we wish to sort (and rank) from lowest to highest, we could use the formula as shown in the INVERSE DENSE column. The INVERSE DENSE column uses the following calculation: RANKX(Resellers, [Round Down Amount],,TRUE,DENSE) After having specified the Sales Group field sort by column as Round Down Order, we may ask why we did not also sort the Customer Name column by their respective values in the Sales Rank column? Trying to define a sort by column in this way would cause an error as it is not a one-to-one relationship between these two fields. That is, each customer does not have a unique value for the sales rank. Let's have a look at this in more detail. If we filter the Resellers table to show the blank USD Sales Total rows (click on the drop-down arrow in the USD Sales Total column and check the BLANKS checkbox), we see that the values of the Sales Rank column for all the rows is the same. In the following screenshot, we can see the value 636 repeated for all the rows: Allowing the client tool visibility to the USD Sales Total and Sales Rank fields will not provide an intuitive browsing attribute for most client tools. For this reason, it is not recommended to expose these attributes to users. Hiding them will still allow the data to be queried directly Discretizing sales By discretizing the Resellers table, we firstly make a decision to group each reseller into bands of 100,000 intervals. Since, we have already calculated the USD Gross Sales value for each customer, our problem is reduced by determining which bin each customer belongs to. This is very easily achieved as we can derive the lower and upper bound for the Resellers table. That is, the lower bound will be a rounded down amount of their sales and the upper bound will be the rounded up value (that is rounded nearest to the 100,000 interval). Finally, we must ensure that the ordering of the bins is correct so that the bins appear from the highest value resellers to the lowest. For convenience, these steps are broken down through the creation of additional columns but they need not be—we could incorporate the steps into a single formula (mind you, it would be hard to read). Additionally, we have provided a unique name for the first bin by testing for 0 sales. This may not be required. The rounding is done with the ROUNDDOWN and ROUNDUP functions. These functions simply return the number moved by the number of digits offset. The following is the syntax for ROUNDDOWN: ROUNDDOWN(<number>, <num_digits>) Since we are interested only in the INTEGER values (that is, values to the left of the decimal place), we must specify <num_digits> as -5. The display value of the bin is controlled through the FORMAT function, which returns the text equivalent of a value according to the provided format string. The syntax for FORMAT is: FORMAT(<value>, <format_string>) There's more... In presenting a USD Gross Sales value for the Resellers table, we may not be interested in all the historic data. A typical variation on this theme is to determine the current worth by showing the recent history (or summing recent sales). This requirement can be easily implemented into the preceding method by swapping USD Gross Sales with recent sales. To determine this amount, we need to filter the data used in the SUM function. For example, to determine the last 30 days' sales for a reseller, we will use the following code: SUMX( FILTER('Reseller Sales' , 'Reseller Sales'[Order dt]> (MAX('Reseller Sales'[Order dt])-30) ) , USD SALES EXPRESSION )
Read more
  • 0
  • 0
  • 2362

article-image-going-isometric
Packt
19 Dec 2013
12 min read
Save for later

Going Isometric

Packt
19 Dec 2013
12 min read
(For more resources related to this topic, see here.) Cartesian to isometric equations A very important thing to understand here is that the level data still remains the same 2D array, and we will be altering only the rendering process. Later on, we will need to update the level data to accommodate large tiles, which will contain items that are bigger than the current tile size. Our two-dimensional top-down coordinates for a tile can be called Cartesian coordinates. The relationship between Cartesian and isometric coordinates is shown in the following code: //Cartesian to isometric: x_Iso = x_Cart - y_Cart; y_Iso = ( x_Cart + y_Cart ) / 2; //Isometric to Cartesian: x_Cart = ( 2 * y_Iso + x_Iso ) / 2; y_Cart = ( 2 * y_Iso – x_Iso ) / 2; Now that is very simple isn't it? We will use an IsoHelper class for this conversion where we can pass through a point and get back to the converted point. An isometric view via a matrix transformation Although the equations are simple and straightforward, the art needed for an isometric tile is a bit complicated. The artist needs to create the rhombus-shaped tile art with pixel precision and mostly tileable in all four directions. An alternative approach is to use the square tile itself and skew them dynamically using the corresponding code. Let us try to create the isometric view for the level data with the same tiles using this approach. The transformation matrix for isometric transformation is as follows, which is essentially a rotation of 45 degrees and scaling by half in Y axis: var m:Matrix = new Matrix(1,0.5,-1,0.5,0,0); The code for the IsometricLevel class, is shared as follows. You should initialize this class from the Starling document class using new Starling (IsometricLevel, stage). The following approach just applies the isometric transformation matrix to the RenderTexture image. Minor changes in the init function are shown in the following code: var m:Matrix = new Matrix(1,0.5,-1,0.5,0,0); for(var i_int=0;i<levelData.length;i++){ for(var j_int=0;j<levelData[0].length;j++){ img=new Image(texAtlas.getTexture(paddedName(levelData[i][j]))); img.x=j*tileWidth+borderX; img.y=i*tileWidth+borderY; rTex.draw(img); } } m.translate( 300, 0 ); rTexImage.transformationMatrix = m; We apply the transformation matrix to the RenderTexture image and translate it by 300 pixels so that the whole of it is visible. Skewing will make a part of the image to be out of the visible area of the screen. We will get the following result: An alternate approach is to apply the transformation matrix to each individual tile image, find the corresponding isometric coordinates, and move and place individual tiles accordingly as shown in the following code: var m:Matrix = new Matrix(1,0.5,-1,0.5,0,0); var pt_Point=new Point(); for(var i_int=0;i<levelData.length;i++){ for(var j_int=0;j<levelData[0].length;j++){ img = new Image(texAtlas.getTexture(paddedName(levelData[i][j]))); img.transformationMatrix = m; pt.x=j*tileWidth+borderX; pt.y=i*tileWidth+borderY; pt=IsoHelper.cartToIso(pt); img.x=pt.x+300; img.y=pt.y; rTex.draw(img); } } Here, we use the convenient cartToIso(pt) conversion function of our IsoHelper class to find the corresponding isometric coordinates to our Cartesian coordinates. We are offsetting the drawing by 300 pixels to handle the skewing offset for the image. This approach will work in some cases, but not all top-down tiles can be simply skewed and made into an isometric tile. For example, consider a tree in the top-down view, it will simply look like a skewed tree graphic after we apply the isometric transformation. So, the right approach is to create an isometric tile art specifically and use isometric equations to place them correctly. Let us use the isometric tiles provided in the assets pack to create a sample level. Implementing the isometric view via isometric art Please refer to the SampleIsometricDemo source folder, which implements a sample level of our game using isometric art and the previously mentioned equations. There are some differences in the approach that I will be explaining in the following sections. Most of it has to do with the change in level data, altering the registration point of larger tiles, and handling depth. We also need to offset the image drawing so that it fits in the screen area. We use a variable called screenOffset for this purpose. The render code is as follows: var pt_Point=new Point(); for(var i_int=0;i<groundArray.length;i++){ for(var j_int=0;j<groundArray[0].length;j++){ //draw the ground img=new Image(texAtlas.getTexture(String(groundArray[i][j]).split(".")[0])); pt.x=j*tileWidth; pt.y=i*tileWidth; pt=IsoHelper.cartToIso(pt); img.x=pt.x+screenOffset.x; img.y=pt.y+screenOffset.y; rTex.draw(img); //draw overlay if(overlayArray[i][j]!="*"){ img=new Image(texAtlas.getTexture(String(overlayArray[i][j]).split(".")[0]));])); img.x=pt.x+screenOffset.x; img.y=pt.y+screenOffset.y; if(regPoints[overlayArray[i][j]]!=null){ img.x+=regPoints[overlayArray[i][j]].x; img.y-=regPoints[overlayArray[i][j]].y; } rTex.draw(img); } } } The result is shown in the following screenshot: Level data structure The level data for our isometric level is not just a simple 2D array with index numbers any more, but a combination of multiple data structures. We have a 2D array for the ground tiles, another 2D array for overlay tiles, and a dictionary to store altered registration points of the overlay tiles. Ground tiles are those tiles which exactly fit the isometric tile dimensions, which in this case is 80 x 40, and makes up the bottom-most layer of the isometric level. These tiles won't take part in any depth sorting as they are always rendered below all other items that populate the level. Overlay tiles are items which may not fit into the isometric tile dimensions and have height, for instance, buildings, trees, bushes, rocks, and so on. Some of these can be fit into tile dimensions, but are kept as such that we have various advantages using the following approach: We are free to place an overlay tile over any ground tile, which adds to flexibility We would need a lot of tiles if we try to fit overlay tiles and ground tiles together for all permutations and combinations Effects such as tinting can be applied independently to the overlay tiles Depth handling becomes much easier Overlay tiles which are smaller than the tile size reduce the game size Altering registration points Starling considers all images as rectangular blocks with their registration point at the top-left corner. The registration point is the point which can be considered as the (0,0) of that image. Traditional Flash had given us the capability to alter the registration points by embedding images inside Sprite or MovieClip. We can still do the same, but it will require unnecessary creation of a lot of Sprites. Alternately, we can use the pivotX and pivotY properties of Starling objects for the same result too. In our isometric level, we will need to precisely place overlay tiles inside the isometric grid space. An overlay tile does not have any standard size as it can be any item— a tree, building, character, and so on. So, placing them correctly is a tricky thing and very specific to the tile concerned. This leads us to have independent registration points for each overlay tile. We use a dictionary structure to save these values and use those values as offsets while placing overlay tiles. For example, we need to place a bush image, nonwalk0009.png, exactly at the middle of an isometric grid, which means moving it 12 pixels to the left and 19 pixels to the top for proper alignment. We save (12,19) as a new point inside our dictionary for ID nonwalk0009.png, as follows: regPoints["nonwalk0009.png"]=new Point(12,19); Finding a tile's precise placement point needs to involve visual interaction; hence, we will build a level editor, which makes this easier. Depth sorting An isometric view needs us to handle the depth of items manually. For ground tiles, there is no depth issue as they always form the lowest layer over which all the overlay items and characters are drawn. But overlay tiles and characters need to be drawn at specific depths for it to look appropriate. By depth, I mean the order at which the images are drawn. An image drawn later will overlap the one drawn earlier, thereby making it seem in front of the latter. For a level which does not change or without any moving items, we need to find the depth only once for the initial render. But for a level with moving characters or vehicles, we need to find every frame in the game loop and render. The current sample level does not change over time, so we can simply render the level by looping through the array. Any overlay item placed at a higher I or J value will be rendered later, and hence will be shown in front, where I and J are array indices. Thus, items placed at higher indices appear closer to the camera, that is, for the same I, a higher J is closer to the camera and vice versa. When we have a moving item, we need to find the corresponding array position it occupies based on its current screen position. By using these new found array indices, we can compare with the overlay tile's indices and decide on the drawing sequence. The code to find array indices from the screen position is as follows: //capture screen position var screenPos_Point=new Point(hero.x,hero.y); //convert to cartesian coordinates var cartPos_Point=IsoHelper.isoToCart(screenPos); //find tile indices from cartesian values var tilePos_Point=IsoHelper.getTileIndices(screenPos,tileWidth); Understanding isometric movement Isometric movement is very straightforward to implement. All we need to do is move the item in top-down Cartesian coordinates and draw it on the screen after converting into isometric coordinates. For example, if our character is at a point, heroCart in the Cartesian system, then the following code moves him/her to the right: heroCart.x+=heroSpeed; //convert to isometric coordinates heroIso=IsoHelper.cartToIso(heroCart); heroImage.x=heroIso.x; heroImage.y=heroIso.y; rTex.draw(heroImage); Detecting isometric collision Collision detection for any tile-based game is done based on tiles. When designing, we will make sure that certain tiles are walkable while certain tiles are nonwalkable, which means that the characters can move over some tiles but not over others. So, when we calculate the movements of any character, we first make sure that the character won't end up on a nonwalkable tile. Thus, after each movement, we check if the resulting position falls in a nonwalkable tile by finding array indices as mentioned previously. If the result is true, we will ignore the movement, else we will proceed with the movement and update the on-screen character position. heroCart.x+=heroSpeed; //find new tile point var tilePos_Point=IsoHelper.getTileIndices(heroCart,tileWidth); //this checks if new tile position is occupied, else proceeds if(checkWalkable(tilePos)){ //convert to isometric coordinates heroIso=IsoHelper.cartToIso(heroCart); heroImage.x=heroIso.x; heroImage.y=heroIso.y; rTex.draw(heroImage); } You may be wondering that the hero character should need some special considerations to be drawn correctly as the right depth, but by the way we draw things, it gets handled automatically. We do not allow the hero to move onto a nonwalkable tile, that is, bushes, trees, and so on. So, any tile remains walkable or nonwalkable. The character gets drawn on top of a walkable tile, which does not contain any overlay items, and hence it will occupy the right depth. In this method, a full tile has to be made either walkable or nonwalkable, but this may not be the case for all games. We may need to have tiles, which block entry from a specific direction or block exit in a particular direction as a fence along one border of a tile. In such cases, the tile is still walkable, but valid movement is also checked by tracking the direction in which the character is moving. For our game, the first method will be used along with the four-way freedom of movement. In an isometric view, movement can be either in four directions or eight directions, which in turn is called a four-way movement or an eight-way movement respectively. A four-way movement is when we move along the X or Y axis alone on the Cartesian space. An eight-way movement happens when, in addition to four - way, we also move the item diagonally. Logic still remains the same. Summary In this article, we learned about the isometric projection and the equations that help us to implement it based on the simpler Cartesian system. We implemented a sample isometric level using isometric art as well as learned about matrix-based fake isometric rendering. We analyzed the IsoHelper class, which facilitates easy conversion between Cartesian and isometric coordinates and also helps in finding array indices. We learned why altering the registration points is essential for perfectly placing the overlay tiles and we found that our level data needs to track these registration points as well. We also learned how depth sorting, collision detection, and isometric movement are done based on our tile-based approach. Resources for Article: Further resources on this subject: Introduction to Game Development Using Unity 3D [Article] Flash Game Development: Making of Astro-PANIC! [Article] Collision Detection and Physics in Panda3D Game Development [Article]
Read more
  • 0
  • 0
  • 18817

article-image-platform-service-and-cloudbees
Packt
19 Dec 2013
10 min read
Save for later

Platform as a Service and CloudBees

Packt
19 Dec 2013
10 min read
(For more resources related to this topic, see here.) Platform as a Service (PaaS) is a crossover between IaaS and SaaS. This is a fuzzy definition, but it defines well the existing actors in this industry well and possible confusions. A general presentation of PaaS uses a pyramid. Depending on what the graphics try to demonstrate, the pyramid can be drawn upside down, as shown in the following diagram: Cloud pyramids The pyramid on the left-hand side shows XaaS platforms based on the target users' profiles. It demonstrates that IaaS is the basis for all Cloud services. It provides the required flexibility for PaaS to support applications that are exposed as SaaS to the end users. Some SaaS actually don't use a PaaS and directly rely on IaaS, but that doesn't really matter here. The pyramid on the right-hand side represents the providers and the three levels suggests the number of providers in each category. IaaS only makes sense for highly concentrated, large-scale providers. PaaS can have more actors, probably focused on some ecosystem, but the need is to have a neutral and standard platform that is actually attractive for developers. SaaS is about all the possible applications running in Cloud. The top-level shape should then be far larger than what the graphic shows. So, which platform? With the previous definition of platform, you just have a faint idea; your understanding about PaaS is more than IaaS and less than SaaS. The missing definition is to know what the platform is about. A platform is a standardization of the runtime for which a developer is waiting to do his/her job. This depends on the software ecosystem you're considering. For a Java EE developer, a platform means having at least a servlet container, managing DataSource to access the database, and having few comparable resources wrapped as standard Java EE APIs. A Play! framework developer will consider this as overweight and only ask for a JVM with web socket's support. A PHP developer will expect a Linux/Apache/MySQL/PHP (LAMP) stack, similar to the one he/she has been using for years, with a traditional server hosting service. So, depending on the development ecosystem you're considering, platforms don't have the exact same meaning, but they all share a common principle. A platform is the common denominator for a software language ecosystem, where the application is all that a specific developer will write or choose on their own. Java EE developers will ask for a container, and Ruby developers will ask for an RVM environment. What they run on top is their own choice. With this definition, you understand that a platform is about the standardization of runtime for a software ecosystem. Maybe some of you have patched OpenJDK to enable some magic features in the JVM (really?), but most of us just use the standard Oracle Java distribution. Such a standardization makes it possible to share resources and engineering skills on a large scale, to reduce cost, and provide a reliable runtime. Cloud and clustering Another consideration for a platform is clustering. Cloud is based on slicing resources into small virtual elements and letting the users select as many as they need. In most cases, this requires the application to support a clustering mode, as using more resources will require you to scale out on multiple hosts. Clustering has never been a trivial thing, and many developers aren't familiar with the related constraints. The platform can help them by providing specialized services to distribute the load around the cluster's nodes. Some PaaS such as CloudBees or Google App Engine provide such features, while some don't. This is the major difference between PaaS offers. Some are IaaS-like preinstalled middleware services, while some offer a highly integrated platform. A typical issue faced is that of state management. Java EE developers rely on HttpSession to store user's data and retrieve them on subsequent interaction. Modern frameworks tend to be stateless, but the state needs to be managed anyway. PaaS has to provide options to developers, so that they can choose the best strategy to match their own business requirements. This is a typical clustering issue that is well addressed by PaaS because the technical solutions (sticky session, session replication, distributed storage engines, and so on) have been implemented once with all the required skills to do it right, and can be used by all platform users. Thanks to a PaaS, you don't need to be a clustering guru. This doesn't mean that it will magically let your legacy application scale out, but it gives you adequate tools to design the application for scalability. Private versus public Clouds Many companies are interested in Cloud, thanks to the press for publishing all product announcements as the new revolution, and would like to benefit from them but as a private resource. If you go back to the comparison in the Preface with an electricity production, this may make sense if you're well established. Amazon or Google should have private power plants to supply giant data centers can make sense—anyway it doesn't seems that they do but as backends. For most of companies, this would be a surprising company choice. The main reason is that the principle of the Cloud relies on the last letter of XaaS (S) that stands for Service. You can install an OpenStack or VMware farm on your data center, but then you won't have an IaaS. You will have some virtualization and flexibility that probably is far better than traditional dedicated hardware, but you miss the major change. You still will have to hire operators to administer the servers and software stack. You will even have a more complex software stack (search for an OpenStack administrator and you'll understand). Using Cloud makes sense because there are thousands of users all around the world sharing the same lower-level resources, and a centralized, highly specialized team to manage them all. Building your own, private PaaS is yet another challenge. This is not a simple middleware stack. This is not about providing virtual machine images with a preinstalled Tomcat server. What about maintenance, application scalability, deployment APIs, clustering, backup, data replication, high availability,monitoring, and support? Support is a major added value of cloud services—I'm not just saying this because I'm a support engineer—but because when something fails, you need someone to help. You can't just wait with the promise for a patch provided by the community. The guy who's running your application needs to have significant knowledge of the platform. That's one reason that CloudBees is focusing on Java first, as this is the ecosystem and environment we know best (even we have some Erlang and Ruby engineers whose preferred game is to troll on this displeasing language). With a private Cloud, you probably can have level-one support with an internal support team, but you can't handle all the issues. As for resource concentration, to build an impressive knowledge base. All those topics are ignored in most cases as people only focus on the app:deploy automation, as opposed to the old-style deployments to dedicated hardware. If this is what you're looking for, you should know that Maven was able to do this for years on all the Java EE containers using cargo. You can check the same at http://cargo.codehaus.org. Cloud isn't just about abstracting the runtime behind an API; it's about changing the way in which developers manage and access runtime so that it becomes a service they can consume without any need to worry about what's happening behind the scene. Security The reason that companies claim to prefer a private cloud solution is security. Amazon datacenters are far more secure than any private datacenter, due to both strong security policy and anonymous user data. Security is not about exploiting encryption algorithms, like in Hollywood movies, but about social attacks that are far more fragile. Few companies take care of administrative, financial, familial, or personal safety. Thanks to the combination of VPN, HTTPS, fixed IPs, and firewall filters, you can safely deploy an application on Amazon Cloud as an extension to your own network, to access data from your legacy Oracle or SAP mainframe hosted in your datacenter. As a mobile application demonstrates, your data is already going out from your private network. There's no concrete reason why your backend application can't be hosted outside your walls. CloudBees – embrace the development stack CloudBees PaaS has something special in its DNA that you won't find in other PaaS; focusing on the Java ecosystem first, even with polyglot support, CloudBees understands well the Java ecosystem's complexity and its underlying practices. Heroku was one of the first successful PaaS, focusing on Ruby runtime. Deployment of a Ruby application is just about sending source code to the platform using the following command: git push heroku master Ruby is a pleasant ecosystem because there are no such long debates on building and provisioning tools that we know of, unlike in JavaWorld, GemFile, and Rake, period. In the Java ecosystem, there is a need to generate, compile the source code, and then sometime post the process classes, hence a large set of build tools are required. There's also a need to provision runtime with dozens of dependencies, so a set of dependency management tools, inter-project relations, and so on are required. With Agile development practices, automated testing has introduced a huge set of test frameworks that developers want to integrate into the deployment process. The Java platform is not just about hosting a JVM or a servlet container, it's about managing Ant, Maven, SBT, or Gradle builds, as well as Grails-, Play-, Clojure-, and Scala-specific tooling. It's about hosting dependency repositories. It's about handling complex build processes to include multiple levels of testing and code analysis. The CloudBees platform has two major components: RUN@cloud is a PaaS, as described earlier, to host applications and provide high-level runtime services DEV@cloud is a continuous integration and deployment SaaS based on Jenkins Jenkins is not the subject of this article, but it is the de facto standard for but not limited to continuous integration in the Java ecosystem. With a large set of plugins, it can be extended to support a large set of tools, processes, and views about your project. The CloudBees team includes major Jenkins committers (including myself #selfpromotion), and so it has a deep knowledge on Jenkins ecosystem and is best placed to offer it as a Cloud service. We also can help you to diagnose your project workflow by applying the best continuous integration and deployment practices. This also helps you to get more efficient and focused results on your actual business development. The following screenshot displays the continuous Cloud delivery concept in CloudBees: With some CloudBees-specific plugins to help, DEV@cloud Jenkins creates a smooth code-build-deploy pipeline, comparable to Heroku's Git push, but with full control over the intermediary process to convert your source code to a runnable application. This is such a significant component to build a full stack for Java developers that CloudBees is the official provider for the continuous integration service for Google App Engine (http://googleappengine.blogspot.fr/2012/10/jenkins-meet-google-app-engine.html), Cloud Foundry (http://blog.cloudfoundry.com/2013/02/28/continuous-integration-to-cloud-foundry-com-using-jenkins-in-the-cloud/), and Amazon. Summary This article introduced the Cloud principles and benefits, and compared CloudBees to its competitors. Resources for Article: Further resources on this subject: Framework Comparison: Backbase AJAX framework Vs Other Similar Framework (Part 2) [Article] Integrating Spring Framework with Hibernate ORM Framework: Part 2 [Article] Working with Zend Framework 2.0 [Article]
Read more
  • 0
  • 0
  • 2872

article-image-jboss-eap6-overview
Packt
19 Dec 2013
5 min read
Save for later

JBoss EAP6 Overview

Packt
19 Dec 2013
5 min read
(For more resources related to this topic, see here.) Understanding high availability To understand the term high availability, here is its definition from Wikipedia: "High availability is a system design approach and associated service implementation that ensures that a prearranged level of operational performance will be met during a contractual measurement period. Users want their systems, for example, hospitals, production computers, and the electrical grid to be ready to serve them at all times. If a user cannot access the system, it is said to be unavailable." In the IT field, when we mention the words "high availability", we usually think of the uptime of the server, and technologies such as clustering and load balancing can be used to achieve this. Clustering means to use multiple servers to form a group. From their perspective, users see the cluster as a single entity and access it as if it's just a single point. The following figure shows the structure of a cluster: To achieve the previously mentioned goal, we usually use a controller of the cluster, called load balancer, to sit in front of the cluster. Its job is to receive and dispatch user requests to a node inside the cluster, and the node will do the real work of processing the user requests. After the node processes the user request, the response will be sent to the load balancer, and the load balancer will send it back to the users. The following figure shows the workflow:     Besides load balancing user requests, the clustering system can also do failover inside itself. Failover means when a node has crashed, the load balancer can switch to other running nodes to process user requests. In a cluster, some nodes may fail during runtime. If this happens, the requests to the failed nodes should be redirected to the healthy nodes. The process is shown in the following figure: To make failover possible, the node in a cluster should be able to replicate user data from one to another. In JBoss EAP6, the Infinispan module, which is a data-grid solution provided by the JBoss community, does the web session replication. If one node fails, the user request could be redirected to another node; however, the session with the user won't be lost. The following figure illustrates failover: To achieve the previously mentioned goals, the JBoss community has provided us a powerful set of tools. In the next section we'll have an overview on it. JBoss EAP6 high availability As a Java EE application server, JBoss EAP6 uses modules coming from different open source projects: Web server (JBossWeb) EJB (JBoss EJB3) Web service (JBossWS/RESTEasy) Messaging (HornetQ) JPA and transaction management (Hibernate/Narayana) As we can see, JBoss EAP6 uses many more open source projects, and each part may have its own consideration to achieve the goal of high availability. Now let's have a brief on these parts with respect to high availability: JBoss Web, Apache httpd, mod_jk, and mod_cluster The clustering for a web server may be the most popular topic and is well understood by the majority. There are a lot of good solutions in the market. For JBoss EAP6, the solution it adopted is to use Apache httpd as the load balancer. httpd will dispatch the user requests to the EAP server. Red Hat has led two open source projects to work with httpd, which are called mod_jk and mod_cluster. In this article we'll learn how to use these two projects. EJB session bean JBoss EAP6 has provided the @org.jboss.ejb3.annotation.Clustered annotation that we can use on both the @Stateless and @Stateful session beans. The clustered annotation is JBoss EAP6/WildFly specific implementation. When using @Clustered with @Stateless, the session bean can be load balanced; and when @Clustered is used with the @Stateful bean, the state of the bean will be replicated in the cluster. JBossWS and RESTEasy JBoss EAP6 provides two web service solutions out of the box. One is JBossWS and the other is RESTEasy. JBossWS is a web service framework that implements the JAX-WS specification. RESTEasy is an implementation of the JAX-RS specification to help you to build RESTFul web services. HornetQ HornetQ is a high-performance messaging system provided by the JBoss community. The messaging system is designed to be asynchronous and has its own consideration on load balancing and failover. Hibernate and Narayana In the database and transaction management field, high availability is a huge topic. For example, each database vendor may have their own solutions on load balancing the database queries. For example, PostgreSQL has some open source solutions, for example, Slony and pgpool, which can let us replicate the database from master to slave and which distributes the user queries to different database nodes in a cluster. In the ORM layer, Hibernate also has projects such as Hibernate Shards that can deploy a data base in a distributed way. JGroups and JBoss Remoting JGroups and JBoss Remoting are the cornerstone of JBoss EAP6 clustering features, which enable it to support high availability. JGroups is a reliable communication system based on IP multicasting. JGroups is not limited to multicast and can use TCP too. JBoss Remoting is the underlying communication framework for multiple parts in JBoss EAP6. Summary In this article we learned the basic concepts about high availability and also had an overview of the basic functions of JBoss EAP6. This will help you in understanding JBoss EAP6 in a better way. Resources for Article: Further resources on this subject: Introduction to JBoss Clustering [Article] JBoss RichFaces 3.3 Supplemental Installation [Article] JBoss AS plug-in and the Eclipse Web Tools Platform [Article]
Read more
  • 0
  • 0
  • 2163

article-image-reporting
Packt
19 Dec 2013
4 min read
Save for later

Reporting

Packt
19 Dec 2013
4 min read
(For more resources related to this topic, see here.) Creating a pie chart First, we made the component test CT for display purposes, but now let's create the CT to make it run. We will use the Direct function, so let's prepare that as well. In reality we've done this already. Duplicate a different app.html and change the JavaScript file like we have done before. Please see the source file for the code: 03_making_a_pie_chart/ct/dashboard/pie_app.html. Implementing the Direct function Next, prepare the Direct function to read the data. First, it's the config.php file that defines the API. Let's gather them together and implement the four graphs (source file: 04_implement_direct_function/php/config.php). .... 'MyAppDashBoard'=>array( 'methods'=>array( 'getPieData'=>array( 'len'=>0 ), 'getBarData'=>array( 'len'=>0 ), 'getLineData'=>array( 'len'=>0 ), 'getRadarData'=>array( 'len'=>0 ) ) .... Next, let's create the following methods to acquire data for the various charts: getPieData getBarData getLineData getRadarData First, implement the getPieData method for the pie chart. We'll implement the Direct method to get the data for the pie chart. Please see the actual content for the source code (source file: 04_implement_direct_function/php/classes/ MyAppDashBoard.php ). This is acquiring valid quotation and bill data items. With the data to be sent back to the client, set the array in items and set up the various names and data in a key array. You will now combine the definitions in the next model. Preparing the store for the pie chart Charts need a store, so let's define the store and model (source file: 05_prepare_the_store_for_the_pie_chart/app/model/ Pie.js). We'll create the MyApp.model.Pie class that has the name and data fields. Connect this with the data you set with the return value of the Direct function. If you increased the number of fields inside the model you just defined, make sure to amend the return field values, otherwise it won't be applied to the chart, so be careful. We'll use the model we made in the previous step and implement the store (source file: 05_prepare_the_store_for_the_pie_chart/app/model/ Pie.js). Ext.define('MyApp.store.Pie', { extend: 'Ext.data.Store', storeId: 'DashboardPie', model: 'MyApp.model.Pie', proxy: { type: 'direct', directFn: 'MyAppDashboard.getPieData', reader: { type: 'json', root: 'items' } } }) Then, define the store using the model we made and set up the Direct function we made earlier in the proxy. Creating the View We have now prepared the presentation data. Now, let's quickly create the view to display it (source file: 06_making_the_view/app/view/dashboard/Pie.js). Ext.define('MyApp.view.dashboard.Pie', { extend: 'Ext.panel.Panel', alias : 'widget.myapp-dashboard-pie', title: 'Pie Chart', layout: 'fit', requires: [ 'Ext.chart.Chart', 'MyApp.store.Pie' ], initComponent: function() { var me = this, store; store = Ext.create('MyApp.store.Pie'); Ext.apply(me, { items: [{ xtype: 'chart', store: store, series: [{ type: 'pie', field: 'data', showInLegend: true, label: { field: 'name', display: 'rotate', contrast: true, font: '18px Arial' } }] }] }); me.callParent(arguments); } }); Implementing the controller With the previous code, data is not being read by the store and nothing is being displayed. In the same way that reading was performed with onShow, let's implement the controller (source file: 06_making_the_view/app/controller/DashBoard.js): Ext.define('MyApp.controller.dashboard.DashBoard', { extend: 'MyApp.controller.Abstract', screenName: 'dashboard', init: function() { var me = this; me.control({ 'myapp-dashboard': { 'myapp-show': me.onShow, 'myapp-hide': me.onHide } }); }, onShow: function(p) { p.down('myapp-dashboard-pie chart').store.load(); }, onHide: function() { } }); With the charts we create from now on, as we create them it would be good to add the reading process to onShow. Let's take a look at our pie chart which appears as follows: Summary You must agree this is starting to look like an application! The dashboard is the first screen you see right after logging in. Charts are extremely effective in order to visually check a large and complicated amount of data. If you keep adding panels as and when you feel it's needed, you'll increase its practicability. This sample will become a customizable base for you to use in future projects. Resources for Article: Further resources on this subject: So, what is Ext JS? [Article] Buttons, Menus, and Toolbars in Ext JS [Article] Displaying Data with Grids in Ext JS [Article]
Read more
  • 0
  • 0
  • 4550
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at €18.99/month. Cancel anytime
article-image-sharing-your-bi-reports-and-dashboards
Packt
19 Dec 2013
4 min read
Save for later

Sharing Your BI Reports and Dashboards

Packt
19 Dec 2013
4 min read
(For more resources related to this topic, see here.) The final objective of the information in the BI reports and dashboards is to detect the cause-effect business behavior and trends, and trigger actions to solve them. These actions supported by visual information, via scorecards and dashboards. This process requires an interaction with several people. MicroStrategy includes the functionality to share our reports, scorecards, and dashboards, regardless of the location of the people. Reaching your audience MicroStrategy offers the option to share our reports via different channels that leverage the latest social technologies that are already present in the marketplace, that is, MicroStrategy integrates with Twitter and Facebook. The sharing is like avoiding any related costs and maintaining the design premise of the do-it-yourself approach without any help from specialized IT personnel. Main menu The main menu of MicroStrategy shows a column named Status. When we click on that column, as shown in the following screenshot, the Share option appears: The Share button The other option is the Share button within our reports, that is, the view that we want to share. Select the Share button located at the bottom of the screen, as shown in the following screenshot: The share options are the same, regardless of the location where you activate the option; the various alternate menus are shown in the following screenshot: E-mail sharing While selecting the e-mail option from the Scorecards-Dashboards model, the system will ask you for the e-mail programs that you want to use in order to send an e-mail; in our case, we select Outlook. MicroStrategy automatically prepares an e-mail with a link to share it. You can modify the text, and select the recipients of the e-mail, as shown in the following screenshot: The recipients of the e-mail will click on the URL that is included in the e-mail, send it by this schema, and the user will be able to analyze the report in a read-only mode with only the Filters panel enabled. The following screenshot shows how the user will review the report. Also, the user is not allowed to make any modifications. This option does not require a MicroStrategy platform user account. When a user clicks on the link, he is able to edit the filters and perform their analyses, as well as switch to any available layout, in our case, scorecards and dashboards. As a result, any visualization object can be maximized and minimized for better analysis, as shown in the following screenshot: In this option, the report can be visualized in a fullscreen mode by clicking on the fullscreen button [] located at the top-right corner of the screen. In this sharing mode, the user is able to download the information in Excel and PDF formats for each visualization object. For instance, if you need all the data included in the grid for the stores in region 1 opened in the year 2000. Perform the following steps: In the browser, open the URL that is generated when you select the e-mail share option. Select the ScoreCard tab. In the Open Year filter, type 2012 and in the Region filter, type 1. Now, maximize the grid. Two icons will appear in the top-left corner of the screen: one for exporting the data to Excel and the other for exporting it to PDF for each visualization object, as shown in the following screenshot: Please keep in mind that these two export options only apply to a specific visualization object; it is not possible to export the complete report from this functionality that is offered to the consumer. Summary In this article, we learned how to share our scorecards and dashboards via several channels, such as e-mails, social networks (Twitter and Facebook), and blogs or corporate intranet sites. Resources for Article: Further resources on this subject: Participating in a business process (Intermediate) [Article] Self-service Business Intelligence, Creating Value from Data [Article] Exploring Financial Reporting and Analysis [Article]
Read more
  • 0
  • 0
  • 1644

article-image-background-animation
Packt
19 Dec 2013
4 min read
Save for later

Background Animation

Packt
19 Dec 2013
4 min read
(For more resources related to this topic, see here.) Background-color animation Animating the background color of an element is a great way to draw our user's eyes to the object we want them to see. Another use for animating the background color of an element is to show that something has happened to the element. It's typically used in this way if the state of the object changes (added, moved, deleted, and so on), or if it requires attention to fix a problem. Due to the lack of support in jQuery 2.0 for animating background-color, we'll be using jQuery UI to give us the functionality we need to create this effect. Introducing the animate method The animate() method is one of the most useful methods jQuery has to offer in its bag of tricks in the animation realm. With it, we’re able to do things like, move an element across the page or alter and animating the properties of colors, backgrounds, text, fonts, the box model, position, display, lists, tables, generated content, and so on. Time for action – animating the body background-color Following the steps below, we're going to start by creating an example that changes the body background color. Start by creating a new file (using our template) called background-color.html and save it in our jquery-animation folder. Next, we'll need to include the jQuery UI library by adding this line directly under our jQuery library by adding this line: <script src = "js/jquery-ui.min.js"></script> A custom or stable build of jQuery UI can be downloaded from http://jqueryui.com, or you can link to the library using one of the three Content Delivery Networks (CDN) below. For fastest access to the library, go to http://jqueryui.com, scroll to the very bottom and look for the Quick Access section. Using the jQuery UI library JS file there will work just fine for our needs for the examples in this article. Media Template: http://code.jquery.com Google: http://developers.google.com/speed/libraries/devguide#jquery-ui Microsoft: http://asp.net/ajaxlibrary/cdn.ashx#jQuery_Releases_on_the_CDN_0 CDNJS: http://cdnjs.com/libraries/jquery Then, we'll add the following jQuery code to the anonymous function: var speed = 1500; $( "body").animate({ backgroundColor: "#D68A85" },speed); $( "body").animate({ backgroundColor: "#E7912D" },speed); $( "body").animate({ backgroundColor: "#CECC33" },speed); $( "body").animate({ backgroundColor: "#6FCD94" },speed); $( "body").animate({ backgroundColor: "#3AB6F1" },speed); $( "body").animate({ backgroundColor: "#8684D8" },speed); $( "body").animate({ backgroundColor: "#DD67AE" },speed); What just happened? First we added in the jQuery UI library to our page. This was needed because of the lack of support for animating the background color in the current version of jQuery. Next, we added in the code that will animate our background. We then set the speed variable to 1500 (milliseconds) so that we can control the duration of our animation. Lastly, using the animate() method, we set the background color of the body element and set duration to the variable we set above named speed. We duplicated the same line several times, changing only the hexadecimal value of the background color. The following screenshot is an illustration of colors the entire body background color animates through: Chaining together jQuery methods It's important to note that jQuery methods (animate() in this case) can be chained together. Our code mentioned previously would look like the following if we chained the animate() methods together: $("body")   .animate({ backgroundColor: "#D68A85"}, speed)  //red   .animate({ backgroundColor: "#E7912D"}, speed)  //orange   .animate({ backgroundColor: "#CECC33"}, speed)  //yellow   .animate({ backgroundColor: "#6FCD94"}, speed)  //green   .animate({ backgroundColor: "#3AB6F1"}, speed)  //blue   .animate({ backgroundColor: "#8684D8"}, speed)  //purple   .animate({ backgroundColor: "#DD67AE"}, speed); //pink Here's another example of chaining methods together: (selector).animate(properties).animate(properties).animate(properties) Have a go hero – extending our script with a loop In this example we used the animate() method and with some help from jQuery UI, we were able to animate the body background color of our page. Have a go at extending the script to use a loop, so that the colors continually animate without stopping once the script gets to the end of the function. Pop quiz – chaining with the animate() method Q1. Which code will properly animate our body background color from red to blue using chaining? $("body")   .animate({ background: "red"}, "fast")   .animate({ background: "blue"}, "fast"); $("body")   .animate({ background-color: "red"}, "slow")   .animate({ background-color: "blue"}, "slow"); $("body")   .animate({ backgroundColor:"red" })   .animate({ backgroundColor:"blue" }); $("body")   .animate({ backgroundColor,"red" }, "slow")   .animate({ backgroundColor,"blue" }, "slow");
Read more
  • 0
  • 0
  • 2520

article-image-code-editing
Packt
18 Dec 2013
9 min read
Save for later

Code Editing

Packt
18 Dec 2013
9 min read
(For more resources related to this topic, see here.) Discovering Search and Replace Search and Replace is one of the common actions we use in every editor, sublime text has two main search features: Single file Multiple files Before covering these topics, let's talk about the best tool available for searching text and especially, patterns, namely, Regular Expressions. Regular Expressions Regular Expressions can find complex patterns in text. To take full advantage of the Search and Replace features of Sublime, you should at least know the basics of Regular Expressions, also known as regex or regexp. Regular Expressions can be really annoying, painful, and joyful at the same time! We won't cover Regular Expressions in this article because it's an endless topic. We will only note that Sublime Text uses the Boost's Perl Syntax for Regular Expressions; this can be found at http://www.boost.org/doc/libs/1_47_0/libs/regex/doc/html/boost_regex/syntax/perl_syntax.html I recommend going to http://www.regular-expressions.info/quickstart.html if you are not familiar with Regular Expressions. Search and Replace – a single file Let's open the Search panel by pressing Ctrl + F on Windows and Linux or command + F on OS X. The search panel options can be controlled using keyboard shortcuts: Search panel Options Windows/Linux OS X Toggle Regular Expressions Alt + R command + Option + R Toggle Case Sensitivity Alt + C command + Option + C Toggle Exact Match Alt + W command + Option + W Find Next Enter Enter Find Previous Shift + Enter Shift + Enter Find All Alt + Enter Option + Enter As we can see in the following screenshot, we have the Regular Expression option turned on: Let's try Search and Replace now by pressing Ctrl + H on Windows and Linux or Option + command + F on OS X and examining the following screenshot: We can see that this time, both, the Regular Expression option and the Case Sensitivity option are turned on. Because of the Case Sensitivity option being on, line 8 isn't selected, the pattern messages/(d) doesn't match line 2 because d only matches numbers, and the 1 on the Replace with field will replace match group number 1, indicated by the parentheses around d. We can also refer to the group by using $1 instead of 1. Let's see what happens after we press Ctrl + Alt + Enter for Replace All: We can see that lines 2 and 8 still say messages and not message; that's exactly what we expected! The incremental search Incremental search is another cool feature that is here to save us keyboard clicks. We can bring up the incremental search panel by pressing Ctrl + I on Windows and Linux or command + I on OS X. The only difference between the incremental search and a regular search is the behavior of the Enter key; in incremental searches, the Enter key will select the next match and dismiss the search panel. This saves us from pressing Esc to dismiss the regular search panel. Search and Replace – multiple files Sublime Text also allows a multiple file search by pressing Ctrl + Shift + F or command + Shift + F on OS X. The same shortcuts from the single file search also apply here; the difference is that we have the Where field and a … button near it. The Where field determines where the files can be searched for; we can define the scope of the search in several ways: Adding individual directories (Unix-style paths, even on Windows( Adding/excluding files based on the wildcard pattern Adding Sublime-symbolic locations such as <open folders>, <open files> We can also combine all the filters by separating them with commas. We can do it in the following manner: /C/Users/Dan/Cool Project,*.rb,<open files> This will look in all files in C:UsersDanCool Project that ends with .rb and are currently open by Sublime. Results will be opened in a new tab called Find Results containing all found results separated by file paths, double clicking on a result will get you to the exact location of the result in the original file. Mastering Column and Multiple Selection Multiple Selections is one of Sublime's coolest features; TextMate users might be familiar with it. So how can we select multiple lines? We select one line like we usually do and selecting the second line while holding Ctrl or command on OS X. We can also subtract a line by holding the Alt key or command + Shift keys on OS X. This feature is really useful so it is recommended to play with it, the following are some shortcuts that can help us feel more comfortable with multiple selections: Multiple Selection action Windows/Linux OS X Return to Single Selection Mode Esc Esc Undo last selection motion Ctrl + U command + U Add next occurrence of selected text to selection Ctrl + D command + D Add all occurrences of selected text to selection Alt + F3 Control + command + G Turn Single Linear Selection into Block Selection Ctrl + Shift + L Shift + command + L Column Selection The Column Selection feature is one of my favorites! We can select multiple lines by pressing Shift and dragging the right mouse button on Windows or Linux and pressing Option and dragging the left mouse button on OS X. Here we want to remove the letter s from messages, as shown in the following screenshot: We have selected all s using Column selection; now we just need to hit backspace to delete them. Navigating through everything Sublime is known for its ability to quickly move between and around files and lines. Here, we are going to master how to navigate our code quickly and easily. Going To Anything We already learned how to use the Go To Anything feature, but it can do more than just searching for filenames. We can conduct a fuzzy search inside a "fuzzily found" file. Really? Yeah, we can. For example, we can type the following inside the Go To Anything window: isl#wld This will make Sublime perform a fuzzy search for wld inside the file that we found by fuzzy searching isl; it can thus find the word world inside a file named island. We can also perform a fuzzy search in the current file by pressing Ctrl + ; in Windows or Linux and command + P, # in OS X. It is very common to use fuzzy search inside HTML files because it will immediately show all the elements and classes in order to accelerate navigation. Symbol search Sometimes we want to search for a specific function or specific class inside the current file. With Sublime we can do it simply by pressing Ctrl + R on Windows or Linux and command + R on OS X. Projects Project is a group of files and folders. To save a project we just need to add folders and files to the sidebar, and then from the menu, we navigate to Project | Save Project As… The saved file is our projects data, and it is stored in a JSON formatted file with a .sublime-project extension. The following is a sample project file: {     "folders":     [         {             "path":"src",             "follow_symlinks":true         },         {             "path":"docs",             "name":"Documentation",       "file_exclude_patterns":["*.xml"]         }     ],     "settings":     {         "tab_size":6     },     "build_systems":     [         {             "name":"List",             "shell_cmd":"ls -l"         }     ] } As we can see in the preceding code, there are three elements written as JSON arrays. Folders Each folder must have a valid folder path that can be absolute or relative to the project directory, which is where the project file is. A folder can also include the following keys: name: This is the name that will be shown on the sidebar file_execlude_pattern: This is the folder that will exclude all the files matching the given Regular Expression file_include_pattern: This is the folder that will include only files matching the given Regular Expression folder_execlude_pattern: This is the folder that will exclude all subfolders matching the given Regular Expression folder_include_pattern: This is the folder that will include only subfolders matching the given Regular Expression follow_symlinks: This will include symlinks if set to true Settings The project-specific settings array will contain all the settings that we want to apply only on this project. These settings will override our user settings. Build systems In an array of build system definitions, we must specify a name for each definition; these build systems will then be specified in Tools | Build Systems. For more information about build systems, please visit http://sublimetext.info/docs/en/reference/build_systems.html. Navigating between projects To switch between projects quickly, we can press Ctrl + Alt + P in Windows or Linux and Control + command + P in OS X. Summary By now, we have learned some of Sublime's basic features to the most advanced features and techniques that need to be used while editing code. Resources for Article: Further resources on this subject: Top features you need to know about [Article] Setting up environment for Cucumber BDD Rails [Article] Implementation of SASS [Article]
Read more
  • 0
  • 0
  • 1468

article-image-working-amqp
Packt
18 Dec 2013
5 min read
Save for later

Working with AMQP

Packt
18 Dec 2013
5 min read
(for more resources related to this topic, see here.) Broadcasting messages In this example we are seeing how to send the same message to a possibly large number of consumers. This is a typical messaging application, broadcasting to a huge number of clients. For example, when updating the scoreboard in a massive multiplayer game, or when publishing news in a social network application. In this article we are discussing both the producer and consumer implementation. Since it is very typical to have consumers using different technologies and programming languages, we are using Java, Python, and Ruby to show interoperability with AMQP. We are going to appreciate the benefits of having separated exchanges and queues in AMQP. Getting ready To use this article you will need to set up Java, Python and Ruby environments as described. How to do it… To cook this article we are preparing four different codes: The Java publisher The Java consumer The Python consumer The Ruby consumer To prepare a Java publisher: Declare a fanout exchange: channel.exchangeDeclare(myExchange, "fanout"); Send one message to the exchange: channel.basicPublish(myExchange, "", null, jsonmessage.getBytes()); Then to prepare a Java consumer: Declare the same fanout exchange declared by the producer: channel.exchangeDeclare(myExchange, "fanout"); Autocreate a new temporary queue: String queueName = channel.queueDeclare().getQueue(); Bind the queue to the exchange: channel.queueBind(queueName, myExchange, ""); Define a custom, non-blocking consumer. Consume messages invoking channel.basicConsume() The source code of the Python consumer is very similar to the Java consumer, so there is no need to repeat the needed steps. In the Ruby consumer you need to use require "bunny" and then use the URI connection. We are now ready to mix all together, to see the article in action: Start one instance of the Java producer; messages start getting published immediately. Start one or more instances of the Java/Python/Ruby consumer; the consumers receive only the messages sent while they are running. see that the consumer has lost the messages sent while it was down. How it works… Both the producer and the consumers are connected to RabbitMQ with a single connection, but the logical path of the messages is depicted in the following figure: In step 1 we have declared the exchange that we are using. The logic is the same as in the queue declaration: if the specified exchange doesn't exist, create it; otherwise, do nothing. The second argument of exchangeDeclare() is a string, specifying the type of the exchange, fanout in this case. In step 2 the producer sends one message to the exchange. You can just view it along with the other defined exchanges issuing the following command on the RabbitMQ command shell: rabbitmqctl list_exchanges The second argument in the call to channel.basicPublish() is the routing key, which is always ignored when used with a fanout exchange. The third argument, set to null, is the optional message property. The fourth argument is just the message itself. When we started one consumer, it created its own temporary queue (step 9). Using the channel.queueDeclare() empty overload, we are creating a nondurable, exclusive, autodelete queue with an autogenerated name. Launching a couple of consumers and issuing rabbitmqctl list_queues, we can see two queues, one per consumer, with their odd names, along with the persistent myFirstQueue as shown in the following screenshot: In step 5 we have bound the queues to myExchange. It is possible to monitor these bindings too, issuing the following command: rabbitmqctl list_bindings The monitoring is a very important aspect of AMQP; messages are routed by exchanges to the bound queues, and buffered in the queues. Exchanges do not buffer messages; they are just logical elements. The fanout exchange routes messages by just placing a copy of them in each bound queue. So, no bound queues and all the messages are just received by no one consumer. As soon as we close one consumer, we implicitly destroy its private temporary queue (that's why the queues are autodelete; otherwise, these queues would be left behind unused, and the number of queues on the broker would increase indefinitely), and messages are not buffered to it anymore. When we restart the consumer, it will create a new, independent queue and as soon as we bind it to myExchange, messages sent by the publisher will be buffered into this queue and pulled by the consumer itself. There's more… When RabbitMQ is started for the first time, it creates some predefined exchanges. Issuing rabbitmqctl list_exchanges we can observe many existing exchanges, in addition to the one that we have defined in this article: All the amq.* exchanges listed here are already defined by all the AMQP-compliant brokers and can be used instead of defining your own exchanges; they do not need to be declared at all. We could have used amq.fanout in place of myLastnews.fanout_6, and this is a good choice for very simple applications. However, applications generally declare and use their own exchanges. See also With the overload used in the article, the exchange is non-autodelete (won't be deleted as soon as the last client detaches it) and non-durable (won't survive server restarts). You can find more available options and overloads at http://www.rabbitmq.com/releases/rabbitmq-java-client/current-javadoc/. Summary In this article, we are mainly using Java since this language is widely used in enterprise software development, integration, and distribution. RabbitMQ is a perfect fit in this environment. resources for article: further resources on this subject: RabbitMQ Acknowledgements [article] Getting Started with ZeroMQ [article] Setting up GlassFish for JMS and Working with Message Queues [article]
Read more
  • 0
  • 0
  • 2775
article-image-component-based-approach-unity
Packt
18 Dec 2013
4 min read
Save for later

Component-based approach of Unity

Packt
18 Dec 2013
4 min read
(For more resources related to this topic, see here.) First of all, you have a project, which is essentially a folder that contains all of the files and information about your game. Some of the files are called scenes (think of them as levels). A scene contains a number of game objects that you have added to it. The contents of your scenes are determined by you, and you can have as many of them as you want. You can also make your game switch between different scenes, thus making different sets of game objects active. On a smaller scale, you have game objects and components. A game object by itself is simply an invisible container that does not do anything. Without adding appropriate components to it, it cannot, for instance, appear in the scene, receive input from the player, or move and interact with other objects. Using components, you can easily assemble powerful game objects while reusing several small parts, each responsible for a simple task or behavior—rendering the game object, handling the input, taking damage, playing an audio effect, and so on—making your game much simpler to develop and manage. Unity relies heavily on this approach, so the better you grasp it, the faster you will get good at it. The only component that each and every game object in Unity has attached to it by default is Transform. It lets you define the game object's position, rotation, and scale. Normally, you can attach, detach, and destroy components in any given game object at will, but you cannot remove Transform. Each component has a number of properties that you can access and change: these can be integer or floating point numbers, strings of text, textures, scripts, references to game objects or other components. They are used to change the way a certain component behaves, to influence its appearance or interaction. Some of the properties include the position, rotation, and scale properties of the Transform component. The following screenshot shows the Wall game object with the Transform, Mesh Filter, Box Collider, Mesh Renderer, and Script components attached to it. the properties of Transform are displayed. In order to reveal or hide a component's properties you need to left-click on its name or on the small arrow on the left of its icon. Unity has a number of predefined game objects that already have components attached to them, such as cameras, lights, and primitives. You can access them by choosing GameObject | Create from the main menu. Alternatively, you can create empty game objects by pressing command + Shift + N (Ctrl + Shift + N in Windows) and attach components to them using the Components submenu. The following figure shows the project structure that we have discussed. Note that there can be any number of scenes within a single project, any number of game objects within a single scene, any number of components attached to a single game object, and finally, any number of properties within a single component. One final thing that you need to know about components right now is that you can copy them by right-clicking on the name of the component in the Inspector panel and selecting Copy Component from the contextual menu shown in the following screenshot. You can also reset the properties of the components to their default values, remove components, and move them up or down for your convenience. Summary This article has covered the basic concept of the component-based approach of Unity and the figures/screenshots demonstrate the various aspect of the same. Resources for Article: Further resources on this subject: Mobile Game Design [Article] Unity Game Development: Welcome to the 3D world [Article] Interface Designing for Games in iOS [Article]
Read more
  • 0
  • 0
  • 7196

article-image-getting-started-apache-nutch
Packt
18 Dec 2013
13 min read
Save for later

Getting Started with Apache Nutch

Packt
18 Dec 2013
13 min read
(For more resources related to this topic, see here.) Introduction of Apache Nutch Apache Nutch is a very robust and scalable tool for webcrawling and it can be integrated with scripting language i.e Python for web crawling. You can use it whenever your application contains huge data and you want to apply crawling on your data. Apache Nutch is an Open Source WebCrawler Software which is used for crawling websites. You can create your own search engine like google if you understand Apache Nutch clearly. It will provide you your own search engine using which you can increase your application page rank in searching and also customize your application searching according to your needs. It is extensible and scalable. It facilitates for parsing, indexing, creating your own search engine, customize search according to needs, scalability, robustness and ScoringFilter for custom implementations. ScoringFilter is a Java class which is used while creating Apache Nutch plugin. It is used for manipulating scoring variables. We can run Apache Nutch on a single machine as well as distributed environment like Apache Hadoop. It is written in Java. We can find broken links using Apache Nutch, create a copy of all the visited pages for searching over for example: Build indexes. We can find Web page hyperlinks in an automated manner. Apache Nutch can be integrated with Apache Solr easily and we can index all the webpages which are crawled by Apache Nutch to Apache Solr. We can then use Apache Solr for searching the webpages which are indexed by Apache Nutch. Apache Solr is a search platform which is built on top of Apache Lucene. It can be used for searching any type of data for example webpages. Crawling your first website Crawling is driven by Apache Nutch crawling tool and certain related tools for building and maintaining several data structures. It includes web database, the index and a set of segments. Once Apache Nutch has indexed the webpages to Apache Solr, you can search for the required webpage(s) in Apache Solr. Apache Solr Installation Apache Solr is a search platform which is built on top of Apache Lucene. It can be used for searching any type of data for example webpages. It’s a very powerful searching mechanism and provides full-text search, dynamic clustering, database integration, rich document handling and many more. Apache SOLR will be used for indexing urls which are crawled by Apache Nutch and then one can search the details in Apache SOLR crawled by Apache Nutch. Crawling your website using the crawl script Apache Nutch 2.2.1 comes with the facility of crawl script which does crawling by just executing one single script. In earlier version, we have to manually do each step like generating data, fetching data, parsing data and so on for perfrom crawling. Crawling the web, the CrawlDb, and URL filters When user invokes crawling command in Apache Nutch 1.x, crawlDB is generated by Apache Nutch which is nothing but a directory which contains details about crawling. In Apache 2.x, crawlDB is not present. Instead Apache Nutch keeps all the crawling data directly into the database. InjectorJob The injector will add the necessary urls to the crawldb. Crawldb is the directory which is created by Apache Nutch for storing data related to crawling. You need to provide urls to InjectorJob either by downloading urls from internet or writing your own file which contains urls. Let’s say you have created one directory called urls which contains all the urls that needs to be injected in cralwdb. Following command will be used for perform the InjectorJob: #bin/nutch inject crawl/crawldb urls Urls will be directory which contains all the urls which needs to be injected in crawldb. Crawl/crawldb is the directory in which injected urls will be placed. After performing this job, you have number of unfetched urls inside your database i.e crawldb. GeneratorJob Once we have done with the InjectorJob, now it’s time to fetch the injected urls from crawldb. So for fetching the urls, you need to perform GeneratorJob before. Follwing command will be used for GeneratorJob: #bin/nutch generate crawl/crawldb crawl/segments Crawldb is the directory from where urls are generated. Segments is the directory which is used by GeneratorJob to fetch the necessary information required for crawling. FetcherJob The job of the fetch is to fetch the urls which are generated by GeneratorJob. It will use the input provided by GeneratorJob. Follwing command will be used for FetcherJob: #bin/nutch fetch –all Here I have provided input parameters –all which means this job will fetch all the urls which are generated by GeneratorJob. You can use different input parameters according to your needs. ParserJob After FetcherJob, ParserJob is to parse the urls which are fetched by FetcherJob. Follwing command will be used for ParserJob: # bin/nutch parse –all I have used input parameters –all which will parse all the urls which are fetched by FetcherJob. You can use different input parameter according to your needs. DbUpdaterJob Once the ParserJob has been completed, we need to update the database by providing results of the FetcherJob. This will update the respected databases with the last fetched urls. Following command will be used for DbUpdaterJob: # bin/nutch updatedb crawl/crawldb –all After performing this job, database will contain both updated entries of all the initial pages and also contains the new entities which are correspond to the newly discovered pages which are linked from the initial set. Invertlinks Before applying indexing, we need to first invert all the links. After this we will be able to index incoming anchor text with the pages. Following command will be used for Invertlinks: # bin/nutch invertlinks crawl/linkdb -dir crawl/segments Apache Hadoop Apache Hadoop is designed for running your application on servers where there will be lot of computers in which one will be master computer and rest will be the slave computers. So it’s huge data warehouse. Master computers are the computers which will direct slave computers for data processing. So processing is done by slave computers. This is the reason why Apache Hadoop is used for processing huge amount of data as process is divided into the number of slave computers and that’s why Apache Hadoop gives highest throughput for any processing. So as data will increase, you need to increase number of slave computers. That’s how Apache Hadoop functionality runs. Integration of Apache Nutch with Apache Hadoop Apache Nutch can be easily integrated with Apache Hadoop and we can make our process much faster than running Apache Nutch on single machine. After integrating Apache Nutch with Apache Hadoop, we can perform crawling on Apache Hadoop cluster environment. So the process will be much faster and we will get highest amount of throughput. Apache Hadoop Setup with Cluster This setup is not required a huge hardware to purchase and running Apache Nutch and Apache Hadoop. It is designed in such a way to make the use of hardware maximum. Formatting the HDFS filesystem using the NameNode HDFS stands for Hadoop Distributed File system is a directory which is used by Apache Hadoop for storage purpose. So it’s the directory which stroes all the data related to Apache Hadoop. It has two components as NameNode and DataNode in which NameNode manages the filesystem metadata and DataNodes actually stores the data. It’s highly configurable and suited well for many installations. When there are very large clusters, at that time configuration needs to be tuned. The first step for getting start your Apache Hadoop is the formatting Hadoop filesystem which is implemented on top of the local filesystem of your cluster(which will include only your local machine if you have followed). Setting up the deployment architecture of Apache Nutch We have to setup Apache Nutch on each of the machine which we are using. In this case, we are using six machines cluster. So we have to setup Apache Nutch on each machine. For the less number of machines in our cluster configuration, we can setup manually on each machine. But when the machines are more, let’s say we have 100 machines in our cluster environment. So we can’t setup on each machine manually. For that we require some deployment tool such as Chef or ateleast distributed ssh. You can refer to http://www.opscode.com/chef/ for getting familiar with Chef. You can refer http://www.ibm.com/developerworks/aix/library/au-satdistadmin/for getting familiar with distributed ssh.I will just demonstrate about running Apache Hadoop on Ubuntu for Single-Node Cluster. If you want to go for running Apache Hadoop on Ubuntu for Multi-Node cluster then I have already provided reference link above. You can follow that and configure the same. Once we have done with the deployment of Apache Nutch to single machine, we will run this script start-all.sh that will start the services on the master node and data nodes. It means the script will begin the hadoop daemons on the master node and so we are able to login into all the slave nodes using ssh command as explained above and will begin daemons on the slave nodes. The start-all.sh script expects that Apache Nutch should be put on the same location on each machine. It is also expecting that Apache Hadoop is storing the data at the same filepath on each machine. The start-all.sh script which starts the daemons on the master and slave nodes are going to use password-less login using ssh. Introduction of Apache Nutch configuration with Eclipse Apache Nutch can be easily configured with Eclipse. After that we can perform crawling easily using Eclipse. So need to perform crawling from command line. We can use eclipse for all the operations of crawling which we are doing from command line.Instructions are provided for fixing a development environment for Apache Nutch with Eclipse IDE. It's supposed to give a comprehensive starting resource for configuring, building, crawling and debugging of Apache Nutch within the above of context. Following are the prerequisites for Apache Nutch integration with Eclipse: Get the latest version of Eclipse from http://www.eclipse.org/downloads/packages/release/juno/r All the required subsequent are available from the Eclipse Marketplace. But if they are not, you can download eclipse market place as follows http://marketplace.eclipse.org/marketplace-client-intro Once you've configuired Eclipse, Download as per here http://subclipse.tigris.org/. If you have faced a problem with the 1.8.x release, try 1.6.x. This may resolve compatability issues. Download IvyDE plugin for Eclipse as here http://ant.apache.org/ivy/ivyde/download.cgi Download m2e plugin for Eclipse here http://marketplace.eclipse.org/content/maven-integration-eclipse Introduction of Apache Accumulo Accumulo is basically used as the datastore for storing data. So same way as we are using different databases like MySQL, Oracle, etc. So same way Apache Accumulo can be used. The key point of Apache Accumulo is, it is running on Apache Hadoop Cluster environment. So that's a very good feature with Accumulo.Accumulo sorted, distributed key/value store could be a strong, scalable, high performance information storage and retrieval system. Apache Accumulo depends on Google's BigTable design and is built ontop of Apache Hadoop, ,Thrift and Zookeeper. Apache Accumulo features a some novel improvement on the BigTable design within a form of cell-based access management and the server-side programming mechanism which will do modificationication in key/value pairs at varied points within the data management process Introduction of Apache Gora Apache Gora open source framework providesin-memory data model and persistence for large data. Apache Gora supports persisting to column stores, key and value stores, document stores and RDBMSs and analyzing the data with extensive Apache Hadoop MapReduce support. Supported Datastores Apache Gora presently supports the subsequent datastores: AccumuloProphetess PApache Hbase Amazon DynamoDB Use of Apache Gora Although there are many excellent ORM frameworks for relational databases and data modeling in NoSQL data stores different profoundly from their relative cousins. DataD-model agnostic frameworks like JDO aren't comfortable to be used cases, wherever one has to use the complete power of data models in column stores. Gora fills the thegap giving user an easy-to-use in-memory data model plus persistence for large data frameworkproviding data store specific mappings and also in built Apache Hadoop support. Integration of Apache Nutch with Apache Accumulo In this section, we are going to cover the integration process for integrating Apache Nutch with Apache Accumulo. Apache Accumulo is basically used for a huge data storeage. It is built on the top of Apache Hadoop, Zookeeper and Thrift. So a potential use of integrating Apache Nutch with Apache Accumulo is when our application has huge data to process and we want to run our application in cluste environment. At that time we can use Apache Accumulo as data storage purpose. As Apache Accumulo only running with Apache Hadoop, maximum use of Apache Accumulo would be in cluster based environment. So first we will start with the configuration of Apache GORA with Apache Nutch. Then we will setup Apache Hadoop and Zookeeper. Then we will do installation and configuration of Apache Accumulo. Then we will test Apache Accumulo and at the end we will see Crawling with Apache Nutch on Apache Accumulo. Setup Apache Hadoop and Apache Zookeeper for Apache Nutch Apache Zookeeper is a centralized service which is used for maintaining configuration information, provideses distributed synchronization, naming and also provideses group services. All these services are used by distributed applications in one or another manner. So all these services are provided by zookeeper so you don’t have to write these services from scratch. You can use these services for implementing consensus, management, group, leader election and presence protocols and you can also build it for your own requirements. Apache Accumulo is built on the top of Apache Hadoop, Zookeeper. So we must configure Apache Accumulo within Apache Hadoop and Apache Zookeeper. You can referrer to http://www.covert.io/post/18414889381/accumulo-nutch-and-gora for any queries related to setup. Integration of Apache Nutch with MySQL In this section, we are going to integrate Apache Nutch with MySQL. So after that you can crawled webpages in Apache Nutch that will be stored in MYSQL. So you can go to MySQL and check your crawled webpages and also perform necessary operations. We will start with the introduction of MySQL then we will cover what is the need of integrating MySQL with Apache Nutch. After that we will see configuration of MySQL with Apache Nutch and at the end we will do crawling with Apache Nutch on MySQL. So let’s just start with the introduction of MYSQL. Summary We covered the following: Downloading Apache Hadoop and Apache Nutch Perform Crawling on Apache Hadoop Cluster in Apache Nutch Apache Nutch configuration with eclipse Installation steps of building Apache Nutch with Eclipse Crawling in Eclipse Configuration of Apache GORA with Apache Nutch Installation and Configuration of Apache Accumulo Crawling with Apache Nutch on Apache Accumulo Need of integrating MySQL with Apache Nutch   Resources for Article: Further resources on this subject: Getting Started with the Alfresco Records Management Module [Article] Making Big Data Work for Hadoop and Solr [Article] Apache Solr PHP Integration [Article]
Read more
  • 0
  • 0
  • 3017

article-image-hello-opencl
Packt
18 Dec 2013
16 min read
Save for later

Hello OpenCL

Packt
18 Dec 2013
16 min read
(For more resources related to this topic, see here.) The Wikipedia definition says that, Parallel Computing is a form of computation in which many calculations are carried out simultaneously, operating on the principle that large problems can often be divided into smaller ones, which are then solved concurrently (in parallel). There are many Parallel Computing programming standards or API specifications, such as OpenMP, OpenMPI, Pthreads, and so on. This book is all about OpenCL Parallel Programming. In this article, we will start with a discussion on different types of parallel programming. We will first introduce you to OpenCL with different OpenCL components. We will also take a look at the various hardware and software vendors of OpenCL and their OpenCL installation steps. Finally, at the end of the article we will see an OpenCL program example SAXPY in detail and its implementation. Advances in computer architecture All over the 20th century computer architectures have advanced by multiple folds. The trend is continuing in the 21st century and will remain for a long time to come. Some of these trends in architecture follow Moore's Law. "Moore's law is the observation that, over the history of computing hardware, the number of transistors on integrated circuits doubles approximately every two years". Many devices in the computer industry are linked to Moore's law, whether they are DSPs, memory devices, or digital cameras. All the hardware advances would be of no use if there weren't any software advances. Algorithms and software applications grow in complexity, as more and more user interaction comes into play. An algorithm can be highly sequential or it may be parallelized, by using any parallel computing framework. Amdahl's Law is used to predict the speedup for an algorithm, which can be obtained given n threads. This speedup is dependent on the value of the amount of strictly serial or non-parallelizable code (B). The time T(n) an algorithm takes to finish when being executed on n thread(s) of execution corresponds to: T(n) = T(1) (B + (1-B)/n) Therefore the theoretical speedup which can be obtained for a given algorithm is given by : Speedup(n) = 1/(B + (1-B)/n) Amdahl's Law has a limitation, that it does not fully exploit the computing power that becomes available as the number of processing core increase. Gustafson's Law takes into account the scaling of the platform by adding more processing elements in the platform. This law assumes that the total amount of work that can be done in parallel, varies linearly with the increase in number of processing elements. Let an algorithm be decomposed into (a+b). The variable a is the serial execution time and variable b is the parallel execution time. Then the corresponding speedup for P parallel elements is given by: (a + P*b) Speedup = (a + P*b) / (a + b) Now defining α as a/(a+b), the sequential execution component, as follows, gives the speedup for P processing elements: Speedup(P) = P – α *(P - 1) Given a problem which can be solved using OpenCL, the same problem can also be solved on a different hardware with different capabilities. Gustafson's law suggests that with more number of computing units, the data set should also increase that is, "fixed work per processor". Whereas Amdahl's law suggests the speedup which can be obtained for the existing data set if more computing units are added, that is, "Fixed work for all processors". Let's take the following example: Let the serial component and parallel component of execution be of one unit each. In Amdahl's Law the strictly serial component of code is B (equals 0.5). For two processors, the speedup T(2) is given by: T(2) = 1 / (0.5 + (1 – 0.5) / 2) = 1.33 Similarly for four and eight processors, the speedup is given by: T(4) = 1.6 and T(8) = 1.77 Adding more processors, for example when n tends to infinity, the speedup obtained at max is only 2. On the other hand in Gustafson's law, Alpha = 1(1+1) = 0.5 (which is also the serial component of code). The speedup for two processors is given by: Speedup(2) = 2 – 0.5(2 - 1) = 1.5 Similarly for four and eight processors, the speedup is given by: Speedup(4) = 2.5 and Speedup(8) = 4.5 The following figure shows the work load scaling factor of Gustafson's law, when compared to Amdahl's law with a constant workload: Comparison of Amdahl's and Gustafson's Law OpenCL is all about parallel programming, and Gustafson's law very well fits into this book as we will be dealing with OpenCL for data parallel applications. Workloads which are data parallel in nature can easily increase the data set and take advantage of the scalable platforms by adding more compute units. For example, more pixels can be computed as more compute units are added. Different parallel programming techniques There are several different forms of parallel computing such as bit-level, instruction level, data, and task parallelism. This book will largely focus on data and task parallelism using heterogeneous devices. We just now coined a term, heterogeneous devices. How do we tackle complex tasks "in parallel" using different types of computer architecture? Why do we need OpenCL when there are many (already defined) open standards for Parallel Computing? To answer this question, let us discuss the pros and cons of different Parallel computing Framework. OpenMP OpenMP is an API that supports multi-platform shared memory multiprocessing programming in C, C++, and Fortran. It is prevalent only on a multi-core computer platform with a shared memory subsystem. A basic OpenMP example implementation of the OpenMP Parallel directive is as follows: #pragma omp parallel { body; } When you build the preceding code using the OpenMP shared library, libgomp would expand to something similar to the following code: void subfunction (void *data) { use data; body; } setup data; GOMP_parallel_start (subfunction, &data, num_threads); subfunction (&data); GOMP_parallel_end (); void GOMP_parallel_start (void (*fn)(void *), void *data, unsigned num_threads) The OpenMP directives make things easy for the developer to modify the existing code to exploit the multicore architecture. OpenMP, though being a great parallel programming tool, does not support parallel execution on heterogeneous devices, and the use of a multicore architecture with shared memory subsystem does not make it cost effective. MPI Message Passing Interface (MPI) has an advantage over OpenMP, that it can run on either the shared or distributed memory architecture. Distributed memory computers are less expensive than large shared memory computers. But it has its own drawback with inherent programming and debugging challenges. One major disadvantage of MPI parallel framework is that the performance is limited by the communication network between the nodes. Supercomputers have a massive number of processors which are interconnected using a high speed network connection or are in computer clusters, where computer processors are in close proximity to each other. In clusters, there is an expensive and dedicated data bus for data transfers across the computers. MPI is extensively used in most of these compute monsters called supercomputers. OpenACC The OpenACC Application Program Interface (API) describes a collection of compiler directives to specify loops and regions of code in standard C, C++, and Fortran to be offloaded from a host CPU to an attached accelerator, providing portability across operating systems, host CPUs, and accelerators. OpenACC is similar to OpenMP in terms of program annotation, but unlike OpenMP which can only be accelerated on CPUs, OpenACC programs can be accelerated on a GPU or on other accelerators also. OpenACC aims to overcome the drawbacks of OpenMP by making parallel programming possible across heterogeneous devices. OpenACC standard describes directives and APIs to accelerate the applications. The ease of programming and the ability to scale the existing codes to use the heterogeneous processor, warrantees a great future for OpenACC programming. CUDA Compute Unified Device Architecture (CUDA) is a parallel computing architecture developed by NVIDIA for graphics processing and GPU (General Purpose GPU) programming. There is a fairly good developer community following for the CUDA software framework. Unlike OpenCL, which is supported on GPUs by many vendors and even on many other devices such as IBM's Cell B.E. processor or TI's DSP processor and so on, CUDA is supported only for NVIDIA GPUs. Due to this lack of generalization, and focus on a very specific hardware platform from a single vendor, OpenCL is gaining traction. CUDA or OpenCL? CUDA is more proprietary and vendor specific but has its own advantages. It is easier to learn and start writing code in CUDA than in OpenCL, due to its simplicity. Optimization of CUDA is more deterministic across a platform, since less number of platforms are supported from a single vendor only. It has simplified few programming constructs and mechanisms. So for a quick start and if you are sure that you can stick to one device (GPU) from a single vendor that is NVIDIA, CUDA can be a good choice. OpenCL on the other hand is supported for many hardware from several vendors and those hardware vary extensively even in their basic architecture, which created the requirement of understanding a little complicated concepts before starting OpenCL programming. Also, due to the support of a huge range of hardware, although an OpenCL program is portable, it may lose optimization when ported from one platform to another. The kernel development where most of the effort goes, is practically identical between the two languages. So, one should not worry about which one to choose. Choose the language which is convenient. But remember your OpenCL application will be vendor agnostic. This book aims at attracting more developers to OpenCL. There are many libraries which use OpenCL programming for acceleration. Some of them are MAGMA, clAMDBLAS, clAMDFFT, BOLT C++ Template library, and JACKET which accelerate MATLAB on GPUs. Besides this, there are C++ and Java bindings available for OpenCL also. Once you've figured out how to write your important "kernels" it's trivial to port to either OpenCL or CUDA. A kernel is a computation code which is executed by an array of threads. CUDA also has a vast set of CUDA accelerated libraries, that is, CUBLAS, CUFFT, CUSPARSE, Thrust and so on. But it may not take a long time to port these libraries to OpenCL. Renderscripts Renderscripts is also an API specification which is targeted for 3D rendering and general purpose compute operations in an Android platform. Android apps can accelerate the performance by using these APIs. It is also a cross-platform solution. When an app is run, the scripts are compiled into a machine code of the device. This device can be a CPU, a GPU, or a DSP. The choice of which device to run it on is made at runtime. If a platform does not have a GPU, the code may fall back to the CPU. Only Android supports this API specification as of now. The execution model in Renderscripts is similar to that of OpenCL. Hybrid parallel computing model Parallel programming models have their own advantages and disadvantages. With the advent of many different types of computer architectures, there is a need to use multiple programming models to achieve high performance. For example, one may want to use MPI as the message passing framework, and then at each node level one might want to use, OpenCL, CUDA, OpenMP, or OpenACC. Besides all the above programming models many compilers such as Intel ICC, GCC, and Open64 provide auto parallelization options, which makes the programmers job easy and exploit the underlying hardware architecture without the need of knowing any parallel computing framework. Compilers are known to be good at providing instruction-level parallelism. But tackling data level or task level auto parallelism has its own limitations and complexities. Introduction to OpenCL OpenCL standard was first introduced by Apple, and later on became part of the open standards organization "Khronos Group". It is a non-profit industry consortium, creating open standards for the authoring, and acceleration of parallel computing, graphics, dynamic media, computer vision and sensor processing on a wide variety of platforms and devices. The goal of OpenCL is to make certain types of parallel programming easier, and to provide vendor agnostic hardware-accelerated parallel execution of code. OpenCL (Open Computing Language) is the first open, royalty-free standard for general-purpose parallel programming of heterogeneous systems. It provides a uniform programming environment for software developers to write efficient, portable code for high-performance compute servers, desktop computer systems, and handheld devices using a diverse mix of multi-core CPUs, GPUs, and DSPs. OpenCL gives developers a common set of easy-to-use tools to take advantage of any device with an OpenCL driver (processors, graphics cards, and so on) for the processing of parallel code. By creating an efficient, close-to-the-metal programming interface, OpenCL will form the foundation layer of a parallel computing ecosystem of platform-independent tools, middleware, and applications. We mentioned vendor agnostic, yes that is what OpenCL is about. The different vendors here can be AMD, Intel, NVIDIA, ARM, TI, and so on. The following diagram shows the different vendors and hardware architectures which use the OpenCL specification to leverage the hardware capabilities: The heterogeneous system The OpenCL framework defines a language to write "kernels". These kernels are functions which are capable of running on different compute devices. OpenCL defines an extended C language for writing compute kernels, and a set of APIs for creating and managing these kernels. The compute kernels are compiled with a runtime compiler, which compiles them on-the-fly during host application execution for the targeted device. This enables the host application to take advantage of all the compute devices in the system with a single set of portable compute kernels. Based on your interest and hardware availability, you might want to do OpenCL programming with a "host and device" combination of "CPU and CPU" or "CPU and GPU". Both have their own programming strategy. In CPUs you can run very large kernels as the CPU architecture supports out-of-order instruction level parallelism and have large caches. For the GPU you will be better off writing small kernels for better performance. Hardware and software vendors There are various hardware vendors who support OpenCL. Every OpenCL vendor provides OpenCL runtime libraries. These runtimes are capable of running only on their specific hardware architectures. Not only across different vendors, but within a vendor there may be different types of architectures which might need a different approach towards OpenCL programming. Now let's discuss the various hardware vendors who provide an implementation of OpenCL, to exploit their underlying hardware. Advanced Micro Devices, Inc. (AMD) With the launch of AMD A Series APU, one of industry's first Accelerated Processing Unit (APU), AMD is leading the efforts of integrating both the x86_64 CPU and GPU dies in one chip. It has four cores of CPU processing power, and also a four or five graphics SIMD engine, depending on the silicon part which you wish to buy. The following figure shows the block diagram of AMD APU architecture: AMD architecture diagram—© 2011, Advanced Micro Devices, Inc. An AMD GPU consist of a number of Compute Engines (CU) and each CU has 16 ALUs. Further, each ALU is a VLIW4 SIMD processor and it could execute a bundle of four or five independent instructions. Each CU could be issued a group of 64 work items which form the work group (wavefront). AMD Radeon ™ HD 6XXX graphics processors uses this design. The following figure shows the HD 6XXX series Compute unit, which has 16 SIMD engines, each of which has four processing elements: AMD Radeon HD 6xxx Series SIMD Engine—© 2011, Advanced Micro Devices, Inc. Starting with the AMD Radeon HD 7XXX series of graphics processors from AMD, there were significant architectural changes. AMD introduced the new Graphics Core Next (GCN) architecture. The following figure shows an GCN compute unit which has 4 SIMD engines and each engine is 16 lanes wide: GCN Compute Unit—© 2011, Advanced Micro Devices, Inc. A group of these Compute Units forms an AMD HD 7xxx Graphics Processor. In GCN, each CU includes four separate SIMD units for vector processing. Each of these SIMD units simultaneously execute a single operation across 16 work items, but each can be working on a separate wavefront. Apart from the APUs, AMD also provides discrete graphics cards. The latest family of graphics card, HD 7XXX, and beyond uses the GCN architecture. NVIDIA® One of NVIDIA GPU architectures is codenamed "Kepler". GeForce® GTX 680 is one Kepler architectural silicon part. Each Kepler GPU consists of different configurations of Graphics Processing Clusters (GPC) and streaming multiprocessors. The GTX 680 consists of four GPCs and eight SMXs as shown in the following figure: NVIDIA Kepler architecture—GTX 680, © NVIDIA® Kepler architecture is part of the GTX 6XX and GTX 7XX family of NVIDIA discrete cards. Prior to Kepler, NVIDIA had Fermi architecture which was part of the GTX 5XX family of discrete and mobile graphic processing units. Intel® Intel's OpenCL implementation is supported in the Sandy Bridge and Ivy Bridge processor families. Sandy Bridge family architecture is also synonymous with the AMD's APU. These processor architectures also integrated a GPU into the same silicon as the CPU by Intel. Intel changed the design of the L3 cache, and allowed the graphic cores to get access to the L3, which is also called as the last level cache. It is because of this L3 sharing that the graphics performance is good in Intel. Each of the CPUs including the graphics execution unit is connected via Ring Bus. Also each execution unit is a true parallel scalar processor. Sandy Bridge provides the graphics engine HD 2000, with six Execution Units (EU), and HD 3000 (12 EU), and Ivy Bridge provides HD 2500(six EU) and HD 4000 (16 EU). The following figure shows the Sandy bridge architecture with a ring bus, which acts as an interconnect between the cores and the HD graphics: Intel Sandy Bridge architecture—© Intel® ARM Mali™ GPUs ARM also provides GPUs by the name of Mali Graphics processors. The Mali T6XX series of processors come with two, four, or eight graphics cores. These graphic engines deliver graphics compute capability to entry level smartphones, tablets, and Smart TVs. The below diagram shows the Mali T628 graphics processor. ARM Mali—T628 graphics processor, © ARM Mali T628 has eight shader cores or graphic cores. These cores also support Renderscripts APIs besides supporting OpenCL. Besides the four key competitors, companies such as TI (DSP), Altera (FPGA), and Oracle are providing OpenCL implementations for their respective hardware. We suggest you to get hold of the benchmark performance numbers of the different processor architectures we discussed, and try to compare the performance numbers of each of them. This is an important first step towards comparing different architectures, and in the future you might want to select a particular OpenCL platform based on your application workload.
Read more
  • 0
  • 0
  • 2763
article-image-enabling-your-new-theme-magento
Packt
18 Dec 2013
3 min read
Save for later

Enabling your new theme in Magento

Packt
18 Dec 2013
3 min read
(For more resources related to this topic, see here.) After your new theme is in place, you can enable it in Magento. Log in to your Magento store's administration panel. Once you have logged in, navigate to System | Configuration, as shown in the following screenshot: From there, select the global configuration scope (labeled Default Config in the following screenshot) you want to apply your new theme to, from the Current Configuration Scope dropdown in the top left of your screen: Once this has loaded, navigate to the Design tab under GENERAL in the left-hand column and expand the Themes block in the right-hand column, as shown in the following screenshot: From here, you can tell Magento to use your new theme. The values given here correspond to the name you gave to the directories when creating your theme. The example uses responsive as the value here, as shown in the following screenshot: Click on the Save Config button at the top right of your screen to save the changes. Next, check that your new theme has been activated. Remember the styles.css file you added in the skin/frontend/default/responsive/css directory? The presence of that file is telling Magento to load your new theme's CSS file instead of the default styles.css file for Magento from the default package, so your store now has none of the original CSS styling it. As such, you should see the following screenshot when you attempt to view the frontend of your Magento store: Overwriting the default Magento templates Noticed the name of your Magento theme appearing next to the logo in the header of your store? You can overwrite the default header.phtml that's causing it by copying the contents of app/design/frontend/base/default/template/page/html/header.phtml into app/design/frontend/default/responsive/template/ page/html/header.phtml. Open the file and find the following lines: <?php if ($this->getIsHomePage()):?> <h1 class="logo"><strong><?php echo $this->getLogoAlt() ?></strong><a href="<?php echo $this->getUrl('') ?>" title= "<?php echo $this->getLogoAlt() ?>" class="logo"><img src = "<?php echo $this->getLogoSrc() ?>" alt="<?php echo $this->getLogoAlt() ?>" /></a></h1> <?php else:?> <a href="<?php echo $this->getUrl('') ?>" title="<?php echo $this->getLogoAlt() ?>" class="logo"><strong><?php echo $this->getLogoAlt() ?></strong><img src = "<?php echo $this->getLogoSrc() ?>" alt="<?php echo $this->getLogoAlt() ?>" /></a> <?php endif?> Replace them with these lines: <a href="<?php echo $this->getUrl('') ?>" title="<?php echo $this- >getLogoAlt() ?>" class="logo"><img src = "<?php echo $this-> getLogoSrc() ?>" alt="<?php echo $this->getLogoAlt() ?>" /></a> Now if you save that file (and upload it to your server, if needed), you can see that the logo now looks tidier, as shown in the following screenshot: That's it! Your basic responsive Magento theme is up and running. Summary Hopefully after reading this article you will get a better understanding of how to enable your new theme in Magento. Resources for Article: Further resources on this subject: Magento : Payment and shipping method [Article] Categories and Attributes in Magento: Part 2 [Article] Magento: Exploring Themes [Article]
Read more
  • 0
  • 0
  • 12541

article-image-settings-goals
Packt
18 Dec 2013
8 min read
Save for later

Settings goals

Packt
18 Dec 2013
8 min read
(for more resources related to this topic, see here.) You can use the following code to track the event: [Flurry logEvent:@"EVENT_NAME"]; The logEvent: method logs your event every time it's triggered during the application session. This method helps you to track how often that event is triggered. You can track up to 300 different event IDs. However, the length of each event ID should be less than 255 characters. After the event is triggered, you can track that event from your Flurry dashboard. As is explained in the following screenshot, your events will be listed in the Events section. After clicking on Event Summary, you can see a list of the events you have created along with the statistics of the generated data as shown in the following screenshot: You can fetch the detailed data by clicking on the event name (for example, USER_VIEWED). This section will provide you with a chart-based analysis of the data as shown in the following screenshot: The Events Per Session chart will provide you with details about how frequently a particular event is triggered in a session. Other than this, you are provided with the following data charts as well: Unique Users Performing Event: This chart will explain the frequency of unique users triggering the event. Total Events: This chart holds the event-generation frequency over a time period. You can access the frequency of the event being triggered over any particular time slot. Average Events Per Session: This charts holds the average frequency of the events that happen per session. There is another variation of this method, as shown in the following code, which allows you to track the events along with the specific user data provided: [Flurry logEvent:@"EVENT_NAME" withParameters:YOUR_NSDictionary]; This version of the logEvent: method counts the frequency of the event and records dynamic parameters in the form of dictionaries. External parameters should be in the NSDictionary format, whereas both the key and the value should be in the NSString object format. Let's say you want to track how frequently your comment section is used and see the comments, then you can use this method to track such events along with the parameters. You can track up to 300 different events with an event ID length less than 255 characters. You can provide a maximum of 10 event parameters per event. The following example illustrates the use of the logEvent: method along with optional parameters in the dictionary format: NSDictionary *dictionary = [NSDictionary dictionaryWithObjectsAndKeys:@"your dynamic parameter value", @"your dynamic parameter name",nil]; [Flurry logEvent:@"EVENT_NAME" withParameters:dictionary]; In case you want Flurry to log all your application sections/screens automatically, then you should pass navigationController or as a parameter to count all your pages automatically using one of the following code: [Flurry logAllPageViews:navigationController]; [Flurry logAllPageViews:tabBarController]; The Flurry SDK will create a delegate on your navigationControlleror tabBarController object, whichever is provided to detect the page's navigation. Each navigation detected will be tracked by the Flurry SDK automatically as a page view. You only need to pass each object to the Flurry SDK once. However you can pass multiple instances of different navigation and tab bar controllers. There can be some cases where you can have a view controller that is not associated with any navigation or tab bar controller. Then you can use the following code: [Flurry logPageView]; The preceding code will track the event independently of navigation and tab bar controllers. For each user interaction you can manually log events. Tracking time spent Flurry allows you to track events based on the duration factor as well. You can use the [Flurry logEvent: timed:] method to log your event in time as shown in the following code: [Flurry logEvent:@"EVENT_NAME" timed:YES]; In case you want to pass additional parameters along with the event name, you can use the following type of the logEvent: method to start a timed event for event Parameters as shown in the following code: [Flurry logEvent:@"EVENT_NAME" withParameters:YOUR_NSDictionarytimed:YES]; The aforementioned method can help you to track your timed event along with the dynamic data provided in the dictionary format. You can end all your timed events before the application exits. This can even be accomplished by updating the event with event Parameters. If you want to end your events without updating the parameters, you can pass nil as the parameters. If you do not end your events, they will automatically end when the application exits as shown in the following code: [Flurry endTimedEvent:@"EVENT_NAME" withParameters:YOUR_NSDictionary]; Let's take the following example in which you want to log an event whenever a user comments on any article in your application: NSDictionary *commentParams = [NSDictionary dictionaryWithObjectsAndKeys: @"User_Comment", @"Comment", // Capture comment info @"Registered", @"User_Status", // Capture user status nil]; [Flurry logEvent:@"User_Comment" withParameters:commentParams timed:YES]; // In a function that captures when a user post the comment [Flurry endTimedEvent:@"Article_Read" withParameters:nil]; //You can pass in additional //params or update existing ones here as well The aforementioned piece of code will help you to log a timed event every time a user comments on a picture in your application. While tracking the event, you are also tracking the comment and the user registered by specifying them in the dictionary. Tracking errors Flurry provides you with a method to track errors as well. You can use the following methods to track errors on Flurry: [Flurry logError:@"ERROR_NAME" message:@"ERROR_MESSAGE" exception:e]; You can track exceptions and errors that occurred in the application by providing the name of the error (ERROR_NAME) along with the messages, such as ERROR_MESSAGE, with an exception object. Flurry reports the first ten errors in each session. You can fetch all the application exceptions and specifically uncaught exceptions on Flurry. You can use the logError:message:exception: class method to catch all the uncaught exceptions. These exceptions will be logged in Flurry in the Error section, which is accessible on the Flurry dashboard: // Uncaught Exception Handler - sent through Flurry. void uncaughtExceptionHandler(NSException *exception) { [Flurry logError:@"Uncaught" message:@"Crash" exception:exception]; } - (void)applicationDidFinishLaunching:(UIApplication *)application { NSSetUncaughtExceptionHandler(&uncaughtExceptionHandler); [Flurry startSession:@"YOUR_API_KEY"]; // .... } Flurry also helps you to catch all the uncaught exceptions generated by the application. All the exceptions will be caught by using the NSSetUncaughtExceptionHandler method in which you can pass a method that will catch all the exceptions raised during the application session. All the errors reported can also be tracked using the logError:message:error: method. You can pass the error name, message, and object to log the NSError error on Flurry as shown in the following code: - (void) webView:(UIWebView *)webView didFailLoadWithError:(NSError *)error { [Flurry logError:@"WebView No Load" message:[error localizedDescription] error:error]; } Tracking versions When you develop applications for mobile devices, it's obvious that you will evolve your application at every stage, pushing the latest updates for the application, which creates a new version of the application on the application store. To track the application based on these versions, you need to set up the Flurry to track your application versions as well. This can be done using the following code: [Flurry setAppVersion:App_Version_Number]; So by using the aforementioned method, you can track your application based on its version. For example, if you have released an application and unfortunately it's having a critical bug, then you can track your application based on the current version and the errors that are tracked by Flurry from the application. You can access data generated from Flurry's Dashboards by navigating to Flurry Classic. This will, by default, load a time-based graph of the application session for all versions. However, you can access the user session graph by selecting a version from the drop-down list as shown in the following screenshot: This is how the drop-down list will appear. Select a version and click on Update as shown in the following screenshot: The previous action will generate a version-based graph for a user's session with the as number of times users have opened the app in the given time frame shown in the following screenshot: Along with that, Flurry also provides user retention graphs to gauge the number of users and the usage of application over a period of time. Summary In this article we explored the ways to track the application on Flurry and to gather meaningful data on Flurry by setting goals to track the application. Then we learned how to track the time spent by users on the application along using user data tracking and retention graphs to gauge the number of users. resources for article: further resources on this subject: Data Analytics [article] Learning Data Analytics with R and Hadoop [article] Limits of Game Data Analysis [article]
Read more
  • 0
  • 0
  • 1481
Modal Close icon
Modal Close icon