Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletter Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Data

1210 Articles
article-image-using-oracle-goldengate
Packt
02 Aug 2013
15 min read
Save for later

Using Oracle GoldenGate

Packt
02 Aug 2013
15 min read
(For more resources related to this topic, see here.) Creating one-way replication (Simple) Here we'll be utilizing the demo scripts included in the OGG software distribution to implement a basic homogenous (Oracle-to-Oracle) replication. Getting ready You need to ensure your Oracle database is in archivelog mode. If your database is not in archivelog mode, you won't be able to recover your database due to media corruption or user errors. How to do it... The steps for creating one-way replication are as follows: Check whether supplemental logging is enabled on your source database using the following command: SQL> select supplemental_log_data_min from v$database; The output of the preceding command will be as follows: SUPPLEME-----------------NO Enable supplemental logging using the following command: SQL> alter database add supplemental log data;SQL> select supplemental_log_data_min from v$database; The output of the preceding command will be as follows: SUPPLEME-----------------YES Let's run the demo script to create a couple of tables in the scott schema. You need to know the scott schema password, which is tiger by default. We do it using following command: $ cd /u01/app/oracle/gg$ ./ggsci$ sqlpus scottEnter password:SQL> @demo_ora_create.sql The output of the preceding command will be as follows: DROP TABLE tcustmer*ERROR at line 1:ORA-00942: table or view does not existTable created.DROP TABLE tcustord*ERROR at line 1:ORA-00942: table or view does not existTable created. You must add the checkpoint table, do it as follows: $ cd /u01/app/oracle/gg$ vi GLOBALS Add the following entry to the file: CheckPointTable ogg.chkpt Save the file and exit. Next create the checkpoint table using the following command: $ ./ggsciGGSCI> add checkpointtableGGSCI> info checkpointtable The output of the preceding command will be as follows: No checkpoint table specified, using GLOBALS specification (ogg.chkpt)...Checkpoint table ogg.chkpt created 2012-10-31 12:39:38. Set up the MANAGER parameter file using the following command: $ cd /u01/app/oracle/gg/dirprm$ vi mgr.prm Add the following lines to the file: PORT 7809DYNAMICPORTLIST 7810-7849AUTORESTART er *, RETRIES 6, WAITMINUTES 1, RESETMINUTES 10PURGEOLDEXTRACTS /u01/app/oracle/gg/dirdat/*, USECHECKPOINTS,MINKEEPDAYS 2 Save the file and exit. Start the manager using the following command: $ cd /u01/app/oracle/gg$ ggsciGGSCI> start mgrGGSCI> info mgr The output of the preceding command will be as follows: GGSCI> info allProgram Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNING Create a TNS entry in the database home so that the extract can connect to the Automatic Storage Management (ASM) instance, using the following command: $ cd $ORACLE_HOME/network/admin$ vi tnsnames.ora Add the following TNS entry: ASMGG =(DESCRIPTION =(ADDRESS =(PROTOCOL = IPC)(key=EXTPROC1521))(CONNECT_DATA=(SID=+ASM))) Save the file and exit. Create a user asmgg with the sysdba role in the ASM instance. Connect to the ASM instance as sys user using the following command: $ sqlplus sys/<password>@asmgg as sysasm The output of the preceding command will be as follows: SQL*Plus: Release 11.2.0.3.0 Production on Thu Nov 15 14:24:202012Copyright (c) 1982, 2011, Oracle. All rights reserved.Connected to:Oracle Database 11g Enterprise Edition Release 11.2.0.3.0 - 64bitProductionWith the Automatic Storage Management option The user is created using the following command: SQL> create user asmgg identified by asmgg ; We will get the following output message: User created. Provide the sysdba role to the user ASMGG using the following command: SQL> grant sysdba to asmgg ; We will get the following output message: Grant succeeded. Let's add supplemental logging to the source tables using the following commands: $ cd /u01/app/oracle/gg$ ./ggsciGGSCI> add trandata scott.tcustmer The output will be as follows: Logging of supplemental redo data enabled for table SCOTT.TCUSTMER. Then type the following command: GGSCI> add trandata scott.tcustord The output message will be as follows: Logging of supplemental redo data enabled for table SCOTT.TCUSTORD. The next command to be executed is: GGSCI> info trandata scott.tcustmer The output message will be as follows: Logging of supplemental redo log data is disabled for table OGG.TCUSTMER. The next command to be used is: GGSCI> info trandata scott.tcustord The output will be as follows: Logging of supplemental redo log data is disabled for table OGG.TCUSTORD. Create the extract parameter file for data capture using the following command: $ cd /u01/app/oracle/gg/dirprm$ vi ex01sand.prm Add the following lines to the file: EXTRACT ex01sandSETENV (ORACLE_SID="SRC100")SETENV (ORACLE_HOME="/u01/app/oracle/product/11.2.0/db_1")SETENV (NLS_LANG="AMERICAN_AMERICA.AL32UTF8")USERID ogg, PASSWORD oggTRANLOGOPTIONS EXCLUDEUSER oggTRANLOGOPTIONS ASMUSER asmgg@ASMGG ASMPASSWORD asmgg-- Trail File location locallyEXTTRAIL /u01/app/oracle/gg/dirdat/prDISCARDFILE /u01/app/oracle/gg/dirrpt/ex01sand.dsc, PURGEDISCARDROLLOVER AT 01:00 ON SUNDAYTABLE SCOTT.TCUSTMER ;TABLE SCOTT.TCUSTORD ; Save the file and exit. Let's add the Extract process and start it. We do it by using the following command: $ cd /u01/app/oracle/gg$ ./ggsciGGSCI> add extract ex01sand tranlog begin now The output of the preceding command will be as follows: EXTRACT added. The following command adds the location of the trail files and size for each trail created: GGSCI> add exttrail /u01/app/oracle/gg/dirdat/pr extract ex01sandmegabytes 2 The output of the preceding command will be as follows: EXTTRAIL added.GGSCI> start ex01sandSending START request to MANAGER ...EXTRACT EX01SAND startingGGSCI> info allProgram Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNINGEXTRACT RUNNING EX01SAND 00:00:00 00:00:06 Next we'll create the data pump parameter file using the following command: $ cd /u01/app/oracle/gg/dirprm$ vi pp01sand.prm Add the following lines to the file: EXTRACT pp01sandPASSTHRURMTHOST hostb MGRPORT 7820RMTTRAIL /u01/app/oracle/goldengate/dirdat/rpDISCARDFILE /u01/app/oracle/gg/dirrpt/pp01sand.dsc, PURGE-- Tables for transportTABLE SCOTT.TCUSTMER ;TABLE SCOTT.TCUSTORD ; Save the file and exit. Add the data pump process and final configuration on the source side as follows: GGSCI> add extract pp01sand exttrailsource /u01/app/oracle/gg/dirdat/pr The output of the preceding command will be as follows: EXTRACT added. The following command points the pump to drop the trail files to the remote location: GGSCI> add rmttrail /u01/app/oracle/goldengate/dirdat/rp extractpp01sand megabytes 2 The output of the preceding command will be as follows: RMTTRAIL added Then we execute the following command: GGSCI> info all The output of the preceding command will be as follows: Program Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNINGEXTRACT RUNNING EXPR610 00:00:00 00:00:05EXTRACT STOPPED PP01SAND 00:00:00 00:00:55 We're not going to start the data pump (pump) at this point since the manager does not yet exist at the target site. Perform the following actions on the target server. We've now completed most of our steps on the source system. We'll have to come back to the source server to start the pump a little later. Now, we'll move on to our target server where we'll have to set up the Replicat process in order to receive and apply the changes received from the source database. Perform the following actions on the target database: Create tables on the target host using the following command: $ cd /u01/app/oracle/goldengate$ sqlplus scott/tigerSQL> @demo_ora_create.sql The output of the preceding command will be as follows: DROP TABLE tcustmer*ERROR at line 1:ORA-00942: table or view does not existTable created.DROP TABLE tcustord*ERROR at line 1:ORA-00942: table or view does not existTable created. Let's add the checkpoint table as a global parameter using the following command: $ cd /u01/app/oracle/goldengate$ vi GLOBALS Add the following line to the file: CheckPointTable ogg.chkpt Save the file and exit. Create the checkpoint table using the following command: $ cd ..$ ./ggsciGGSCI> dblogin userid ogg password oggGGSCI> add checkpointtable Then execute the following command: $ cd /u01/app/oracle/goldengate/dirprm$ vi mgr.prm Add the following lines to the file: PORT 7820DYNAMICPORTLIST 7821-7849AUTORESTART er *, RETRIES 6, WAITMINUTES 1, RESETMINUTES 10PURGEOLDEXTRACTS /u01/app/oracle/goldengate/dirdat/*,USECHECKPOINTS, MINKEEPFILES 2 Save the file and exit Start the manager using the following command: $ cd /u01/app/oracle/goldengate$ ./ggsciGGSCI> start mgrGGSCI> info mgrGGSCI> info all We will get the following output: Program Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNING Edit the parameter file using the following command, now we're ready to create the replicat parameter file: $ cd /u01/app/oracle/goldengate/dirprm$ vi re01sand.prm Add the following lines to the file: REPLICAT re01sandSETENV (ORACLE_SID="TRG101")SETENV (ORACLE_HOME="/u01/app/oracle/product/11.1.0/db_1")SETENV (NLS_LANG = "AMERICAN_AMERICA.AL32UTF8")USERID ogg PASSWORD oggDISCARDFILE /u01/app/oracle/goldengate/dirrpt/re01sand.dsc, APPENDDISCARDROLLOVER at 01:00ReportCount Every 30 Minutes, RateREPORTROLLOVER at 01:30DBOPTIONS DEFERREFCONSTASSUMETARGETDEFSMAP SCOTT.TCUSTMER , TARGET SCOTT.TCUSTMER ;MAP SCOTT Save the file and exit. We now add and start the Replicat process using the following commands: $ cd .. The following extrail location must match exactly as in the pump's rmttrail location on the source server: $ ./ggsciGGSCI> add replicat re01sand exttrail /u01/app/oracle/goldengate/dirdat/rp checkpointtable ogg.chkptGGSCI> start re01sand The output of the preceding command will be as follows: Sending START request to MANAGER ...REPLICAT RE01SAND starting Then we execute the following command: GGSCI> info all The output of the preceding command will be as follows:` Program Status Group Lag at Chkpt Time Since ChkptMANAGER RUNNINGREPLICAT RUNNING RE01SAND 00:00:00 00:00:01 Let's go back to the source host and start the pump using the following command: $ cd /u01/app/oracle/gg$ ./ggsciGGSCI> start pp01sand The output of the preceding command will be as follows: Sending START request to MANAGER ...EXTRACT PP01SAND starting Next we use the demo insert script to add rows to source tables that should replicate to the target tables. We can do it using the following commands: $ cd /u01/app/oracle/gg$ sqlplus scott/tigerSQL> @demo_ora_insert The output of the preceding command will be as follows: 1 row created.1 row created.1 row created.1 row created.Commit complete. To verify that the 4 rows just created have been captured at the source use the following commands: $ ./ggsciGGSC>stats ex01sand totalsonly scott.* The output of the preceding command will be as follows: Sending STATS request to EXTRACT EX01SAND ...Start of Statistics at 2012-11-30 20:22:37.Output to /u01/app/oracle/gg/dirdat/pr:… truncated for brevity*** Latest statistics since 2012-11-30 20:17:38 ***Total inserts 4.00Total updates 0.00Total deletes 0.00Total discards 0.00Total operations 4.00 To verify if the pump has shipped to the target server use the following command: GGSCI> stats pp01sand totalsonly scott.* The output of the preceding command will be as follows: Sending STATS request to EXTRACT PP01SAND ...Start of Statistics at 2012-11-30 20:24:56.Output to /u01/app/oracle/goldengate/dirdat/rp:Cumulative totals for specified table(s):… cut for brevity*** Latest statistics since 2012-11-30 20:18:14 ***Total inserts 4.00Total updates 0.00Total deletes 0.00Total discards 0.00Total operations 4.00End of Statistics. And finally if they have been applied at the target, the next command is performed at the target server as follows: $ ./ggsciGGSCI> stats re01sand totalsonly scott.* The output of the preceding command will be as follows: Sending STATS request to REPLICAT RE01SAND ...Start of Statistics at 2012-11-30 20:28:01.Cumulative totals for specified table(s):...*** Latest statistics since 2012-11-30 20:18:20 ***Total inserts 4.00Total updates 0.00Total deletes 0.00Total discards 0.00Total operations 4.00End of Statistics. How it works... Supplemental logging must be turned on at the database level and subsequently at the table level as well, for those tables you would like to replicate. For a one-way replication, this is done at the source table. There isn't a need to turn on supplemental logging at the target site, if the target site in turn is not a source to other targets or to itself. A database user ogg is created in order to administer the OGG schema. This user is solely used for the purpose of administering OGG in the database. Checkpoints are needed by both the source and target servers; these are structures that persist to disk as a known position in the trail file. You would start from these after an expected or unexpected shutdown of the OGG process. The PORT parameter in the mgr.prm file specifies the port to which the MGR should bind and start listening for connection requests. If the manager is down, then connections can't be established and you'll receive TCP connection errors. The only necessary parameter required is the port number itself. Also, the PURGEOLDEXTRACT parameter is a nice way to keep your trail files to a minimum size so that they don't store indefinitely and finally run out of space in your filesystem. In this example, we're asking the manager to purge trail files and keep the files from the last two days on disk. If your Oracle database is using an ASM instance, then OGG needs to establish a connection to the ASM instance in order to read the online-redo logs. You must ensure that you either use the sys schema or create a user (such as asmgg) with SYSDBA privileges for authentication. Since we need a supplemental log at the table level, add trandata does precisely this Now we'll focus on some of the EXTRACT (ex01sand) data capture parameters. For one thing, you'll notice that we need to supply the extract with credentials to the database and the ASM instance in order to scan the online-redo logs for committed transactions. The following lines tell OGG to exclude the user ogg from capture. The second tranlogoptions is how the extract authenticates to the ASM instance. USERID ogg, PASSWORD oggTRANLOGOPTIONS EXCLUDEUSER oggTRANLOGOPTIONS ASMUSER asmgg@ASMGG ASMPASSWORD asmgg If you're using Oracle version 10gR2 and later versions of 10gR2, or Oracle 11.2.0.2 and later, you could use the newer ASM API tranlogoptions DBLOGREADER rather than the ASMUSER. The API uses the database connection rather than connecting to the ASM instance to read the online-redo logs. The following two lines in the extract tell the extract where to place the trail files, with a prefix of pr followed by 6 digits that increment once each file rolls over to the next file generation. The DISCARDFILE by convention has the same name as the extract but with an extension .dsc for discard. If, for any reason, OGG can't capture a transaction, it will throw the text and SQL to this file for later investigation. EXTTRAIL /u01/app/oracle/gg/dirdat/prDISCARDFILE /u01/app/oracle/gg/dirrpt/ex01sand.dsc, PURGE Tables or schemas are captured with the following syntax in the extract file: TABLE SCOTT.TCUSTMER ;TABLE SCOTT.TCUSTORD ; The specification can vary and use wildcards as well. Say you want to capture the entire schema, you could specify this as TABLE SCOTT.* ;. In the following code the first command adds the extract with the option tranlog begin now telling OGG to start capturing changes using the online-redo logs as of now. The second command tells the extract where to store the trail files with a size not exceeding 2 MB. GGSCI> add extract ex01sand tranlog begin nowGGSCI> add exttrail /u01/app/oracle/gg/dirdat/pr extract ex01sandmegabytes 2 Now, the PUMP (data pump; pp01sand) is an optional, but highly recommended extract whose sole purpose is to perform all of the TCP/IP activity; for example, transporting the trail files to the target site. This is beneficial because we alleviate the capture process from performing any of the TCP/IP activity. The parameters in the following snippet tell the pump to send the data as is with the PASSTHRU parameter. This is the optimal and preferred method if there isn't any data transformation along the way. The RMTHOST parameter specifies the destination host and the port to which the remote manager is listening, for example, port 7820. If the manager port is not running at the target, the destination host will refuse the connection; that is why we did not start the pump early on during our work on the source host. PASSTHRURMTHOST hostb MGRPORT 7820RMTTRAIL /u01/app/oracle/goldengate/dirdat/rp The RMTTRAIL specifies where the trail file will be stored at the remote host with a prefix of rp followed by a 6 digit number sequentially increasing as the files roll over after a specified size has reached. Finally, at the destination host, hostb, the Replicat process (re01sand) is the applier where the SQL is replayed in the target database. The following two lines in the parameter file specify how the Replicat knows to map source and target data as it comes in by way of the trail files: MAP SCOTT.TCUSTMER , TARGET SCOTT.TCUSTMER ;MAP SCOTT.TCUSTORD , TARGET SCOTT.TCUSTORD ; The target tables don't necessarily have to be of the same schema names as in the preceding example, but they could have been applied to a different schema altogether if that was the requirement Summary In this article we learned about the creation of one-way replication using Oracle GoldenGate. Resources for Article : Further resources on this subject: Oracle GoldenGate 11g: Configuration for High Availability [Article] Getting Started with Oracle GoldenGate [Article] Oracle GoldenGate: Considerations for Designing a Solution [Article]
Read more
  • 0
  • 0
  • 11117

article-image-making-simple-curl-request-simple
Packt
01 Aug 2013
5 min read
Save for later

Making a simple cURL request (Simple)

Packt
01 Aug 2013
5 min read
(For more resources related to this topic, see here.) Getting ready In this article we will use cURL to request and download a web page from a server. How to do it... Enter the following code into a new PHP project: <?php // Function to make GET request using cURL function curlGet($url) { $ch = curl_init(); // Initialising cURL session // Setting cURL options curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); curl_setopt($ch, CURLOPT_URL, $url); $results = curl_exec($ch); // Executing cURL session curl_close($ch); // Closing cURL session return $results; // Return the results } $packtPage = curlGet('http://www.packtpub.com/oop-php-5/book'); echo $packtPage; ?> Save the project as 2-curl-request.php (ensure you use the .php extension!). Execute the script. Once our script has completed, we will see the source code of http://www.packtpub.com/oop-php-5/book displayed on the screen. How it works... Let's look at how we performed the previously defined steps: The first line, <?php, and the last line,?>, indicate where our PHP code block will begin and end. All the PHP code should appear between these two tags. Next, we create a function called curlGet(), which accepts a single parameter $url, the URL of the resource to be requested. Running through the code inside the curlGet() function, we start off by initializing a new cURL session as follows: $ch = curl_init(); We then set our options for cURL as follows: curl_setopt($ch, CURLOPT_RETURNTRANSFER, TRUE); // Tells cURL to return the results of the request (the source code of the target page) as a string. curl_setopt($ch, CURLOPT_URL, $url); // Here we tell cURL the URL we wish to request, notice that it is the $url variable that we passed into the function as a parameter. We execute our cURL request, storing the returned string in the $results variable as follows: $results = curl_exec($ch); Now that the cURL request has been made and we have the results, we close the cURL session by using the following code: curl_close($ch); At the end of the function, we return the $results variable containing our requested page, out of the function for using in our script. return $results; After the function is closed we are able to use it throughout the rest of our script. Later, deciding on the URL we wish to request, http://www.packtpub.com/oop-php-5/book , we execute the function, passing the URL as a parameter and storing the returned data from the function in the $packtPage variable as follows: $packtPage = curlGet('http://www.packtpub.com/oop-php-5/book'); Finally, we echo the contents of the $packtPage variable (the page we requested) to the screen by using the following code: echo $packtPage; There's more... There are a number of different HTTP request methods which indicate the server the desired response, or the action to be performed. The request method being used in this article is cURLs default GET request. This tells the server that we would like to retrieve a resource. Depending on the resource we are requesting, a number of parameters may be passed in the URL. For example, when we perform a search on the Packt Publishing website for a query, say, php, we notice that the URL is http://www.packtpub.com/books?keys=php. This is requesting the resource books (the page that displays search results) and passing a value of php to the keys parameter, indicating that the dynamically generated page should show results for the search query php. More cURL Options Of the many cURL options available, only two have been used in our preceding code. They are CURLOPT_RETURNTRANSFER and CURLOPT_URL. Though we will cover many more throughout the course of this article, some other options to be aware of, that you may wish to try out, are listed in the following table: Option Name Value Purpose CURLOPT_FAILONERROR TRUE or FALSE If a response code greater than 400 is returned, cURL will fail silently. CURLOPT_FOLLOWLOCATION TRUE or FALSE If Location: headers are sent by the server, follow the location. CURLOPT_USERAGENT A user agent string, for example: 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10.5; rv:15.0) Gecko/20100101 Firefox/15.0.1' Sending the user agent string in your request informs the target server, which client is requesting the resource. Since many servers will only respond to 'legitimate' requests it is advisable to include one. CURLOPT_HTTPHEADER An array containing header information, for example: array('Cache-Control: max-age=0', 'Connection: keep-alive', 'Keep-Alive: 300', 'Accept-Language: en-us,en;q=0.5') This option is used to send header information with  the request and we will come across use cases for this in later recipes. A full listing of cURL options can be found on the PHP website at http://php.net/manual/en/function.curl-setopt.php. The HTTP response code An HTTP response code is the number that is returned, which corresponds with the result of an HTTP request. Some common response code values are as follows: 200: OK 301: Moved Permanently 400: Bad Request 401: Unauthorized 403: Forbidden 404: Not Found 500: Internal Server Error Summary This article covers techniques on making a simple cURL request. It is often useful to have our scrapers responding to different response code values in a different manner, for example, letting us know if a web page has moved, or is no longer accessible, or we are unauthorized to access a particular page. In this case, we can access the response of a request using cURL by adding the following line to our function, which will store the response code in the $httpResponse variable: $httpResponse = curl_getinfo($ch, CURLINFO_HTTP_CODE); Resources for Article: Further resources on this subject: A look into the high-level programming operations for the PHP language [Article] Installing PHP-Nuke [Article] Creating Your Own Theme—A Wordpress Tutorial [Article]
Read more
  • 0
  • 0
  • 3722

article-image-data-sources-charts
Packt
31 Jul 2013
12 min read
Save for later

Data sources for the Charts

Packt
31 Jul 2013
12 min read
(For more resources related to this topic, see here.) Spreadsheets In Spreadsheets, two preparation steps must be addressed in order to use a Spreadsheet as a data source with the Visualization API. The first is to identify the URL location of the Spreadsheet file for the API code. The second step is to set appropriate access to the data held in the Spreadsheet file. Preparation The primary method of access for a Spreadsheet behaving as a data source is through a JavaScript-based URL query. The query itself is constructed with the Google Query Language. If the URL request does not include a query, all data source columns and rows are returned in their default order. To query a Spreadsheet also requires that the Spreadsheet fi le and the API application security settings are con figured appropriately. Proper preparation of a Spreadsheet as a data source involves both setting the appropriate access as well as locating the fi le's query URL. Permissions In order for a Spreadsheet to return data to the Visualization API properly, access settings on the Spreadsheets fi le itself must allow view access to users. For a Spreadsheet that allows for edits, including form-based additions, permissions must be set to Edit . To set permissions on the Spreadsheet, select the Share button to open up the Sharing settings dialog. To be sure the data is accessible to the Visualization API, access levels for both the Visualization application and Spreadsheet must be the same. For instance, if a user has access to the Visualization application and does not have view access to the Spreadsheet, the user will not be able to run the visualization as the data is more restrictive to that user than the application. The opposite scenario is true as well, but less likely to cause confusion as a user unable to access the API application is a fairly self-described problem. All Google applications handle access and permissions similarly. More information on this topic can be found on the Google Apps Support pages. Google Permissions overview is available at  http://support.google. com/drive/bin/answer.py?hl=en&answer=2494886&rd=1. Get the URL path At present, acquiring a query-capable URL for a Spreadsheet is not as straightforward a task as one might think. There are several methods for which a URL is generated for sharing purposes, but the URL format needed for a data source query can only be found by creating a gadget in the Spreadsheet. A Google Gadget is simply dynamic, HTML or JavaScript-based web content that can be embedded in a web page. Google Gadgets also have their own API, and have capabilities beyond Spreadsheets applications. Information on Google Gadget API is available at https://developers.google.com/gadgets/. Initiate gadget creation by selecting the Gadget... option from the Insert item on the menu bar. When the Gadget Settings window appears, select Apply & close from the Gadget Settings dialog. Choose any gadget from the selection window. The purpose of this procedure is simply to retrieve the correct URL for querying. In fact, deleting the gadget as soon as the URL is copied is completely acceptable. In other words, the specific gadget chosen is of no consequence. Once the gadget has been created, select Get query data source url… from the newly created gadget's drop-down menu. Next, determine and select the range of the Spreadsheet to query. Either the previously selected range when the gadget was created, or the entire sheet is acceptable, depending on the needs of the Visualization application being built. The URL listed under Paste this as a gadget data source url in the Table query data source window is the correct URL to use with the API code requiring query capabilities. Be sure to select the desired cell range, as the URL will change with various options. Important note Google Gadgets are to be retired in 2013, but the query URL is still part of the gadget object at the time of publication. Look for the method of finding the query URL to change as Gadgets are retired. Query Use the URL retrieved from the Spreadsheet Gadget to build the query. The following query statement is set to query the entire Spreadsheet of the key indicated: var query =new google.visualization.Query ('https://docs.google.com/spreadsheet/tq?key =0AhnmGz1SteeGdEVsNlNWWkoxU3ZRQjlmbDdTTjF2dHc&headers=-1'); Once the query is built, it can then be sent. Since an external data source is by definition not always under explicit control of the developer, a valid response to a query is not necessarily guaranteed. In order to prevent hard-to-detect data-related issues, it is best to include a method of handling erroneous returns from the data source. The following query.send function also informs the application how to handle information returned from the data source, regardless of quality. query.send(handleQueryResponse); The handleQueryResponse function sent along with the query acts as a filter, catching and handling errors from the data source. If an error was detected, the handleQueryResponse function displays an alert message. If the response from the data source is valid, the function proceeds and draws the visualization. function handleQueryResponse(response) { if (response.isError()) {alert('Error in query: ' + response.getMessage() + ' ' + response.getDetailedMessage()); return; } var data = response.getDataTable(); visualization = new google.visualization.Table (documnt.getElementById('visualization')); visualization.draw(data, null);} Best practice Be prepared for potential errors by planning for how to handle them. For reference, the previous example is given in its complete HTML form: <html > <head><meta http-equiv="content-type" content ="text/html; charset=utf-8"/> <title> Google Visualization API Sample </title> <script type="text/javascript" src ="http://www.google.com/jsapi"> </script><script type="text/javascript"> google.load('visualization', '1', {packages: ['table']}); </script> <script type="text/javascript">var visualization;function drawVisualization() {// To see the data that this visualization uses, browse to //https://docs.google.com/spreadsheet/ccc?key=0AhnmGz1SteeGdEVsNlN WWkoxU3ZRQjlmbDdTTjF2dHc&usp=sharing var query = new google.visualization.Query('https://docs.google.com/spreadsheet/tq?key= 0AhnmGz1SteeGdEVsNlNWWkoxU3ZRQjlmbDdTTjF2dHc&headers=-1'); // Send the query with a callback function. query.send(handleQueryResponse); } function handleQueryResponse(response) { if (response.isError()) { alert('Error in query: ' + response.getMessage() + ' ' + response.getDetailedMessage()); return; } var data = response.getDataTable(); visualization = new google.visualization.Table(document.getEleme ntById('visualization')); visualization.draw(data, null);} google.setOnLoadCallback(drawVisualization); </script></head><body style="font-family: Arial;border: 0 none;"> <div id="visualization" style ="height: 400px; width: 400px;"> </div> </body></html>  View live examples for Spreadsheets at http://gvisapi-packt. appspot.com/ch6-examples/ch6-datasource.html Apps Script method Just as the Visualization API can be used from within an Apps Script, external data sources can also be requested from the script. In the Apps Script Spreadsheet example presented earlier in this article, the DataTable() creation was performed within the script. In the following example, the create data table element has been removed and a .setDataSourceUrloption has been added to Charts. newAreaChart(). The script otherwise remains the same. functiondoGet() {var chart = Charts.newAreaChart().setDataSourceUrl("https: //docs.google.com/spreadsheet/tq ?key= 0AhnmGz1SteeGdEVsNlNWWkoxU3ZRQjlmbDdTTjF2dHc&headers=-1").setDimensions(600, 400) .setXAxisTitle("Age Groups") .setYAxisTitle("Population") .setTitle("Chicago Population by Age and Gender - 2010 Census") .build();varui = UiApp.createApplication(); ui.add(chart); returnui;} View live examples in Apps Script at https://script. google.com/d/1Q2R72rGBnqPsgtOxUUME5zZy5Kul5 3r_lHIM2qaE45vZcTlFNXhTDqrr/edit. Fusion Tables Fusion Tables are another viable data source ready for use by Visualization API. Fusion Tables offer benefit over Spreadsheets beyond just the Google Map functionality. Tables API also allows for easier data source modification than is available in Spreadsheets. Preparation Preparing a Fusion Table to be used as a source is similar in procedure to preparing a Spreadsheet as a data source. The Fusion Table must be shared to the intended audience, and a unique identifier must be gathered from the Fusion Tables application. Permissions Just as with Spreadsheets, Fusion Tables must allow a user a minimum of view permissions in order for an application using the Visualization API to work properly. From the Sharing settings window in Fusion Tables, give the appropriate users viewaccess as a minimum. Get the URL path Referencing a Fusion Table is very similar in method to Spreadsheets. Luckily, the appropriate URL ID information is slightly easier to find in Fusion Tables than in Spreadsheets. With the Sharing settings window open, there is a field at the top of the page containing the Link to share . At the end portion of the link, following the characters dcid= is the Table's ID. The ID will look something like the following: 1Olo92KwNin8wB4PK_dBDS9eghe80_4kjMzOTSu0 This ID is the unique identifier for the table. Query Google Fusion Tables API includes SQL-like queries for the modification of Fusion Tables data from outside the GUI interface. Queries take the form of HTTP POST and GET requests and are constructed using the Fusion Tables API query capabilities. Data manipulation using Fusion Tables API is beyond the scope of this article, but a simple example is offered here as a basic illustration of functionality. Fusion Table query requests the use of the API SELECT option, formatted as: SELECT Column_name FROM Table_ID Here Column_name is the name of the Fusion Table column and Table_ID is the table's ID extracted from the Sharing settings window. If the SELECT call is successful, the requested information is returned to the application in the JSON format. The Visualization API drawChart() is able to take the SELECT statement and the corresponding data source URL as options for the chart rendering. The male and female data from the Fusion Tables 2010 Chicago Census file have been visualized using the drawChart() technique. function drawVisualization() { google.visualization.drawChart({ containerId: 'visualization', dataSourceUrl: 'http://www.google.com/fusiontables/gvizdata?tq=', query: 'SELECT Age, Male, Female FROM 1Olo92KwNin8wB4PK_ dBDS9eghe80_4kjMzOTSu0', chartType: 'AreaChart', options: { title: 'Chicago Population by Age and Sex - 2010 Census', vAxis: { title: 'Population' }, hAxis: { title: 'Age Groups' } } });} The preceding code results in the following visualization: Live examples are available at http://gvisapi-packt. appspot.com/ch6-examples/ch6-queryfusion.html. Important note Fusion Table query responses are limited to 500 rows. See Fusion Tables API documentation for other resource parameters. API Explorer With so many APIs available to developers using the Google platform, testing individual API functionality can be time consuming. The same issue arises for GUI applications used as a data source. Fortunately, Google provides API methods for its graphical applications as well. The ability to test API requests against Google's infrastructure is a desirable practice for all API programing efforts. To support this need, Google maintains the APIs Explorer service. This service is a console-based, web application that allows queries to be submitted to APIs directly, without an application to frame them. This is helpful functionality when attempting to verify whether a data source is properly configured. To check if the Fusion Tables 2010 U.S. Census data instance is configured properly, a query can be sent to list all columns, which informs which columns are actually exposed to the Visualization API application. Best practice Use the Google API Explorer service to test if API queries work as intended. To use the API Explorer for Fusion Tables, select Fusion Tables API from the list of API services. API functions available for testing are listed on the Fusion Tables API page. Troubleshooting a Chart with a Fusion Tables data source usually involves fi rst verifying all columns are available to the visualization code. If a column is not available, or is not formatted as expected, a visualization issue related to data problems may be difficult to troubleshoot from inside the Visualization API environment. The API call that best performs a simple check on column information is the fusiontables.column.list item. Selecting fusiontables.column.list opens up a form-based interface. The only required information is the Table ID (collected from the Share settings window in the Fusion Tables file). Click on the Execute button to run the query. The API Explorer tool will then show the GET query sent to the Fusion Table in addition to the results it returned. For the fusiontables.column.list query, columns are returned in bracketed sections. Each section contains attributes of that column. The following queried attributes should look familiar, as it is the fusiontables.column.list result of a query to the 2010 Chicago Census data Fusion Table. Best Practice The Column List Tool is helpful when troubleshooting Fusion Table to API code connectivity. If the Table is able to return coherent values through the tool, it can generally be assumed that access settings are appropriate and the code itself may be the source of connection issues. Fusion Tables—row and query reference is available at https:// developers.google.com/fusiontables/docs/v1/sqlreference. Information on API Explorer—column list is available at https:// developers.google.com/fusiontables/docs/v1/ reference/column/list#try-it.
Read more
  • 0
  • 0
  • 2001

article-image-participating-business-process-intermediate
Packt
31 Jul 2013
5 min read
Save for later

Participating in a business process (Intermediate)

Packt
31 Jul 2013
5 min read
(For more resources related to this topic, see here.) The hurdles and bottlenecks for financial services from an IT point of view are: Silos of data Outdated IT system and many applications running on legacy and non-standard based systems Business process and reporting systems not in sync with each other Lack of real-time data visibility Automated decision making Ability to change and manage business processes in accordance with changes in business dynamics Partner management Customer satisfaction This is where BPM plays a key role in bridging the gap between key business requirements and technology or businesses hurdles. In a real-life scenario, a typical home loan use case would be tied up with Know Your Customer (KYC) regulatory requirement. In India for example, the Reserve Bank of India ( RBI) had passed on guidelines that make it mandatory for banks to properly know their customers. RBI mandates that banks collect their customers' proof of identity, recent photographs, and Income Tax PAN. Proof of residence can be a voter card, a driving license, or a passport copy. Getting ready We start with the source code from the previous recipe. We will add a re-usable e-mail or SMS notification process. It is always a best practice to add a new process if it is called multiple times in the same process. This can be a subprocess within the main process itself, or it can be a part of the same composite outside the main process. We will add a new regulatory requirement that allows the customer to add KYC requirements such as photo, proof of address, and Income Tax PAN copy as attachments that will be checked into the WebCenter Content repository. These checks become a part of the customer verification stage before finance approval. We will make KYC as a subprocess with a scope of expansion under a different scenario. We will also save the process data into a filesystem or in a JMS messaging queue at the end of the loan process completion. In a banking scenario, it can also be the integration stage for other applications such as a CRM application or any other application. How to do it… Let's perform the following steps: Launch JDeveloper and open the composite.xml of LoanApplicationProcess in the Design view. Drag-and-drop a new BPMN Process component from the Component Palette. Create the Send Notifications process next to the existing LoanApplicationProcess, and edit the new process. The Send Notifications process will take input parameters as To e-mail ID, From e-mail ID, Subject, CC, and send e-mail to the given e-mail ID. Similarly, we will drag-and-drop a File Adapter component from the Component Palette that saves the customer data into a file. We place this component the end of the LoanApplication process, just before the End activity. We will use this notification service to notify Verification Officers about the arrival of a new eligible application that needs to be verified. In the Application Verification Officer stage, we will add a subprocess, KYC , that will be assigned to the loan initiator—James Cooper in our case. This will be preceded by sending an e-mail notification to the applicant asking for KYC details such as PAN number, scanned photograph, and voter ID as requested by the Verification Officers. Now, let us implement Save Loan Application by invoking the File Adapter service. The Email notification services are also available out of the box. How it works… The outputs of this recipe are re-usable services that can be used across multiple service calls such as notification services. This recipe also demonstrates how to use subprocesses and change the process to meet regulatory requirements. Let's understand the output by taking our use case scenario: When the process is initiated, the e-mail notification gets triggered at appropriate stages of the process. Conan Doyle and John Steinbeck will get the e-mail, requesting them to process the application, with the required information of the applicant, along with the link to BPM Workspace. The KYC task also sends an e-mail to James Cooper, requesting him for the documents required for the KYC check. James Cooper logs in to the James Bank WebCenter Portal and sees there is a task assigned to him to upload his KYC details. James Cooper clicks on the task link and submits the required soft copy documents, and gets them checked into the content repository once the form is submitted.            The start-to-end process flow now looks as follows: Summary BPM Process Spaces, which is an extension template of BPM, allows process and task views to be exposed to WebCenter Portal. The advantage of having Process Spaces made available within the Portal is that the users can collaborate with others using out of the box Portal features such as wikis, discussion forums, blogs, and content management. This improves productivity as the user need not log in to different applications for different purposes, as all the required data and information will be made available within the Portal environment. It is also possible to expose some of the WSRP supported application portlets (for example, HR Portlets from PeopleSoft) into a corporate portal environment. All of this sums up to provide higher visibility of the entire business process, and a nature of working and collaborating together in an enterprise business environment. Resources for Article : Further resources on this subject: Managing Oracle Business Intelligence [Article] Oracle E-Business Suite: Creating Bank Accounts and Cash Forecasts [Article] Getting Started with Oracle Information Integration [Article]
Read more
  • 0
  • 0
  • 1853

article-image-model-design-accelerator
Packt
30 Jul 2013
6 min read
Save for later

Model Design Accelerator

Packt
30 Jul 2013
6 min read
(For more resources related to this topic, see here.) By the end of this article you will be able to use Model Design Accelerator to design a new Framework model. To introduce Model Design Accelerator, we will use a fairly simple schema based on a rental star schema, derived from the MySQL Sakila sample database.This database can be downloaded from http://dev.mysql.com/doc/sakila/en/. It is just one example of a number of possible dimensional models based on this sample database. The Model Design Accelerator user interface The user interface of Model Design Accelerator is very simple, consisting of only two panels: Explorer Tree: This contains details of the database tables and views from the data source. Model Accelerator: This contains a single fact table surrounded by four dimension tables, and is the main work area for the model being designed. By clicking on the labels (Explorer Tree and Model Accelerator) at the top of the window, it is possible to hide either of these panels, but having both these panels always visible is beneficial. Starting Model Design Accelerator Model Design Accelerator is started from the Framework Manager initial screen: Select Create a new project using Model Design Accelerator…. This will start the new project creation wizard, which is exactly the same as if you were starting any new project. Select the data source to import the database tables into the new model. After importing the database tables, the project creation wizard will display the Model Design Accelerator Introduction screen: After reading the instructions, click on the Close button to continue. This will then show the Model Design Accelerator workspace. Adding tables to your workspace The first step in creating your model with Model Design Accelerator is to add the dimension and fact tables to your model: From the Explorer panel,drag-and-drop dim_date ,dim_film ,dim_ customer, and dim_store to the four New Query Subject boxes in the Model Accelerator panel. After adding your queries, right-click on the boxes to rename the queries to Rental Date Dim,Film Dim ,Customer Dim, and Store Dim respectively. If not all query columns are required, it is also possible to expand the dimension tables and drag-and-drop individual columns to the query boxes. In the Explorer Tree panel,expand the fact_rental table by clicking on the (+) sign besides the name, and from the expanded tree drag-and-drop count_returns,count_rentals, and rental_duration columns to the Fact Query Subject box. Rename the Fact Query Subject to Rental Fact. Additional dimension queries can be added to the model by clicking on the top-left icon in the Model Accelerator panel, and then by dragging and dropping the required query onto the workplace window. Since we have a start_date and an end_date for the rental period, add a second copy of the date_dim table, by clicking on the icon and dragging the table from the Explorer view into the workspace. Also rename this query as Return Date Dim: Adding joins to your workspace After we have added our database table columns to the workspace, we now need to add the relationship joins between the dimension and fact tables. To do this: Double-click on the Rental Date Dim table, and this will expand the date_ dim and the fact_rental tables in the workspace window: Click on the Enter relationship creation mode link. Select the date_key column in the dim_date table, and the rental_date_ key column in the fact_rental table as follows: Click on the Create relationship icon: Click on OK to create this join. Close the Query Subject Diagram by clicking on the (X) symbol in the top-right corner. Repeat this procedure for each of the other four tables. The final model will look like the following screenshot: Generating Framework Manager model Once we have completed our model in Model Design Accelerator, we need to create a Framework Manager model: Click on the Generate Model button. Click on Yes to generate your model.The Framework Manager model will be generated and will open as follows:   When you generate your model, all of the Model Advisor tests are automatically applied to the resulting model. You should review any issues that have been identified in the Verify Results tab, and decide whether you need to fix them. When you generate the model only those query items required will be used to create the Framework Manager model. The Physical View tab will contain only those tables required by your star schema model. The Business View tab will contain model query subjects containing only the columns used in your star schema model. The Presentation View tab will only contain shortcuts to the query subjects that exist in the Business View tab. After generating your model, you can use Framework Manager to improve the model by adding calculations, filters, dimensions, measures, and so on. Each time you generate a Framework Manager model from your Model Design Accelerator model, a new namespace is created in the current Framework Manager model and any improvements you want to use will also need to be applied to these new namespaces. From Framework Manager you can return to Model Design Accelerator at any time to continue making changes to your star schema. To return to the Model Design Accelerator from within Framework Manager: From the Tools menu, select Run Model Design Accelerator. You may choose to continue with the same model or create a new model. To make your star schema model available to the report authors, you must first create a package and then publish the package to your Cognos Reporting Server. Summary In this article, we have looked at Model Design Accelerator. This is a tool that allows a novice modeler, or even an experienced modeler, to create a new Framework Manger model quickly and easily. Resources for Article: Further resources on this subject: Integrating IBM Cognos TM1 with IBM Cognos 8 BI [Article] How to Set Up IBM Lotus Domino Server [Article] IBM Cognos 10 BI dashboarding components [Article]
Read more
  • 0
  • 0
  • 1830

article-image-first-steps-r
Packt
30 Jul 2013
6 min read
Save for later

First steps with R

Packt
30 Jul 2013
6 min read
(For more resources related to this topic, see here.) Obtaining and installing R The way to obtain R is downloading it from the CRAN website (http://www.r-project.org/). The Comprehensive R Archive Network (CRAN) is a network of FTP and web servers around the world that stores identical, up-to-date versions of code and documentation for R. The CRAN is directly accessible from the R website and on such website it is also possible to find information about R, some technical manuals, the R journal, and details about the packages developed for R and stored on the CRAN repositories. The functionalities of the R environment can then also be expanded thanks to software libraries which can be installed and recalled if needed. These libraries or packages are a collection of source code and other additional files that, when installed in R, allow the user to load them in the workspace via a call to the library() function. An example of code to load the package lattice may be found as follows: > library(lattice) An R installation contains one or more libraries of packages. Some of these packages are part of the basic installation and are loaded automatically as soon as the session is started. Other can be installed from the CRAN, the official R repository, or downloaded and installed manually. Interacting with the console As soon as you will start R, you will see that a workspace is open; you can see a screenshot of the R Console window in the image below. The workspace is the environment in which you are working, where you will load your data, and create your variables. The screen prompt > is the R prompt that waits for commands. On the starting screen, you can either type any function, command, or you can use R to perform basic calculation. R uses the usual symbols for addition (+), subtraction (-), multiplication (*), division (/), and exponentiation (^). Parentheses ( ) can be used to specify the order of operations. R also provides %% for taking the modulus and %/% for integer division. Comments in R are defined by the character #, so everything after such character up to the end of the line will be ignored by R. R has a number of built-in functions, for example, sin(x), cos(x), tan(x), (all in radians), exp(x), log(x), and sqrt(x). Some special constants such as pi are also pre-defined. You can see an example of the use of such function in the following code: > exp(2.5)[1] 12.18249 Understanding R objects In every computer language, variables provide a means of accessing the data stored in memory. R does not provide direct access to the computer’s memory but rather provides a number of specialized data structures called objects. These objects are referred to through symbols or variables. Vectors The basic object in R is the vector; even scalars are vectors of length one. Vectors can be thought of as a series of data of the same class. There are six basic vector type (called atomic vectors): logical, integer, real, complex, string (or character), and raw. Integer and real represent numeric objects; logicals are Boolean data type with possible value TRUE or FALSE. Among such atomic vectors, the more common ones are logical, string, and numeric (integer and real). There are several ways to create vectors. For instance the operator : (colon) is a sequence-generating operator, it creates sequences by incrementing or decrementing by one. > 1:10 [1] 1 2 3 4 5 6 7 8 9 10> 5:-6 [1] 5 4 3 2 1 0 -1 -2 -3 -4 -5 -6 If the interval between the numbers is not one, you can use the seq() function. Here an example > seq(from=2, to=2.5, by=0.1)[1] 2.0 2.1 2.2 2.3 2.4 2.5 One of the more important features of R is the possibility to use entire vector as arguments of functions, thus avoiding the use of cyclic loops. Most of the functions in R allow the use of vector as argument, as example the use of some of these functions is reported as follows > x <- c(12,10,4,6,9)> max(x)[1] 12> min(x)[1] 4> mean(x)[1] 8.2 Matrices and arrays In R, the matrix notation is extended to elements of any kind, so in example it is possible to have a matrix of character strings. Matrices and arrays are basically vectors with a dimension attribute. The function matrix() may be used to create matrices. By default, such function creates the matrix by column; as alternative it is possible to specify to the function to build the matrix by row: > matrix(1:9,nrow=3,byrow=TRUE) [,1] [,2] [,3][1,] 1 2 3[2,] 4 5 6[3,] 7 8 9 Lists A list in R is a collection of different objects. One of the main advantages of lists is that the object contained within a list may be of different type, for example, numeric and character values. In order to define a list, you simply will need to provide the object that you want to include as argument of the function list(). Data frame A data frame corresponds to a data set; it is basically a special list in which the elements have the same length. Elements may be different type in different columns, but within the same column all the elements are of the same type. You can easily create data frames using the function data.frame(), and a specific column can be recall using the operator $. Top features you’ll want to know about In addition to the basic object creation and manipulation, many more complex tasks can be performed with R, spanning from data manipulation, programming, statistical analysis and the realization of very high quality graphs. Some of the most useful features are Data input and output Flow control (for, if…else, while) Create your own functions Debugging functions and handling exceptions Plotting data Summary In this article we saw what is R, how to obtain and install R, and how to interacting with the console. We also saw at few R objects and also looked at the top features you would want to know about Resources for Article: Further resources on this subject: Organizing, Clarifying and Communicating the R Data Analyses [Article] Customizing Graphics and Creating a Bar Chart and Scatterplot in R [Article] Graphical Capabilities of R [Article]
Read more
  • 0
  • 0
  • 6724
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-avro-source-sink
Packt
19 Jul 2013
3 min read
Save for later

Avro Source Sink

Packt
19 Jul 2013
3 min read
(For more resources related to this topic, see here.) A typical configuration might look something as follows: To use the Avro Source, you specify the type property with a value of avro. You need to provide a bind address and port number to listen on: collector.sources=av1collector.sources.av1.type=avrocollector.sources.av1.bind=0.0.0.0collector.sources.av1.port=42424collector.sources.av1.channels=ch1collector.channels=ch1collector.channels.ch1.type=memorycollector.sinks=k1collector.sinks.k1.type=hdfscollector.sinks.k1.channel=ch1collector.sinks.k1.hdfs.path=/path/in/hdfs Here we have configured the agent on the right that listens on port 42424, uses a memory channel, and writes to HDFS. Here I've used the memory channel for brevity of this example configuration. Also, note that I've given this agent a different name, collector, just to avoid confusion. The agents on the left—feeding the collector tier—might have a configuration similar to this. I have left the sources off this configuration for brevity: client.channels=ch1client.channels.ch1.type=memoryclient.sinks=k1client.sinks.k1.type=avroclient.sinks.k1.channel=ch1client.sinks.k1.hostname=collector.example.comclient.sinks.k1.port=42424 The hostname value, collector.example.com, has nothing to do with the agent name on that machine, it is the host name (or you can use an IP) of the target machine with the receiving Avro Source. This configuration, named client, would be applied to both agents on the left assuming both had similar source configurations. Since I don't like single points of failure, I would configure two collector agents with the preceding configuration and instead set each client agent to round robin between the two using a sink group. Again, I've left off the sources for brevity: client.channels=ch1client.channels.ch1.type=memoryclient.sinks=k1 k2client.sinks.k1.type=avroclient.sinks.k1.channel=ch1client.sinks.k1.hostname=collectorA.example.comclient.sinks.k1.port=42424client.sinks.k2.type=avroclient.sinks.k2.channel=ch1client.sinks.k2.hostname=collectorB.example.comclient.sinks.k2.port=42424client.sinkgroups=g1client.sinkgroups.g1=k1 k2client.sinkgroups.g1.processor.type=load_balanceclient.sinkgroups.g1.processor.selector=round_robinclient.sinkgroups.g1.processor.backoff=true Summary In this article, we covered tiering data flows using the Avro Source and Sink. More information on this topic can be found in the book Apache Flume: Distributed Log Collection for Hadoop. Resources for Article : Further resources on this subject: Supporting hypervisors by OpenNebula [Article] Integration with System Center Operations Manager 2012 SP1 [Article] VMware View 5 Desktop Virtualization [Article]
Read more
  • 0
  • 0
  • 3125

article-image-dpm-non-aware-windows-workload-protection
Packt
16 Jul 2013
18 min read
Save for later

DPM Non-aware Windows Workload Protection

Packt
16 Jul 2013
18 min read
(For more resources related to this topic, see here.) Protecting DFS with DPM DFS stands for Distributed File System . It was introduced in Windows Server 2003, and is a set of services available as a role on Windows Server operating systems that allow you to group file shares held in different locations (different servers) under one folder known as DFS root . The actual locations of the file shares are transparent to the end user. DFS is also often used for redundancy of file shares. For more information on DFS Windows Server 2008: http://technet.microsoft.com/en-us/library/cc753479%28v=ws.10%29.aspx Windows Server 2008 R2 and Windows Server 2012: http://technet.microsoft.com/en-us/library/cc732006.aspx Before DFS can be protected it is important to know how it is structured. DFS consists of both data and configuration information: The configuration for DFS is stored in the registry of each server, and in either the DFS tree during standalone DFS deployments, or in Active Directory when domain-based DFS is deployed. DFS data is stored on each server in the DFS tree. The data consists of the multiple shares that make up the DFS root. Protecting DFS with DPM is fairly straightforward. It is recommended to protect the actual file shares directly on each of the servers in the DFS root. When you have a standalone DFS deployment you should protect the system state on the servers in the DFS root, and when you have a domain-based DFS deployment we recommend you protect your Active Directory of the domain controller that hosts the DFS root. If you are using DFS replication it is also recommended to protect the shadow copy components on servers that host the replication data, in addition to the previously mentioned items. These methods would allow you to restore DFS by restoring the data and either system state or Active Directory depending on your deployment type. Another option is to use the DfsUtil tool to export/import your DFS configuration. This is a command-line utility that comes with Windows Server that can export the namespace configuration to a file. The configuration can then be imported back into a DFS server to restore a DFS namespace. DPM can be set up to protect the DFS export. You would still need to protect the actual data directly. An example of using the DfsUtil tool would be: Run DfsUtil root export domainnamerootname dfsrootname.xml to export the DFS configuration to an XML file, then run DfsUtil root import to import the DFS configuration back in. For more information on the DfsUtil tool, visit the following URL: http://blogs.technet.com/b/josebda/archive/2009/05/01/using-the-windows-server-2008-dfsutil-exe-command-line-to-manage-dfs-namespaces.aspx That covers the backing up of DFS with DPM. Protecting Dynamics CRM with DPM Microsoft Dynamics CRM is Microsoft's customer relationship management (CRM) software in the CRM market. Microsoft Dynamics CRM Version 1.0 was released in 2003. It then progressed to Version 4.0 and the latest one is 2011. CRM is a part of the Microsoft Dynamics product family. In this section we will cover protecting Versions 4.0 and 2011. Note that when protecting Microsoft Dynamics CRM on either Version 4.0 or 2011, you should keep a note of your update-rollup level some place safe, so that you can install CRM back to that level in the event of a restore. You will need to restore the CRM database and this could lead to an error if CRM is not at the correct update level. To protect Microsoft Dynamics CRM 4.0, back up the following components: Microsoft CRM Server database This is straightforward; you simply need to protect the SQL CRM databases. The two databases you want to protect are the following: The configuration database: MSCRM_CONFIG The organization database: OrganizationName_MSCRM Microsoft CRM Server program files By default, these files will be located at C:Program FilesMicrosoft CRM. Microsoft CRM website By default the CRM website files are located in the C:Inetpubwwwroot directory. The web.config file can be protected. It only needs protecting if it has been changed from the default settings. Microsoft CRM registry subkey Back up the HKEY_LOCAL_MACHINESOFTWAREMicrosoftMSCRM key. Microsoft CRM customizations To protect customizations or any third-party add-ons you will need to understand the specific components to back up and protect. Other components to back up for protecting Microsoft CRM include the following: System state of your domain controller. Exchange server if the CRM's e-mail router is used. To protect Microsoft Dynamics CRM 2011, back up the following components: Microsoft CRM 2011 databases This is straightforward, you simply need to protect the SQL CRM databases. The two databases you want to protect are: The configuration database: MSCRM_CONFIG The organization database: OrganizationName_MSCRM Microsoft CRM 2011 program files By default, these files will be located at C:Program FilesMicrosoft CRM. Microsoft CRM 2011 website By default the CRM website files are located in the C:Program FilesMicrosoft CRMCRMWeb directory. The web.config file can be protected. It only needs protecting if it has been changed from the default settings. Microsoft CRM 2011 registry subkey Back up the HKEY_LOCAL_MACHINESOFTWAREMicrosoftMSCRM subkey. Microsoft CRM 2011 customizations To protect customizations or any third-party add-ons you will need to understand the specific components to back up and protect. Other components to back up for protecting Microsoft CRM 2011 include: System state of your domain controller. Exchange server if the CRM's e-mail router is used. SharePoint if CRM and SharePoint integration is in use. Note that for both CRM 4.0 and CRM 2011, you could have more than one OrganizationName_MSCRM database if you have more than one organization in CRM. Be sure to protect all of the OrganizationName_MSCRM databases that may exist. That wraps up the Microsoft Dynamics CRM protection for both 4.0 and 2011. You simply need to configure protection of the mentioned components with DPM. Now let's look at what it will take to protect another product from the Dynamics family. Protecting Dynamics GP with DPM Dynamics GP is Microsoft's ERP and accounting software package for mid-market businesses. GP has standard accounting functions but it can do more such as Sales Order Processing, Order Management, Inventory Management, and Demand Planner for forecasting, thus making it usable as a full-blown ERP. GP was once known as Great Plains Software before acquisition by Microsoft. The most recent versions of GP are Microsoft Dynamics GP 10.0 and Dynamics GP 2010 R2. GP holds your organization's financial data. If you use it as an ERP solution, it holds even more critical data, and losing this data could be devastating to an organization. Yes, there is a built-in backup utility in GP but this does not cover all bases in protecting your GP. In fact, the built-in backup process only backs up the SQL database, and does not cover items like: Customized forms Reports Financial statement formats The sysdata folder These are the GP components you should protect with DPM: SQL administrative databases: Master, TempDB, and Model Microsoft Dynamics GP system database (DYNAMICS) Each of your company databases If you use SQL Server Agent to schedule automatic tasks, back up the msdb database forms.dic (for customized forms) can be found in %systemdrive%Program Files (x86)Microsoft DynamicsGP2010 reports.dic (for reports) can be found in %systemdrive%Program Files (x86)Microsoft DynamicsGP2010 Backing up these components with DPM should be sufficient protection in the event a restore is needed. Protecting TMG 2010 with DPM Threat Management Gateway (TMG ) is a part of the Forefront product family. The predecessor to TMG is Internet Security and Acceleration Server (ISA Server ). TMG is fundamentally a firewall, but a very powerful one with features such as VPN, web caching, reverse proxy, advanced stateful packet, WAN failover, malware protection, routing, load balancer, and much more. There have been several forum threads on the Microsoft DPM TechNet forums asking about DPM protecting TMG, which sparked the inclusion of this section in the book. TMG is a critical part of networks and should have high priority in regards to backup, right up there with your other critical business applications. In many environments, if TMG is down, there are a good amount of users that cannot access certain business applications which causes downtime. Let's take a look at how and what to protect in regards to TMG. The first step is to allow DPM traffic on TMG so that the agent can communicate with DPM. You will need to install the DPM agent on TMG and then start protecting it from there. Follow the ensuing steps to protect your TMG server: On the TMG server, go to Start | All Programs | Microsoft TMG Server . Open the TMG Server Management MMC. Expand Arrays and then TMG Server computer , then click on Firewall Policy . On the View menu, click on Show System Policy Rules . Right-click on the Allow remote management from selected computers using MMC system policy rule. Select Edit System Policy . In the System Policy Editor dialog box, click to clear the Enable this configuration group checkbox, and then click on OK . Click on Apply to update the firewall configuration, and then click on OK . Right-click on the Allow RPC from TMG server to trusted servers system policy rule. Select Edit System Policy . In the System Policy Editor dialog box, click to clear the Enforce strict RPC compliance checkbox, and then click on OK . Click on Apply to update the firewall configuration, and then click on OK . On the View menu, click on Hide System Policy Rules . Right-click on Firewall Policy . Select New and then Access Rule . In the New Access Rule Wizard window, type a name in the Access rule name box. Click on Next . Check the Allow checkbox and then click on Next . In the This rule applies to list, select All outbound traffic from the drop-down menu and click on Next . On the Access Rule Sources page, click on Add . In the Add Network Entities dialog window, click on New and select Computer from the drop-down list. Now type the name of your DPM server and type the DPM server's IP address in the Computer IP Address field. Click on OK when you are done. You will then see your DPM server listed under the Computers folder in the Add Network Entities window. Select it and click on Add . This will bring the DPM computer into your access rule wizard. Click on Next . In the Add Rule Destinations window click on Add . The Add Network Entities window will come up again. In this window expand the Networks folder, and then select Local Host and click on Add . Now click on Next . Your rule should have both the DPM server and Local Host listed for both incoming and outgoing. Click on Next , leave the default All Users entry in the This rule applies to requests from the following user sets box, click on Next again. Click on Finish . Right-click on the new rule (DPM2010 in this example), and then click on Move Up . Right-click on the new rule, and select Properties . In the rule name properties dialog box (DPM2010 Properties ), click on the Protocols tab, then click on Filtering . Now select Configure RPC Protocol . In the Configure RPC protocol policy dialog box, check the Enforce strict RPC compliance checkbox, and then click on OK twice. Click on Apply to update the firewall policy, and then click on OK . Now you will need to attach the DPM agent for the TMG server. Follow the ensuing steps to complete this task: Open the DPM Administrator Console. Click on the Management tab on the navigation bar. Now click on the Agents tab. On the Actions pane, click on Install . Now the Protection Agent Install Wizard window should pop up. Choose the Attach agents checkbox. Choose Computer on trusted domain , and click on Next . Select the TMG server from the list and click on Add and then click on Next . Enter credentials for the domain account. The account that is used here needs to have administrative rights on the computer you are going to protect. Click on Next to continue. You will receive a warning that DPM cannot tell if the TMG server is clustered or not. Click on OK for this. On the next screen click on Attach to continue. Next you have to install the agent on the TMG firewall and point it to the correct DPM server. Follow the ensuing steps to complete this task: From the TMG server that you will be protecting, access the DPM server over the network and copy the folder with the agent installed in it down to the local machine. Use this path DPMSERVERNAME%systemdrive%program filesMicrosoft DPMDPMProtectionAgentsRA3.03.0.7696.0i386. Then from the local folder on the protected computer, run dpmra.msi to install the agent. Open a command prompt (make sure you have elevated privileges), change directory to C:Program FilesMicrosoft Data Protection ManagerDPMbin then run the following: SetDpmServer.exe -dpmServerName <serverName> userName <userName> Following is the example of the previous command: SetDpmServer.exe -dpmServerName buchdpm Now restart the TMG server. Once your TMG server comes back, check the Windows services to make sure that the DPMRA service is set to automatic, and then start it. That is it for configuring DPM to start protecting TMG, but there are a few more things that we still need to cover on this topic. With TMG backup you can choose to back up certain components of TMG, depending on your recovery needs. With DPM you can back up the TMG hard drive, TMG logs that are stored in SQL, TMG's system state, or BMR of TMG. Following is the list of components you should back up depending on your circumstances: What can be included in TMG server backup: TMG configuration settings (exported through TMG) TMG firewall settings (exported through TMG) TMG logfiles (stored in SQL databases) TMG install directory (only needed if you have custom forms for things such as an Outlook Web Access login screen TMG server system state TMG BMR None of the previous components are required for protection of TMG. In fact, protecting the SQL logfiles tends to cause more issues than it helps, as they change so often. These SQL log databases change so often that DPM will send an error when the old SQL databases no longer shown under protection. The logfiles are not required to restore your TMG. For a standard TMG restore, you will need to reinstall TMG, reconfigure NIC settings, import any certificates, and restore TMG configuration and firewall settings. For more information on backing up TMG 2010, visit the following page: http://technet.microsoft.com/en-us/library/cc984454.aspx. DPM cannot back up the TMG configuration and firewall settings natively. This needs to be scripted and scheduled through Windows Task Scheduler, and then placed on the local hard drive. DPM can back up the .XML settings for TMG export from there. You can find the TMG server's export script at http://msdn.microsoft.com/en-us/library/ms812627.aspx. Place this script into a .VBS file, and then set up a scheduled task to call this file to run. This automates the export of your TMG server settings. There is another way to back up the entire TMG server. This is a new type of protection, specific to TMG 2010. This protection is BMR and is available because TMG is now installed on top of Windows Server 2008 and Windows Server 2008 R2. Protecting the BMR of your TMG gives you the ability to restore your entire TMG in the event that it fails-configuration and firewall settings included. BMR will also bring back certificates and NIC card settings. Note that BMR of TMG restored on a virtual machine can't use its NIC card settings. It only on the same hardware. Well that covers how to protect TMG with DPM. As you can see that there are some improvements through BMR, and if you do not employ BMR protection you can still automate the process of protecting TMG. How to protect IIS Internet Information Services (IIS ) is Microsoft's web server platform. It is included for free with Windows Server operating systems. Its modular nature makes it scalable for different organization web server need. The latest version is IIS 8. It can be used for more than standard web hosting, for example as an FTP server or for media delivery . Knowing what to protect when it comes to IIS will come in handy in almost any environment you may work in. Backing up IIS is one thing but you need to ensure that you understand the websites or web applications you are running, so that you know how to back them up too. In this section, we are going to look at the protection of IIS. To protect IIS, you should backup the following components: IIS configuration files Website or web applications data SSL certificates Registry (only needed if website or web application required modifications of the registry) Metabase The IIS configuration files are located in the %systemdrive%windowssystem32inetsrvconfig directory (and subdirectories). The website or web application files are typically found in C:inetpubwwwroot. Now this is the default location but the website or web application files can be located anywhere on an IIS server. To export SSL certificates directly from IIS, follow the ensuing steps: Open the Microsoft IIS 7 console. In the left-hand pane, select the server name. In the center pane click on the server certificates icon. Right-click on the certificate you wish to export and select export . Enter a file path, name the certificate file, and give it a password. Click on OK and your certificate will be exported as a .pfx file in the path you specified. Metabase is an internal database that holds IIS configuration data. It is made up of two files: MBSchema.xml and MetaBase.xml. These can be found in %SystemRoot%system32inetsrv. A good thing to know is that if you protect the system state of a server, then IIS configuration will be included in this backup. This does not include the website or web application files, so you will still need to protect these in addition to a system state backup. That covers the items you will need to protect IIS with DPM backup. Protecting Lync 2010 with DPM Lync 2010 is Microsoft's Unified Communication platform complete with IM, presence, conferencing, enterprise video and voice, and more. Lync was formerly known as Office Communicator. Lync is quickly becoming an integral part of business communications. With Lync being a critical application to organizations, it important to ensure this platform is backed up. Lync is a massive product with many moving parts. We are not going to cover all of Lync's architecture as this would need its own book. We are going to focus on what should be backed up to ensure protection of your Lync deployment. Overall, we want to protect Lync's settings and configuration data. The majority of this data is stored in the Lync Central Management store. The following are the components that needs to be protected in order to back up Lync: Settings and configuration data Topology configuration (Xds.mdf) Location information (Lis.mdf) Response group configuration (RgsConfig.mdf) Data stored in databases User data (Rtc.mdf) Archiving data (LcsLog.mdf) Monitoring data (csCDR.mdf and QoeMetrics.mdf) File stores Lync server file store Archiving file store These stores will be file shares on the Lync server, named in the format lyncservernamesharename. To track down these file shares if you don't know where they are, go to the Lync Topology Builder and look in the File stores node. Note the files named Meeting.Active should not be backed up. These files are in use and locked while a meeting takes place. Other components as follows: Active Directory (User SIP data, a pointer to the Central Management store, and objects for Response Group and Conferencing Attendant) Certification authority (CA) and certificates (if you use an internal CA) Microsoft Exchange and Exchange Unified Messaging (UM) if you are using UM with your Exchange Domain Name System (DNS) records and IP addresses IIS on Lync Server DHCP Configuration Group Chat (if used) XMPP gateways if you are using XMPP gateway Public switched telephone network (PSTN) gateway configuration, if your Lync is connected to one Firewall and Load Balancer (if used) configurations Summary Now that we had a chance to look at several Microsoft workloads that are used in organizations today and how to protect them with DPM, you should have a good understanding what it takes to back them up. These workloads included Lync 2010, IIS, CRM, GP, DFS, and TMG. Note there are many more Microsoft workloads that DPM cannot protect natively, which we were unable to cover in this article. Resources for Article : Further resources on this subject: Overview of Microsoft Dynamics CRM 2011 [Article] Deploying .NET-based Applications on to Microsoft Windows CE Enabled Smart Devices [Article] Working with Dashboards in Dynamics CRM [Article]
Read more
  • 0
  • 0
  • 2135

article-image-oracle-business-intelligence-11g-r1-cookbook
Packt
10 Jul 2013
4 min read
Save for later

Measuring Performance with Key Performance Indicators

Packt
10 Jul 2013
4 min read
(For more resources related to this topic, see here.) Creating the KPIs and the KPI watchlists We're going to create Key Performance Indicators and watchlists in the first recipe. There should be comparable measure columns in the repository in order to create KPI objects. The following columns will be used in the sample scenario: Shipped Quantity Requested Quantity How to do it Click on the KPI link in the Performance Management section and you're going to select a subject area. The KPI creation wizard has five different steps. The first step is the General Propertiessection and we're going to write a description for the KPI object. The Actual Value and the Target Value attributes display the columns that we'll use in this scenario. The columns should be selected manually. The Enable Trendingcheckbox is not selected by default. When you select the checkbox, trending options will appear on the screen. We're going to select the Day level from the Time hierarchy for trending in the Compare to Prior textbox and define a value for the Tolerance attribute. We're going to use 1 and % Change in this scenario. Clicking on the Next button will display the second step named Dimensionality. Click on the Add button to select Dimension attributes. Select the Region column in the Add New Dimension window. After adding the Region column, repeat the step for the YEAR column. You shouldn't select any value to pin. Both column values will be . Clicking on Next button will display the third step named States. You can easily configure the state values in this step. Select the High Values are Desirable value from the Goal drop-down list. By default, there are three steps: OK Warning Critical Then click on the Next button and you'll see the Related Documents step. This is a list of supporting documents and links regarding to the Key Performance Indicator. Click on the Add button to select one of the options. If you want to use another analysis as a supporting document, select the Catalog option and choose the analysis that consists of some valuable information about the report. We're going to add a link. You can also easily define the address of the link. We'll use the http://www.abc.com/portal link. Click on the Next button to display the Custom Attributes column values. To add a custom attribute that will be displayed in the KPI object, click on the Add button and define the values specified as follows: Number: 1 Label: Dollars Formula: "Fact_Sales"."Dollars" Save the KPI object by clicking on the Save button. Right after saving the KPI object, you'll see the KPI content. KPI objects cannot be published in the dashboards directly. We need KPI watchlists to publish them in the dashboards. Click on the KPI Watchlist link in the Performance Managementsection to create one. The New KPI Watchlist page will be displayed without any KPI objects. Drag-and-drop the KPI object that was previously created from the Catalog pane onto the KPI watchlist list. When you drop the KPI object, the Add KPI window will pop up automatically. You can select one of the available values for the dimensions. We're going to select the Use Point-of-View option. Enter a Label value, A Sample KPI, for this example. You'll see the dimension attributes in the Point-of-View bar. You can easily select the values from the drop-down lists to have different perspectives. Save the KPI watchlist object. How it works KPI watchlists can contain multiple KPI objects based on business requirements. These container objects can be published in the dashboards so that end users will access the content of the KPI objects through the watchlists. When you want to publish these watchlists, you'll need to select a value for the dimension attributes. There's more The Drill Down feature is also enabled in the KPI objects. If you want to access finer levels, you can just click on the hyperlink of the value you are interested in and a detailed level is going to be displayed automatically. Summary In this article, we learnt how to create KPIs and KPI watchlists. Key Performance Indicators are building blocks of strategy management. In order to implement balanced scorecard management technique in an organization, you'll first need to create the KPI objects. Resources for Article : Further resources on this subject: Oracle Integration and Consolidation Products [Article] Managing Oracle Business Intelligence[Article] Oracle Tools and Products [Article]
Read more
  • 0
  • 0
  • 1508

article-image-getting-started-oracle-data-guard
Packt
02 Jul 2013
13 min read
Save for later

Getting Started with Oracle Data Guard

Packt
02 Jul 2013
13 min read
(For more resources related to this topic, see here.) What is Data Guard? Data Guard, which was introduced as the standby database in Oracle database Version 7.3 under the name of Data Guard with Version 9 i , is a data protection and availability solution for Oracle databases. The basic function of Oracle Data Guard is to keep a synchronized copy of a database as standby, in order to make provision, incase the primary database is inaccessible to end users. These cases are hardware errors, natural disasters, and so on. Each new Oracle release added new functionalities to Data Guard and the product became more and more popular with offerings such as data protection, high availability, and disaster recovery for Oracle databases. Using Oracle Data Guard, it's possible to direct user connections to a Data Guard standby database automatically with no data loss, in case of an outage in the primary database. Data Guard also offers taking advantage of the standby database for reporting, test, and backup offloading. Corruptions on the primary database may be fixed automatically by using the non-corrupted data blocks on the standby database. There will be minimal outages (seconds to minutes) on the primary database in planned maintenances such as patching and hardware changes by using the switchover feature of Data Guard, which changes the roles of the primary and standby databases. All of these features are available with Data Guard, which doesn't require an installation but a cloning and configuration of the Oracle database. A Data Guard configuration consists of two main components: primary database and standby database. The primary database is the database for which we want to take precaution for its inaccessibility. Fundamentally, changes on the data of the primary database are passed through the standby database and these changes are applied to the standby database in order to keep it synchronized. The following figure shows the general structure of Data Guard: Let's look at the standby database and its properties more closely. Standby database It is possible to configure a standby database simply by copying, cloning, or restoring a primary database to a different server. Then the Data Guard configurations are made on the databases in order to start the transfer of redo information from primary to standby and also to start the apply process on the standby database. Primary and standby databases may exist on the same server; however, this kind of configuration should only be used for testing. In a production environment, the primary and standby database servers are generally preferred to be on separate data centers. Data Guard keeps the primary and standby databases synchronized by using redo information. As you may know, transactions on an Oracle database produce redo records. This redo information keeps all of the changes made to the database. The Oracle database first creates redo information in memory (redo log buffers). Then they're written into online redo logfiles, and when an online redo logfile is full, its content is written into an archived redo log. An Oracle database can run in the ARCHIVELOG mode or the NOARCHIVELOG mode. In the ARCHIVELOG mode, online redo logfiles are written into archived redo logs and in the NOARCHIVELOG mode, redo logfiles are overwritten without being archived as they become full. In a Data Guard environment, the primary database must be in the ARCHIVELOG mode. In Data Guard, transfer of the changed data from the primary to standby database is achieved by redo with no alternative. However, the apply process of the redo content to the standby database may vary. The different methods on the apply process reveal different type of standby databases. There were two kinds of standby databases before Oracle database Version 11 g , which were: physical standby database and logical standby database. Within Version 11 g we should mention a third type of standby database which is snapshot standby. Let's look at the properties of these standby database types. Physical standby database The Physical standby database is a block-based copy of the primary database. In a physical standby environment, in addition to containing the same database objects and same data, the primary and standby databases are identical on a block-for-block basis. Physical standby databases use Redo Apply method to apply changes. Redo Apply uses Managed recovery process ( MRP ) in order to manage application of the change in information on redo. In Version 11 g , a physical standby database can be accessible in read-only mode while Redo Apply is working, which is called Active Data Guard. Using the Active Data Guard feature, we can offload report jobs from the primary to physical standby database. Physical standby database is the only option that has no limitation on storage vendor or data types to keep a synchronized copy of the primary database. Logical standby database Logical standby database is a feature introduced in Version 9 i R2. In this configuration, redo data is first converted into SQL statements and then applied to the standby database. This process is called SQL Apply. This method makes it possible to access the standby database permanently and allows read/write while the replication of data is active. Thus, you're also able to create database objects on the standby database that don't exist on the primary database. So a logical standby database can be used for many other purposes along with high availability and disaster recovery. Due to the basics of SQL Apply, a logical standby database will contain the same data as the primary database but in a different structure on the disks. One discouraging aspect of the logical standby database is the unsupported data types, objects, and DDLs. The following data types are not supported to be replicated in a logical standby environment: BFILE Collections (including VARRAYS and nested tables) Multimedia data types (including Spatial, Image, and Oracle Text) ROWID and UROWID User-defined types The logical standby database doesn't guarantee to contain all primary data because of the unsupported data types, objects, and DDLs. Also, SQL Apply consumes more hardware resources. Therefore, it certainly brings more performance issues and administrative complexities than Redo Apply. Snapshot standby database Principally, a snapshot standby database is a special condition of a physical standby database. Snapshot standby is a feature that is available with Oracle Database Version 11 g . When you convert a Physical standby database into a snapshot standby database, it becomes accessible for read/write. You can run tests on this database and change the data. When you're finished with the snapshot standby database, it's possible to reverse all the changes made to the database and turn it back to a physical standby again. An important point here is that a snapshot standby database can't run Redo Apply. Redo transfer continues but standby is not able to apply redo. Oracle Data Guard evolution It has been a long time that the Oracle Data Guard technology has been in the database administrator's life and it apparently evolved from the beginning until 11 g R2. Let's look at this evolution closely through the different database versions. Version 7.3 – stone age The functionality of keeping a duplicate database in a separate server, which can be synchronized with the primary database, came with Oracle database Version 7.3 under the name of standby database. This standby database was constantly in recovery mode waiting for the archived redo logs to be synchronized. However, this feature was not able to automate the transfer of archived redo logs. Database administrators had to find a way to transfer archived redo logs and apply them to the standby server continuously. This was generally accomplished by a script running in the background. The only aim of Version 7.3 of the standby database was disaster recovery. It was not possible to query the standby database or to open it for any purpose other than activating it in the event of failure of the primary database. Once the standby database was activated, it couldn't be returned to the standby recovery mode again. Version 8 i – first age Oracle database Version 8 i brought the much-awaited features to the standby database and made the archived log shipping and apply process automatic, which is now called managed standby environment and managed recovery, respectively. However, some users were choosing to apply the archived logs manually because it was not possible to set a delay in the managed recovery mode. This mode was bringing the risk of the accidental operations to reflect standby database quickly. Along with the "managed" modes, 8 i made it possible to open a standby database with the read-only option and allowed it to be used as a reporting database. Even though there were new features that made the tool more manageable and practical, there were still serious deficiencies. For example, when we added a datafile or created a tablespace on the primary database, these changes were not being replicated to the standby database. Database administrators had to take care of this maintenance on the standby database. Also when we opened the primary database with resetlogs or restored a backup control file, we had to re-create the standby database. Version 9 i – middle age First of all, with this version Oracle8 i standby database was renamed to Oracle9 i Data Guard. 9 i Data Guard includes very important new features, which makes the product much more reliable and functional. The following features were included: Oracle Data Guard Broker management framework, which is used to centralize and automate the configuration, monitoring, and management of Oracle Data Guard installations, was introduced with this version. Zero data loss on failover was guaranteed as a configuration option. Switchover was introduced, which made it possible to change the roles of primary and standby. This made it possible to accomplish a planned maintenance on the primary database with very less service outage. Standby database administration became simpler because new datafiles on the primary database are created automatically on standby and if there are missing archived logs on standby, which is called gap; Data Guard detects and transmits the missing logs to standby automatically. Delay option was added, which made it possible to configure a standby database that is always behind the primary in a specified time delay. Parallel recovery increased recovery performance on the standby database. In Version 9 i Release 2, which was introduced in May 2002, one year after Release 1, there were again very important features announced. They are as follows: Logical standby database was introduced, which we've mentioned earlier in this article Three data protection modes were ready to use: Maximum Protection, Maximum Availability, and Maximum Performance, which offered more flexibility on configuration The Cascade standby database feature made it possible to configure a second standby database, which receives its redo data from the first standby database Version 10 g – new age The 10 g version again introduced important features of Data Guard but we can say that it perhaps fell behind expectations because of the revolutionary changes in release 9 i . The following new features were introduces in Version 10 g : One of the most important features of 10 g was the Real-Time Apply. When running in Real-Time Apply mode, the standby database applies changes on the redo immediately after receiving it. Standby does not wait for the standby redo logfile to be archived. This provides faster switchover and failover. Flashback database support was introduced, which made it unnecessary to configure a delay in the Data Guard configuration. Using flashback technology, it was possible to flash back a standby database to a point in time. With 10 g Data Guard, if we open a primary database with resetlogs it was not required to re-create the standby database. Standby was able to recover through resetlogs. Version 10 g made it possible to use logical standby databases in the database software rolling upgrades of the primary database. This method made it possible to lessen the service outage time by performing switchover to the logical standby database. 10 g Release 2 also introduced new features to Data Guard, but these features again were not satisfactory enough to make a jump to the Data Guard technology. The two most important features were Fast-Start Failover and the use of Guaranteed restore point: Fast-start failover automated and accelerated the failover operation when the primary database was lost. This option strengthened the disaster recovery role of Oracle Data Guard. Guaranteed restore point was not actually a Data Guard feature. It was a database feature, which made it possible to revert a database to the moment that Guaranteed restore point was created, as long as there is sufficient disk space for the flashback logs. Using this feature following scenario became possible: Activate a physical standby database after stopping Redo Apply, use it for testing with read/write operations, then revert the changes, make it standby again and synchronize it with the primary. Using a standby database read/write was offering a great flexibility to users but the archived log shipping was not able to continue while the standby is read/write and this was causing data loss on the possible primary database failure. Version 11 g – modern age Oracle database version 11 g offered the expected jump in the Data Guard technology, especially with two new features, which are called Active Data Guard and snapshot standby. The following features were introduced: Active Data Guard has been a milestone in Data Guard history, which enables a query from a physical standby database while the media recovery is active. Snapshot standby is a feature to use a physical standby database read/write for test purposes. As we mentioned, this was possible with 10 g R2 Guaranteed restore point feature but 11 g provided the continuous archived log shipping in the time period that standby is read/write with snapshot standby. It has been possible to compress redo traffic in a Data Guard configuration, which is useful in excessive redo generation rates and resolving gaps. Compression of redo when resolving gaps was introduced in 11 g R1 and compression of all redo data was introduced in 11 g R2. Use of the physical standby databases for the rolling upgrades of database software was enabled, aka Transient Logical Standby. It became possible to include different operating systems in a Data Guard configuration such as Windows and Linux. Lost-write, which is a serious data corruption type arising from the misinformation of storage subsystem on completing the write of a block, can be detected in an 11 g Data Guard configuration. Recovery is automatically stopped in such a case. RMAN fast incremental backup feature "Block Change Tracking" can be run on an Active Data Guard enabled standby database. Another very important enhancement in 11 g was Automatic Block Corruption Repair feature that was introduced with 11 g R2. With this feature, a corrupted data block in the primary database can be automatically replaced with an uncorrupted copy from a physical standby database in Active Data Guard mode and vice versa. We've gone through the evolution of Oracle Data Guard from its beginning until today. As you may notice, Data Guard started its life as a very simple database property revealed to keep a synchronized database copy with a lot of manual work and now it's a complicated tool with advanced automation, precaution, and monitoring features. Now let's move on with the architecture and components of Oracle Data Guard 11 g R2.
Read more
  • 0
  • 0
  • 4153
article-image-creating-your-first-collection-simple
Packt
26 Jun 2013
7 min read
Save for later

Creating your first collection (Simple)

Packt
26 Jun 2013
7 min read
(For more resources related to this topic, see here.) Getting ready Assuming that you have walked through the tutorial, you should be nearly ready with the setup. Still, it does not hurt to go through the checklist: Be familiar that you know how to start your operating system's shell (cmd.exe on Windows, Terminal/iTerm on Mac, and sh/bash/tch/zsh on Unix). Ensure that running the java –version command on the shell's prompt returns at least Version 1.6. You may need to upgrade if you have an older version. Ensure that you know where you unpacked the Solr distribution and the full path to the example directory within that. You needed that directory for the tutorial, but that's also where we are going to start our own Solr instance. That allows us to easily run an embedded Jetty web server and to also find all the additional JAR files that Solr needs to operate properly. Now, create a directory where we will store our indexes and experiments. It can be anywhere on your drive. As Solr can run on any operating system where Java can run, we will use SOLRINDEXING as a name whenever we refer to that directory. Make sure to use absolute path names when substituting with your real path for the directory. How to do it... As our first example, we will create an index that stores and allows for the searching of simplified e-mail information. For now, we will just look at the addr_from and addr_to e-mail addresses and the subject line. You will see that it takes only two simple configuration files to get the basic Solr index working. Under the SOLR-INDEXING directory, create a collection1 directory and inside that create a conf directory. In the conf directory, create two files: schema.xml and solrconfig.xml. The schema.xml file should have the following content: <?xml version="1.0" encoding="UTF-8" ?><schema version="1.5"><fields><field name="id" type="string" indexed="true" stored="true"required="true"/><field name="addr_from" type="string" indexed="true"stored="true" required="true"/><field name="addr_to" type="string" indexed="true"stored="true" required="true"/><field name="subject" type="string" indexed="true"stored="true" required="true"/></fields><uniqueKey>id</uniqueKey><types><fieldType name="string" class="solr.StrField" /></types></schema> The solrconfig.xml file should have the following content: <?xml version="1.0" encoding="UTF-8" ?><config><luceneMatchVersion>LUCENE_43</luceneMatchVersion><requestDispatcher handleSelect="false"><httpCaching never304="true" /></requestDispatcher><requestHandler name="/select" class="solr.SearchHandler" /><requestHandler name="/update" class="solr.UpdateRequestHandler" /><requestHandler name="/admin" class="solr.admin.AdminHandlers" /><requestHandler name="/analysis/field" class="solr.FieldAnalysisRequestHandler" startup="lazy" /></config> That is it. Now, let's start our just-created Solr instance. Open a new shell (we'll need the current one later). On that shell's command prompt, change the directory to the example directory of the Solr distribution and run the following command: java -Dsolr.solr.home=SOLR-INDEXING -jar start.jar Notice that solr.solr.home is not a typo; you do need the solr part twice. And, as always, if you have spaces in your paths (now or later), you may need to escape them in platform-specific ways, such as with backslashes on Unix/Linux or by quoting the whole value. In the window of your shell, you should see a long list of messages that you can safely ignore (at least for now). You can verify that everything is working fine by checking for the following three elements: The long list of messages should finish with a message like Started SocketConnector@0.0.0.0:8983. This means that Solr is now running on port 8983 successfully. You should now have a directory called data, right next to the directory called conf that we created earlier. If you open the web browser and go to the http:// localhost:8983/ solr/, you should see a web-based admin interface that makes testing and troubleshooting your Solr instance much easier. We will be using this interface later, so do spend a couple of minutes clicking around now. Now, let's load some actual content into our collection: Copy post.jar from the Solr distribution's example/exampledocs directory to our root SOLR-INDEXING directory. Create a file called input1.csv in the collection1 directory, next to the conf and data directories with the following three-line content: id,addr_from,addr_to,subjectemail1,fulan@acme.example.com,kari@acme.example.com,"Kari,we need more Junior Java engineers"email2,kari@acme.example.com,maija@acme.example.com,"Updating vacancy description" Run the import command from the command line in the SOLR-INDEXING directory (one long command; do not split it across lines): java -Dauto -Durl=http://localhost:8983/solr/collection1/update -jar post.jar collection1/input1.csv You should see the following in one of the message lines: "1 files indexed". If you now open a web browser and go to http:// localhost:8983/solr/ collection1/select?q=*%3A*&wt=ruby&indent=true, you should see Solr output with all the three documents displayed on the screen in a somewhat readable format. How it works... We have created two files to get our example working. Let's review what they mean and how they fit together: The schema.xml file in the collection's conf directory defines the actual shape of data that you want to store and index. The fields define a structure of a record. Each field has a type, which is also defined in the same file. The field defines whether it is stored, indexed, required, multivalued, or a small number of other, more advanced properties. On the other hand, the field type defines what is actually done to the field when it is indexed and when it is searched. We will explore all of these later. The solrconfig.xml file also in the collection's conf directory defines and tunes the components that make up Solr's runtime environment. At the very least, it needs to define which URLs can be called to add records to a collection (here, /update), which to query a collection (here, /select), and which to do various administrative tasks (here, /admin and /analysis/field). Once Solr started, it created a single collection with the default name of collection1, assigned an update handler to it at the /solr/collection1/update URL and search handler at the /solr/collection1/select URL (as per solrconfig.xml). At that point, Solr was ready for the data to be imported into the four required fields (as per schema.xml). We then proceeded to populate the index from a CSV file (one of many update formats available) and then verified that the records are all present in an indented Ruby format (again, one of many result formats available). Summary This article helped you create a basic Solr collection and populate it with a simple dataset in CSV format. Resources for Article : Further resources on this subject: Integrating Solr: Ruby on Rails Integration [Article] Indexing Data in Solr 1.4 Enterprise Search Server: Part2 [Article] Text Search, your Database or Solr [Article]
Read more
  • 0
  • 0
  • 4693

article-image-article-creating-your-first-heat-map-r
Packt
26 Jun 2013
10 min read
Save for later

Creating your first heat map in R

Packt
26 Jun 2013
10 min read
(For more resources related to this topic, see here.) The following image shows one of the heat maps that we are going to create in this recipe from the total count of air passengers: Image Getting ready Download the script 5644_01_01.r from your account at http://www.packtpub.com and save it to your hard disk. The first section of the script, below the comment line starting with ### loading packages, will automatically check for the availability of the R packages gplots and lattice, which are required for this recipe. If those packages are not already installed, you will be prompted to select an official server from the Comprehensive R Archive Network (CRAN) to allow the automatic download and installation of the required packages. If you have already installed those two packages prior to executing the script, I recommend you to update them to the most recent version by calling the following function in the R command line: code Use the source() function in the R command-line to execute an external script from any location on your hard drive. If you start a new R session from the same directory as the location of the script, simply provide the name of the script as an argument in the function call as follows: code   You have to provide the absolute or relative path to the script on your hard drive if you started your R session from a different directory to the location of the script. Refer to the following example: code   You can view the current working directory of your current R session by executing the following command in the R command-line: code   How to do it... Run the 5644OS_01_01.r script in R to execute the following code, and take a look at the output printed on the screen as well as the PDF file, first_heatmaps.pdf that will be created by this script: code How it works... There are different functions for drawing heat maps in R, and each has its own advantages and disadvantages. In this recipe, we will take a look at the levelplot() function from the lattice package to draw our first heat map. Furthermore, we will use the advanced heatmap.2() function from gplots to apply a clustering algorithm to our data and add the resulting dendrograms to our heat maps. The following image shows an overview of the different plotting functions that we are using throughout this book: Image Now let us take a look at how we read in and process data from different data files and formats step-by-step: Loading packages: The first eight lines preceding the ### loading data section will make sure that R loads the lattice and gplots package, which we need for the two heat map functions in this recipe: levelplot() and heatmap.2(). Each time we start a new session in R, we have to load the required packages in order to use the levelplot() and heatmap.2() functions. To do so, enter the following function calls directly into the R command-line or include them at the beginning of your script: library(lattice) library(gplots)   Loading the data set: R includes a package called data, which contains a variety of different data sets for testing and exploration purposes. More information on the different data sets that are contained in the data package can be found at http:// stat.ethz.ch/ROmanual/ROpatched/library/datasets/. For this recipe, we are loading the AirPassenger data set, which is a collection of the total count of air passengers (in thousands) for international airlines from 1949- 1960 in a time-series format. code Converting the data set into a numeric matrix: Before we can use the heat map functions, we need to convert the AirPassenger time-series data into a numeric matrix first. Numeric matrices in R can have characters as row and column labels, but the content itself must consist of one single mode: numerical. We use the matrix() function to create a numeric matrix consisting of 12 columns to which we pass the AirPassenger time-series data row-by-row. Using the argument dimnames = rowcolNames, we provide row and column names that we assigned previously to the variable rowColNames, which is a list of two vectors: a series of 12 strings representing the years 1949 to 1960, and a series of strings for the 12 three-letter abbreviations of the months from January to December, respectively. code A simple heat map using levelplot(): Now that we have converted the AirPassenger data into a numeric matrix format and assigned it to the variable air_data, we can go ahead and construct our first heat map using the levelplot() function from the lattice package: code The levelplot() function creates a simple heat map with a color key to the righthand side of the map. We can use the argument col.regions = heat.colors to change the default color transition to yellow and red. X and y axis labels are specified by the xlab and ylab parameters, respectively, and the main parameter gives our heat map its caption. In contrast to most of the other plotting functions in R, the lattice package returns objects, so we have to use the print() function in our script if we want to save the plot to a data file. In an interactive R session, the print() call can be omitted. Typing the name of the variable will automatically display the referring object on the screen. Creating enhanced heat maps with heatmap.2(): Next, we will use the heatmap.2() function to apply a clustering algorithm to the AirPassenger data and to add row and column dendrograms to our heat map: code Hierarchical clustering is especially popular in gene expression analyses. It is a very powerful method for grouping data to reveal interesting trends and patterns in the data matrix. Another neat feature of heatmap.2() is that you can display a histogram of the count of the individual values inside the color key by including the argument density.info = NULL in the function call. Alternatively, you can set density. info = "density" for displaying a density plot inside the color key. By adding the argument keysize = 1.8, we are slightly increasing the size of the color key—the default value of keysize is 1.5: code Did you notice the missing row dendrogram in the resulting heat map? This is due to the argument dendrogram = "column" that we passed to the heat map function. Similarly, you can type row instead of column to suppress the column dendrogram, or use neither to draw no dendrogram at all. There's more... By default, levelplot() places the color key on the right-hand side of the heat map, but it can be easily moved to the top, bottom, or left-hand side of the map by modifying the space parameter of colorkey: Replacing top by left or bottom will place the color key on the left-hand side or on the bottom of the heat map, respectively. Moving around the color key for heatmap.2() can be a little bit more of a hassle. In this case we have to modify the parameters of the layout() function. By default, heatmap.2() passes a matrix, lmat, to layout(), which has the following content: code The numbers in the preceding matrix specify the locations of the different visual elements on the plot (1 implies heat map, 2 implies row dendrogram, 3 implies column dendrogram, and 4 implies key). If we want to change the position of the key, we have to modify and rearrange those values of lmat that heatmap.2() passes to layout(). For example, if we want to place the color key at the bottom left-hand corner of the heat map, we need to create a new matrix for lmat as follows: code We can construct such a matrix by using the rbind() function and assigning it to lmat: code Furthermore, we have to pass an argument for the column height parameter lhei to heatmap.2(), which will allow us to use our modified lmat matrix for rearranging the color key: code If you don't need a color key for your heat map, you could turn it off by using the argument key = FALSE for heatmap.2() and colorkey = FALSE for levelplot(), respectively. R also has a base function for creating heat maps that does not require you to install external packages and is most advantageous if you can go without a color key. The syntax is very similar to the heatmap.2() function, and all options for heatmap.2() that we have seen in this recipe also apply to heatmap(): code More information on dendrograms and clustering By default, the dendrograms of heatmap.2() are created by a hierarchical agglomerate clustering method, also known as bottom-up clustering. In this approach, all individual objects start as individual clusters and are successively merged until only one single cluster remains. The distance between a pair of clusters is calculated by the farthest neighbor method, also called the complete linkage method, which is based by default on the Euclidean distance of the two points from both clusters that are farthest apart from each other. The computed dendrograms are then reordered based on the row and column means. By modifying the default parameters of the dist() function, we can use another distance measure rather than the Euclidean distance. For example, if we want to use the Manhattan distance measure (based on a grid-like path rather than a direct connection between two objects), we would modify the method parameter of the dist() function and assign it to a variable distance first: code Other options for the method parameter are: euclidean (default), maximum, canberra, binary, or minkowski. To use other agglomeration methods than the complete linkage method, we modify the method parameter in the hclust() function and assign it to another variable cluster. Note the first argument distance that we pass to the hclust() function, which comes from our previous assignment: code By setting the method parameter to ward, R will use Joe H. Ward's minimum variance method for hierarchical clustering. Other options for the method parameter that we can pass as arguments to hclust() are: complete (default), single, average, mcquitty, median, or centroid. To use our modified clustering parameters, we simply call the as.dendrogram() function within heatmap.2() using the variable cluster that we assigned previously: code We can also draw the cluster dendrogram without the heat map by using the plot() function: code To turn off row and column reordering, we need to turn off the dendrograms and set the parameters Colv and Rowv to NA: code Summary This article has helped us create our first heat maps from a small data set provided in R. We have used different heat map functions in R to get a first impression of their functionalities. Resources for Article :   Further resources on this subject: Getting started with Leaflet [Article] Moodle 1.9: Working with Mind Maps [Article] Joomla! with Flash: Showing maps using YOS amMap [Article]
Read more
  • 0
  • 0
  • 7099

article-image-linking-section-access-multiple-dimensions
Packt
25 Jun 2013
3 min read
Save for later

Linking Section Access to multiple dimensions

Packt
25 Jun 2013
3 min read
(For more resources related to this topic, see here.) Getting ready Load the following script: Product:LOAD * INLINE [ ProductID, ProductGroup, ProductName 1, GroupA, Great As 2, GroupC, Super Cs 3, GroupC, Mega Cs 4, GroupB, Good Bs 5, GroupB, Busy Bs];Customer:LOAD * INLINE [ CustomerID, CustomerName, Country 1, Gatsby Gang, USA 2, Charly Choc, USA 3, Donnie Drake, USA 4, London Lamps, UK 5, Shylock Homes, UK];Sales:LOAD * INLINE [ CustomerID, ProductID, Sales 1, 2, 3536 1, 3, 4333 1, 5, 2123 2, 2, 45562, 4, 1223 2, 5, 6789 3, 2, 1323 3, 3, 3245 3, 4, 6789 4, 2, 2311 4, 3, 1333 5, 1, 7654 5, 2, 3455 5, 3, 6547 5, 4, 2854 5, 5, 9877];CountryLink:Load Distinct Country, Upper(Country) As COUNTRY_LINKResident Customer;Load Distinct Country, 'ALL' As COUNTRY_LINKResident Customer;ProductLink:Load Distinct ProductGroup, Upper(ProductGroup) As PRODUCT_LINKResident Product;Load Distinct ProductGroup, 'ALL' As PRODUCT_LINKResident Product;//Section Access;Access:LOAD * INLINE [ ACCESS, USERID, PRODUCT_LINK, COUNTRY_LINKADMIN, ADMIN, *, * USER, GM, ALL, ALL USER, CM1, ALL, USA USER, CM2, ALL, UK USER, PM1, GROUPA, ALL USER, PM2, GROUPB, ALL USER, PM3, GROUPC, ALL USER, SM1, GROUPB, UK USER, SM2, GROUPA, USA];Section Application; Note that there is a loop error generated on reload because there is a loop in the data structure. How to do it… Follow these steps to link Section Access to multiple dimensions: Add list boxes to the layout for ProductGroup and Country. Add a statistics box for Sales. Remove // to uncomment the Section Access statement. From the Settings menu, open Document Properties and select the Opening tab. Turn on the Initial Data Reduction Based on Section Access option. Reload and save the document. Close QlikView. Re-open QlikView and open the document. Log in as the Country Manager, CM1, user. Note that USA is the only country. Also, the product group, GroupA, is missing—there are no sales of this product group in USA. Close QlikView and then re-open again. This time, log in as the Sales Manager, SM2. You will not be allowed access to the document. Log into the document as the ADMIN user. Edit the script. Add a second entry for the SM2 user in the Access table as follows: USER, SM2, GROUPA, USA USER, SM2, GROUPB, UK Reload, save, and close the document and QlikView. Re-open and log in as SM2. Note the selections. How it works… Section Access is really quite simple. The user is connected to the data and the data is reduced accordingly. QlikView allows Section Access tables to be connected to multiple dimensions in the main data structure without causing issues with loops. Each associated field acts in the same way as a selection in the layout. The initial setting for the SM2 user contained values that were mutually exclusive. Because of the default Strict Exclusion setting, the SM2 user cannot log in. We changed the script and included multiple rows for the SM2 user. Intuitively, we might expect that, as the first row did not connect to the data, only the second row would connect to the data. However, each field value is treated as an individual selection and all of the values are included. There's more… If we wanted to include solely the composite association of Country and ProductGroup, we would need to derive a composite key in the data set and connect the user to that. In this example, we used the USERID field to test using QlikView logins. However, we would normally use NTNAME to link the user to either a Windows login or a custom login. Resources for Article : Further resources on this subject: Pentaho Reporting: Building Interactive Reports in Swing [Article] Visual ETL Development With IBM DataStage [Article] A Python Multimedia Application: Thumbnail Maker [Article]
Read more
  • 0
  • 0
  • 6166
article-image-ibm-cognos-workspace-advanced
Packt
14 Jun 2013
5 min read
Save for later

IBM Cognos Workspace Advanced

Packt
14 Jun 2013
5 min read
(For more resources related to this topic, see here.) Who should use Cognos Workspace Advanced? With Cognos Workspace Advanced, business users have one tool for creating advanced analyses and reports. The tool, like Query Studio and Analysis Studio, is designed for ease of use and is built on the same platform as the other report development tools in Cognos. Business Insight Advanced/Cognos Workspace Advanced is actually so powerful that it is being positioned more as a light Cognos Report Studio than as a powerful Cognos Query Studio and Cognos Analysis Studio. Comparing to Cognos Query Studio and Cognos Analysis Studio With so many options for business users, how do we know which tool to use? The best approach for making this decision is to consider the similarities and differences between the options available. In order to help us do so, we can use the following table: Feature Query Studio Analysis Studio Cognos Workspace Advanced Ad hoc reporting X   X Ad hoc analysis   X X Basic charting X X X Advanced charting     X Basic filtering X X X Advanced filtering     X Basic calculations X X X Advanced calculations     X Properties pane     X External data     X Freeform design     X As you can see from the table, all three products have basic charting, basic filtering, and basic calculation features. Also, we can see that Cognos Query Studio and Cognos Workspace Advanced both have ad hoc reporting capabilities, while Cognos Analysis Studio and Cognos Workspace Advanced both have ad hoc analysis capabilities. In addition to those shared capabilities, Cognos Workspace Advanced also has advanced charting, filtering, and calculation features. Cognos Workspace Advanced also has a limited properties pane (similar to what you would see in Cognos Report Studio). Furthermore, Cognos Workspace Advanced allows end users to bring in external data from a flat file and merge it with the data from Cognos Connection. Finally, Cognos Workspace Advanced has free-form design capabilities. In other words, you are not limited in where you can add charts or crosstabs in the way that Cognos Query Studio and Cognos Analysis Studio limit you to the standard templates. The simple conclusion after performing this comparison is that you should always use Cognos Workspace Advanced. While that will be true for some users, it is not true for all. With the additional capabilities come additional complexities. For your most basic business users, you may want to keep them using Cognos Query Studio or Cognos Analysis Studio for their ad hoc reporting and ad hoc analysis simply because they are easier tools to understand and use. However, for those business users with basic technical acumen, Cognos Workspace Advanced is clearly the superior option. Accessing Cognos Workspace Advanced I would assume now that, after reviewing the capabilities Cognos Workspace Advanced brings to the table, you are anxious to start using it. We will start off by looking at how to access the product. The first way to access Cognos Workspace Advanced is through the welcome page. On the welcome page, you can get to Cognos Workspace Advanced by clicking on the option Author business reports: This will bring you to a screen where you can select your package. In Cognos Query Studio or Cognos Analysis Studio, you will only be able to select non-dimensional and dimensional packages based on the tool you are using. With Cognos Workspace Advanced, because the tool can use both dimensional and non-dimensional packages, you will be prompted with packages for both. The next way to access Cognos Workspace Advanced is through the Launch menu in Cognos Connection. Within the menu, you can simply choose Cognos Workspace Advanced to be taken to the same options for choosing a package. Note, however, that if you have already navigated into a package, it will automatically launch Cognos Workspace Advanced using the very same package. The third way to access Cognos Workspace Advanced is by far the most functional way. You can actually access Cognos Workspace Advanced from within Cognos Workspace by clicking on the Do More... option on a component of the dashboard: When you select this option, the object will expand out and open for editing inside Cognos Workspace Advanced. Then, once you are done editing, you can simply choose the Done button in the upper right-hand corner to return to Cognos Workspace with your newly updated object. For the sake of showing as many features as possible in this chapter, we will launch Cognos Workspace Advanced from the welcome page or from the Launch menu and select a package that has an OLAP data source. For the purpose of following along, we will be using the Cognos BI sample package great_outdoors_8 (or Great Outdoors). When we first access it, we are prompted to choose a package. For these examples, we will choose great_outdoors_8: We are then brought to a splash screen where we can choose Create new or Open existing. We will choose Create new. We are then prompted to pick the type of chart we want to create. As we will see from the following screenshot, our options are: Blank: It starts us off with a completely blank slate List: It starts us off with a list report Crosstab: It starts us off with a crosstab Chart: It starts us off with a chart and loads the chart wizard Financial: It starts us off with a crosstab formatted like a financial report Existing...: It allows us to open an existing report We will choose Blank because we can still add as many of the other objects as we want to later on.
Read more
  • 0
  • 0
  • 3242

Packt
12 Jun 2013
8 min read
Save for later

A quick start – OpenCV fundamentals

Packt
12 Jun 2013
8 min read
(For more resources related to this topic, see here.) The OpenCV library has a modular structure, and the following diagram depicts the different modules available in it: A brief description of all the modules is as follows: Module Feature Core A compact module defining basic data structures, including the dense multidimensional array Mat and basic functions used by all other modules. Imgproc An image processing module that includes linear and non-linear image filtering, geometrical image transformations (resize, affine and perspective warping, generic table-based remapping), color space conversion, histograms, and so on. Video A video analysis module that includes motion estimation, background subtraction, and object tracking algorithms. Calib3d Basic multiple-view geometry algorithms, single and stereo camera calibration, object pose estimation, stereo correspondence algorithms, and elements of 3D reconstruction. Features2d Salient feature detectors, descriptors, and descriptor matchers. Objdetect Detection of objects and instances of the predefined classes; for example, faces, eyes, mugs, people, cars, and so on. Highgui An easy-to-use interface to video capturing, image and video codecs, as well as simple UI capabilities. Gpu GPU-accelerated algorithms from different OpenCV modules. Task 1 – image basics When trying to recreate the physical world around us in digital format via a camera, for example, the computer just sees the image in the form of a code that just contains the numbers 1 and 0. A digital image is nothing but a collection of pixels (picture elements) which are then stored in matrices in OpenCV for further manipulation. In the matrices, each element contains information about a particular pixel in the image. The pixel value decides how bright or what color that pixel should be. Based on this, we can classify images as: Greyscale Color/RGB Greyscale Here the pixel value can range from 0 to 255 and hence we can see the various shades of gray as shown in the following diagram. Here, 0 represents black and 255 represents white: A special case of grayscale is the binary image or black and white image. Here every pixel is either black or white, as shown in the following diagram: Color/RGB Red, Blue, and Green are the primary colors and upon mixing them in various different proportions, we can get new colors. A pixel in a color image has three separate channels— one each for Red, Blue, and Green. The value ranges from 0 to 255 for each channel, as shown in the following diagram: Task 2 – reading and displaying an image We are now going to write a very simple and basic program using the OpenCV library to read and display an image. This will help you understand the basics. Code A simple program to read and display an image is as follows: // opencv header files #include "opencv2/highgui/highgui.hpp" #include "opencv2/core/core.hpp" // namespaces declaration using namespace cv; using namespace std; // create a variable to store the image Mat image; int main( int argc, char** argv ) { // open the image and store it in the 'image' variable // Replace the path with where you have downloaded the image image=imread("<path to image">/lena.jpg"); // create a window to display the image namedWindow( "Display window", CV_WINDOW_AUTOSIZE ); // display the image in the window created imshow( "Display window", image ); // wait for a keystroke waitKey(0); return 0; } Code explanation Now let us understand how the code works. Short comments have also been included in the code itself to increase the readability. #include "opencv2/highgui/highgui.hpp" #include "opencv2/core/core.hpp" The preceding two header files will be a part of almost every program we write using the OpenCV library. As explained earlier, the highgui header is used for window creation, management, and so on, while the core header is used to access the Mat data structure in OpenCV. using namespace cv; using namespace std; The preceding two lines declare the required namespaces for this code so that we don't have to use the :: (scope resolution) operator every time for accessing the functions. Mat image; With the above command, we have just created a variable image of the datatype Mat that is frequently used in OpenCV to store images. image=imread("<path to image">/lena.jpg"); In the previous command, we opened the image lena.jpg and stored it in the image variable. Replace <path to image> in the preceding command with the location of that picture on your PC. namedWindow( "Display window", CV_WINDOW_AUTOSIZE ); We now need a window to display our image. So, we use the above function to do the same. This function takes two parameters, out of which the first one is the name of the window. In our case, we would like to name our window Display Window. The second parameter is optional, but it resizes the window based on the size of the image so that the image is not cropped. imshow( "Display window", image ); Finally, we are ready to display our image in the window we just created by using the preceding function. This function takes two parameters out of which the first one is the window name in which the image has to be displayed. In our case, obviously, that will be Display Window . The second parameter is the image variable containing the image that we want to display. In our case, it's the image variable. waitKey(0); Last but not least, it is advised that you use the preceding function in most of the codes that you write using the OpenCV library. If we don't write this code, the image will be displayed for a fraction of a second and the program will be immediately terminated. It happens so fast that you will not be able to see the image. What this function does essentially is that it waits for a keystroke from the user and hence it delays the termination of the program. The delay here is in milliseconds. Output The image can be displayed as follows: Task 3 – resizing and saving an image We are now going to write a very simple and basic program using the OpenCV library to resize and save an image. Code The following code helps you to resize a given image: // opencv header files #include "opencv2/highgui/highgui.hpp" #include "opencv2/imgproc/imgproc.hpp" #include "opencv2/core/core.hpp" // namespaces declaration using namespace std; using namespace cv; int main(int argc, char** argv) { // create variables to store the images Mat org, resized,saved; // open the image and store it in the 'org' variable // Replace the path with where you have downloaded the image org=imread("<path to image>/lena.png"); //Create a window to display the image namedWindow("Original Image",CV_WINDOW_AUTOSIZE); //display the image imshow("Original Image",org); //resize the image resize(org,resized,Size(),0.5,0.5,INTER_LINEAR); namedWindow("Resized Image",CV_WINDOW_AUTOSIZE); imshow("Resized Image",resized); //save the image //Replace <path> with your desired location imwrite("<path>/saved.png",resized; namedWindow("Image saved",CV_WINDOW_AUTOSIZE); saved=imread("<path to image>/saved.png"); imshow("Image saved",saved); //wait for a keystroke waitKey(0); return 0; } Code explanation Only the new functions/concepts will be explained in this case. #include "opencv2/imgproc/imgproc.hpp" Imgproc is another useful header that gives us access to the various transformations, color conversions, filters, histograms, and so on. Mat org, resized; We have now created two variables, org and resized, to store the original and resized images respectively. resize(org,resized,Size(),0.5,0.5,INTER_LINEAR); We have used the preceding function to resize the image. The preceding function takes six parameters, out of which the first one is the variable containing the source image to be modified. The second one is the variable to store the resized image. The third parameter is the output image size. In this case we have not specified this, but we have instead used the Size() function, which will automatically calculate it based on the values of the fourth and fifth parameters. The fourth and fifth parameters are the scale factors along the horizontal and vertical axes respectively. The sixth parameter is for choosing the type of interpolation method. We have used the bilinear interpolation, which is the default method. imwrite("<path>/saved.png",final); Finally, using the preceding function, you can save an image to a particular location on our PC. The function takes two parameters, out of which the first one is the location where you want to store the image and the second is the variable in which the image is stored. This function is very useful when you want to perform multiple operations on an image and save the image on your PC for future reference. Replace <path> in the preceding function with your desired location. Output Resizing can be demonstrated through the following output: Summary This section showed you how to perform a few of the basic tasks in OpenCV as well as how to write your first OpenCV program. Resources for Article : Further resources on this subject: OpenCV: Segmenting Images [Article] Tracking Faces with Haar Cascades [Article] OpenCV: Image Processing using Morphological Filters [Article]
Read more
  • 0
  • 0
  • 8544
Modal Close icon
Modal Close icon