Search icon CANCEL
Subscription
0
Cart icon
Your Cart (0 item)
Close icon
You have no products in your basket yet
Save more on your purchases! discount-offer-chevron-icon
Savings automatically calculated. No voucher code required.
Arrow left icon
Explore Products
Best Sellers
New Releases
Books
Events
Videos
Audiobooks
Packt Hub
Free Learning
Arrow right icon
timer SALE ENDS IN
0 Days
:
00 Hours
:
00 Minutes
:
00 Seconds

How-To Tutorials - Data

1229 Articles
article-image-configuration-release-and-change-management-oracle
Packt
07 May 2010
9 min read
Save for later

Configuration, Release and Change Management with Oracle

Packt
07 May 2010
9 min read
One of the largest changes to Oracle is the recent acquisition of several other software lines and technologies. Oracle has combined all of these technologies and customers under a single support site called My Oracle Support at http://support.oracle. com, effective from Fall 2009. Along the way, Oracle also completely redesigned the interface, making it flash-based in order to provide a personalized GUI. To take full advantage of the personalization features, you will need to install a free utility on each node and each ORACLE_HOME you would like to monitor. The following paragraphs outline several reasons for use and suggestions for getting started. Configuration management Are you the only Oracle DBA in your company? How do you provide disaster recovery and redundancy for personnel in that situation? MOS has a tool that provides an Automatic Document Repository (my words) called Oracle Configuration Manager (OCM). The real purpose of this tool is to manage all of your configurations (different systems, servers, databases, application servers) when dealing with Oracle support. It is automatic in the sense that if you are out of the office, temporarily or permanently, the system configurations are available for viewing by anyone with the same Oracle Customer Support Identifier (CSI) number . The information is also available to Oracle support personnel. The repository is located on My Oracle Support. The systems are for you to choose, whether you want to only include production and/or non-production systems. What information does OCM collect and upload? It contains extensive hardware details, software installs (not just Oracle products), databases, and Oracle application servers. There is enough information to help in recreating your site if there is a complete disaster. The GUI interface allows managers and other IT personnel to see how nodes and applications are related and how they fit into your architectural framework. The information can only be updated by the upload process. Using OCM in disconnected mode with masking There is sensitive information being collected from the OCM tool. If you are employed by an organization that doesn't allow you to reveal such information or allow direct access by the servers to the Internet, there are steps to improve the security of this upload process. This section is highly recommended to be reviewed before enabling OCM. You must know what types of information are there and how that information is used before enabling uploading capabilities to a support website. To disable the collection of IP and MAC addresses, you add the following entries to the $ORACLE_HOME/ccr/config/collector.properties file. To disable the collection of network addresses, add the following entry: ccr.metric.host.ecm_hw_nic.inet_address=false To disable the collection of the MAC address, add the following entry: ccr.metric.host.ecm_hw_nic.mac_address=false The OCM collector collects the schema usernames for databases configured for configuration collections. The collection of this information is filtered or masked when ccr.metric.oracle_database.db_users.username is assigned the value of 'mask' in the $ORACLE_HOME/ccr/config/collector.properties file. The default behavior of the collector is to not mask this data. MOS customers may request deletion of their configuration information by logging a Service Request (SR) indicating the specific configuration information and scope of the deletion request. Disconnected mode is carried out with something called Oracle Support Hub, which is installed at your site. This hub is configured as a local secure site for direct uploads from your nodes, which the hub can then upload to MOS through the Internet. This protects each of your nodes from any type of direct Internet access. Finally, there is a way to do a manual upload of a single node using the method outlined in the MOS document 763142.1: How to upload the collection file ocmconfig.jar to My Oracle Support for Oracle Configuration Manager (OCM) running in Disconnected Mode. This is probably the safest method to use for OCM. Run it for a specific purpose with appropriate masking built-in and then request the information to be deleted by entering a SR request. These tips came from these locations as well as the OCM licensing agreement found on MOS: http://www.oracle.com/support/collateral/customersupport- security-practices.pdf http://download.oracle.com/docs/html/E12881_01/toc.htm The Oracle Support Hub can by found on the OCM Companion Distribution Disk at: http://www.oracle.com/technology/ documentation/ocm.html. Each node with an installed OCM collector can be automated to upload any changes on a daily basis or interval of your choice. OCM is now an optional part of any of the 10.2.0.4+ Oracle Product GUI installs. The OCM collector is also found by logging into MOS and selecting the collector tab. It is recommended to use at least the 3.2 version for ease of installation across the enterprise. Be aware! The collector install actually creates the Unix cron entry to automatically schedule the uploads. Mass deployment utility The OCM collector utility has been out for over a year, but a recent enhancement makes installation easier with a mass deployment utility. On the MOS collector tab, find Configuration Manager Repeater & Mass Deployment Tools and the OCM Companion Distribution Guide. The template file required to install the collector on multiple servers is in csv format, which you may find difficult to edit using vi or vim. The template doesn't have an initial entry and the length is wider than the average session window. Once the first entry is filed out (try using desktop spreadsheet software), editing this file with a command-line tool is easier. It has a secure password feature so that no password is stored in clear text. You can enter a password at the prompt or allow the password utility to encrypt the open text passwords in the template file during the install run. Running the utility runs very quickly from a single node that has SSH access to all entries in the template. It auto detects if OCM was already installed and bypasses any of those entries. You may encounter an issue where the required JAVA version is higher than what is installed. Other prerequisites include SSH on Linux or CYGWIN for Windows. A downside is that all configuration information is available to everyone with the same CSI number. In a small IT shop, this isn't a problem as long as MOS access is maintained properly when personnel changes. Providing granular group access within a CSI number to your uploaded configurations is a highly anticipated feature. Release management As a DBA you must be consistent in the different aspects of administration. This takes dedication to keep all of your installed Oracle products up-to-date on critical patches. Most DBAs keep up-to-date with production down issues that require a patch install. But what about the quarterly security fixes? The operating systems that your system admin is in charge of will probably be patched more regularly than Oracle. Why is that the case? It seems to take an inordinate amount of effort to accomplish what appears to be a small task. Newer versions of Oracle are associated with major enhancements—as shown by the differences between versions 11.1 and 11.2. Patch sets contain at least all the cumulative bug fixes for a particular version of Oracle and an occasional enhancement as shown in the version difference between 11.1.0.6 and 11.1.0.7. Oracle will stop supporting certain versions, indicating which is the most stable version (labeling it as the terminal release). For example, the terminal release of Oracle 10.1.x is 10.1.0.5, as that was the last patch set released. See the following document on MOS for further information on releases—Oracle Server (RDBMS) Releases Support Status Summary [Doc ID: 161818.1]. In addition to applying patch sets on a regular basis (usually an annual event) to keep current with bug fixes, there are other types of patches released on a regular basis. Consider these to be post-patch set patches. There is some confusing information from MOS, with two different methods of patching on a quarterly basis (Jan, April, July, Oct.)—Patch Set Updates and Critical Patch Updates. CPUs only contain security bug fixes. The newer method of patching—PSU—includes not only the security fixes but other major bugs. These are tested as a single unit and contain bug fixes that have been applied in customers' production environments. See the following for help in identifying a database version in relationship to PSUs: MOS Doc ID 850471.1 1st digit-Major release number 2nd digit-Maintenance release 3rd digit-Application server release 4th digit-Release component specific 5th digit-Platform specific release First PSU for Oracle Database Version-10.2.0.4.1 Second PSU for Oracle Database Version-10.2.0.4.2 While either PSUs or CPUs can be applied to a new or existing system, Oracle recommends that you stick to one type. If you have applied CPUs in the past and want to continue—that is one path. If you have applied CPUs in the past and now want to apply a PSU, you must now only apply PSUs from this point to prevent conflicts. Switching back and forth will cause problems and ongoing issues with further installs, and it requires significant effort to start down this path. You may need a merge patch when migrating from a current CPU environment, called a Merge Request on MOS. Important information on differences between CPUs and PSUs can be found in the following locations. If there is a document number, then that is found on the MOS support site: http://blogs.oracle.com/gridautomation/ http://www.oracle/technology/deploy/security/alerts. htm Doc 864316.1 Application of PSU can be automated through Deployment Procedures Doc 854428.1 Intro to Patch Set Updates Doc 756388.1 Recommended Patches Upgrade Companions 466181.1, 601807.1 Error Correction Policy 209768.1 Now to make things even more complicated for someone new to Oracle; let's discuss recommended patches. These are released between the quarterly PSUs and CPUs with common issues for targeted configurations . The following are targeted configurations: Generic—General database use Real Application Clusters and CRS—For running multiple instances on a single database with accompanying Oracle Clusterware software DataGuard (and/or Streams)—Oracle Redo Apply technology for moving data to a standby database or another read/write database Exadata—Vendor-specific HP hardware storage solution for Oracle Ebusiness Suite Certification—Oracle's version of Business Applications, which runs on an Oracle Database Recommended patches are tested as a single combined unit, reducing some of the risk involved with multiple patches. They are meant to stabilize production environments, hopefully saving time and cost with known issues starting with Oracle Database Release 10.2.0.3—see Doc ID: 756671.1.
Read more
  • 0
  • 0
  • 2555

article-image-oracle-when-use-log-miner
Packt
07 May 2010
6 min read
Save for later

Oracle: When to use Log Miner

Packt
07 May 2010
6 min read
Log Miner has both a GUI interface in OEM as well as the database package, DBMS_LOGMNR. When this utility is used by the DBA, its primary focus is to mine data from the online and archived redo logs. Internally Oracle uses the Log Miner technology for several other features, such as Flashback Transaction Backout, Streams, and Logical Standby Databases. This section is not on how to run Log Miner, but looks at the task of identifying the information to restore. The Log Miner utility comes into play when you need to retrieve an older version of selected pieces of data without completely recovering the entire database. A complete recovery is usually a drastic measure that means downtime for all users and the possibility of lost transactions. Most often Log Miner is used for recovery purposes when the data consists of just a few tables or a single code change. Make sure supplemental logging is turned on (see the Add Supplemental Logging section). In this case, you discover that one or more of the following conditions apply when trying to recover a small amount of data that was recently changed: Flashback is not enabled Flashback logs that are needed are no longer available Data that is needed is not available in the online redo logs Data that is needed has been overwritten in the undo segments Go to the last place available: archived redo logs. This requires the database to be in archivelog mode and for all archive logs that are needed to still be available or recoverable. Identifying the data needed to restore One of the hardest parts of restoring data is determining what to restore, the basic question being when did the bad data become part of the collective? Think the Borg from Star Trek! When you need to execute Log Miner to retrieve data from a production database, you will need to act fast. The older the transactions the longer it will take to recover and traverse with Log Miner. The newest (committed) transactions are processed first, proceeding backwards. The first question to ask is when do you think the bad event happened? Searching for data can be done in several different ways: SCN, timestamp, or log sequence number> Pseudo column ORA_ROWSCN SCN, timestamp, or log sequence number If you are lucky, the application also writes a timestamp of when the data was last changed. If that is the case, then you determine the archive log to mine by using the following queries. It is important to set the session NLS_DATE_FORMAT so that the time element is displayed along with the date, otherwise you will just get the default date format of DD-MMM-RR. The data format comes from the database startup parameters— the NLS_TERRITORY setting. Find the time when a log was archived and match that to the archive log needed. Pseudo column ORA_ROWSCN While this method seems very elegant, it does not work perfectly, meaning it won't always return the correct answer. As it may not work every time or accurately, it is generally not recommended for Flashback Transaction Queries. It is definitely worth trying to narrow the window that you will have to search. It uses the SCN information that was stored for the associated transaction in the Interested Transaction List. You know that delayed block cleanout is involved. The pseudo column ORA_ROWSCN contains information for the approximate time this table was updated for each row. In the following example the table has three rows, with the last row being the one that was most recently updated. It gives me the time window to search the archive logs with Log Miner. Log Miner is the basic technology behind several of the database Maximum Availability Architecture capabilities—Logical Standby, Streams, and the following Flashback Transaction Backout exercise. Flashback Transaction Query and Backout Flashback technology was first introduced in Oracle9i Database. This feature allows you to view data at different points in time and with more recent timestamps (versions), and thus provides the capability to recover previous versions of data. In this article, we are dealing with Flashback Transaction Query (FTQ) and Flashback Transaction Backout (FTB), because they both deal with transaction IDs and integrate with the Log Miner utility. See the MOS document: "What Do All 10g Flashback Features Rely on and what are their Limitations?" (Doc ID 435998.1). Flashback Transaction Query uses the transaction ID (Xid) that is stored with each row version in a Flashback Versions Query to display every transaction that changed the row. Currently, the only Flashback technology that can be used when the object(s) in question have been changed by DDL is Flashback Data Archive. There are other restrictions to using FTB with certain data types (VARRAYs, BFILES), which match the data type restrictions for Log Miner. This basically means if data types aren't supported, then you can't use Log Miner to find the undo and redo log entries. When would you use FTQ or FTB instead of the previously described methods? The answer is when the data involves several tables with multiple constraints or extensive amounts of information. Similar to Log Miner, the database can be up and running while people are working online in other schemas of the database to accomplish this restore task. An example of using FTB or FTQ would be to reverse a payroll batch job that was run with the wrong parameters. Most often a batch job is a compiled code (like C or Cobol) run against the database, with parameters built in by the application vendor. A wrong parameter could be the wrong payroll period, wrong set of employees, wrong tax calculations, or payroll deductions. Enabling flashback logs First off all flashback needs to be enabled in the database. Oracle Flashback is the database technology intended for a point-in-time recovery (PITR) by saving transactions in flashback logs. A flashback log is a temporary Oracle file and is required to be stored in the FRA, as it cannot be backed up to any other media. Extensive information on all of the ramifications of enabling flashback is found in the documentation labeled: Oracle Database Backup and Recovery User's Guide. See the following section for an example of how to enable flashback: SYS@NEWDB>ALTER SYSTEM SET DB_RECOVERY_FILE_DEST='/backup/flash_recovery_area/NEWDB' SCOPE=BOTH;SYS@NEWDB>ALTER SYSTEM SET DB_RECOVERY_FILE_DEST_SIZE=100M SCOPE=BOTH;--this is sized for a small test databaseSYS@NEWDB> SHUTDOWN IMMEDIATE;SYS@NEWDB> STARTUP MOUNT EXCLUSIVE;SYS@NEWDB> ALTER DATABASE FLASHBACK ON;SYS@NEWDB> ALTER DATABASE OPEN;SYS@NEWDB> SHOW PARAMETER RECOVERY; The following query would then verify that FLASHBACK had been turned on: SYS@NEWDB>SELECT FLASHBACK_ON FROM V$DATABASE;
Read more
  • 0
  • 0
  • 4101

article-image-installing-and-managing-multi-master-replication-managermmm-mysql-high-availability
Packt
04 May 2010
5 min read
Save for later

Installing and Managing Multi Master Replication Manager(MMM) for MySQL High Availability

Packt
04 May 2010
5 min read
(Read more interesting articles on MySQL High Availability here.) Multi Master Replication Manager (MMM): initial installation This setup is asynchronous, and a small number of transactions can be lost in the event of the failure of the master. If this is not acceptable, any asynchronous replication-based high availability technique is not suitable. Over the next few recipes, we shall configure a two-node cluster with MMM. It is possible to configure additional slaves and more complicated topologies. As the focus of this article is high availability, and in order to keep this recipe concise, we shall not mention these techniques (although, they all are documented in the manual available at http://mysql-mmm.org/). MMM consists of several separate Perl scripts, with two main ones: mmmd_mon: Runs on one node, monitors all nodes, and takes decisions. mmmd_agent: Runs on each node, monitors the node, and receives instructions from mmm_mon. In a group of MMM-managed machines, each node has a node IP, which is the normal server IP address. In addition, each node has a "read" IP and a "write" IP. Read and write IPs are moved around depending on the status of each node as detected and decided by mmmd_mon, which migrates these IP address around to ensure that the write IP address is always on an active and working master, and that all read IPs are connected to another master that is in sync (which does not have out-of-date data). mmmd_mon should not run on the same server as any of the databases to ensure good availability. Thus, the best practice would be to keep a minimum number of three nodes. In the examples of this article, we will configure two MySQL servers, node 5 and node 6 (10.0.0.5 and 6) with a virtual writable IP of 10.0.0.10 and two read-only IPs of 10.0.0.11 and 10.0.0.12, using a monitoring node node 4 (10.0.0.4). We will use RedHat / CentOS provided software where possible. If you are using the same nodes to try out any of the other recipes discussed in this article, be sure to remove MySQL Cluster RPMs and /etc/my.cnf before attempting to follow this recipe There are several phases to set up MMM. Firstly, the MySQL and monitoring nodes must have MMM installed, and each node must be configured to join the cluster. Secondly, the MySQL server nodes must have MySQL installed and must be configured in a master-master replication agreement. Thirdly, a monitoring node (which will monitor the cluster and take actions based on what it sees) must be configured. Finally, the MMM monitoring node must be allowed to take control of the cluster. In this article, each of the previous four steps is a recipe. The first recipe covers the initial installation of MMM on the nodes. How to do it... The MMM documentation provides a list of required Perl modules. With one exception, all Perl modules currently required for both monitoring agents and server nodes can be found in either the base CentOS / RHEL repositories, or the EPEL library (see the Appendices for instructions on configuration of this repository), and will be installed with the following yum command: [root@node6 ~]# yum -y install perl-Algorithm-Diff perl-Class-Singleton perl-DBD-MySQL perl-Log-Log4perl perl-Log-Dispatch perl-Proc-Daemon perl-MailTools Not all of the package names are obvious for each module; fortunately, the actual perl module name is stored in the Other field in the RPM spec file, which can be searched using this syntax: [root@node5 mysql-mmm-2.0.9]# yum whatprovides "*File::stat*"Loaded plugins: fastestmirror...4:perl-5.8.8-18.el5.x86_64 : The Perl programming languageMatched from:Other : perl(File::stat) = 1.00Filename : /usr/share/man/man3/File::stat.3pm.gz... This shows that the Perl File::stat module is included in the base perl package (this command will dump once per relevant file; in this case, the first file that matches is in fact the manual page). The first step is to download the MMM source code onto all nodes: [root@node4 ~]# mkdir mmm[root@node4 ~]# cd mmm[root@node4 mmm]# wget http://mysql-mmm.org/_media/:mmm2:mysql-mmm-2.0.9.tar.gz--13:44:45-- http://mysql-mmm.org/_media/:mmm2:mysql-mmm-2.0.9.tar.gz...13:44:45 (383 KB/s) - `mysql-mmm-2.0.9.tar.gz' saved [50104/50104] Then we extract it using the tar command: [root@node4 mmm]# tar zxvf mysql-mmm-2.0.9.tar.gzmysql-mmm-2.0.9/mysql-mmm-2.0.9/lib/...mysql-mmm-2.0.9/VERSIONmysql-mmm-2.0.9/LICENSE[root@node4 mmm]# cd mysql-mmm-2.0.9 Now, we need to install the software, which is simply done with the make file provided: [root@node4 mysql-mmm-2.0.9]# make installmkdir -p /usr/lib/perl5/vendor_perl/5.8.8/MMM /usr/bin/mysql-mmm /usr/sbin /var/log/mysql-mmm /etc /etc/mysql-mmm /usr/bin/mysql-mmm/agent/ /usr/bin/mysql-mmm/monitor/...[ -f /etc/mysql-mmm/mmm_tools.conf ] || cp etc/mysql-mmm/mmm_tools.conf /etc/mysql-mmm/ Ensure that the exit code is 0 and that there are no errors: [root@node4 mysql-mmm-2.0.9]# echo $?0 Any errors are likely caused as a result of dependencies—ensure that you have a working yum configuration (refer to Appendices) and have run the correct yum install command.
Read more
  • 0
  • 0
  • 4227

article-image-setting-mysql-replication-high-availability
Packt
04 May 2010
6 min read
Save for later

Setting up MySQL Replication for High Availability

Packt
04 May 2010
6 min read
MySQL Replication is a feature of the MySQL server that allows you to replicate data from one MySQL database server (called the master) to one or more MySQL database servers (slaves). MySQL Replication has been supported in MySQL for a very long time and is an extremely flexible and powerful technology. Depending on the configuration, you can replicate all databases, selected databases, or even selected tables within a database. In this article, by Alex Davies, author of High Availability MySQL Cookbook, we will cover: Designing a replication setup Configuring a replication master Configuring a replication slave without synchronizing data Configuring a replication slave and migrating data with a simple SQL dump Using LVM to reduce downtime on master when bringing a slave online Replication safety tricks Installing and Managing Multi Master Replication Manager(MMM) for MySQL High Availability is covered seperately. Replication is asynchronous, that is, the process of replication is not immediate and there is no guarantee that slaves have the same contents as the master (this is in contrast to MySQL Cluster). Designing a replication setup There are many ways to architect a MySQL Replication setup, with the number of options increasing enormously with the number of machines. In this recipe, we will look at the most common topologies and discuss the advantages and disadvantages of each, in order to show you how to select the appropriate design for each individual setup. Getting ready MySQL replication is simple. A server involved in a replication setup has one of following two roles: Master: Master MySQL servers write all transactions that change data to a binary log Slave: Slave MySQL servers connect to a master (on start) and download the transactions from the master's binary log, thereby applying them to the local server Slaves can themselves act as masters; the transactions that they apply from their master can be added in turn to their log as if they were made directly against the slave. Binary logs are binary files that contain details of every transaction that the MySQL server has executed. Running the server with the binary log enabled makes performance about 1 percent slower. The MySQL master creates binary logs in the forms name.000001, name.000002, and so on. Once a binary log reaches a defined size, it starts a new one. After a certain period of time, MySQL removes old logs. The exact steps for setting up both slaves and masters are covered in later recipes, but for the rest of this recipe it is important to understand that slaves contact masters to retrieve newer bits of the binary log, and to apply these changes to their local database. How to do it... There are several common architectures that MySQL replication can be used with. We will briefly mention and discuss benefits and problems with the most common designs, although we will explore in detail only designs that achieve high availability. Master and slave A single master with one or more slaves is the simplest possible setup. A master with one slave connected from the local network, and one slave connected via a VPN over the Internet, is shown in the following diagram: A setup such as this—with vastly different network connections from the different slaves to the master—will result in the two slaves having slightly different data. It is likely that the locally attached slave may be more up to date, because the latency involved in data transfers over the Internet (and any possible restriction on bandwidth) may slow down the replication process. This Master-Slave setup has the following common uses and advantages: A local slave for backups, ensuring that there is no massive increase in load during a backup period. A remote location—due to the asynchronous nature of MySQL replication, there is no great problem if the link between the master and the slave goes down (the slave will catch up when reconnected), and there is no significant performance hit at the master because of the slave. It is possible to run slightly different structures (such as different indexes) and focus a small number of extremely expensive queries at a dedicated slave in order to avoid slowing down the master. This is an extremely simple setup to configure and manage. A Master-Slave setup unfortunately has the following disadvantages: No automatic redundancy. It is common in setups such as this to use lower specification hardware for the slaves, which means that it may be impossible to "promote" a slave to a master in the case of an master failure. Write queries cannot be committed on the slave node. This means write transactions will have to be sent over the VPN to the master (with associated latency, bandwidth, and availability problems). Replication is equivalent to a RAID 1 setup, which is not an enormously efficient use of disk space (In the previous example diagram, each piece of data is written three times). Each slave does put a slight load on the master as it downloads its binary log. The number of slaves thus can't increase infinitely. Multi-master (active / active) Multi-master replication involves two MySQL servers, both configured as replication masters and slaves. This means that a transaction executed on one is picked up by the other, and vice versa, as shown in the following diagram: A SQL client connecting to the master on the left will execute a query, which will end up in that master's binary log. The master on the right will pick this query up and execute it. The same process, in reverse, occurs when a query is executed on the master on the right. While this looks like a fantastic solution, there are problems with this design: It is very easy for the data on the servers to become inconsistent due to the non-deterministic nature of some queries and "race conditions" where conflicting queries are executed at the same time on each node Recent versions of MySQL include various tricks to minimize the likelihood of these problems, but they are still almost inevitable in most real-world setups. It is extremely difficult to discover if this inconsistency exists, until it gets so bad that the replication breaks (because a replicated query can't be executed on the other node). This design is only mentioned here for completeness; it is often strongly recommended not to use it. Either use the next design, or if more than one "active" node is required, use one of the other high-availability techniques that are available but not covered in this article.
Read more
  • 0
  • 0
  • 9827

article-image-working-aggregators-oracle-coherence-35
Packt
27 Apr 2010
5 min read
Save for later

Working with Aggregators in Oracle Coherence 3.5

Packt
27 Apr 2010
5 min read
For example, you might want to retrieve the total amount of all orders for a particular customer. One possible solution is to retrieve all the orders for the customer using a filter and to iterate over them on the client in order to calculate the total. While this will work, you need to consider the implications: You might end up moving a lot of data across the network in order to calculate a result that is only few bytes long You will be calculating the result in a single-threaded fashion, which might introduce a performance bottleneck into your application The better approach would be to calculate partial results on each cache node for the data it manages, and to aggregate those partial results into a single answer before returning it to the client. Fortunately, we can use Coherence aggregators to achieve exactly that. By using an aggregator, we limit the amount of data that needs to be moved across the wire to the aggregator instance itself, the partial results returned by each Coherence node the aggregator is evaluated on, and the final result. This reduces the network traffic significantly and ensures that we use the network as efficiently as possible. It also allows us to perform the aggregation in parallel, using full processing power of the Coherence cluster. At the very basic, an aggregator is an instance of a class that implements the com.tangosol.util.InvocableMap.EntryAggregator interface: interface EntryAggregator extends Serializable {Object aggregate(Set set);} However, you will rarely have the need to implement this interface directly. Instead, you should extend the com.tangosol.util.aggregator.AbstractAggregator class that also implements the com.tangosol.util.InvocableMap.ParallelAwareAggregator interface, which is required to ensure that the aggregation is performed in parallel across the cluster. The AbstractAggregator class has a constructor that accepts a value extractor to use and defines the three abstract methods you need to override: public abstract class AbstractAggregatorimplements InvocableMap.ParallelAwareAggregator {public AbstractAggregator(ValueExtractor valueExtractor) {...}protected abstract void init(boolean isFinal);protected abstract void process(Object value, boolean isFinal);protected abstract Object finalizeResult(boolean isFinal);} The init method is used to initialize the result of aggregation, the process method is used to process a single aggregation value and include it in the result, and the finalizeResult method is used to create the final result of the aggregation. Because aggregators can be executed in parallel, the init and finalizeResult methods accept a flag specifying whether the result to initialize or finalize is the final result that should be returned by the aggregator or a partial result, returned by one of the parallel aggregators. The process method also accepts an isFinal flag, but in its case the semantics are somewhat different—if the isFinal flag is true, that means that the object to process is the result of a single parallel aggregator execution that needs to be incorporated into the final result. Otherwise, it is the value extracted from a target object using the value extractor that was specified as a constructor argument. This will all be much clearer when we look at an example. Let's write a simple aggregator that returns an average value of a numeric attribute: public class AverageAggregatorextends AbstractAggregator {private transient double sum;private transient int count;public AverageAggregator() {// deserialization constructor}public AverageAggregator(ValueExtractor valueExtractor) {super(valueExtractor);}public AverageAggregator(String propertyName) {super(propertyName);}protected void init(boolean isFinal) {sum = 0;count = 0;}protected void process(Object value, boolean isFinal) {if (value != null) {if (isFinal) {PartialResult pr = (PartialResult) o;sum += pr.getSum();count += pr.getCount();}else {sum += ((Number) o).doubleValue();count++;}}}protected Object finalizeResult(boolean isFinal) {if (isFinal) {return count == 0 ? null : sum / count;}else {return new PartialResult(sum, count);}}static class PartialResult implements Serializable {private double sum;private int count;PartialResult(double sum, int count) {this.sum = sum;this.count = count;}public double getSum() {return sum;}public int getCount() {return count;}}} As you can see, the init method simply sets both the sum and the count fields to zero, completely ignoring the value of the isFinal flag. This is OK, as we want those values to start from zero whether we are initializing our main aggregator or one of the parallel aggregators. The finalizeResult method, on the other hand, depends on the isFinal flag to decide which value to return. If it is true, it divides the sum by the count in order to calculate the average and returns it. The only exception is if the count is zero, in which case the result is undefined and the null value is returned. However, if the isFinal flag is false, the finalizeResult simply returns an instance of a PartialResult inner class, which is nothing more than a holder for the partial sum and related count on a single node. Finally, the process method also uses the isFinal flag to determine its correct behavior. If it's true, that means that the value to be processed is a PartialResult instance, so it reads partial sum and count from it and adds them to the main aggregator's sum and count fields. Otherwise, it simply adds the value to the sum field and increments the count field by one. We have implemented AverageAggregator in order to demonstrate with a simple example how the isFinal flag should be used to control the aggregation, as well as to show that the partial and the final result do not have to be of the same type. However, this particular aggregator is pretty much a throw-away piece of code, as we'll see in the next section.
Read more
  • 0
  • 0
  • 3070

article-image-querying-data-grid-coherence-35-obtaining-query-results-and-using-indexes
Packt
27 Apr 2010
6 min read
Save for later

Querying the Data Grid in Coherence 3.5: Obtaining Query Results and Using Indexes

Packt
27 Apr 2010
6 min read
The easiest way to obtain query results is to invoke one of the QueryMap.entrySet methods: Filter filter = ...;Set<Map.Entry> results = cache.entrySet(filter); This will return a set of Map.Entry instances representing both the key and the value of a cache entry, which is likely not what you want. More often than not you need only values, so you will need to iterate over the results and extract the value from each Map.Entry instance: List values = new ArrayList(results.size());for (Map.Entry entry : entries) {values.add(entry.getValue());} After doing this a couple times you will probably want to create a utility method for this task. Because all the queries should be encapsulated within various repository implementations, we can simply add the following utility methods to our AbstractCoherenceRepository class: public abstract class AbstractCoherenceRepository<K, V extendsEntity<K>> {...protected Collection<V> queryForValues(Filter filter) {Set<Map.Entry<K, V>> entries = getCache().entrySet(filter);return extractValues(entries);}protected Collection<V> queryForValues(Filter filter,Comparator comparator) {Set<Map.Entry<K, V>> entries =getCache().entrySet(filter, comparator);return extractValues(entries);}private Collection<V> extractValues(Set<Map.Entry<K, V>> entries) {List<V> values = new ArrayList<V>(entries.size());for (Map.Entry<K, V> entry : entries) {values.add(entry.getValue());}return values;} What happened to the QueryMap.values() method?Obviously, things would be a bit simpler if the QueryMap interface also had an overloaded version of the values method that accepts a filter and optionally comparator as arguments.I'm not sure why this functionality is missing from the API, but I hope it will be added in one of the future releases. In the meantime, a simple utility method is all it takes to provide the missing functionality, so I am not going to complain too much. Controlling query scope using data affinity Data affinity can provide a significant performance boost because it allows Coherence to optimize the query for related objects. Instead of executing the query in parallel across all the nodes and aggregating the results, Coherence can simply execute it on a single node, because data affinity guarantees that all the results will be on that particular node. This effectively reduces the number of objects searched to approximately C/N, where C is the total number of objects in the cache query is executed against, and N is the number of partitions in the cluster. However, this optimization is not automatic—you have to target the partition to search explicitly, using KeyAssociatedFilter: Filter query = ...;Filter filter = new KeyAssociatedFilter(query, key); In the previous example, we create a KeyAssociatedFilter that wraps the query we want to execute. The second argument to its constructor is the cache key that determines the partition to search. To make all of this more concrete, let's look at the final implementation of the code for our sample application that returns account transactions for a specific period. First, we need to add the getTransactions method to our Account class: public Collection<Transaction> getTransactions(Date from, Date to) {return getTransactionRepository().findTransactions(m_id, from, to);} Finally, we need to implement the findTransactions method within the CoherenceTransactionRepository: public Collection<Transaction> findTransactions(Long accountId, Date from, Date to) {Filter filter = new FilterBuilder().equals("id.accountId", accountId).between("time", from, to).build();return queryForValues(new KeyAssociatedFilter(filter, accountId),new DefaultTransactionComparator());} As you can see, we target the query using the account identifier and ensure that the results are sorted by transaction number by passing DefaultTransactionComparator to the queryForValues helper method we implemented earlier. This ensures that Coherence looks for transactions only within the partition that the account with the specified id belongs to. Querying near cache One situation where a direct query using the entrySet method might not be appropriate is when you need to query a near cache. Because there is no way for Coherence to determine if all the results are already in the front cache, it will always execute the query against the back cache and return all the results over the network, even if some or all of them are already present in the front cache. Obviously, this is a waste of network bandwidth. What you can do in order to optimize the query is to obtain the keys first and then retrieve the entries by calling the CacheMap.getAll method: Filter filter = ...;Set keys = cache.keySet(filter);Map results = cache.getAll(keys); The getAll method will try to satisfy as many results as possible from the front cache and delegate to the back cache to retrieve only the missing ones. This will ensure that we move the bare minimum of data across the wire when executing queries, which will improve the throughput. However, keep in mind that this approach might increase latency, as you are making two network roundtrips instead of one, unless all results are already in the front cache. In general, if the expected result set is relatively small, it might make more sense to move all the results over the network using a single entrySet call. Another potential problem with the idiom used for near cache queries is that it could return invalid results. There is a possibility that some of the entries might change between the calls to keySet and getAll. If that happens, getAll might return entries that do not satisfy the filter anymore, so you should only use this approach if you know that this cannot happen (for example, if objects in the cache you are querying, or at least the attributes that the query is based on, are immutable). Sorting the results We have already seen that the entrySet method allows you to pass a Comparator as a second argument, which will be used to sort the results. If your objects implement the Comparable interface you can also specify null as a second argument and the results will be sorted based on their natural ordering. For example, if we defined the natural sort order for transactions by implementing Comparable within our Transaction class, we could've simply passed null instead of a DefaultTransactionComparator instance within the findTransactions implementation shown earlier. On the other hand, if you use near cache query idiom, you will have to sort the results yourself. This is again an opportunity to add utility methods that allow you to query near cache and to optionally sort the results to our base repository class. However, there is a lot more to cover in this article, so I will leave this as an exercise for the reader.
Read more
  • 0
  • 0
  • 1992
Unlock access to the largest independent learning library in Tech for FREE!
Get unlimited access to 7500+ expert-authored eBooks and video courses covering every tech area you can think of.
Renews at $19.99/month. Cancel anytime
article-image-working-value-extractors-and-simplifying-queries-oracle-coherence-35
Packt
27 Apr 2010
5 min read
Save for later

Working with Value Extractors and Simplifying Queries in Oracle Coherence 3.5

Packt
27 Apr 2010
5 min read
Coherence allows you to do look up one or more objects based on attributes other than the identity by specifying a filter for set-based operations defined by the QueryMap interface. public interface QueryMap extends Map {Set keySet(Filter filter);Set entrySet(Filter filter);Set entrySet(Filter filter, Comparator comparator);...} As you can see from the previous interface definition, all three methods accept a filter as the first argument, which is an instance of a class implementing a very simple com.tangosol.util.Filter interface: public interface Filter {boolean evaluate(Object o);} Basically, the Filter interface defines a single method, evaluate, which takes an object to evaluate as an argument and returns true if the specified object satisfies the criteria defined by the filter, or false if it doesn't. This mechanism is very flexible, as it allows you to filter your cached objects any way you want. For example, it would be quite simple to implement a filter that can be used to retrieve all the account transactions in a specific period: public class TransactionFilter implements Filter {private Long m_accountId;private Date m_from;private Date m_to;public TransactionFilter(Long accountId, Date from, Date to) {m_accountId = accountId;m_from = from;m_to = to;}public boolean evaluate(Object o) {Transaction tx = (Transaction) o;return tx.getId().getAccountId().equals(m_accountId)&& tx.getTime().compareTo(from) >= 0&& tx.getTime().compareTo(to) <= 0;}} While the previous sample filter implementation is perfectly valid and will return correct results if executed against the transactions cache, it would be very cumbersome if you had to define every single query criterion in the application by implementing a custom filter class as we did previously. Fortunately, Coherence provides a number of built-in filters that make custom filter implementation unnecessary in the vast majority of cases. Built-in filters Most queries can be expressed in terms of object attributes and standard logical and relational operators, such as AND, OR, equals, less than, greater than, and so on. For example, if we wanted to find all the transactions for an account, it would be much easier if we could just execute the query analogous to the select * from Transactions where account_id = 123 SQL statement than to write a custom filter that checks if the accountId attribute is equal to 123. The good news is that Coherence has a number of built-in filters that allow us to do exactly that. The following table lists all the filters from the com.tangosol.util.filter package that you can use to construct custom queries: As you can see, pretty much all of the standard Java logical operators and SQL predicates are covered. This will allow us to construct query expressions as complex as the ones we can define in Java code or the SQL where clause. The bad news is that there is no query language in Coherence that allows you to specify a query as a string. Instead, you need to create the expression tree for the query programmatically, which can make things a bit tedious. For example, the where clause of the SQL statement we specified earlier, select * from Transactions where account_id = 123, can be represented by the following Coherence filter definition: Filter filter = new EqualsFilter("getId.getAccountId", 123); In this case it is not too bad: we simply create an instance of an EqualsFilter that will extract the value of an accountId attribute from a Transaction.Id instance and compare it with 123. However, if we modify the query to filter transactions by date as well, the filter expression that we need to create becomes slightly more complex: Filter filter = new AndFilter(new EqualsFilter("getId.getAccountId", accountId),new BetweenFilter("getTime", from, to)); If you need to combine several logical expressions, this can quickly get out of hand, so we will look for a way to simplify filter creation shortly. But first, let's talk about something we used in the examples without paying much attention to it—value extractors. Value extractors As you can see from the previous examples, a query is typically expressed in terms of object attributes, such as accountId or time, while the evaluate method defined by the Filter interface accepts a whole object that the attributes belong to, such as a Transaction instance. That implies that we need a generic way to extract attribute values from an object instance—otherwise, there would be no way to define reusable filters, such as the ones in the table earlier that ship with Coherence, and we would be forced to implement a custom filter for each query we need to execute. In order to solve this problem and enable extraction of attribute values from an object, Coherence introduces value extractors. A value extractor is an object that implements a com.tangosol.util.ValueExtractor interface: public interface ValueExtractor {Object extract(Object target);} The sole purpose of a value extractor is to extract a derived value from the target object that is passed as an argument to the extract method . The result could be a single attribute value, a combination of multiple attributes (concatenation of first and last name, for example), or in general, a result of some transformation of a target object.
Read more
  • 0
  • 0
  • 5251

article-image-author-podcast-bob-griesemer-oracle-warehouse-builder-11g
Packt
09 Apr 2010
1 min read
Save for later

Author Podcast - Bob Griesemer on Oracle Warehouse Builder 11g

Packt
09 Apr 2010
1 min read
Click here to download the interview, or hit play in the media player below.    
Read more
  • 0
  • 0
  • 1542

article-image-oracles-rdbms-sql-command-dump-block
Packt
09 Apr 2010
8 min read
Save for later

Oracle's RDBMS SQL Command Dump Block

Packt
09 Apr 2010
8 min read
Do not do this in a production database. Before continuing with this article, you should read the Oracle Database Concepts 11g Release 2 (11.2) of the documentation, the book every DBA should start with. Our examination of data blocks starts in Section 12-6 of the Concepts Manual. Data block format: "Every Oracle data block has a format or internal structure that enables the database to track the data and free space in the block. This format is similar whether the data block contains table, index, or table cluster data." A block is the smallest unit of logical storage that the Relational Database Management System (RDBMS) can manipulate. Block size is determined by the database parameter DB_BLOCK_SIZE. The logical storage of data blocks, extents, segments, and table spaces (from smallest to largest) map to the data files, which are stored in operating system blocks. An undo block will store the undo transaction that is the actual SQL command needed to reverse the original SQL transaction statement. This undo is needed for read consistency for all read-only queries until you commit or rollback that transaction. Read consistency within a changed block (transaction) is maintained for any of the following commands: insert, update, delete, merge, select for update, or lock table. Any of the previous changes are tracked until the command is issued to either commit or rollback a particular transaction. This consistency keeps the data view to each user the same, whether they are just doing queries or actually changing any other data. A point in time or what is called the System Change Number (SCN) identifies each transaction, and transaction flags show the state of the transaction. The only end user that can see any changed data will be the one making the changes, no matter the application used until they commit that change. The SCN advances for every change to the database as a sequential counter, which identifies a certain point in time. The SCN tracks more than just single transactions by end users. These transactions will be in Data Definition Language (DDL) or Data Manipulation Language (DML). DDL statements are associated with creating objects (create table) or what is also called metadata. DML are the other commands mentioned earlier (insert, update, delete, among others) that manipulate the data in some way. The RDBMS advances the SCN if another person logs in, reconnects, or alters their session as well as when Oracle background processes (which constantly check the state of activity inside of the database) take place. It is undo that gives everyone a point-in-time consistent view of the data, which is called Read Consistency. There are controls created from business rules within the application called triggers and integrity constraints that validate the data entered by the user. Database locks control access to data during changes for exclusive access by the end user changing it. During a delete or update statement: The data block is read, loading it into a memory structure called a buffer cache The redo log buffer will contain the corresponding delete or update statement An entry in the undo segment header block is created for this transaction It also copies the delete or update row into an undo block For a delete, the row is removed from the data block and that block is marked as dirty Locks keep exclusive use of that block until a commit or rollback occurs Dirty is an internal designation where the block is identified as having changed data that has not been written to disk. The RDBMS needs to track this information for transactional integrity and consistency. The underlying dynamic performance view v$bh indicates when a particular block is dirty, as seen by the following query: SYS@ORCL11>select file#, block# from v$bh where dirty='Y'; When a transaction is committed by the end user: The transaction SCN is updated in the data block and the undo segment header marks that statement as committed in the header section of the undo block. The logwriter process (LGWR) will flush the log buffer to the appropriate online redo log file. SCN is changed on the data block if it is still in the buffer cache (fast commit). Delayed block cleanout can happen when all of the changed blocks don't have the updated SCN indicating the commit has occurred. This can cause problems with a transaction that is updating large numbers of rows if a rollback needs to occur. Symptoms include hanging onto an exclusive lock until that rollback is finished, and causing end users to wait. The delayed block cleanout process does occasionally cause problems that would require opening an Oracle Support Request. Delayed block cleanout was implemented to save time by reducing the number of disk reads to update the SCN until the RDBMS needs to access data from that same block again. If the changed block has already been written to the physical disk and the Oracle background process encounters this same block (for any other query, DML, or DDL), it will also record the committed change at the same time. It does this by checking the transaction entry by SCN in the undo header, which indicates the changes that have been committed. That transaction entry is located in the transaction table, which keeps track of all active transactions for that undo segment. Each transaction is uniquely identified by the assignment of a transaction ID (XID), which is found in the v$transaction view. This XID is written in the undo header block along with the Undo Byte Address (Uba), which consists of the file and block numbers UBAFIL data file and UBABLK data block, and columns found in the v$transaction view, respectively. Please take the time to go through the following demonstration; it will solidify the complex concepts in this article. Demonstration of data travel path Dumping a block is one of the methods to show how data is stored. It will show the actual contents of the block, whether it is a Table or Index Block, and an actual address that includes the data file number and block number. Remember from the concepts manual that several blocks together make up an extent, and extents then make up segments. A single segment maps to a particular table or index. It is easy to see from the following simplified diagram how different extents can be stored in different physical locations in different data files but the same logical tablespace: The data in the test case comes from creating a small table (segment) with minimum data in a tablespace with a single data file created just for this demonstration. Automatic Segment Space Management (ASSM) is the default in 11g. If you create a tablespace in 11g with none of the optional storage parameters, the RDBMS by default creates an ASSM segment with locally managed autoallocated extents. It is possible to define the size of the extents at tablespace creation time that depends on the type of data to be stored. If all of the data is uniform and you need to maintain strict control over the amount of space used, then uniform extents are desirable. Allowing the RDBMS to autoallocate extents is typical in situations where the data is not the same size for each extent, reducing the amount of time spent in allocating and maintaining space for database segments. Discussing the details, options, and differences for all of the ways to manage segment space in Oracle Database 11g is beyond the scope of this article. For this example, we will be using race car track information as the sample data. For this demonstration, you will create a specific user with the minimum amount of privileges needed to complete this exercise; SQL is provided for that step in the script. There are several key files in the zipped code for this article that you will need for this exercise, listed as follows: dumpblock_sys.sql dumpblock_ttracker.sql dumpblocksys.lst dumpblockttracker.lst NEWDB_ora_8582_SYSDUMP1.rtf NEWDB_ora_8582_SYSDUMP1.txt NEWDB_ora_8621_SYSDUMP2.rtf NEWDB_ora_8621_SYSDUMP2.txt NEWDB_ora_8628_SYSDUMP3.rtf NEWDB_ora_8628_SYSDUMP3.txt NEWDB_ora_8635_SYSDUMP4.rtf NEWDB_ora_8635_SYSDUMP4.txt You will also need access to a conversion calculator to translate the hexadecimal to a number that is the first listing below—use hexadecimal input and decimal output. The second will allow you to look up Hex (Hexadecimal) equivalents for characters.http://calculators.mathwarehouse.com/binary-hexadecimal-calculator.php#hexadecimalBinaryCalculatorhttp://www.asciitable.com/ Location of trace files The dump block statement will create a trace file in the user dump (udump) directory on any version prior to 11gR1, which can be viewed by a text editor. Using 11gR1 and above, you will find it in the diag directory location. This example will demonstrate how to use the adrci command-line utility to view trace files. First we set the home path where the utility will find the files, then search with the most recent listed first—in this case, it is the NEWDB_ora_9980.trc file. Now that you know the location for the trace files, how do you determine which trace file was produced? The naming convention for trace files includes the actual process number associated with that session. Use the following command to produce trace files with a specific name, making it easier to identify a separate task: SYS@NEWDB>ALTER SESSION SET TRACEFILE_IDENTIFIER = SYSDUMP_SESSION;
Read more
  • 0
  • 0
  • 6573

article-image-installing-pentaho-data-integration-mysql
Packt
09 Apr 2010
8 min read
Save for later

Installing Pentaho Data Integration with MySQL

Packt
09 Apr 2010
8 min read
In order to work with Pentaho 3.2 Data Integration(PDI) you need to install the software. It's a simple task; let's do it. Time for action – installing PDI These are the instructions to install Kettle, whatever your operating system. The only prerequisite to install PDI is to have JRE 5.0 or higher installed. If you don't have it, please download it from http://www.javasoft.com/ and install it before proceeding. Once you have checked the prerequisite, follow these steps: From http://community.pentaho.com/sourceforge/ follow the link to Pentaho Data Integration (Kettle). Alternatively, go directly to the download page http://sourceforge.net/projects/pentaho/files/Data Integration. Choose the newest stable release. At this time, it is 3.2.0. Download the file that matches your platform. The preceding screenshot should help you. Unzip the downloaded file in a folder of your choice —C:/Kettle or /home/your_dir/kettle. If your system is Windows, you're done. Under UNIX-like environments, it's recommended that you make the scripts executable. Assuming that you chose Kettle as the installation folder, execute the following command: cd Kettlechmod +x *.sh What just happened? You have installed the tool in just a few minutes. Now you have all you need to start working Launching the PDI graphical designer: Spoon Now that you've installed PDI, you must be eager to do some stuff with data. That will be possible only inside a graphical environment. PDI has a desktop designer tool named Spoon. Let's see how it feels to work with it. Time for action – starting and customizing Spoon In this tutorial you're going to launch the PDI graphical designer and get familiarized with its main features. Start Spoon. If your system is Windows, type the following command: Spoon.bat In other platforms such as Unix, Linux, and so on, type: Spoon.sh If you didn't make spoon.sh executable, you may type: sh Spoon.sh As soon as Spoon starts, a dialog window appears asking for the repository connection data. Click the No Repository button. The main window appears. You will see a small window with the tip of the day. After reading it, close that window. A welcome! window appears with some useful links for you to see. Close the welcome window. You can open that window later from the main menu. Click Options... from the Edit menu. A window appears where you can change various general and visual characteristics. Uncheck the circled checkboxes: Select the tab window Look Feel. Change the Grid size and Preferred Language settings as follows: Click the OK button. Restart Spoon in order to apply the changes. You should neither see the repository dialog, nor the welcome window. You should see the following screen instead: What just happened? You ran for the first time the graphical designer of PDI Spoon, and applied some custom configuration. From the Look Feel configuration window, you changed the size of the dotted grid that appears in the canvas area while you are working. You also changed the preferred language. In the Option tab window, you chose not to show either the repository dialog or the welcome window at startup. These changes were applied as you restarted the tool, not before. The second time you launched the tool, the repository dialog didn't show up. When the main window appeared, all the visible texts were shown in French, which was the selected language, and instead of the welcome window, there was a blank screen. Spoon This tool that you're exploring in this section is the PDI's desktop design tool. With Spoon you design, preview, and test all your work, that is, transformations and jobs. When you see PDI screenshots, what you are really seeing are Spoon screenshots. The other PDI components that you will meet in the following chapters are executed from terminal windows. Setting preferences in the Options window In the tutorial you changed some preferences in the Options window. There are several look and feel characteristics you can change beyond those you changed. Feel free to experiment with this setting. Remember to restart Spoon in order to see the changes applied. If you choose any language as preferred language other than English, you should select a diff erent language as alternati ve. If you do so, every name or descripti on not translated to your preferred language will be shown in the alternative language. Just for the curious people: Italian and French are the overall winners of the list of languages to which the tool has been translated from English. Below them follow Korean, Argenti neanSpanish, Japanese, and Chinese. One of the setti ngs you changed was the appearance of the welcome window at start up. The welcome window has many useful links, all related with the tool: wiki pages, news, forum access, and more. It's worth exploring them. You don't have to change the settings again to see the welcome window. You can open it from the menu Help | Show the Welcome Screen. Storing transformations and jobs in a repository The first time you launched Spoon, you chose No Repository. After that, you confi gured Spoon to stop asking you for the Repository option. You must be curious about what the repository is and why not to use it. Let's explain it. As said, the results of working with PDI are Transformati ons and Jobs. In order to save the Transformations and Jobs, PDI offers two methods: Repository: When you use the repository method you save jobs and transformations in a repository. A repository is a relational database specially designed for this purpose. Files: The files method consists of saving jobs and transformations as regular XML files in the filesystem, with extension kjb and ktr respectively. The following diagram summarizes this: You cannot mix the two methods (files and repository) in the same project. Therefore, you must choose the method when you start the tool. Why did we choose not to work with repository, or in other words, to work with fi les? This is mainly for the following two reasons: Working with files is more natural and practical for most users. Working with repository requires minimum database knowledge and that you also have access to a database engine from your computer. Having both preconditions would allow you to learn working with both methods. However, it's probable that you haven't. Creating your first transformation Until now, you've seen the very basic elements of Spoon. For sure, you must be waiti ng to do some interesting task beyond looking around. It's time to create your first transformation. Time for action – creating a hello world transformation How about starting by saying Hello to the World? Not original but enough for a very first practical exercise. Here is how you do it: Create a folder named pdi_labs under the folder of your choice. Open Spoon. From the main menu select File | New Transformation. At the left-hand side of the screen, you'll see a tree of Steps. Expand the Input branch by double-clicking it. Left -click the Generate Rows icon. Without releasing the button, drag-and-drop the selected icon to the main canvas. The screen will look like this: Double-click the Generate Rows step that you just put in the canvas and fill the text boxes and grid as follows: From the Steps tree, double-click the Flow step. Click the Dummy icon and drag-and-drop it to the main canvas. Click the Generate Rows step and holding the Shift key down, drag the cursor towards the Dummy step. Release the button. The screen should look like this: Right-click somewhere on the canvas to bring up a contextual menu. Select New note. A note editor appears. Type some description such as Hello World! and click OK. From the main menu, select Transformation | Configuration. A window appears to specify transformation properties. Fill the Transformation name with a simple name as hello_world. Fill the Description field with a short description such as My first transformation. Finally provide a more clear explanation in the Extended description text box and click OK. From the main menu, select File | Save. Save the transformation in the folder pdi_labs with the name hello_world. Select the Dummy step by left -clicking it. Click on the Preview button in the menu above the main canvas. A debug window appears. Click the Quick Launch button. The following window appears to preview the data generated by the transformation: Close the preview window and click the Run button. A window appears. Click Launch. The execution results are shown in the bottom of the screen. The Logging tab should look as follows:
Read more
  • 0
  • 1
  • 5365
article-image-unveil-power-your-business-data-oracle-discoverer
Packt
08 Apr 2010
4 min read
Save for later

Unveil the Power of Your Business Data with Oracle Discoverer

Packt
08 Apr 2010
4 min read
A quick guide to Oracle Discoverer packaging Before proceeding to get the Oracle Discoverer software, it’s important to realize what you actually need and what the Oracle Discoverer packaging provides. At the moment, there are two options when it comes to the current Oracle Discoverer software: Oracle Business Intelligence suite, part of Oracle Application Server 10g Release 2 Portal, Forms, Reports and Discoverer suite, part of Oracle Fusion Middleware 11g Release 1 The first option – the Oracle Business Intelligence suite, part of Oracle Application Server 10g Release 2 – includes the following components: Oracle Business Intelligence Discoverer Oracle HTTP Server Oracle Application Server Containers for J2EE (OC4J) Oracle Enterprise Manager 10g Application Server Control Oracle Application Server Web Cache Oracle Application Server Reports Services The first component, Oracle Business Intelligence Discoverer, in the above list represents actually a group of components whose name starts with Discoverer. The package includes: Discoverer Plus Discoverer Viewer Discoverer Services Discoverer Portlet Provider Note, however, that the above list does not include all the Discoverer components. For example, you won’t find the following Discoverer components there: Discoverer Administrator Discoverer Desktop The above components are included in a complementary package called Oracle Business Intelligence Tools. As mentioned at the beginning of this section, another option to take advantage of the Discoverer components is to install the Portal, Forms, Reports and Discoverer suite, which is part of Oracle Fusion Middleware 11g Release 1. This package includes the following components: HTTP Server WebCache Portal Forms Services Forms Builder Reports Services Report Builder/Compiler Discoverer Administrator Discoverer Plus Discoverer Viewer Discoverer Services Discoverer Desktop Enterprise Manager Fusion Middleware Control As you can see, the Portal, Forms, Reports and Discoverer suite, unlike Oracle Business Intelligence suite, does include Discoverer Administrator and Discoverer Desktop. So you won’t need to install another package to obtain these components. A major downside to choosing the Portal, Forms, Reports and Discoverer suite, though, is that it requires some additional software to be installed in your system. Here is the list of the required additional software components: WebLogic Server Repository Creation Utility Identity Management SSO Metadata Repository Creation Assistant Patch Scripts Identity Management 10gR3 Oracle Database Due to this reason – to save you the trouble of installing a lot of software – the "Installation process" section later in this article will cover the installation of Oracle Business Intelligence suite, part of Oracle Application Server 10g Release 2 rather than the Portal, Forms, Reports and Discoverer suite of Oracle Fusion Middleware 11g Release 1. Getting the software Once you have decided on the package you want to install, you can go for it to the OTN’s Software Downloads page at http://www.oracle.com/technology/software/index.html. It’s important to remember that each software component available from this page comes with a Development License, which allows for free download and unlimited evaluation time. You can look at the license at http://www.oracle.com/technology/software/popup-license/standard-license.html. Later, if you so desire, you can always buy products with full-use licenses. So, the OTN’s Software Downloads page, go to the Middleware section and, assuming you want to download Oracle Business Intelligence suite, click the Business Intelligence SE link to proceed to the Oracle Application Server 10g Release 2 (10.1.2.0.2) page at http://www.oracle.com/technology/software/products/ias/htdocs/101202.html. On this page, go down to the Business Intelligence section and find the links to the packages provided for your operating system. Each package is supposed to be copied on a separate CD. The number of CDs and the size of packages to be copied on them may vary depending on the operating system. What you need to do is download the installation packages and then copy each to a CD. Looking through the links to the installation packages, you may notice that Tools CD – the link to the package containing the Oracle Business Intelligence Tools suite – is available only for Microsoft Windows operating system. This is because the components included in the Oracle Business Intelligence Tools suite are Windows-only applications. If, instead of the Oracle Business Intelligence suite, you decided on the Portal, Forms, Reports and Discoverer suite, you have to follow the Oracle Fusion Middleware 11g R1 link in the Middleware section on the OTN’s Software Downloads page. Following that link, you’ll be directed to the Oracle Fusion Middleware 11gR1 Software Downloads page at http://www.oracle.com/technology/software/products/middleware/htdocs/fmw_11_download.html. On this page, go down to the Portal, Forms, Reports and Discoverer section and pick up the distribution divided into several packages. Again, the number of packages within a distribution and their size may vary depending on the operating system.
Read more
  • 0
  • 0
  • 2317

article-image-installing-coherence-35-and-accessing-data-grid-part-2
Packt
31 Mar 2010
10 min read
Save for later

Installing Coherence 3.5 and Accessing the Data Grid: Part 2

Packt
31 Mar 2010
10 min read
Using the Coherence API One of the great things about Coherence is that it has a very simple and intuitive API that hides most of the complexity that is happening behind the scenes to distribute your objects. If you know how to use a standard Map interface in Java, you already know how to perform basic tasks with Coherence. In this section, we will first cover the basics by looking at some of the foundational interfaces and classes in Coherence. We will then proceed to do something more interesting by implementing a simple tool that allows us to load data into Coherence from CSV files, which will become very useful during testing. The basics: NamedCache and CacheFactory As I have briefly mentioned earlier, Coherence revolves around the concept of named caches. Each named cache can be configured differently, and it will typically be used to store objects of a particular type. For example, if you need to store employees, trade orders, portfolio positions, or shopping carts in the grid, each of those types will likely map to a separate named cache. The first thing you need to do in your code when working with Coherence is to obtain a reference to a named cache you want to work with. In order to do this, you need to use the CacheFactory class, which exposes the getCache method as one of its public members. For example, if you wanted to get a reference to the countries cache that we created and used in the console example, you would do the following: NamedCache countries = CacheFactory.getCache("countries"); Once you have a reference to a named cache, you can use it to put data into that cache or to retrieve data from it. Doing so is as simple as doing gets and puts on a standard Java Map: countries.put("SRB", "Serbia");String countryName = (String) countries.get("SRB"); As a matter of fact, NamedCache is an interface that extends Java's Map interface, so you will be immediately familiar not only with get and put methods, but also with other methods from the Map interface, such as clear, remove, putAll, size, and so on. The nicest thing about the Coherence API is that it works in exactly the same way, regardless of the cache topology you use. For now let's just say that you can configure Coherence to replicate or partition your data across the grid. The difference between the two is that in the former case all of your data exists on each node in the grid, while in the latter only 1/n of the data exists on each individual node, where n is the number of nodes in the grid. Regardless of how your data is stored physically within the grid, the NamedCache interface provides a standard API that allows you to access it. This makes it very simple to change cache topology during development if you realize that a different topology would be a better fit, without having to modify a single line in your code. In addition to the Map interface, NamedCache extends a number of lower-level Coherence interfaces. The following table provides a quick overview of these interfaces and the functionality they provide: The "Hello World" example In this section we will implement a complete example that achieves programmatically what we have done earlier using Coherence console—we'll put a few countries in the cache, list cache contents, remove items, and so on. To make things more interesting, instead of using country names as cache values, we will use proper objects this time. That means that we need a class to represent a country, so let's start there: public class Country implements Serializable, Comparable {private String code;private String name;private String capital;private String currencySymbol;private String currencyName;public Country() {}public Country(String code, String name, String capital,String currencySymbol, String currencyName) {this.code = code;this.name = name;this.capital = capital;this.currencySymbol = currencySymbol;this.currencyName = currencyName;}public String getCode() {return code;}public void setCode(String code) {this.code = code;}public String getName() {return name;}public void setName(String name) {this.name = name;}public String getCapital() {return capital;}public void setCapital(String capital) {this.capital = capital;}public String getCurrencySymbol() {return currencySymbol;}public void setCurrencySymbol(String currencySymbol) {this.currencySymbol = currencySymbol;}public String getCurrencyName() {return currencyName;}public void setCurrencyName(String currencyName) {this.currencyName = currencyName;}public String toString() {return "Country(" +"Code = " + code + ", " +"Name = " + name + ", " +"Capital = " + capital + ", " +"CurrencySymbol = " + currencySymbol + ", " +"CurrencyName = " + currencyName + ")";}public int compareTo(Object o) {Country other = (Country) o;return name.compareTo(other.name);}} There are several things to note about the Country class, which also apply to other classes that you want to store in Coherence: Because the objects needs to be moved across the network, classes that are stored within the data grid need to be serializable. In this case we have opted for the simplest solution and made the class implement the java.io.Serializable interface. This is not optimal, both from performance and memory utilization perspective, and Coherence provides several more suitable approaches to serialization. We have implemented the toString method that prints out an object's state in a friendly format. While this is not a Coherence requirement, implementing toString properly for both keys and values that you put into the cache will help a lot when debugging, so you should get into a habit of implementing it for your own classes. Finally, we have also implemented the Comparable interface. This is also not a requirement, but it will come in handy in a moment to allow us to print out a list of countries sorted by name. Now that we have the class that represents the values we want to cache, it is time to write an example that uses it: import com.tangosol.net.NamedCache;import com.tangosol.net.CacheFactory;import ch02.Country;import java.util.Set;import java.util.Map;public class CoherenceHelloWorld {public static void main(String[] args) {NamedCache countries = CacheFactory.getCache("countries");// first, we need to put some countries into the cachecountries.put("USA", new Country("USA", "United States","Washington", "USD", "Dollar"));countries.put("GBR", new Country("GBR", "United Kingdom","London", "GBP", "Pound"));countries.put("RUS", new Country("RUS", "Russia", "Moscow","RUB", "Ruble"));countries.put("CHN", new Country("CHN", "China", "Beijing","CNY", "Yuan"));countries.put("JPN", new Country("JPN", "Japan", "Tokyo","JPY", "Yen"));countries.put("DEU", new Country("DEU", "Germany", "Berlin","EUR", "Euro"));countries.put("FRA", new Country("FRA", "France", "Paris","EUR", "Euro"));countries.put("ITA", new Country("ITA", "Italy", "Rome","EUR", "Euro"));countries.put("SRB", new Country("SRB", "Serbia", "Belgrade","RSD", "Dinar"));assert countries.containsKey("JPN"): "Japan is not in the cache";// get and print a single countrySystem.out.println("get(SRB) = " + countries.get("SRB"));// remove Italy from the cacheint size = countries.size();System.out.println("remove(ITA) = " + countries.remove("ITA"));assert countries.size() == size - 1: "Italy was not removed";// list all cache entriesSet<Map.Entry> entries = countries.entrySet(null, null);for (Map.Entry entry : entries) {System.out.println(entry.getKey() + " = " + entry.getValue());}}} Let's go through this code section by section. At the very top, you can see import statements for NamedCache and CacheFactory, which are the only Coherence classes we need for this simple example. We have also imported our Country class, as well as Java's standard Map and Set interfaces. The first thing we need to do within the main method is to obtain a reference to the countries cache using the CacheFactory.getCache method. Once we have the cache reference, we can add some countries to it using the same old Map.put method you are familiar with. We then proceed to get a single object from the cache using the Map.get method , and to remove one using Map.remove. Notice that the NamedCache implementation fully complies with the Map.remove contract and returns the removed object. Finally, we list all the countries by iterating over the set returned by the entrySet method. Notice that Coherence cache entries implement the standard Map.Entry interface. Overall, if it wasn't for a few minor differences, it would be impossible to tell whether the preceding code uses Coherence or any of the standard Map implementations. The first telltale sign is the call to the CacheFactory.getCache at the very beginning, and the second one is the call to entrySet method with two null arguments. We have already discussed the former, but where did the latter come from? The answer is that Coherence QueryMap interface extends Java Map by adding methods that allow you to filter and sort the entry set. The first argument in our example is an instance of Coherence Filter interface. In this case, we want all the entries, so we simply pass null as a filter. The second argument, however, is more interesting in this particular example. It represents the java.util.Comparator that should be used to sort the results. If the values stored in the cache implement the Comparable interface, you can pass null instead of the actual Comparator instance as this argument, in which case the results will be sorted using their natural ordering (as defined by Comparable.compareTo implementation). That means that when you run the previous example, you should see the following output: get(SRB) = Country(Code = SRB, Name = Serbia, Capital = Belgrade,CurrencySymbol = RSD, CurrencyName = Dinar)remove(ITA) = Country(Code = ITA, Name = Italy, Capital = Rome,CurrencySymbol = EUR, CurrencyName = Euro)CHN = Country(Code = CHN, Name = China, Capital = Beijing, CurrencySymbol= CNY, CurrencyName = Yuan)FRA = Country(Code = FRA, Name = France, Capital = Paris, CurrencySymbol= EUR, CurrencyName = Euro)DEU = Country(Code = DEU, Name = Germany, Capital = Berlin,CurrencySymbol = EUR, CurrencyName = Euro)JPN = Country(Code = JPN, Name = Japan, Capital = Tokyo, CurrencySymbol =JPY, CurrencyName = Yen)RUS = Country(Code = RUS, Name = Russia, Capital = Moscow, CurrencySymbol= RUB, CurrencyName = Ruble)SRB = Country(Code = SRB, Name = Serbia, Capital = Belgrade,CurrencySymbol = RSD, CurrencyName = Dinar)GBR = Country(Code = GBR, Name = United Kingdom, Capital = London,CurrencySymbol = GBP, CurrencyName = Pound)USA = Country(Code = USA, Name = United States, Capital = Washington,CurrencySymbol = USD, CurrencyName = Dollar) As you can see, the countries in the list are sorted by name, as defined by our Country.compareTo implementation. Feel free to experiment by passing a custom Comparator as the second argument to the entrySet method, or by removing both arguments, and see how that affects result ordering. If you are feeling really adventurous and can't wait to learn about Coherence queries, take a sneak peek by changing the line that returns the entry set to: Set<Map.Entry> entries = countries.entrySet(new LikeFilter("getName", "United%"), null); As a final note, you might have also noticed that I used Java assertions in the previous example to check that the reality matches my expectations (well, more to demonstrate a few other methods in the API, but that's beyond the point). Make sure that you specify the -ea JVM argument when running the example if you want the assertions to be enabled, or use the run-helloworld target in the included Ant build file, which configures everything properly for you. That concludes the implementation of our first Coherence application. One thing you might notice is that the CoherenceHelloWorld application will run just fine even if you don't have any Coherence nodes started, and you might be wondering how that is possible. The truth is that there is one Coherence node—the CoherenceHelloWorld application. As soon as the CacheFactory.getCache method gets invoked, Coherence services will start within the application's JVM and it will either join the existing cluster or create a new one, if there are no other nodes on the network. If you don't believe me, look at the log messages printed by the application and you will see that this is indeed the case. Now that you know the basics, let's move on and build something slightly more exciting, and much more useful.
Read more
  • 0
  • 0
  • 1809

article-image-installing-coherence-35-and-accessing-data-grid-part-1
Packt
31 Mar 2010
10 min read
Save for later

Installing Coherence 3.5 and Accessing the Data Grid: Part 1

Packt
31 Mar 2010
10 min read
When I first started evaluating Coherence, one of my biggest concerns was how easy it would be to set up and use, especially in a development environment. The whole idea of having to set up a cluster scared me quite a bit, as any other solution I had encountered up to that point that had the word "cluster" in it was extremely difficult and time consuming to configure. My fear was completely unfounded—getting the Coherence cluster up and running is as easy as starting Tomcat. You can start multiple Coherence nodes on a single physical machine, and they will seamlessly form a cluster. Actually, it is easier than starting Tomcat. Installing Coherence In order to install Coherence you need to download the latest release from the Oracle Technology Network (OTN) website. The easiest way to do so is by following the link from the main Coherence page on OTN. At the time of this writing, this page was located at http://www.oracle.com/technology/products/coherence/index.html, but that might change. If it does, you can find its new location by searching for 'Oracle Coherence' using your favorite search engine. In order to download Coherence for evaluation, you will need to have an Oracle Technology Network (OTN) account. If you don't have one, registration is easy and completely free. Once you are logged in, you will be able to access the Coherence download page, where you will find the download links for all available Coherence releases: one for Java, one for .NET, and one for each of the supported C++ platforms. You can download any of the Coherence releases you are interested in while you are there, but for the remainder of this article you will only need the first one. The latter two (.NET and C++) are client libraries that allow .NET and C++ applications to access the Coherence data grid. Coherence ships as a single ZIP archive. Once you unpack it you should see the README.txt file containing the full product name and version number, and a single directory named coherence. Copy the contents of the coherence directory to a location of your choice on your hard drive. The common location on Windows is c:coherence and on Unix/Linux /opt/coherence, but you are free to put it wherever you want. The last thing you need to do is to configure the environment variable COHERENCE_HOME to point to the top-level Coherence directory created in the previous step, and you are done. Coherence is a Java application, so you also need to ensure that you have the Java SDK 1.4.2 or later installed and that JAVA_HOME environment variable is properly set to point to the Java SDK installation directory. If you are using a JVM other than Sun's, you might need to edit the scripts used in the following section. For example, not all JVMs support the -server option that is used while starting the Coherence nodes, so you might need to remove it. What's in the box? The first thing you should do after installing Coherence is become familiar with the structure of the Coherence installation directory. There are four subdirectories within the Coherence home directory: bin: This contains a number of useful batch files for Windows and shell scripts for Unix/Linux that can be used to start Coherence nodes or to perform various network tests doc: This contains the Coherence API documentation, as well as links to online copies of Release Notes, User Guide, and Frequently Asked Questions documents examples: This contains several basic examples of Coherence functionality lib: This contains JAR files that implement Coherence functionality Shell scripts on UnixIf you are on a Unix-based system, you will need to add execute permission to the shell scripts in the bin directory by executing the following command: $ chmod u+x *.sh Starting up the Coherence cluster In order to get the Coherence cluster up and running, you need to start one or more Coherence nodes. The Coherence nodes can run on a single physical machine, or on many physical machines that are on the same network. The latter will definitely be the case for a production deployment, but for development purposes you will likely want to limit the cluster to a single desktop or laptop. The easiest way to start a Coherence node is to run cache-server.cmd batch file on Windows or cache-server.sh shell script on Unix. The end result in either case should be similar to the following screenshot: There is quite a bit of information on this screen, and over time you will become familiar with each section. For now, notice two things: At the very top of the screen, you can see the information about the Coherence version that you are using, as well as the specific edition and the mode that the node is running in. Notice that by default you are using the most powerful, Grid Edition, in development mode. The MasterMemberSet section towards the bottom lists all members of the cluster and provides some useful information about the current and the oldest member of the cluster. Now that we have a single Coherence node running, let's start another one by running the cache-server script in a different terminal window. For the most part, the output should be very similar to the previous screen, but if everything has gone according to the plan, the MasterMemberSet section should reflect the fact that the second node has joined the cluster: MasterMemberSet ( ThisMember=Member(Id=2, ...) OldestMember=Member(Id=1, ...) ActualMemberSet=MemberSet(Size=2, BitSetCount=2 Member(Id=1, ...) Member(Id=2, ...) )RecycleMillis=120000RecycleSet=MemberSet(Size=0, BitSetCount=0)) You should also see several log messages on the first node's console, letting you know that another node has joined the cluster and that some of the distributed cache partitions were transferred to it. If you can see these log messages on the first node, as well as two members within the ActualMemberSet on the second node, congratulations—you have a working Coherence cluster. Troubleshooting cluster start-up In some cases, a Coherence node will not be able to start or to join the cluster. In general, the reason for this could be all kinds of networking-related issues, but in practice a few issues are responsible for the vast majority of problems. Multicast issues By far the most common issue is that multicast is disabled on the machine. By default, Coherence uses multicast for its cluster join protocol, and it will not be able to form the cluster unless it is enabled. You can easily check if multicast is enabled and working properly by running the multicast-test shell script within the bin directory. If you are unable to start the cluster on a single machine, you can execute the following command from your Coherence home directory: $ . bin/multicast-test.sh –ttl 0 This will limit time-to-live of multicast packets to the local machine and allow you to test multicast in isolation. If everything is working properly, you should see a result similar to the following: Starting test on ip=Aleks-Mac-Pro.home/192.168.1.7,group=/237.0.0.1:9000, ttl=0Configuring multicast socket...Starting listener...Fri Aug 07 13:44:44 EDT 2009: Sent packet 1.Fri Aug 07 13:44:44 EDT 2009: Received test packet 1 from selfFri Aug 07 13:44:46 EDT 2009: Sent packet 2.Fri Aug 07 13:44:46 EDT 2009: Received test packet 2 from selfFri Aug 07 13:44:48 EDT 2009: Sent packet 3.Fri Aug 07 13:44:48 EDT 2009: Received test packet 3 from self If the output is different from the above, it is likely that multicast is not working properly or is disabled on your machine. This is frequently the result of a firewall or VPN software running, so the first troubleshooting step would be to disable such software and retry. If you determine that was indeed the cause of the problem you have two options. The first, and obvious one, is to turn the offending software off while using Coherence. However, for various reasons that might not be an acceptable solution, in which case you will need to change the default Coherence behavior, and tell it to use the Well-Known Addresses (WKA) feature instead of multicast for the cluster join protocol. Doing so on a development machine is very simple—all you need to do is add the following argument to the JAVA_OPTS variable within the cache-server shell script: -Dtangosol.coherence.wka=localhost With that in place, you should be able to start Coherence nodes even if multicastis disabled. Localhost and loopback addressOn some systems, localhost maps to a loopback address, 127.0.0.1. If that's the case, you will have to specify the actual IP address or host name for the tangosol.coherence.wka configuration parameter. The host name should be preferred, as the IP address can change as you move from network to network, or if your machine leases an IP address from a DHCP server. As a side note, you can tell whether the WKA or multicast is being used for the cluster join protocol by looking at the section above the MasterMemberSet section when the Coherence node starts. If multicast is used, you will see something similar to the following: Group{Address=224.3.5.1, Port=35461, TTL=4} The actual multicast group address and port depend on the Coherence version being used. As a matter of fact, you can even tell the exact version and the build number from the preceding information. In this particular case, I am using Coherence 3.5.1 release, build 461. This is done in order to prevent accidental joins of cluster members into an existing cluster. For example, you wouldn't want a node in the development environment using newer version of Coherence that you are evaluating to join the existing production cluster, which could easily happen if the multicast group address remained the same. On the other hand, if you are using WKA, you should see output similar to the following instead: WellKnownAddressList(Size=1, WKA{Address=192.168.1.7, Port=8088} ) Using the WKA feature completely disables multicast in a Coherence cluster, and is recommended for most production deployments, primarily due to the fact that many production environments prohibit multicast traffic altogether, and that some network switches do not route multicast traffic properly. That said, configuring WKA for production clusters is out of the scope of this article, and you should refer to Coherence product manuals for details. Binding issues Another issue that sometimes comes up is that one of the ports that Coherence attempts to bind to is already in use and you see a bind exception when attempting to start the node. By default, Coherence starts the first node on port 8088, and increments port number by one for each subsequent node on the same machine. If for some reason that doesn't work for you, you need to identify a range of available ports for as many nodes as you are planning to start (both UDP and TCP ports with the same numbers must be available), and tell Coherence which port to use for the first node by specifying the tangosol.coherence.localport system property. For example, if you want Coherence to use port 9100 for the first node, you will need to add the following argument to the JAVA_OPTS variable in the cache-server shell script: -Dtangosol.coherence.localport=9100
Read more
  • 0
  • 0
  • 4114
article-image-slowly-changing-dimension-scd-type-6
Packt
30 Mar 2010
6 min read
Save for later

Slowly Changing Dimension (SCD) Type 6

Packt
30 Mar 2010
6 min read
The Example We will apply SCD’s to maintain the history of Product dimension, specifically the history of changes of Product's Product Group. The PRODUCT_SK column is the surrogate key of the Product dimension table. PRODUCT_SK PRODUCT_CODE PRODUCT_NAME PRODUCT_GROUP_CODE PRODUCT_GROUP_NAME 1 11 PENCIL 1 WRITING SUPPLY 2 22 PEN 1 WRITING SUPPLY 3 33 TONER 2 PRINTING SUPPLY 4 44 NOTEBOOK 4 NON ELECTRONIC SUPPL SCD Type 1 We will apply SCD Type 1 to the PENCIL product in the Product dimension table. Let’s say PENCIL changes its product group into 4. Effecting this change by applying SCD Type 1 just updates the existing row of PENCIL on its product group. We do not have record of its previous product group; in other words, we do not maintain its product group history. The updated PENCIL’s product group is shown highlighted in blue. PRODUCT_SK PRODUCT_CODE PRODUCT_NAME PRODUCT_GROUP_CODE PRODUCT_GROUP_NAME 1 11 PENCIL 4 NON ELECTRONIC SUPPLY 2 22 PEN 1 WRITING SUPPLY 3 33 TONER 2 PRINTING SUPPLY 4 44 NOTEBOOK 4 NON ELECTRONIC SUPPLY SCD Type 2 SCD Type 2 is essentially the opposite of Type 1. When we apply SCD Type 2, we never update or delete any existing product group. To apply SCD Type 2 we need an effective date and an expiry date. Effective date 31-Dec-99 means the row is not expired. It is the most current version of the product. PRODUCT_SK PRODUCT_CODE PRODUCT_NAME PRODUCT_GROUP_CODE PRODUCT_GROUP_NAME EFFECTIVE_DATE EXPIRY_DATE 1 11 PENCIL 1 WRITING SUPPLY 1-Jan-09 31-Dec-99 2 22 PEN 1 WRITING SUPPLY 1-Jan-09 31-Dec-99 3 33 TONER 2 PRINTING SUPPLY 1-Jan-09 31-Dec-99 4 44 NOTEBOOK 4 NON ELECTRONIC SUPPLY 1-Jan-09 31-Dec-99 Assuming the product group change of PENCIL is effective 1 April 2010, we update the expiry date of its existing row to 31 March 2010, one day before the effective date of the effective date of the change, and insert a new row that represents its new, current version. PRODUCT_SK PRODUCT_CODE PRODUCT_NAME PRODUCT_ GROUP _CODE PRODUCT_GROUP _NAME EFFECTIVE_DATE EXPIRY_DATE 1 11 PENCIL 1 WRITING SUPPLY 1-Jan-09 31-Mar-10 2 22 PEN 1 WRITING SUPPLY 1-Jan-09 31-Dec-99 3 33 TONER 2 PRINTING SUPPLY 1-Jan-09 31-Dec-99 4 44 NOTEBOOK 4 NON ELECTRONIC SUPPLY 1-Jan-09 31-Dec-99 5 11 PENCIL 4 NON ELECTRONIC SUPPLY 1-Apr-09 31-Dec-99 SCD Type 3 With SCD Type 3 we maintain history but in one record only. We have one column for each version of the product group. You need to have as many columns as the number of versions you want to keep. One of the most common SCD Type 3 applications is to maintain two versions of product group: the original version and the current version. When there is no product group change yet, the current product group is the same as the original product group. PRODUCT_SK PRODUCT_CODE PRODUCT_ NAME PRODUCT_ GROUP_ CODE PRODUCT_ GROUP_NAME EFFECTIVE_ DATE EXPIRY_ DATE   CURRENT_ PRODUCT_ GROUP_ CODE CURRENT_ PRODUCT_ GROUP_NAME 1 11 PENCIL 1 WRITING SUPPLY 1-Jan-09 31-Dec-99   1 WRITING SUPPLY 2 22 PEN 1 WRITING SUPPLY 1-Jan-09 31-Dec-99   1 WRITING SUPPLY 3 33 TONER 2 PRINTING SUPPLY 1-Jan-09 31-Dec-99   2 PRINTING SUPPLY 4 44 NOTEBOOK 4 NON ELECTRONIC SUPPLY 1-Jan-09 31-Dec-99     4 NON ELECTRONIC SUPPLY When the pencil’s product group changes, let’s say on 1 April 2010, we expire its original product group by changing the expiry date to a day earlier (30 March 2010), and replace its current product group to the new product group. PRODUCT_SK PRODUCT_CODE PRODUCT_ NAME PRODUCT_ GROUP_CODE PRODUCT_ GROUP_NAME EFFECTIVE_ DATE EXPIRY_ DATE   CURRENT_ PRODUCT_ GROUP_ CODE CURRENT_ PRODUCT_ GROUP_ NAME 1 11 PENCIL 1 WRITING SUPPLY 1-Jan-09 31-Mar-10     4 NON ELECTRONIC SUPPLY 2 22 PEN 1 WRITING SUPPLY 1-Jan-09 31-Dec-99   1 WRITING SUPPLY 3 33 TONER 2 PRINTING SUPPLY 1-Jan-09 31-Dec-99   2 PRINTING SUPPLY 4 44 NOTEBOOK 4 NON ELECTRONIC SUPPLY 1-Jan-09 31-Dec-99     4 NON ELECTRONIC SUPPLY When its product group changes again in the future, we will replace just the current product group with the new product group. The expiry date does not change. It gets updated once only the first time its product group changes. SCD Type 6 SCD Type 6 combines the three basic types. Type 6 is particularly applicable if you want to maintain complete history and would also like have an easy way to effect on current version. Let’s apply Type 6 instead of Type 3 only. We have applied Type 3 by having two versions of product group. When the pencil’s product group changes we update its existing current product group (that is Type 1 update). We also apply Type 2 by adding a new row. PRODUCT_SK PRODUCT_CODE PRODUCT_ NAME PRODUCT_ GROUP_ CODE PRODUCT_ GROUP_ NAME EFFECTIVE_ DATE EXPIRY_ DATE   CURRENT_ PRODUCT_ GROUP_ CODE CURRENT_ PRODUCT_ GROUP_ NAME 1 11 PENCIL 1 WRITING SUPPLY 1-Jan-09 31-Mar-10     4 NON ELECTRONIC SUPPLY 2 22 PEN 1 WRITING SUPPLY 1-Jan-09 31-Dec-99   1 WRITING SUPPLY 3 33 TONER 2 PRINTING SUPPLY 1-Jan-09 31-Dec-99   2 PRINTING SUPPLY 4 44 NOTEBOOK 4 NON ELECTRONIC SUPPLY 1-Jan-09 31-Dec-99     4 NON ELECTRONIC SUPPLY 5 11 PENCIL 4 NON ELECTRONIC SUPPLY 1-Apr-10 31-Dec-99     4 NON ELECTRONIC SUPPLY On the next pencil’s product group change (1 July 2010), we will again apply all three SCD types. PRODUCT _SK PRODUCT _CODE PRODUCT _NAME PRODUCT_ GROUP_ CODE PRODUCT_ GROUP _NAME EFFECTIVE_ DATE EXPIRY_ DATE   CURRENT_ PRODUCT_ GROUP_ CODE CURRENT_ PRODUCT_ GROUP_ NAME 1 11 PENCIL 1 WRITING SUPPLY 1-Jan-09 31-Mar-10     5 LEGACY SUPPLY 2 22 PEN 1 WRITING SUPPLY 1-Jan-09 31-Dec-99   1 WRITING SUPPLY 3 33 TONER 2 PRINTING SUPPLY 1-Jan-09 31-Dec-99   2 PRINTING SUPPLY 4 44 NOTEBOOK 4 NON ELECTRONIC SUPPLY 1-Jan-09 31-Dec-99     4 NON ELECTRONIC SUPPLY 5 11 PENCIL 4 NON ELECTRONIC SUPPLY 1-Apr-10 30-Jun-10   5 LEGACY SUPPLY 6 11 PENCIL 5 LEGACY SUPPLY 1-Jul-10 31-Dec-99   5 LEGACY SUPPLY QUERY Let’s next see how our Type 6 in the Product dimension works on a sales fact. (In a real sales fact data you will have some other dimensions, meaning the fact table will have more surrogate key columns than just the product surrogate key) If our interest is in the current version, our SQL query will use the current product group column. An example SQL query will look like: SELECT current_product_group_name, SUM(sales_amt)FROM sales_fact s, product_dim pWHEREs.product_sk = p.product_skAND product_name = ‘PENCIL’GROUP BY current_product_group_code The output of the query will be: The reason of applying SCD Type 2 is to have a complete history that tracks changes. SQL queries that take into account dimension history use the product group column: SELECT product_group_name, SUM(sales_amt)FROM sales_fact s, product_dim p, date_dim dWHEREs.product_sk = p.product_skAND product_name = ‘PENCIL’GROUP BY product_group_code The output of the query will be: SUMMARY This article discusses what SCD Type 6 is, when to apply it, and how it works. The name Type 6 comes from the ‘sum’ of the three basic SCD types (6 = 1 + 2 + 3).
Read more
  • 0
  • 0
  • 11469

article-image-setting-ireport-pages
Packt
29 Mar 2010
2 min read
Save for later

Setting Up the iReport Pages

Packt
29 Mar 2010
2 min read
Configuring the page format We can follow the listed steps for setting up report pages: Open the report List of Products. Go to menu Window | Report Inspector. The following window will appear on the left side of the report designer: Select the report List of Products, right-click on it, and choose Page Format…. The Page format… dialog box will appear, select A4 from the Format drop-down list, and select Portrait from the Page orientation section. You can modify the page margins if you need to, or leave it as it is to have the default margins. For our report, you need not change the margins. Press OK. Page size You have seen that there are many preset sizes/formats for the report, such as Custom, Letter, Note, Legal, A0 to A10, B0 to B5, and so on. You will choose the appropriate one based on your requirements. We have chosen A4. If the number of columns is too high to fit in Portrait, then choose the Landscape orientation. If you change the preset sizes, the report elements (title, column heading, fields, or other elements) will not be positioned automatically according to the new page size. You have to position each element manually. So be careful if you decide to change the page size. Configuring properties We can modify the default settings of report properties in the following way: Right-click on List of Products and choose Properties. We can configure many important report properties from the Properties window. You can see that there are many options here. You can change the Report name, Page size, Margins, Columns, and more. We have already learnt about setting up pages, so now our concern is to learn about some of the other (More…) options.
Read more
  • 0
  • 0
  • 7006
Modal Close icon
Modal Close icon