Comparing Cursor and Set Approaches in Processing Relational Data

Exclusive offer: get 50% off this eBook here
Creating your MySQL Database: Practical Design Tips and Techniques

Creating your MySQL Database: Practical Design Tips and Techniques — Save 50%

A short guide for everyone on how to structure your data and set-up your MySQL database tables efficiently and easily.

$11.99    $6.00
by Djoni Darmawikarta | October 2008 | MySQL PHP

MySQL introduced cursor in its release 5. You create a cursor to define a set of rows from one or more table that you want to access row-by-row sequentially. If your need is really to access the rows one at a time then use cursor. But a lot of data processing is actually set processing, and set processing is what relational database is best for.

This article by Djoni Darmawikarta compares cursor approach and set approach using various examples.

To give you an idea about cursor, the following DECLARE statement creates a cursor (named zero_bal_crs), which gives you access to the rows of a payment table, rows that have 100 or less balances.

DECLARE zero_bal_crs CURSOR FOR
SELECT * FROM payment WHERE payment_bal < 100

You then fetch and process each of the rows sequentially; the process can be, for example, summing up rows by product group and loading the sums into a summary table. You develop the process outside of the cursor, for example, in a stored procedure. (You’ll see in the examples that the cursor-based process is procedural).

Instead of processing row-by-row sequentially, you can process relational data set-by-set, without a cursor. For example, to sum all payment rows with 100 or less balances and load it into a table (named payment_100orless_bal in the following SQL statement), you can use the following SQL statement. This single SQL statement completely processes the rows that meet the condition, all at once, as a set.

INSERT INTO payment_100orless_bal
SELECT SUM(payment_bal) FROM payment WHERE payment_bal < 100

The following three examples, which are among those I most frequently encountered, further compare cursor-based processing with set processing.

Example 1: Sequential Loop cf. One SQL Statement

Our process in the first example is to summarize sales transactions by product. Listing 1 shows the DDLs for creating the sales transaction table and sales product table that stores the summary.

Listing 1.1: DDLs of the Example 1 Tables

CREATE TABLE sales_transactions
( product_code INT
, sales_amount DEC(8,2)
, discount INT);

CREATE TABLE sales_product
(product_code INT
, sales_amount DEC(8,2));

Listing 1.2 shows a cursor-based stored procedure that implements the process. In this example, what the cursor (sales_crs) does is simply putting all rows in ascending order by their product codes. As you might have expected, this stored procedure applies loop with if-then-else programming construct to process the data row-by-row.

Listing 1.2 Cursor Procedural solution

DELIMITER $$

DROP PROCEDURE IF EXISTS ex1_cursor $$
USE sales $$
CREATE PROCEDURE ex1_cursor()
BEGIN

DECLARE p INT DEFAULT 0;
DECLARE s DECIMAL(8,2) DEFAULT 0;
DECLARE d INT DEFAULT 0;
DECLARE done INT DEFAULT 0;
DECLARE first_row INT DEFAULT 0;
DECLARE px INT DEFAULT 0;
DECLARE sx DECIMAL(8,2) DEFAULT 0;

DECLARE sales_crs CURSOR FOR
SELECT * FROM sales_transactions ORDER BY product_code;

DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

OPEN sales_crs;

REPEAT

FETCH sales_crs INTO p, s, d;

IF first_row = 0 THEN
SET first_row = 1;
SET px = p;
SET sx = s * (100 - d)/100;

ELSEIF NOT done THEN

IF p <> px THEN
INSERT INTO sales_product VALUES(px, sx);
SET sx = s * (100 - d)/100;
SET px = p;

ELSE
SET sx = sx + s * (100 - d)/100;

END IF;

ELSE INSERT INTO sales_product VALUES(px, sx);

END IF;

UNTIL done END REPEAT;

CLOSE sales_crs;

END $$

DELIMITER ;

The stored procedure in Listing 1.3 implements a set, processing to accomplish the same purpose as its foregoing cursor-based processing. You can see that this stored procedure is simpler than its cursor-based counterpart.

Listing 1.3 Set Operation solution

DELIMITER //

USE sales //

DROP PROCEDURE IF EXISTS ex1_sql //
CREATE PROCEDURE ex1_sql ()
BEGIN

INSERT INTO sales_product
SELECT product_code
, SUM(sales_amount * (100 - discount)/100)
FROM sales_transactions
GROUP BY product_code;

END //

DELIMITER ;

Example 2: Nested Cursor cf. Join

Our 2nd example is to consolidate order by customer. While example 1 has one table, example 2 has two: order and item. The DDLs of the two tables are shown in Listing 2.

To process two tables, our cursor implementation uses a nested loop (Listing 2.2) which makes it even more complex than example 1.

Listing 2.1: DDL’s of the Example 2 Tables

CREATE TABLE  order (
, order_number INT
, order_date DATE
, customer_number INT);

CREATE TABLE item (
, order_number INT
, product_code INT
, quantity INT
, unit_price DEC(8,2));

Listing 2.2: Nested Cursor

DELIMITER $$

DROP PROCEDURE IF EXISTS ex2_cursor $$
CREATE PROCEDURE ex2_cursor()
BEGIN

DECLARE so INT;
DECLARE sc INT;
DECLARE io INT;
DECLARE iq INT;
DECLARE iu DEC(10,2);

DECLARE done1 VARCHAR(5) DEFAULT 'START' ;
DECLARE done2 VARCHAR(5) DEFAULT 'START' ;

DECLARE sales_crs CURSOR FOR
SELECT customer_number, order_number
FROM sales_order
ORDER BY customer_number, order_number;

DECLARE order_crs CURSOR FOR
SELECT order_number, quantity, unit_price
FROM item
WHERE order_number = so
ORDER BY order_number;

DECLARE CONTINUE HANDLER FOR NOT FOUND SET done1 = 'END';

OPEN sales_crs;

WHILE done1 <> 'END' DO
FETCH sales_crs INTO sc, so;
IF done1 <> 'END' THEN

/* inner cursor */
OPEN oorder_crs;
SET done2 = done1;

WHILE done1 <> 'END' DO
FETCH order_crs INTO io, iq, iu;
IF done1 <> 'END' THEN
INSERT INTO customer_order VALUES (sc, iq * iu);

END IF;
END WHILE;

CLOSE order_crs;
SET done1 = done2;
END IF;
END WHILE;
CLOSE sales_crs;
END $$
DELIMITER ;

Listing 2.3 is the set solution. We apply join in this 2nd example which makes it just a single SQL statement. You can see that this set stored procedure is again simpler than its cursor-based equivalent.

 

 

Creating your MySQL Database: Practical Design Tips and Techniques A short guide for everyone on how to structure your data and set-up your MySQL database tables efficiently and easily.
Published: November 2006
eBook Price: $11.99
Book Price: $19.99
See more
Select your format and quantity:

Listing 2.3 Join Set Processing

DELIMITER $$

DROP PROCEDURE IF EXISTS ex2_sql$$
CREATE PROCEDURE ex2_sql()
BEGIN

INSERT INTO customer_order

SELECT customer_number, SUM(quantity * unit_price)
FROM sales_order o, item i
WHERE i.order_number = o.order_number
GROUP BY customer_number;

END $$

DELIMITER ;

Example 3: Join Cursor cf. Multi Sets

In the 3rd example we’d like to keep track of order payments. We update the payment once a month (30 days), and remove an order when its balance is fully paid (zero balance). We apply late charges (5%) to outstanding balances older than 90 days; and, to those that have outstanding balance older than 1 year, we suspend the customer (move them to a suspense table).

Listing 3.1: DDL’s of the Example 3 Tables

CREATE TABLE  order_payment (
order_number INT
, payment_amt DEC(10,2);

CREATE TABLE payment_balance (
order_number int(10)
, payment_balance DEC(10,2)
, order_amt DEC(10,2)
, balance_age INT;

The cursor in the stored procedure (Listing 3.2) uses join, but we process the rows sequentially, not as a set at once.

Listing 3.2: Join Cursor

DELIMITER $$

DROP PROCEDURE IF EXISTS ex3_cursor $$
CREATE PROCEDURE ex3_cursor()
BEGIN

DECLARE b DECIMAL(8,2) DEFAULT 0;
DECLARE a DECIMAL(8,2) DEFAULT 0;
DECLARE o INT DEFAULT 0;
DECLARE oa DEC(8,2);
DECLARE ba INT DEFAULT 0;
DECLARE done INT DEFAULT 0;
DECLARE first_row INT DEFAULT 0;
DECLARE bx DECIMAL(8,2) DEFAULT 0;

DECLARE pb_crs CURSOR FOR
SELECT o.order_number, payment_amt, payment_balance, order_amt,
balance_age

FROM order_payment o, payment_balance p
WHERE o.order_number = p.order_number;

DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = 1;

OPEN pb_crs;

WHILE done <> 1 DO

FETCH pb_crs INTO o, a, b, oa, ba;

IF done <> 1 THEN
SET bx = (b - a);
END IF;

IF done <> 1 THEN
IF bx = 0 THEN
DELETE FROM payment_balance
WHERE order_number = o;
ELSE
UPDATE payment_balance SET payment_balance = bx,
balance_age = ba + 30

WHERE payment_balance.order_number = o;

IF ((ba + 30) > 90 AND (ba + 30) <= 365) THEN
UPDATE payment_balance SET payment_balance =
payment_balance + payment_balance * 0.05
WHERE payment_balance.order_number = o;

ELSEIF (ba + 30) > 365 THEN
INSERT INTO suspense VALUES(o, bx, oa, (ba + 30));
DELETE FROM payment_balance
WHERE payment_balance.order_number = o;


END IF;
END IF;
END IF;

END WHILE;

CLOSE pb_crs;

END $$

DELIMITER ;

While the cursor defines only one set, our set processing requires four sets (a set defines the rows that qualify for the conditions of an SQL operation):

  1. Orders that have payments for the month – to apply the payments and update their ages
  2. Zero balance – to be removed from the payment balance table
  3. balance age > 90 days but <= 365 days – to be charged for late payment
  4. balance age > 365 – to suspend

Note that we need to apply two to SQL operations to the fourth set: add the orders to the suspense table and remove them from the payment balance table.

Listing 3.3 Multi Operations Set Processing

DELIMITER $$

DROP PROCEDURE IF EXISTS ex3_sql $$

BEGIN

/* Update */
UPDATE payment_balance p, order_payment o
SET p.payment_balance = p.payment_balance - o.payment_amt,
balance_age = balance_age + 30
WHERE o.order_number = p.order_number;

/* Delete */
DELETE payment_balance
FROM payment_balance WHERE payment_balance = 0;

UPDATE payment_balance
SET payment_balance = payment_balance + payment_balance * 0.05
WHERE balance_age > 90 AND balance_age <= 365;

INSERT INTO suspense
SELECT order_number, payment_balance, order_amt, balance_age
FROM payment_balance
WHERE balance_age > 365;

DELETE FROM payment_balance WHERE balance_age > 365;

END $$

DELIMITER ;

Speed Test

I tested both cursor-based and set stored procedures of all three examples with data ranges from a thousand to 100 thousand rows, and found out (as expected) that the cursor processing is 50 to 100 times slower than the set processing. You can likely be able to further speed up the set processing by applying relational database performance facilities, such as index.

Summary

Both cursor and set processing define set(s) of rows of the data to process; but, while cursor processes the rows sequentially, set processing takes its sets at once.

Set operation is generally simpler and faster than cursor-based processing. The challenge, particularly to those developers accustomed to procedural process, is the need to design their solutions in terms of set and set operation instead of the more familiar sequential procedural algorithm.

Creating your MySQL Database: Practical Design Tips and Techniques A short guide for everyone on how to structure your data and set-up your MySQL database tables efficiently and easily.
Published: November 2006
eBook Price: $11.99
Book Price: $19.99
See more
Select your format and quantity:

About the Author :


Djoni Darmawikarta built his career in IBM Asia Pacific and Canada as a software engineer, international consultant, instructor and project manager, for a total of 17 years. He's currently a technical specialist in the Data Warehousing and Business Intelligence team of a Toronto-based insurance company. Outside of his office works, Djoni writes IT articles and books.

Books From Packt

Mastering phpMyAdmin 2.11 for Effective MySQL Management
Mastering phpMyAdmin 2.11 for Effective MySQL Management

BSelling Online with Drupal e-Commerce
Selling Online with Drupal e-Commerce

Drupal 6 JavaScript and jQuery
Drupal 6 JavaScript and jQuery

Building Websites with Joomla! 1.5
Building Websites with Joomla! 1.5

WordPress Complete
WordPress Complete

Learning jQuery 1.3
Learning jQuery 1.3

Spring 2.5 Aspect Oriented Programming
Spring 2.5 Aspect Oriented Programming

Drupal 6 Themes
Drupal 6 Themes

 

Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software