It's All About ORACLE

Oracle - The number one Database Management System. Hope this Blog will teach a lot about oracle.

Exploring Oracle - ROWNUM

For each row returned by a query, the ROWNUM pseudocolumn returns a number indicating the order in which Oracle selects the row from a table or set of joined rows. The first row selected has a ROWNUM of 1, the second has 2, and so on.
You can use 
ROWNUM to limit the number of rows returned by a query, as in this example:
SELECT * FROM employees WHERE ROWNUM < 10; If an ORDER BY clause follows ROWNUM in the same query, then the rows will be reordered by the ORDER BY clause. The results can vary depending on the way the rows are accessed. For example, if the ORDER BY clause causes Oracle to use an index to access the data, then Oracle may retrieve the rows in a different order than without the index. Therefore, the following statement will not have the same effect as the preceding example:
SELECT * FROM employees WHERE ROWNUM < 11 ORDER BY last_name;

If you embed the ORDER BY clause in a subquery and place the ROWNUM condition in the top-level query, then you can force the ROWNUM condition to be applied after the ordering of the rows. For example, the following query returns the employees with the 10 smallest employee numbers. This is sometimes referred to as top-N reporting:
SELECT * FROM
   (SELECT * FROM employees ORDER BY employee_id)
   WHERE ROWNUM < 11;

In the preceding example, the ROWNUM values are those of the top-level SELECT statement, so they are generated after the rows have already been ordered by employee_id in the subquery.

Conditions testing for ROWNUM values greater than a positive integer are always false. For example, this query returns no rows:
SELECT * FROM employees
    WHERE ROWNUM > 1;

The first row fetched is assigned a ROWNUM of 1 and makes the condition false. The second row to be fetched is now the first row and is also assigned a ROWNUM of 1 and makes the condition false. All rows subsequently fail to satisfy the condition, so no rows are returned.
You can also use ROWNUM to assign unique values to each row of a table, as in this example:
UPDATE my_table
    SET column1 = ROWNUM;

Suppose I have a table with 5 rows:
SELECT * FROM EMP;
EMPID POSITION_NAME
5 SALES REP
4 SUPERVISOR
3 AREA MANAGER
2 SALES MANAGER
1 REGIONAL DIRECTOR

SELECT * FROM EMP WHERE ROWNUM<4 font="">
EMPID POSITION_NAME
5 SALES REP
4 SUPERVISOR
3 AREA MANAGER

SELECT * FROM EMP WHERE ROWNUM=1;
EMPID POSITION_NAME
5 SALES REP

SELECT * FROM EMP WHERE ROWNUM=2;
No Rows Returned
-- Choosing any value other than 1 will be false. 

SELECT * FROM EMP WHERE ROWNUM>1;
No Rows Returned

SELECT * FROM EMP WHERE ROWNUM>2;
No Rows Returned

-- Choosing any value greater than 1 will be false. Explained above

ROWNUM works with =1 or less than  only.

When used in Sub-queries ROWNUM allow us to view data providing range of rows. It's like mentioning a window for viewing the data.

SELECT * FROM (SELECT rownum r1, t.* from r_roche_terr_position t)
WHERE r1 > 1
and r1<4 font="">
R1 EMPID   POSITION_NAME
2     4          SUPERVISOR
3     3         AREA MANAGER

Subqueries and it's Use in Oracle 10g

Scalar Subqueries:

Scalar Subquery return exactly one column value from one row. Multiple-column subqueries - pairwise or non-pairwise are not scalar. If a Scalar subquery return 0 rows, the value of the subquery expression is NULL. And if it returns more than one row, Oracle returns an error.

Valid Use of Scalar Subqueries:
  • The condition and expression part of DECODE or CASE statement.
  • All clauses of a SELECT statement except GROUP BY.
  • The SET clause and the WHERE clause of an UPDATE statement.


e.g. 1
SELECT Employee_id, Last_name,
( CASE 
 WHEN department_id = 
( SELECT department_id FROM Departments
 WHERE location_id = 1800)
      THEN 'Canada' ELSE 'USA' END) LOCATION 
FROM EMPLOYEE;

e.g. 2
Sorting by a column that is not itself outputted - DEPARTMENT_NAME on the DEPARTMENTS table.
SELECT department_id, last_name
FROM Employees e
ORDER BY 
( SELECT department_name
FROM departments d
WHERE e.DEPARTMENT_ID = d.Department_id);
 
e.g.3
Query which should display Employee_Id, Department_Id and Salary column. It should also repeat the figure for the average company wide salary across a column caled COMPANY_AVG_SALARY:
SELECT employee_id, department_id, salary
(SELECT AVG(salary) FROM Employee) COMPANY_AVG_SALARY
FROM employees
ORDER BY department_id, salary, employee_id;

Co-related subquery:
A correlated subquery is the one that references a column from a table referred in the parent statement. Typically a subquery execute before main query. However a correlated subquery is evaluated once for each row processed by the parent statement. In other words, the inner query is driven by the outer query

Parent statement can be a SELECT, UPDATE or DELETE statement.

With Correlated Sub-queries: 
  • outer query retrieves a candidate row.
  • the inner query executes using one or more candidate row values.
  • the output of the inner query qualifies or disqualifies the candidate row
  • the process repeats until no row are left.

e.g 1: Find details of employees who earn more than the average salary for that department.
SELECT last_name, salary, department_id 
FROM employee outer 
WHERE salary >
 ( SELECT AVG(salary) 
   FROM employees
WHERE department_id = outer.employee_id);
e.g. 2: Display details for employees who have changed job at least twice. 
SELECT e.Employee_id, e.last_name, e.job_id
FROM employee e
WHERE 2 <= 
( SELECT COUNT(*) 
FROM job_history
WHERE employee_id = e.Employee_id);

EXITS:
In Subqueries we can also write EXISTS and NOT EXISTS operations. 
The EXISTS operator checks whether a value returned by the main query is present in the set of value returned by the subquery.  It is set to true if subquery returns one more more row otherwise it is set to FALSE.

NOT EXISTS work in opposite way. 

e.g.1 - Find employees who have at least one person reporting to them. 
SELECT employee_id, last_name, department_id
FROM employee outer
WHERE EXISTS 
(SELECT 'X' 
FROM Employees
WHERE manager_id = outer.employee_id);

Because the subquery is evaluated on whether it returns records, the X constant can be selected instead of an actual column.

e.g. 2 - Find Departments with no Employees:
SELECT department_id, department_name
FROM departments d
WHERE NOT EXISTS 
( SELECT 'X'
FROM employees 
WHERE department_id = d.department_id);

You have to be very careful when using the NOT IN construct as an alternative to NOT EXISTS operator. Thats because NOT IN evaluates to FALSE if any member of the required result set is a NULL value. So your query does not return any rows even if there are rows in the DEPARTMENTS table that satisfy the WHERE condition.

e.g.
SELECT department_id, department_name
FROM departments
WHERE department_id NOT IN ( SELECT department_id FROM EMPLOYEES);

NO ROW RETURNS

WITH Clause:
The WITH clause, whose formal name is subquery factoring clause, allows you to define a query block before putting it to use in a query.
Block results are stored in your temporary tablespace. They are reusable - so you dont have to invoke the same code repeatedly to retrieve the same information.


Advantages of WITH clause:

  •  The WITH query_name clause lets you assign a name to a subquery block. You can then reference the subquery block multiple places in the query by specifying the query name. Oracle optimizes the query by treating the query name as either an inline view or as a temporary table.
  •  You can specify this clause in any top-level SELECT statement and in most types of subqueries. The query name is visible to the main query and to all subsequent subqueries except the subquery that defines the query name itself.
  •  A WITH clause is really best used when the result of the WITH query is required more than one time in the body of the query such as where one averaged value needs to be compared against two or three times.
  •  The SQL WITH clause only works on Oracle 9i release 2 and beyond. Formally, the WITH clause is called subquery factoring. The SQL WITH clause is used when a subquery is executed multiple times.
  • Using WITH clause to create BLOCK can improve performance. It is helpful in complex queries that reference the same block several times or that use joins and aggregations. Code containing WITH clause is also easier to read.

Example: Provides detail of departments that have a higher total salary bill than the average department has.
WITH dept_costs AS (
SELECT d.department_name, SUM(e.salary) AS DEPT_TOT
FROM employees e, departments d
WHERE e.department_id = d.department_id
GROUP BY d.department_name),

  In this first WITH clause, you create  a block called dept_coste. It stores the department name and total salary spend for each department. You get the department_name value from the DEPARTMENTS table.

AVG_COST as (
SELECT AVG(dept_tot) AS DEPT_AVG
FROM dept_costs)
  In the second WITH clause, you create a block called AVG_COST. This stores one value: the average of the total salaries for every department.
  The average of the DEPT_TOT value in the DEPT_COSTS tablespace (which was created in the previous WITH clause) is placed in the DEPT_AVG alias). 

SELECT * 
FROM dept_costs
WHERE DEPT_TOT >
( SELECT DEPT_AVG
   FROM AVG_COST)
ORDER BY department_name;
 The main query returns a row for every department with a higher-than-average departmental total.
 It compares the total salary spend for each department - stored in the DEPT_TOT alias of the DEPT_COST block - with the average of all the total -stored in the DEPT_AVG alias of the AVG_COST block.

When we execute the query, a row is returned for the two departments with a higher-than-average total salary spend.

Hierarchical Query in Oracle 10g database

A Hierarchical query is a method of reporting the branches of a tree in a specific order. You use hierarchical queries to retrieve data based on a natural hierarchical relationship between rows a table. 

Data is not stored in a hierarchical way. However a process called tree walking allows you to construct the hierarchy.
A Hierarchical query is possible when a relationship exists between rows in a single table.
e.g. MANAGER_ID column of employees table: Employee_id, Last_Name, Job_Id, Manager_ID

Syntax of hierarchical query:
SELECT [LEVEL], column
FROM table
[WHERE condition]
[START WITH condition]
[CONNECT BY PRIOR condition]

SELECT: statement cannot contain a join or query from a view that contains a join.

LEVEL: For each row returned by a hierarchical query, the LEVEL pseudocolumn returns 1 for a root row 2 for a child of root and so on.
When used with the SELECT and CONNECT BY statement, the LEVEL pseudocolumn organizes rows from a database table into a tree structure. Oracle 

FROM: specifies the table, view or snapshot containt the columns. You can select from only one table.

START WITH: specifies the root rows of the hierarchy where the query starts. START WITH can specify any valid condition and can contain subqueries. If a START WITH condition return multiple rows, Oracle generates multiple hierarchies.

CONNECT BY: specifies the columns that contain related parent and child PRIOR rows. Using the CONNECT BY statement, you can create queries that report that branches of a tree from top to bottom or from bottom to top.
Once oracle has identified the parent rows, it uses uses the condition in CONNECT BY clause to select immediate child rows of each parent.

Then oracle select the child rows of those child rows and return seccessive generation of child rows untill the query is complete.

In a hierarchical query, one expression in condition must be qualified with the PRIOR operator to refer to the parent rows.

The direction of the query depends on where you place the PRIOR operator, which refer to parent rows. These two PRIOR clause are equivalent:
PRIOR exp1 = exp2 and exp2 = PRIOR exp1

e.g.1 - Create a top-down query that displays the management hierarchy of a company. 
SELECT last_name, job_id, employee_id, manager_id
FROM employees
START WITH employee_id = 100
CONNECT BY PRIOR employee_id = manager_id;

last_name job_id employee_id manager_id
King AD_PRESS 100
Herry MK_MAN 201 100
John MK_REP 202 201

e.g. 2- Now create a bottom up query that displays a reverse management hierarchy. 
SELECT last_name, job_id, employee_id, manager_id
FROM employees
START WITH employee_id = 105
CONNECT BY PRIOR manager_id = employee_id;

last_name job_id employee_id manager_id
Austin IT_PROG 105 103
Hunold IT_PROG 103 102
De Han AD_VP 102 100
King AD_PRESS 100

Making readable hierachy:
Suppose you want to make the hierarchy of the tree easy to read. You use the LPAD function with LEVEL to indent child rows with underscore character.
SELECT LPAD(last_name, LENGTH(last_name) + (LEVEL*2)-2,'-') AS employee, job_id
FROM employees 
START WITH employee_id = 100
CONNECT BY PRIOR employee_id = manager_id;

Result is that the lower employees are in the hierarchy, the further to the right their names display.

e.g. 3: Create a query that returns all rows with FI_ACCOUNT in the JOB_ID column and all child rows with Popp in the last_name.
SELECT last_name, job_id, employee_id, manager_id
FROM employees
WHERE job_id != 'FI_ACCOUNT'
START WITH manager_id IS NULL
CONNECT BY PRIOR employee_id = manager_id
AND  last_name != 'Popp';

Use of Hierarchy queries: 


DML Error Logging

When you need to load millions of rows of data into a table, the most efficient way is usually to use an INSERT, UPDATE, or MERGE statement to process your data in bulk. Similarly, if you want to delete thousands of rows, using a DELETE statement is usually faster than using procedural code. But what if the data you intend to load contains values that might cause an integrity or check constraint to be violated, or what if some values are too big for the column they are to be loaded into?

You may well have loaded 999,999 rows into your table, but that last row, which violates a check constraint, causes the whole statement to fail and roll back. In situations such as this, you have to use an alternative approach to loading your data.

For example, if your data is held in a file, you can use SQL*Loader to automatically handle data that raises an error, but then you have to put together a control file, run SQL*Loader from the command line, and check the output file and the bad datafile to detect any errors.
If, however, your data is held in a table or another object, you can write a procedure or an anonymous block to process your data row by row, loading the valid rows and using exception handling to process those rows that raise an error. You might even use BULK COLLECT and FORALL to handle data in your PL/SQL routine more efficiently, but even with these improvements, handling your data in this manner is still much slower than performing a bulk load by using a direct-path INSERT DML statement.

Until now, you could take advantage of the set-based performance of INSERT, UPDATE, MERGE, and DELETE statements only if you knew that your data was free from errors; in all other circumstances, you needed to resort to slower alternatives. All of this changes with the release of Oracle Database 10g Release 2, which introduces a new SQL feature called DML error logging.

Efficient Error Handling

DML error logging enables you to write INSERT, UPDATE, MERGE, or DELETE statements that automatically deal with certain constraint violations. With this new feature, you use the new LOG ERRORS clause in your DML statement and Oracle Database automatically handles exceptions, writing erroneous data and details of the error message to an error logging table you've created.

Before you can use the LOG ERRORS clause, you need to create an error logging table, either manually with DDL or automatically with the CREATE_ERROR_LOG procedure in the DBMS_ERRLOG package, whose specification is shown in Listing 1.

Code Listing 1: DBMS_ERRLOG.CREATE_ERROR_LOG parameters 
DBMS_ERRLOG.CREATE_ERROR_LOG (
        dml_table_name                  IN VARCHAR2,
        err_log_table_name              IN VARCHAR2 := NULL,
        err_log_table_owner             IN VARCHAR2 := NULL,
        err_log_table_space             IN VARCHAR2 := NULL,
        skip_unsupported                IN BOOLEAN  := FALSE);


All the parameters except DML_TABLE_NAME are optional, and if the optional details are omitted, the name of the error logging table will be ERR$_ together with the first 25 characters of the DML_TABLE_NAME. The SKIP_UNSUPPORTED parameter, if set to TRUE, instructs the error logging clause to skip over LONG, LOB, and object type columns that are not supported and omit them from the error logging table.

Syntax:

With the error logging table created, you can add the error logging clause to most DML statements, using the following syntax: 
LOG ERRORS [INTO [schema.]table][ (simple_expression) ] [ REJECT LIMIT {integer|UNLIMITED} ]
The INTO clause is optional; if you omit it, the error logging clause will put errors into a table with the same name format used by the CREATE_ERROR_LOG procedure. SIMPLE_EXPRESSION is any expression that would evaluate to a character string and is used for tagging rows in the error table to indicate the process that caused the error, the time of the data load, and so on. REJECT LIMIT can be set to any integer or UNLIMITED and specifies the number of errors that can occur before the statement fails. This value is optional, but if it is omitted, the default value is 0, which effectively disables the error logging feature.

The following types of errors are handled by the error logging clause: 
  • Column values that are too large
  • Constraint violations (NOT NULL, unique, referential, and check constraints), except in certain circumstances detailed below
  • Errors raised during trigger execution
  • Errors resulting from type conversion between a column in a subquery and the corresponding column of the table
  • Partition mapping errors
The following conditions cause the statement to fail and roll back without invoking the error logging capability: 
  • Violated deferred constraints
  • Out-of-space errors
  • Any direct-path INSERT operation (INSERT or MERGE) that raises a unique constraint or index violation
  • Any UPDATE operation (UPDATE or MERGE) that raises a unique constraint or index violation
To show how the error logging clause works in practice, consider the following scenario, in which data needs to be loaded in batch from one table to another:
You have heard of the new error logging feature in Oracle Database 10g Release 2 and want to compare this new approach with your previous method of writing a PL/SQL package. To do this, you will use data held in the SH sample schema to try out each approach.

Using DML Error Logging

In this example, you will use the data in the SALES table in the SH sample schema, together with values from a sequence, to create a source table for the error logging test. This example assumes that the test schema is called ERRLOG_TEST and that it has the SELECT object privilege for the SH.SALES table. Create the source data and a target table called SALES_TARGET, based on the definition of the SALES_SRC table, and add a check constraint to the AMOUNT_SOLD column to allow only values greater than 0. Listing 2 shows the DDL for creating the source and target tables.

Code Listing 2: Creating the SALES_SRC and SALES_TARGET tables 
SQL> CREATE SEQUENCE sales_id_seq;
Sequence created.

SQL> CREATE TABLE sales_src
  2    AS
  3    SELECT sales_id_seq.nextval AS "SALES_ID"
  4    ,         cust_id
  5    ,         prod_id
  6    ,         channel_id
  7    ,         time_id
  8    ,         promo_id
  9    ,         amount_sold
 10    ,        quantity_sold
 11   FROM   sh.sales
 12   ;
Table created.

SQL> SELECT count(*)
  2    ,         min(sales_id)
  3    ,         max(sales_id)
  4    FROM   sales_src
  5    ;

        COUNT(*)        MIN(SALES_ID)   MAX(SALES_ID)
        ------          --------        --------
        918843             1            918843

SQL> CREATE TABLE sales_target
  2    AS
  3    SELECT *
  4    FROM   sales_src
  5    WHERE 1=0
  6    ;
Table created.

SQL> ALTER TABLE sales_target
  2    ADD CONSTRAINT amount_sold_chk
  3    CHECK (amount_sold > 0)
  4    ENABLE
  5    VALIDATE
  6    ;
Table altered.

Note from the descriptions of the tables in Listing 2 that the SALES_TARGET and SALES_SRC tables have automatically inherited the NOT NULL constraints that were present on the SH.SALES table because you created these tables by using a CREATE TABLE ... AS SELECT statement that copies across these column properties when you are creating a table.
You now introduce some errors into your source data, so that you can subsequently test the error logging feature. Note that because one of the errors you want to test for is a NOT NULL constraint violation on the PROMO_ID column, you need to remove this constraint from the SALES_SRC table before adding null values. The following shows the SQL used to create the data errors. 
SQL> ALTER TABLE sales_src
  2    MODIFY promo_id NULL
  3    ;
Table altered.

SQL> UPDATE sales_src
  2    SET      promo_id = null
  3    WHERE  sales_id BETWEEN 5000 and 5005
  4    ;
6 rows updated.

SQL> UPDATE sales_src
  2    SET      amount_sold = 0
  3    WHERE  sales_id IN (1000,2000,3000)
  4    ;
3 rows updated.

SQL>  COMMIT;
Commit complete.

Now that your source and target tables are prepared, you can use the DBMS_ERRLOG.CREATE_ERROR_LOG procedure to create the error logging table. Supply the name of the table on which the error logging table is based; the procedure will use default values for the rest of the parameters. Listing 3 shows the creation and description of the error logging table.

Code Listing 3: Creating the err$_sales_target error logging table 

SQL> BEGIN
  2       DBMS_ERRLOG.CREATE_ERROR_LOG('SALES_TARGET');
  3    END;
  4    /
PL/SQL procedure successfully completed.

SQL> DESCRIBE err$_sales_target;
 Name                    Null?   Type
 -------------------     ----    ------------- 
 ORA_ERR_NUMBER$                 NUMBER
 ORA_ERR_MESG$                   VARCHAR2(2000)
 ORA_ERR_ROWID$                  ROWID
 ORA_ERR_OPTYP$                  VARCHAR2(2)
 ORA_ERR_TAG$                    VARCHAR2(2000)
 SALES_ID                        VARCHAR2(4000)
 CUST_ID                         VARCHAR2(4000)
 PROD_ID                         VARCHAR2(4000)
 CHANNEL_ID                      VARCHAR2(4000)
 TIME_ID                         VARCHAR2(4000)
 PROMO_ID                        VARCHAR2(4000)
 AMOUNT_SOLD                     VARCHAR2(4000)
 QUANTITY_SOLD                   VARCHAR2(4000)

Note that the CREATE_ERROR_LOG procedure creates five ORA_ERR_% columns, to hold the error number, error message, ROWID, operation type, and tag you will supply when using the error logging clause. Datatypes have been automatically chosen for the table columns that will allow you to store numbers and characters.

The first approach is to load data into the SALES_TARGET table by using a direct-path INSERT statement. This is normally the most efficient way to load data into a table while still making the DML recoverable, but in the past, this INSERT would have failed, because the check constraints on the SALES_TARGET table would have been violated. Listing 4 shows this INSERT and the check constraint violation.

Code Listing 4: Violating the check constraint with direct-path INSERT 
SQL> SET SERVEROUTPUT ON
SQL> SET LINESIZE 150
SQL> SET TIMING ON
SQL> ALTER SESSION SET SQL_TRACE = TRUE;

Session altered.
Elapsed: 00:00:00.04

SQL> INSERT  /*+ APPEND */
  2    INTO     sales_target
  3    SELECT  *
  4    FROM    sales_src
  5    ;
INSERT /*+ APPEND */
*
ERROR at line 1:
ORA-02290: check constraint (ERRLOG_TEST.AMOUNT_SOLD_CHK) violated

Elapsed: 00:00:00.15
If you add the new LOG ERRORS clause to the INSERT statement, however, the statement will complete successfully and save any rows that violate the table constraints to the error logging table, as shown in Listing 5.
Code Listing 5: Violating the constraints and logging the errors with LOG ERRORS 
SQL> INSERT  /*+ APPEND */
  2    INTO     sales_target
  3    SELECT  *
  4    FROM    sales_src
  5    LOG ERRORS
  6    REJECT LIMIT UNLIMITED
  7    ;
918834 rows created.
Elapsed: 00:00:05.75

SQL> SELECT count(*)
  2    FROM   err$_sales_target
  3    ;

COUNT(*)
---------   
 9

Elapsed: 00:00:00.06

SQL> COLUMN ora_err_mesg$ FORMAT A50
SQL> SELECT   ora_err_number$
  2    ,           ora_err_mesg$
  3    FROM     err$_sales_target
  4    ;

ORA_ERR_NUMBER$         ORA_ERR_MESG$
---------------         ------------------------------

        2290            ORA-02290: check constraint (ERRLOG_TEST.AMOUNT_
                        SOLD_CHK) violated

        2290            ORA-02290: check constraint (ERRLOG_TEST.AMOUNT_
                        SOLD_CHK) violated

        2290            ORA-02290: check constraint (ERRLOG_TEST.AMOUNT_
                        SOLD_CHK) violated

        1400            ORA-01400: cannot insert NULL into ("ERRLOG_TEST".
                        "SALES_TARGET"."PROMO_ID")

        1400            ORA-01400: cannot insert NULL into ("ERRLOG_TEST".
                               "SALES_TARGET"."PROMO_ID")

        1400            ORA-01400: cannot insert NULL into ("ERRLOG_TEST".
                        "SALES_TARGET"."PROMO_ID")

        1400            ORA-01400: cannot insert NULL into ("ERRLOG_TEST".
                        "SALES_TARGET"."PROMO_ID")

        1400            ORA-01400: cannot insert NULL into ("ERRLOG_TEST".
                        "SALES_TARGET"."PROMO_ID")

        1400            ORA-01400: cannot insert NULL into ("ERRLOG_TEST".
                        "SALES_TARGET"."PROMO_ID")

9 rows selected.

Elapsed: 00:00:00.28
Listing 5 shows that when this INSERT statement uses direct path to insert rows above the table high-water mark, the process takes 5.75 seconds and adds nine rows to the error logging table. Try the same statement again, this time with a conventional-path INSERT, as shown in Listing 6.
Code Listing 6: Violating the check and NOT NULL constraints with conven 
SQL> TRUNCATE TABLE sales_target;

Table truncated.

Elapsed: 00:00:06.07

SQL> TRUNCATE TABLE err$_sales_target;

Table truncated.

Elapsed: 00:00:00.25

SQL> INSERT INTO sales_target
  2  SELECT *
  3  FROM   sales_src
  4  LOG ERRORS
  5  REJECT LIMIT UNLIMITED
  6  ;

918834 rows created.

Elapsed: 00:00:30:65
As you might expect, the results in Listing 6 show that the direct-path load is much faster than the conventional-path load, because the former writes directly to disk whereas the latter writes to the buffer cache. The LOG ERRORS clause also causes kernel device table (KDT) buffering to be disabled when you're performing a conventional-path INSERT. One reason you might want to nevertheless use a conventional-path INSERT with error logging is that direct-path loads will fail when a unique constraint or index violation occurs, whereas a conventional-path load will log these errors to the error logging table and then continue. Oracle Database will also ignore the /*+ APPEND */ hint when the table you are inserting into contains foreign key constraints, because you cannot have these enabled when working in direct-path mode.
Now compare these direct- and conventional-path loading timings with the timing for using a PL/SQL anonymous block. You know that the traditional way of declaring a cursor against the source table—reading it row by row, inserting the contents into the target table, and dealing with exceptions as they occur—will be slow, but the column by Tom Kyte in the September/October 2003 issue of Oracle Magazine ("On HTML DB, Bulking Up, and Speeding") shows how BULK COLLECT, FORALL, and SAVE EXCEPTIONS could be used to process dirty data in a more efficient manner. How does Kyte's 2003 approach compare with using DML error logging? A version of Kyte's approach that, like the LOG ERRORS clause, writes error messages to an error logging table is shown in Listing 7.
Code Listing 7: PL/SQL anonymous block doing row-by-row INSERT  
SQL> CREATE TABLE sales_target_errors
  2  (sql_err_mesg varchar2(4000))
  3  /

Table created.
Elapsed: 00:00:00.28
SQL>  DECLARE
  2        TYPE array IS TABLE OF sales_target%ROWTYPE
  3           INDEX BY BINARY_INTEGER;
  4        sales_src_arr   ARRAY;
  5        errors          NUMBER;
  6        error_mesg     VARCHAR2(255);
  7        bulk_error      EXCEPTION;
  8        l_cnt           NUMBER := 0;
  9        PRAGMA exception_init
 10              (bulk_error, -24381);
 11        CURSOR c IS 
 12           SELECT * 
 13           FROM   sales_src;
 14        BEGIN
 15        OPEN c;
 16        LOOP
 17          FETCH c 
 18             BULK COLLECT 
 19             INTO sales_src_arr 
 20             LIMIT 100;
 21          BEGIN
 22             FORALL i IN 1 .. sales_src_arr.count 
 23                      SAVE EXCEPTIONS
 24               INSERT INTO sales_target VALUES sales_src_arr(i);
 25          EXCEPTION
 26          WHEN bulk_error THEN
 27            errors := 
 28               SQL%BULK_EXCEPTIONS.COUNT;
 29            l_cnt := l_cnt + errors;
 30            FOR i IN 1..errors LOOP
 31              error_mesg := SQLERRM(-SQL%BULK_EXCEPTIONS(i).ERROR_CODE);
 32              INSERT INTO sales_target_errors 
 33              VALUES     (error_mesg);
 34       END LOOP;
 35          END;
 36          EXIT WHEN c%NOTFOUND;
 37      
 38       END LOOP;
 39       CLOSE c;
 40       DBMS_OUTPUT.PUT_LINE
 41        ( l_cnt || ' total errors' );
 42       END;
 43  /
9 total errors

PL/SQL procedure successfully completed.

Elapsed: 00:00:10.46
SQL> alter session set sql_trace = false;

Session altered.

Elapsed: 00:00:00.03
SQL> select * from sales_target_errors;

SQL_ERR_MESG

---------------------------------
ORA-02290: check constraint (.) violated
ORA-02290: check constraint (.) violated
ORA-02290: check constraint (.) violated
ORA-01400: cannot insert NULL into ()
ORA-01400: cannot insert NULL into ()
ORA-01400: cannot insert NULL into ()
ORA-01400: cannot insert NULL into ()
ORA-01400: cannot insert NULL into ()
ORA-01400: cannot insert NULL into ()

9 rows selected.

Elapsed: 00:00:00.21
Processing your data with this method takes 10.46 seconds, longer than the 5.75 seconds when using DML error logging and a direct-path INSERT but quicker than using a conventional-path INSERT. The results are conclusive: If you use DML error logging and you can insert your data with direct path, your batches can load an order of magnitude faster than if you processed your data row by row, using PL/SQL, even if you take advantage of features such as BULK COLLECT, FORALL, and SAVE EXCEPTIONS.
Finally, use TKPROF to format the SQL trace file you generated during your testing and check the explain plan and statistics for the direct-path insertion, shown in Listing 8. Note that the insertions into the error logging table are carried out after the INSERT has taken place and that these rows will stay in the error logging table even if the main statement fails and rolls back.
Code Listing 8: Using TKPROF to look at direct-path INSERT statistics 
INSERT /*+ APPEND */
INTO   sales_target
SELECT *
FROM   sales_src
LOG ERRORS
REJECT LIMIT UNLIMITED

call    count   cpu     elapsed disk    query   current    rows
---     ---     ----     ----   ----     ----   ----       ----
Parse   1       0.01    0.10       0       0       0           0
Execute 1       2.84    5.52    3460    5226    6659      918834
Fetch   0       0.00    0.00       0       0       0           0
---     ---     ----     ----   ----     ----   ----       ----
total   2       2.85    5.62    3460    5226    6659      918834

Misses in library cache during parse: 1
Optimizer mode: ALL_ROWS
Parsing user id: 99  
Rows     Row Source Operation
-------  ---------------------------------------------------
      1  LOAD AS SELECT  (cr=5907 pr=3462 pw=5066 time=5539104 us)
 918843   ERROR LOGGING  (cr=5094 pr=3460 pw=0 time=92811603 us)
 918843   TABLE ACCESS FULL SALES_SRC (cr=5075 pr=3458 pw=0 time=16547710 us)

***************************************************************************
INSERT INTO ERR$_SALES_TARGET (ORA_ERR_NUMBER$, ORA_ERR_MESG$, 
ORA_ERR_ROWID$,   ORA_ERR_OPTYP$, ORA_ERR_TAG$, SALES_ID, PROD_ID, 
CUST_ID, CHANNEL_ID, TIME_ID, PROMO_ID, AMOUNT_SOLD, QUANTITY_SOLD) 
VALUES
 (:1, :2, :3, :4, :5, :6, :7, :8, :9, :10, :11, :12, :13)
call    count   cpu     elapsed  disk   query   current rows
---      ---    ----    ----     ----   ----    ----    ----
Parse     1     0.00    0.00     0      0        0      0
Execute   9     0.00    0.01     2      4       39      9
Fetch     0     0.00    0.00     0      0        0      0
---      ---    ----    ----     ----   ----    ----    ----
total    10     0.00    0.01     2      4       39      9
Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 99     (recursive depth: 1)

Next, locate the part of the formatted trace file that represents the PL/SQL approach and note how the execution of the anonymous block is split into four parts: (1) the anonymous block is parsed, (2) the source data is bulk-collected into an array, (3) the array is unloaded into the target table, and (4) the exceptions are written to the error logging table. Listing 9 shows that, together, these steps take more than twice as long to execute as a direct-path INSERT statement with DML error logging yet involve more coding and store less information about the rows that returned errors.
Code Listing 9: PROF to look at PL/SQL INSERT statistics 
DECLARE
TYPE array IS TABLE OF sales_target%ROWTYPE INDEX BY BINARY_INTEGER; sales_src_arr ARRAY; errors NUMBER; error_mesg VARCHAR2(255); bulk_error EXCEPTION; l_cnt NUMBER := 0; PRAGMA exception_init (bulk_error, -24381); CURSOR c IS SELECT * FROM sales_src; BEGIN OPEN c; LOOP FETCH c BULK COLLECT INTO sales_src_arr LIMIT 100; BEGIN FORALL i IN 1 .. sales_src_arr.count SAVE EXCEPTIONS INSERT INTO sales_target VALUES sales_src_arr(i); EXCEPTION WHEN bulk_error THEN errors := SQL%BULK_EXCEPTIONS.COUNT; l_cnt := l_cnt + errors; FOR i IN 1..errors LOOP error_mesg := SQLERRM(-SQL%BULK_EXCEPTIONS(i).ERROR_CODE); INSERT INTO sales_target_errors VALUES (error_mesg); END LOOP; END; EXIT WHEN c%NOTFOUND; END LOOP; CLOSE c; DBMS_OUTPUT.PUT_LINE ( l_cnt || ' total errors' ); END; call count cpu elapsed disk query current rows --- --- ---- ---- ---- ---- ---- ---- Parse 1 0.03 0.02 0 0 0 0 Execute 1 1.14 2.71 0 0 0 1 Fetch 0 0.00 0.00 0 0 0 0 --- --- ---- ---- ---- ---- ---- ---- total 2 1.17 2.73 0 0 0 1 Misses in library cache during parse: 1 Optimizer mode: ALL_ROWS Parsing user id: 99 ******************************************************************** SELECT * FROM SALES_SRC call count cpu elapsed disk query current rows --- --- ---- ---- ---- ---- ---- ---- Parse 1 0.00 0.00 0 0 0 0 Execute 1 0.00 0.00 0 0 0 0 Fetch 9189 3.60 3.23 0 14219 0 918843 --- --- ---- ---- ---- ---- ---- ---- total 9191 3.60 3.23 0 14219 0 918843 Misses in library cache during parse: 0 Optimizer mode: ALL_ROWS Parsing user id: 99 (recursive depth: 1) Rows Row Source Operation ------- --------------------------------------------------- 918843 TABLE ACCESS FULL SALES_SRC (cr=14219 pr=0 pw=0 time=33083496 us) ************************************************************************** INSERT INTO SALES_TARGET VALUES (:B1 ,:B2 ,:B3 ,:B4 ,:B5 ,:B6 ,:B7 ,:B8 ) call count cpu elapsed disk query current rows --- --- ---- ---- ---- ---- ---- ---- Parse 1 0.00 0.00 0 0 0 0 Execute 9189 4.39 4.30 2 6886 54411 918834 Fetch 0 0.00 0.00 0 0 0 0 --- --- ---- ---- ---- ---- ---- ---- total 9190 4.39 4.30 2 6886 54411 918834 Misses in library cache during parse: 1 Misses in library cache during execute: 1 Optimizer mode: ALL_ROWS Parsing user id: 99 (recursive depth: 1) ************************************************************************ INSERT INTO SALES_TARGET_ERRORS VALUES (:B1 ) call count cpu elapsed disk query current rows --- --- ---- ---- ---- ---- ---- ---- Parse 1 0.00 0.00 0 0 0 0 Execute 9 0.00 0.01 2 4 30 9 Fetch 0 0.00 0.00 0 0 0 0 --- --- ---- ---- ---- ---- ---- ---- total 10 0.00 0.01 2 4 30 9 Misses in library cache during parse: 1 Misses in library cache during execute: 1 Optimizer mode: ALL_ROWS Parsing user id: 99 (recursive depth: 1) Leftover from Listing 2*************** SQL> DESC sales_src Name Null? Type ------------------- ---- ------------- SALES_ID NUMBER CUST_ID NOT NULL NUMBER PROD_ID NOT NULL NUMBER CHANNEL_ID NOT NULL NUMBER TIME_ID NOT NULL DATE PROMO_ID NOT NULL NUMBER AMOUNT_SOLD NOT NULL NUMBER(10,2) QUANTITY_SOLD NOT NULL NUMBER(10,2) SQL> DESC sales_target Name Null? Type ------------------- ---- ------------- SALES_ID NUMBER CUST_ID NOT NULL NUMBER PROD_ID NOT NULL NUMBER CHANNEL_ID NOT NULL NUMBER TIME_ID NOT NULL DATE PROMO_ID NOT NULL NUMBER AMOUNT_SOLD NOT NULL NUMBER(10,2) QUANTITY_SOLD NOT NULL NUMBER(10,2)

Conclusion

In the past, if you wanted to load data into a table and gracefully deal with constraint violations or other DML errors, you either had to use a utility such as SQL*Loader or write a PL/SQL procedure that processed each row on a row-by-row basis. The new DML error logging feature in Oracle Database 10g Release 2 enables you to add a new LOG ERRORS clause to most DML statements that allows the operation to continue, writing errors to an error logging table. By using the new DML error logging feature, you can load your batches faster, have errors handled automatically, and do away with the need for custom-written error handling routines in your data loading process.

You Might Also Like

Related Posts with Thumbnails

Pages