It's All About ORACLE

Oracle - The number one Database Management System. Hope this Blog will teach a lot about oracle.

Virtual Columns in Oracle Database 11g Release 1

When queried, virtual columns appear to be normal table columns, but their values are derived rather than being stored on disc. The syntax for defining a virtual column is listed below.
column_name [datatype] [GENERATED ALWAYS] AS (expression) [VIRTUAL]
If the datatype is omitted, it is determined based on the result of the expression. The GENERATED ALWAYS and VIRTUAL keywords are provided for clarity only.
The script below creates and populates an employees table with two levels of commission. It includes two virtual columns to display the commission-based salary. The first uses the most abbreviated syntax while the second uses the most verbose form.
CREATE TABLE employees (
  id          NUMBER,
  first_name  VARCHAR2(10),
  last_name   VARCHAR2(10),
  salary      NUMBER(9,2),
  comm1       NUMBER(3),
  comm2       NUMBER(3),
  salary1     AS (ROUND(salary*(1+comm1/100),2)),
  salary2     NUMBER GENERATED ALWAYS AS (ROUND(salary*(1+comm2/100),2)) VIRTUAL,
  CONSTRAINT employees_pk PRIMARY KEY (id)
);

INSERT INTO employees (id, first_name, last_name, salary, comm1, comm2)
VALUES (1, 'JOHN', 'DOE', 100, 5, 10);

INSERT INTO employees (id, first_name, last_name, salary, comm1, comm2)
VALUES (2, 'JAYNE', 'DOE', 200, 10, 20);
COMMIT;
Querying the table shows the inserted data plus the derived commission-based salaries.
SELECT * FROM employees;

        ID FIRST_NAME LAST_NAME      SALARY      COMM1      COMM2    SALARY1    SALARY2
---------- ---------- ---------- ---------- ---------- ---------- ---------- ----------
         1 JOHN       DOE               100          5         10        105        110
         2 JAYNE      DOE               200         10         20        220        240

2 rows selected.
The expression used to generate the virtual column is listed in the DATA_DEFAULT column of the [DBA|ALL|USER]_TAB_COLUMNS views.
COLUMN data_default FORMAT A50
SELECT column_name, data_default
FROM   user_tab_columns
WHERE  table_name = 'EMPLOYEES';

COLUMN_NAME                    DATA_DEFAULT
------------------------------ --------------------------------------------------
ID
FIRST_NAME
LAST_NAME
SALARY
COMM1
COMM2
SALARY1                        ROUND("SALARY"*(1+"COMM1"/100),2)
SALARY2                        ROUND("SALARY"*(1+"COMM2"/100),2)

8 rows selected.

SQL>
Notes and restrictions on virtual columns include:
  • Indexes defined against virtual columns are equivalent to function-based indexes.
  • Virtual columns can be referenced in the WHERE clause of updates and deletes, but they cannot be manipulated by DML.
  • Tables containing virtual columns can still be eligible for result caching.
  • Functions in expressions must be deterministic at the time of table creation, but can subsequently be recompiled and made non-deterministic without invalidating the virtual column. In such cases the following steps must be taken after the function is recompiled:
    • Constraint on the virtual column must be disabled and re-enabled.
    • Indexes on the virtual column must be rebuilt.
    • Materialized views that access the virtual column must be fully refreshed.
    • The result cache must be flushed if cached queries have accessed the virtual column.
    • Table statistics must be regathered.
  • Virtual columns are not supported for index-organized, external, object, cluster, or temporary tables.
  • The expression used in the virtual column definition has the following restrictions:
    • It cannot refer to another virtual column by name.
    • It can only refer to columns defined in the same table.
    • If it refers to a deterministic user-defined function, it cannot be used as a partitioning key column.
    • The output of the expression must be a scalar value. It cannot return an Oracle supplied datatype, a user-defined type, or LOB or LONG RAW.
Working with Virtual Column:

Creating a Virtual Column:
We will begin by creating a simple table with a single virtual column, as follows.
SQL> CREATE TABLE t
  2  ( n1 INT
  3  , n2 INT
  4  , n3 INT GENERATED ALWAYS AS (n1 + n2) VIRTUAL
  5  );

Table created.

We can see that the virtual column is generated from a simple expression involving the other columns in our table. Note that the VIRTUAL keyword is optional and is included for what Oracle calls "syntactic clarity".

Virtual column values are not stored on disk. They are generated at runtime using their associated expression (in our example, N1 + N2). This has some implications for the way we insert data into tables with virtual columns, as we can see below.
SQL> INSERT INTO t VALUES (10, 20, 30);
INSERT INTO t VALUES (10, 20, 30)
            *
ERROR at line 1:
ORA-54013: INSERT operation disallowed on virtual columns

We cannot explicitly add data to virtual columns, so we will attempt an insert into the physical columns only, as follows.
SQL> INSERT INTO t VALUES (10, 20);
INSERT INTO t VALUES (10, 20)
            *
ERROR at line 1:
ORA-00947: not enough values
Despite the fact that we cannot insert or update virtual columns, they are still considered part of the table's column list. This means, therefore, that we must explicitly reference the physical columns in our insert statements, as follows.
SQL> INSERT INTO t (n1, n2) VALUES (10, 20);

1 row created.
Of course, fully-qualified inserts such as our example above are best practice so this should be a trivial restriction for most developers. Now we have data in our example table, we can query our virtual column, as follows.
SQL> SELECT * FROM t;

        N1         N2         N3
---------- ---------- ----------
        10         20         30

1 row selected.

Our expression is evaluated at runtime and gives the output we see above.

Indexes and Constraints

Virtual columns are valid for indexes and constraints. Indexes on virtual columns are essentially function-based indexes (this is covered in more detail later in this article). The results of the virtual column's expression are stored in the index. In the following example, we will create a primary key constraint on the N3 virtual column.
SQL> CREATE UNIQUE INDEX t_pk ON t(n3); Index created. SQL> ALTER TABLE t ADD 2 CONSTRAINT t_pk 3 PRIMARY KEY (n3) 4 USING INDEX; Table altered. If we try to insert data that results in a duplicate virtual column value, we should expect a unique constraint violation, as follows. SQL> INSERT INTO t (n1, n2) VALUES (10, 20); INSERT INTO t (n1, n2) VALUES (10, 20) * ERROR at line 1: ORA-00001: unique constraint (SCOTT.T_PK) violated As expected, this generates an ORA-00001 exception. It follows, therefore, that if we can create a primary key on a virtual column, then we can reference it from a foreign key constraint. In the following example, we will create a child table with a foreign key to the T.N3 virtual column. SQL> CREATE TABLE t_child 2 ( n3 INT 3 , CONSTRAINT tc_fk 4 FOREIGN KEY (n3) 5 REFERENCES t(n3) 6 ); Table created. We will now insert some valid and invalid data, as follows. SQL> INSERT INTO t_child VALUES (30); 1 row created. SQL> INSERT INTO t_child VALUES (40); INSERT INTO t_child VALUES (40) * ERROR at line 1: ORA-02291: integrity constraint (SCOTT.TC_FK) violated - parent key not found

Adding Virtual Column

Virtual columns can be added after table creation with an ALTER TABLE statement. In the following example, we will add a new virtual column to our existing table. We will include a check constraint for demonstration purposes.
SQL> ALTER TABLE t ADD
  2     n4 GENERATED ALWAYS AS (n1 * n2)
  3        CHECK (n4 >= 10);

Table altered.
As stated earlier, an index on a virtual column will store the results of the expression. A check constraint, however, will evaluate the expression at the time of adding or modifying data to the underlying table. This is seemingly obvious, because there is no associated storage structure (i.e. index) with a check constraint.
Our new virtual column, N4, has a check constraint to ensure that the product of the N1 and N2 columns is greater than 10. We will test this, as follows.
SQL> INSERT INTO t (n1, n2) VALUES (1, 2);
INSERT INTO t (n1, n2) VALUES (1, 2)
*
ERROR at line 1:
ORA-02290: check constraint (SCOTT.SYS_C0010001) violated


Oracle Database - SQL Plus - Substitution Variables

About
You can define variables, called substitution variables, for repeated use in a single script. Note that you can also define substitution variables to use in titles and to save your keystrokes (by defining a long string as the value for a variable with a short name).

A substitution variable is preceded by one or two ampersands (&).

How to Create a Substitution Variable
Temporary substitution variable:-
When SQL Plus find a substitution variable define by using only one ampersand (&), it tries to replace it with the value of one permanent substitution variable previously defined otherwise, it will prompt you to enter a value that will be use only once.

Permanent Substitution Variables:-
A permanent Substitution Variables is a variable available for the complete session.

Set up a substitution Variable ?
To define a substitution variable, you can use:-

-> DEFINE command

-> Two Ampersands

-> ACCEPT command
-> COLUMN NEW_VALUE


DEFINE command:

         DEFINE L_NAME = SMITH
Note that any substitution variable you define explicitly through DEFINE takes only CHAR values (that is, the value you assign to the variable is always treated as a CHAR datatype). You can define a substitution variable of datatype NUMBER implicitly through the ACCEPT command.


Two ampersands:

SQL*Plus automatically DEFINEs any substitution variable preceded by two ampersands, but does not DEFINE those preceded by only one ampersand.
Oradata@orcl>select &&MySubstitutionVariable FROM dual;

Enter VALUE FOR mysubstitutionvariable: 'Value'
old 1: SELECT &&MySubstitutionVariable FROM dual
NEW 1: SELECT 'Value' FROM dual

'VALUE'

-------
VALUE

Oradata@orcl>DEFINE MySubstitutionVariable

DEFINE MYSUBSTITUTIONVARIABLE = "'Value'" (CHAR)


NOTE: One thing to note here is: Any character string values to be given to variable, must be define in single quotes else the value we provide, will itself be considered as Variable.
Enter value for counter: dsfs
old   5: v_count := &counter;
new   5: v_count := dsfs;

ERROR at line 5:
ORA-06550: line 5, column 12:
PLS-00201: identifier 'DSFS' must be declared
ORA-06550: line 5, column 1:
PL/SQL: Statement ignored

ACCEPT command:
ACCEPT pswd CHAR PROMPT 'Password: ' HIDE 

For a number
ACCEPT salary NUMBER FORMAT '999.99' DEFAULT '000.0' -
PROMPT 'Enter weekly salary: '

PROMPT You have enter: &salary

If we do not Enter salary in defined format, it will keep prompting for right format salary:
SQL> ACCEPT salary NUMBER FORMAT '999.99' DEFAULT '000.0'
4553.223
SP2-0598: "4553.223" does not match input format "999.99"
444.33


COLUMN NEW_VALUE

You can store the value of a column value in a variable using this statement
COLUMN column_name NEW_VALUE variable_name

Example:

define SpoolFileId=idle
column ReleaseId NEW_VALUE SpoolFileId 

select TO_CHAR(SYSDATE,'YYYYMMDD_HH24MMSS_') || '_Release.log' ReleaseId from dual; 
PROMPT The value of SpoolFileId is now: &SpoolFileId


then to define a log file, you can use this statement:

SPOOL '&SpoolFileId'


How to delete a substitution variable?
To delete a substitution variable, use the SQL*Plus command UNDEFINE followed by the variable name.
UNDEFINE MySubstitutionVariable

How to list?


All substitution variables:
To list all substitution variable definitions, enter DEFINE by itself.


SQL> DEFINE

DEFINE _DATE = "14-JUN-10" (CHAR)
DEFINE _CONNECT_IDENTIFIER = "bidb" (CHAR)
DEFINE _USER = "DWH" (CHAR)
DEFINE _PRIVILEGE = "" (CHAR)
DEFINE _SQLPLUS_RELEASE = "1002000300" (CHAR)
DEFINE _EDITOR = "Notepad" (CHAR)
DEFINE _O_VERSION = "Oracle Database 10g Enterprise Edition Release 10.2.0.
4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options" (
CHAR)
DEFINE _O_RELEASE = "1002000400" (CHAR)
DEFINE MY_SUBSTITUTION_VARIABLE = "1" (CHAR)


One substitution variable

PROMPT The value of MySubstitutionVariable is: &MySubstitutionVariable
or
DEFINE MySubstitutionVariable

Parameters:


SET DEFINE:
SET DEFINE defines the substitution character (by default the ampersand ”&”) and turns substitution on and off.


SET DEFINE OFF:

To turn off this functionality (also in SQL Developer).

SET VERIFY:

After you enter a value at the prompt, SQL*Plus lists the line containing the substitution variable twice: once before substituting the value you enter and once after substitution. You can suppress this listing by setting the SET command variable VERIFY to OFF.

SET VERIFY ON. Lists each line of the script before and after substitution. This can be used for debug purpose also. We are able to see which value got assigned to which value variable.

Like in below example:
SQL>  define pswd=12
SQL> define charctr=123
SQL> DECLARE
  2  v_count NUMBER;
  3  v_char VARCHAR2(30);
  4  BEGIN
  5  v_count := &pswd;
  6  v_char := &charctr;
  7  DBMS_OUTPUT.PUT_LINE('First Value of v_count ' ||v_count);
  8  DBMS_OUTPUT.PUT_LINE('Value of v_char ' ||v_char);
  9  END;
 10  /
First Value of v_count 12
Value of v_char 123

PL/SQL procedure successfully completed.

SQL>
SQL> set verify on
SQL> DECLARE
  2  v_count NUMBER;
  3  v_char VARCHAR2(30);
  4  BEGIN
  5  v_count := &pswd;
  6  v_char := &charctr;
  7  DBMS_OUTPUT.PUT_LINE('First Value of v_count ' ||v_count);
  8  DBMS_OUTPUT.PUT_LINE('Value of v_char ' ||v_char);
  9  END;
 10  /
old   5: v_count := &pswd;
new   5: v_count := 12;
old   6: v_char := &charctr;
new   6: v_char := 123;
First Value of v_count 12
Value of v_char 123

PL/SQL procedure successfully completed.

http://www.oracle-base.com/articles/misc/literals-substitution-variables-and-bind-variables.php

SQL* Loader FAQ

1. What is SQL*Loader and what is it used for?

            SQL*Loader is a bulk loader utility used for moving data from external files into the Oracle database. Its syntax is similar to that of the DB2 Load utility, but comes with more options. SQL*Loader supports various load formats, selective loading, and multi-table loads.

2. How does one use the SQL*Loader utility?

            One can load data into an Oracle database by using the sqlldr (sqlload on some platforms) utility. Invoke the utility without arguments to get a list of available parameters.
Look at the following example:
            sqlldr scott/tiger control=loader.ctl

This sample control file (loader.ctl) will load an external data file containing delimited data:
load data
infile 'c:\data\mydata.csv'
into table emp ( empno, empname, sal, deptno )
fields terminated by "," optionally enclosed by '"'                         

The mydata.csv file may look like this:
            10001,"Scott Tiger", 1000, 40
            10002,"Frank Naude", 500, 20

Another Sample control file with in-line data formatted as fix length records. The trick is to specify "*" as the name of the data file, and use BEGINDATA to start the data section in the control file.
load data
infile *
replace
into table departments ( dept position (02:05) char(4), deptname position (08:27) char(20) )
begindata
COSC  COMPUTER SCIENCE
ENGL  ENGLISH LITERATURE
MATH  MATHEMATICS
POLY  POLITICAL SCIENCE

3. Is there a SQL*Unloader to download data to a flat file?

            Oracle does not supply any data unload utilities. However, you can use SQL*Plus to select and format your data and then spool it to a file:
set echo off newpage 0 space 0 pagesize 0 feed off head off trimspool on
spool oradata.txt
select col1 || ',' || col2 || ',' || col3 from tab1 where  col2 = 'XYZ';
spool off

Alternatively use the UTL_FILE PL/SQL package:

rem Remember to update initSID.ora, utl_file_dir='c:\oradata' parameter
declare
       fp utl_file.file_type;
begin
       fp := utl_file.fopen('c:\oradata','tab1.txt','w');
       utl_file.putf(fp, '%s, %s\n', 'TextField', 55);
       utl_file.fclose(fp);
end;
/

            You might also want to investigate third party tools like TOAD or ManageIT Fast Unloader from CA to help you unload data from Oracle.

4. Can one load variable and fix length data records?

            Yes, look at the following control file examples. In the first we will load delimited data (variable length):

LOAD DATA
INFILE *
INTO TABLE load_delimited_data
FIELDS TERMINATED BY "," OPTIONALLY ENCLOSED BY '"'
TRAILING NULLCOLS (data1, data2)
BEGINDATA
            11111,AAAAAAAAAA
            22222,"A,B,C,D,"

            If you need to load positional data (fixed length), look at the following control file example:
LOAD DATA
INFILE *
INTO TABLE load_positional_data (data1 POSITION(1:5), data2 POSITION(6:15) )
BEGINDATA
            11111AAAAAAAAAA
            22222BBBBBBBBBB

5. Can one skip header records load while loading?

            Use the "SKIP n" keyword, where n = number of logical rows to skip. Look at this example:
LOAD DATA
INFILE *
INTO TABLE load_positional_data
SKIP 5(data1 POSITION(1:5), data2 POSITION(6:15))
BEGINDATA
            11111AAAAAAAAAA
            22222BBBBBBBBBB

6. Can one modify data as it loads into the database?

            Data can be modified as it loads into the Oracle Database. Note that this only applies for the conventional load path and not for direct path loads.
LOAD DATA
INFILE *
INTO TABLE modified_data( rec_no "my_db_sequence.nextval", region CONSTANT '31', time_loaded "to_char(SYSDATE, 'HH24:MI')", data1 POSITION(1:5)  ":data1/100", data2 POSITION(6:15) "upper(:data2)", data3 POSITION(16:22)"to_date(:data3, 'YYMMDD')" )
BEGINDATA
            11111AAAAAAAAAA991201
            22222BBBBBBBBBB990112

LOAD DATA
INFILE 'mail_orders.txt'
BADFILE 'bad_orders.txt'
APPEND
INTO TABLE mailing_list
FIELDS TERMINATED BY ","(addr, city, state, zipcode, mailing_addr "decode(:mailing_addr, null, :addr, :mailing_addr)", mailing_city "decode(:mailing_city, null, :city, :mailing_city)", mailing_state)

7. Can one load data into multiple tables at once?

Look at the following control file:
LOAD DATA
INFILE *
REPLACE
INTO TABLE emp 
WHEN empno != ' ' ( empno  POSITION(1:4) INTEGER EXTERNAL, ename  POSITION(6:15)  CHAR, deptno POSITION(17:18) CHAR, mgr POSITION(20:23) INTEGER EXTERNAL )
INTO TABLE proj WHEN projno != ' ' 
( projno POSITION(25:27) INTEGER EXTERNAL, empno  POSITION(1:4)  INTEGER EXTERNAL )

8. Can one selectively load only the records that one need?

Look at this example, (01) is the first character, (30:37) are characters 30 to 37:
LOAD DATA
INFILE  'mydata.dat' BADFILE  'mydata.bad' DISCARDFILE 'mydata.dis'
APPEND
INTO TABLE my_selective_table 
WHEN (01) <> 'H' and (01) <> 'T' and (30:37) ='19991217'
(region CONSTANT '31', service_key POSITION(01:11) INTEGER EXTERNAL, call_b_no POSITION(12:29) CHAR )

9. Can one skip certain columns while loading data?

            One cannot use POSTION(x:y) with delimited data. Luckily, from Oracle 8i one can specify
FILLER columns. FILLER columns are used to skip columns/fields in the load file, ignoring fields that one does not want. Look at this example:

LOAD DATA
TRUNCATE INTO TABLE T1
FIELDS TERMINATED BY ',' ( field1, field2 FILLER, field3 )

10. How does one load multi-line records?
 
            One can create one logical record from multiple physical records using one of the following two clauses:
 
CONCATENATE- use when SQL*Loader should combine the same number of physical records together to form one logical record.
 
CONTINUEIF - use if a condition indicates that multiple records should be treated as one.
Eg. by having a '#' character in column 1.
 
11. How can get SQL*Loader to COMMIT only at the end of the load file?
 
            One cannot, but by setting the ROWS= parameter to a large value, committing can be reduced.
 Make sure you have big rollback segments ready when you use a high value for ROWS=.
 
12. Can one improve the performance of SQL*Loader?
 
            A very simple but easily overlooked hint is not to have any indexes and/or constraints (primary key) on your load tables during the load process. This will significantly slow down load times even with ROWS= set to a high value.
 
            Add the following option in the command line: DIRECT=TRUE. This will effectively bypass most of the RDBMS processing. However, there are cases when you can't use direct load. 
 
            Turn off database logging by specifying the UNRECOVERABLE option. This option can only be used with direct data loads.
 
            Run multiple load jobs concurrently.


Oracle SQL Loader utility - sqlldr Concepts

SQL*Loader Features

SQL*Loader loads data from external files into tables of an Oracle database. It has a powerful data parsing engine that puts little limitation on the format of the data in the datafile. You can use SQL*Loader to do the following:
  • Load data across a network. This means that you can run the SQL*Loader client on a different system from the one that is running the SQL*Loader server.
  • Load data from multiple datafiles during the same load session.
  • Load data into multiple tables during the same load session.
  • Specify the character set of the data.
  • Selectively load data (you can load records based on the records values).
  • Manipulate the data before loading it, using SQL functions.
  • Generate unique sequential key values in specified columns.
  • Use the operating system's file system to access the datafiles.
  • Load data from disk, tape, or named pipe.
  • Generate sophisticated error reports, which greatly aid troubleshooting.
  • Load arbitrarily complex object-relational data.
  • Use secondary datafiles for loading LOBs and collections.
  • Use either conventional or direct path loading. While conventional path loading is very flexible, direct path loading provides superior loading performance
Description of sut81088.gif follows

The figure shows SQL*Loader receiving input datafiles and a SQL*Loader control file as input. SQL*Loader then outputs a log file, bad files, and discard files. Also, the figure shows that the database into which SQL*Loader loaded the input data now contains tables and indexes.


SQL*Loader Parameters

SQL*Loader is invoked when you specify the sqlldr command and, optionally, parameters that establish session characteristics.
In situations where you always use the same parameters for which the values seldom change, it can be more efficient to specify parameters using the following methods, rather than on the command line:
  • Parameters can be grouped together in a parameter file. You could then specify the name of the parameter file on the command line using the PARFILE parameter.
  • Certain parameters can also be specified within the SQL*Loader control file by using the OPTIONS clause.
Parameters specified on the command line override any parameter values specified in a parameter file or OPTIONS clause.

SQL*Loader Control File

The control file is a text file written in a language that SQL*Loader understands. The control file tells SQL*Loader where to find the data, how to parse and interpret the data, where to insert the data, and more.
Although not precisely defined, a control file can be said to have three sections.
The first section contains session-wide information, for example:
  • Global options such as bindsize, rows, records to skip, and so on
  • INFILE clauses to specify where the input data is located
  • Data to be loaded
The second section consists of one or more INTO TABLE blocks. Each of these blocks contains information about the table into which the data is to be loaded, such as the table name and the columns of the table.
The third section is optional and, if present, contains input data.

Input Data and Datafiles

SQL*Loader reads data from one or more files (or operating system equivalents of files) specified in the control file. From SQL*Loader's perspective, the data in the datafile is organized as records. A particular datafile can be in fixed record format, variable record format, or stream record format. The record format can be specified in the control file with the INFILE parameter. If no record format is specified, the default is stream record format.

Note: If data is specified inside the control file (that is, INFILE * was specified in the control file), then the data is interpreted in the stream record format with the default record terminator.

Fixed Record Format

A file is in fixed record format when all records in a datafile are the same byte length. Although this format is the least flexible, it results in better performance than variable or stream format. Fixed format is also simple to specify. For example:
INFILE datafile_name "fix n"

Example 1 shows a control file that specifies a datafile that should be interpreted in the fixed record format. The datafile in the example contains five physical records. Assuming that a period (.) indicates a space, the first physical record is [001,...cd,.] which is exactly eleven bytes (assuming a single-byte character set). The second record is [0002,fghi,\n] followed by the newline character (which is the eleventh byte), and so on. Note that newline characters are not required with the fixed record format.


Example 1 Loading Data in Fixed Record Format
load data
infile 'example.dat'  "fix 11"
into table example
fields terminated by ',' optionally enclosed by '"'
(col1, col2)

example.dat:
001,   cd, 0002,fghi,
00003,lmn,
1, "pqrs",
0005,uvwx,

Variable Record Format

A file is in variable record format when the length of each record in a character field is included at the beginning of each record in the datafile. This format provides some added flexibility over the fixed record format and a performance advantage over the stream record format. For example, you can specify a datafile that is to be interpreted as being in variable record format as follows:

INFILE "datafile_name" "var n"
In this example, n specifies the number of bytes in the record length field. If n is not specified, SQL*Loader assumes a length of 5 bytes. Specifying n larger than 40 will result in an error.

Example 2 shows a control file specification that tells SQL*Loader to look for data in the datafile example.dat and to expect variable record format where the record length fields are 3 bytes long. The example.dat datafile consists of three physical records. The first is specified to be 009 (that is, 9) bytes long, the second is 010 bytes long (that is, 10, including a 1-byte newline), and the third is 012 bytes long (also including a 1-byte newline). Note that newline characters are not required with the variable record format. This example also assumes a single-byte character set for the datafile.

The lengths are always interpreted in bytes, even if character-length semantics are in effect for the file. This is necessary because the file could contain a mix of fields, some processed with character-length semantics and others processed with byte-length semantics.

Example 2 Loading Data in Variable Record Format
load data
infile 'example.dat'  "var 3"
into table example
fields terminated by ',' optionally enclosed by '"'
(col1 char(5),
 col2 char(7))

example.dat:
009hello,cd,010world,im,
012my,name is,

Stream Record Format

A file is in stream record format when the records are not specified by size; instead SQL*Loader forms records by scanning for the record terminator. Stream record format is the most flexible format, but there can be a negative effect on performance. The specification of a datafile to be interpreted as being in stream record format looks similar to the following:
INFILE datafile_name ["str terminator_string"]

The terminator_string is specified as either 'char_string' or X'hex_string' where:
  • 'char_string' is a string of characters enclosed in single or double quotation marks
  • X'hex_string' is a byte string in hexadecimal format
When the terminator_string contains special (nonprintable) characters, it should be specified as a X'hex_string'. However, some nonprintable characters can be specified as ('char_string') by using a backslash. For example:
  • \n indicates a line feed
  • \t indicates a horizontal tab
  • \f indicates a form feed
  • \v indicates a vertical tab
  • \r indicates a carriage return
On UNIX-based platforms, if no terminator_string is specified, SQL*Loader defaults to the line feed character, \nOn Windows NT, if no terminator_string is specified, then SQL*Loader uses either \n or \r\n as the record terminator, depending on which one it finds first in the datafile.

Example 3 illustrates loading data in stream record format where the terminator string is specified using a character string, '|\n'. The use of the backslash character allows the character string to specify the nonprintable line feed character.

Example 3 Loading Data in Stream Record Format
load data
infile 'example.dat'  "str '|\n'"
into table example
fields terminated by ',' optionally enclosed by '"'
(col1 char(5),
 col2 char(7))

example.dat:
hello,world,|
james,bond,|

Discarded and Rejected Record

Records read from the input file might not be inserted into the database. Such records are placed in either a bad file or a discard file.

The Bad File


The bad file contains records that were rejected, either by SQL*Loader or by the Oracle database. If you do not specify a bad file and there are rejected records, then SQL*Loader automatically creates one. It will have the same name as the data file, with a.bad extension. Some of the possible reasons for rejection are discussed in the next sections.

SQL*Loader Rejects


Datafile records are rejected by SQL*Loader when the input format is invalid. For example, if the second enclosure delimiter is missing, or if a delimited field exceeds its maximum length, SQL*Loader rejects the record. Rejected records are placed in the bad file.

Oracle Database Rejects 


After a datafile record is accepted for processing by SQL*Loader, it is sent to the Oracle database for insertion into a table as a row. If the Oracle database determines that the row is valid, then the row is inserted into the table. If the row is determined to be invalid, then the record is rejected and SQL*Loader puts it in the bad file. The row may be invalid, for example, because a key is not unique, because a required field is null, or because the field contains invalid data for the Oracle datatype.

The Discard File

As SQL*Loader executes, it may create a file called the discard file. This file is created only when it is needed, and only if you have specified that a discard file should be enabled. The discard file contains records that were filtered out of the load because they did not match any record-selection criteria specified in the control file.

The discard file therefore contains records that were not inserted into any table in the database. You can specify the maximum number of such records that the discard file can accept. Data written to any database table is not written to the discard file.

Log File and Logging Information

When SQL*Loader begins execution, it creates a log file. If it cannot create a log file, execution terminates. The log file contains a detailed summary of the load, including a description of any errors that occurred during the load

Question:  What is the difference between the bad file and the discard file in SQL*Loader.
Answer:  The bad file and discard files both contain rejected rows, but they are rejected for different reasons:
  • Bad file:  The bad file contains rows that were rejected because of errors.  These errors might include bad datatypes or referential integrity constraints.
  • Discard file:  The discard file contains rows that were discarded because they were filtered out because of a statement in the SQL*Loader control file.
Conventional Path Loads, Direct Path Loads, and External Table Loads

SQL*Loader provides the following methods to load data:

  • Conventional Path Loads
  • Direct Path Loads
  • External Table Loads
Conventional Path Loads

During conventional path loads, the input records are parsed according to the field specifications, and each data field is copied to its corresponding bind array. When the bind array is full (or no more data is left to read), an array insert is executed.

Direct Path Loads

A direct path load parses the input records according to the field specifications, converts the input field data to the column datatype, and builds a column array. The column array is passed to a block formatter, which creates data blocks in Oracle database block format. The newly formatted database blocks are written directly to the database, bypassing much of the data processing that normally takes place. Direct path load is much faster than conventional path load, but entails several restrictions.

External Table Loads

An external table load creates an external table for data that is contained in a datafile. The load executes INSERT statements to insert the data from the datafile into the target table.
The advantages of using external table loads over conventional path and direct path loads are as follows:
  • An external table load attempts to load datafiles in parallel. If a datafile is big enough, it will attempt to load that file in parallel.
  • An external table load allows modification of the data being loaded by using SQL functions and PL/SQL functions as part of the INSERT statement that is used to create the external table.

Choosing External Tables Versus SQL*Loader

The record parsing of external tables and SQL*Loader is very similar, so normally there is not a major performance difference for the same record format. However, due to the different architecture of external tables and SQL*Loader, there are situations in which one method is more appropriate than the other.
In the following situations, use external tables for the best load performance:
  • You want to transform the data as it is being loaded into the database.
  • You want to use transparent parallel processing without having to split the external data first.
However, in the following situations, use SQL*Loader for the best load performance:
  • You want to load data remotely.
  • Transformations are not required on the data, and the data does not need to be loaded in parallel.

You Might Also Like

Related Posts with Thumbnails

Pages