Oracle9i introduces a family of new features to support more scalable
and efficient ETL (Extraction, Transformation, Load) processing for data warehouses and
business intelligence systems.
One of the most significant ETL additions is the Multitable
Insert feature. The multitable insert feature allows the INSERT . . . SELECT
statement to use multiple tables as targets. In addition, it can distribute
data among target tables based on logical attributes of the new rows. Multitable
insert thus enables a single scan and transformation of source data to insert data into
multiple tables, sharply increasing performance.
In the following example we use external
tables and multitable inserts, two of these powerful ETL features. Let's
assume that you have the following revenue data to load into a database in different
ways.
Person Jan Feb Mar Apr Mai Jun Jul
Aug Sep Oct Nov Dez
--------------------------------------------------------
Schnyder,345,223,122,345,324,244,123,123,345,121,345,197
Weber,234,234,123,457,456,287,234,123,678,656,341,567
Keller,596,276,347,134,743,545,216,456,124,753,346,456
Meyer,987,345,645,567,834,567,789,234,678,973,456,125
Holzer,509,154,876,347,146,788,174,986,568,246,324,987
Müller,456,125,678,235,878,237,567,237,788,237,324,778
Binggeli,487,347,458,347,235,864,689,235,764,964,624,347
Stoller,596,237,976,876,346,567,126,879,125,568,124,753
Marty,094,234,235,763,054,567,237,457,325,753,577,346
Studer,784,567,235,753,124,575,864,235,753,864,634,678
New in Oracle9i is the concept of an external table. This is a
table that you define in the database's data dictionary, but for which the data itself is
stored outside of the database. It's possible, for example, to define an external
table that derives its data from the type of text file you would load using SQL*Loader.
This is great, because the revenue data in the above example
you need to load resides in just such a text file. Before you can create an external
table, you need to create an Oracle directory object that points to the operating system
directory in which your text file resides.
CONNECT sys/manager AS SYSDBA;
CREATE OR REPLACE DIRECTORY dat_dir AS 'C:\Oradata\Data';
CREATE OR REPLACE DIRECTORY log_dir AS 'C:\Oradata\Log';
CREATE OR REPLACE DIRECTORY bad_dir AS 'C:\Oradata\Bad';
GRANT READ ON DIRECTORY dat_dir TO scott;
GRANT WRITE ON DIRECTORY log_dir TO scott;
GRANT WRITE ON DIRECTORY bad_dir TO scott;
You can now use a new form of the CREATE TABLE statement that looks
like a cross between a SQL statement and a SQL*Loader control file:
CONNECT scott/tiger;
CREATE TABLE revext (person
VARCHAR2(20),
rev_jan NUMBER(4),
rev_feb NUMBER(4),
rev_mar NUMBER(4),
rev_apr NUMBER(4),
rev_mai NUMBER(4),
rev_jun NUMBER(4),
rev_jul NUMBER(4),
rev_aug NUMBER(4),
rev_sep NUMBER(4),
rev_oct NUMBER(4),
rev_nov NUMBER(4),
rev_dez NUMBER(4))
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY dat_dir
ACCESS PARAMETERS
(
records delimited by newline
badfile bad_dir:'revext%a_%p.bad'
logfile log_dir:'revext%a_%p.log'
fields terminated by ','
missing field values are null
( person,
rev_jan,
rev_feb,
rev_mar,
rev_apr,
rev_mai,
rev_jun,
rev_jul,
rev_aug,
rev_sep,
rev_oct,
rev_nov,
rev_dez
)
)
LOCATION ('revext.dat')
)
PARALLEL 4
REJECT LIMIT UNLIMITED;
When you create an external table, you're really only creating some
data dictionary entries. Nothing exciting happens until you query the table. Realize
though, that you can query the table as you would any other SQL table.
In a multitable insert, you insert computed rows derived from the rows
returned from the evaluation of a subquery into one or more tables.
Specify ALL followed by multiple insert_into_clauses to perform an
unconditional multitable insert. Oracle executes each insert_into_clause once for each
row returned by the subquery.
Specify the conditional_insert_clause to perform a conditional
multitable insert. Oracle filters each insert_into_clause through the corresponding WHEN
condition, which determines whether that insert_into_clause is executed. A single
multitable insert statement can contain up to 127 WHEN clauses.
If you specify ALL, Oracle evaluates each WHEN clause regardless of the
results of the evaluation of any other WHEN clause. For each WHEN clause whose condition
evaluates to true, Oracle executes the corresponding INTO clause list.
If you specify FIRST, Oracle evaluates each WHEN clause in the order in
which it appears in the statement. For the first WHEN clause that evaluates to true,
Oracle executes the corresponding INTO clause and skips subsequent WHEN clauses for the
given row.
For a given row, if no WHEN clause evaluates to true:
- If you have specified an ELSE clause Oracle executes the INTO clause list
associated with the ELSE clause.
- If you did not specify an else clause, Oracle takes no action for that row.
Specify a subquery that returns rows that are inserted into the table.
The subquery can refer to any table, view, or materialized view, including the target
tables of the INSERT statement. If the subquery selects no rows, Oracle inserts no rows
into the table.
Read from the external table and insert result into the normalized
REVENUE table. You end up with one row for each person/month
combination:
PERSON
MON REVENUE
-------------------- --- ----------
Schnyder
01 345
Weber
01 234
Keller
01 596
Meyer
01 987
Holzer
01 509
Müller
01 456
Binggeli
01 487
Stoller
01 596
Marty
01 94
Studer
01 784
Schnyder
02 223
Weber
02 234
Keller
02 276
Meyer
02 345
Holzer
02 154
Müller
02 125
Binggeli
02 347
Stoller
02 237
........
.. ...
CREATE TABLE revenue (
person VARCHAR2(20),
month VARCHAR2(3),
revenue NUMBER,
CONSTRAINT revenue_pk PRIMARY KEY (person,month));
INSERT ALL
INTO revenue (person, month, revenue)
VALUES (person, '01', rev_jan)
INTO revenue (person, month, revenue)
VALUES (person, '02', rev_feb)
INTO revenue (person, month, revenue)
VALUES (person, '03', rev_mar)
INTO revenue (person, month, revenue)
VALUES (person, '04', rev_apr)
INTO revenue (person, month, revenue)
VALUES (person, '05', rev_mai)
INTO revenue (person, month, revenue)
VALUES (person, '06', rev_jun)
INTO revenue (person, month, revenue)
VALUES (person, '07', rev_jul)
INTO revenue (person, month, revenue)
VALUES (person, '08', rev_aug)
INTO revenue (person, month, revenue)
VALUES (person, '09', rev_sep)
INTO revenue (person, month, revenue)
VALUES (person, '10', rev_oct)
INTO revenue (person, month, revenue)
VALUES (person, '11', rev_nov)
INTO revenue (person, month, revenue)
VALUES (person, '12', rev_dez)
SELECT person,
rev_jan,
rev_feb,
rev_mar,
rev_apr,
rev_mai,
rev_jun,
rev_jul,
rev_aug,
rev_sep,
rev_oct,
rev_nov,
rev_dez
FROM revext;
In this example you insert each month in one separate table, all in one
transaction.
create table revenue_jan (person VARCHAR2(20) NOT NULL, revenue
NUMBER);
create table revenue_feb (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_mar (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_apr (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_mai (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_jun (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_jul (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_aug (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_sep (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_oct (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_nov (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_dez (person VARCHAR2(20) NOT NULL, revenue NUMBER);
INSERT ALL
INTO revenue_jan (person, revenue)
VALUES (person, rev_jan)
INTO revenue_feb (person, revenue)
VALUES (person, rev_feb)
INTO revenue_mar (person, revenue)
VALUES (person, rev_mar)
INTO revenue_apr (person, revenue)
VALUES (person, rev_apr)
INTO revenue_mai (person, revenue)
VALUES (person, rev_mai)
INTO revenue_jun (person, revenue)
VALUES (person, rev_jun)
INTO revenue_jul (person, revenue)
VALUES (person, rev_jul)
INTO revenue_aug (person, revenue)
VALUES (person, rev_aug)
INTO revenue_sep (person, revenue)
VALUES (person, rev_sep)
INTO revenue_oct (person, revenue)
VALUES (person, rev_oct)
INTO revenue_nov (person, revenue)
VALUES (person, rev_nov)
INTO revenue_dez (person, revenue)
VALUES (person, rev_dez)
SELECT person,
rev_jan,
rev_feb,
rev_mar,
rev_apr,
rev_mai,
rev_jun,
rev_jul,
rev_aug,
rev_sep,
rev_oct,
rev_nov,
rev_dez
FROM revext;
In this example you filter the input data in three ranges:
- Revenue <= 150
- 150 < Revenue <= 300
- Revenue > 300
and insert the results in three corresponding tables:
CREATE TABLE revenue_low (person VARCHAR2(20) NOT NULL,
month VARCHAR2(3) NOT NULL, revenue NUMBER);
CREATE TABLE revenue_med (person VARCHAR2(20) NOT NULL,
month VARCHAR2(3) NOT NULL, revenue NUMBER);
CREATE TABLE revenue_top (person VARCHAR2(20) NOT NULL,
month VARCHAR2(3) NOT NULL, revenue NUMBER);
INSERT ALL
WHEN (rev_jan <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '01', rev_jan)
WHEN (rev_feb <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '02', rev_feb)
WHEN (rev_mar <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '03', rev_mar)
WHEN (rev_apr <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '04', rev_apr)
WHEN (rev_mai <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '05', rev_mai)
WHEN (rev_jun <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '06', rev_jun)
WHEN (rev_jul <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '07', rev_jul)
WHEN (rev_aug <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '08', rev_aug)
WHEN (rev_sep <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '09', rev_sep)
WHEN (rev_oct <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '10', rev_oct)
WHEN (rev_nov <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '11', rev_nov)
WHEN (rev_dez <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '12', rev_dez)
WHEN (rev_jan > 150 AND rev_jan < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '01', rev_jan)
WHEN (rev_feb > 150 AND rev_feb < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '02', rev_feb)
WHEN (rev_mar > 150 AND rev_mar < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '03', rev_mar)
WHEN (rev_apr > 150 AND rev_apr < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '04', rev_apr)
WHEN (rev_mai > 150 AND rev_mai < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '05', rev_mai)
WHEN (rev_jun > 150 AND rev_jun < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '06', rev_jun)
WHEN (rev_jul > 150 AND rev_jul < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '07', rev_jul)
WHEN (rev_aug > 150 AND rev_aug < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '08', rev_aug)
WHEN (rev_sep > 150 AND rev_sep < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '09', rev_sep)
WHEN (rev_oct > 150 AND rev_oct < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '10', rev_oct)
WHEN (rev_nov > 150 AND rev_nov < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '11', rev_nov)
WHEN (rev_dez > 150 AND rev_dez < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '12', rev_dez)
WHEN (rev_jan > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '01', rev_jan)
WHEN (rev_feb > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '02', rev_feb)
WHEN (rev_mar > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '03', rev_mar)
WHEN (rev_apr > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '04', rev_apr)
WHEN (rev_mai > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '05', rev_mai)
WHEN (rev_jun > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '06', rev_jun)
WHEN (rev_jul > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '07', rev_jul)
WHEN (rev_aug > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '08', rev_aug)
WHEN (rev_sep > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '09', rev_sep)
WHEN (rev_oct > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '10', rev_oct)
WHEN (rev_nov > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '11', rev_nov)
WHEN (rev_dez > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '12', rev_dez)
SELECT person,
rev_jan,
rev_feb,
rev_mar,
rev_apr,
rev_mai,
rev_jun,
rev_jul,
rev_aug,
rev_sep,
rev_oct,
rev_nov,
rev_dez
FROM revext;
In this example the summary over all months are calculated. If the sum
is less or eqal to 6000 then the sum is inserted in table SUMMARY_LOW, else in table
SUMMARY_TOP, again all in one transaction.
CREATE TABLE summary_low (person VARCHAR2(20) NOT NULL, sum
NUMBER);
CREATE TABLE summary_top (person VARCHAR2(20) NOT NULL, sum NUMBER);
INSERT FIRST
WHEN ((rev_jan + rev_feb + rev_mar + rev_apr + rev_mai + rev_jun + rev_jul
+ rev_aug + rev_sep + rev_oct + rev_nov + rev_dez)
<= 6000) THEN
INTO summary_low (person, sum)
VALUES (person, rev_jan + rev_feb + rev_mar +
rev_apr + rev_mai + rev_jun
+ rev_jul + rev_aug + rev_sep + rev_oct + rev_nov +
rev_dez)
ELSE
INTO summary_top (person, sum)
VALUES (person, rev_jan + rev_feb + rev_mar + rev_apr +
rev_mai + rev_jun
+ rev_jul + rev_aug + rev_sep + rev_oct + rev_nov +
rev_dez)
SELECT person,
rev_jan,
rev_feb,
rev_mar,
rev_apr,
rev_mai,
rev_jun,
rev_jul,
rev_aug,
rev_sep,
rev_oct,
rev_nov,
rev_dez
FROM revext;
Multitable inserts are just one example of the new ETL features of
Oracle9i. The full set of new ETL functionality creates a powerful framework for handling
all ETL tasks within the Oracle database.
|