Oracle9i introduces a family of new features to support more scalable and efficient
ETL (Extraction, Transformation, Load) processing for data warehouses and business
intelligence systems.
One of the most significant ETL additions is the Multitable Insert feature.
The multitable insert feature allows the INSERT . . . SELECT statement to use
multiple tables as targets. In addition, it can distribute data among
target tables based on logical attributes of the new rows. Multitable insert thus
enables a single scan and transformation of source data to insert data into multiple
tables, sharply increasing performance.
In the following example we use external tables and
multitable inserts, two of these powerful ETL features. Let's assume that you
have the following revenue data to load into a database in different ways.
Person Jan Feb Mar Apr Mai Jun Jul Aug Sep Oct Nov
Dez
--------------------------------------------------------
Schnyder,345,223,122,345,324,244,123,123,345,121,345,197
Weber,234,234,123,457,456,287,234,123,678,656,341,567
Keller,596,276,347,134,743,545,216,456,124,753,346,456
Meyer,987,345,645,567,834,567,789,234,678,973,456,125
Holzer,509,154,876,347,146,788,174,986,568,246,324,987
Müller,456,125,678,235,878,237,567,237,788,237,324,778
Binggeli,487,347,458,347,235,864,689,235,764,964,624,347
Stoller,596,237,976,876,346,567,126,879,125,568,124,753
Marty,094,234,235,763,054,567,237,457,325,753,577,346
Studer,784,567,235,753,124,575,864,235,753,864,634,678
New in Oracle9i is the concept of an external table. This is a table that you
define in the database's data dictionary, but for which the data itself is stored
outside of the database. It's possible, for example, to define an external
table that derives its data from the type of text file you would load using
SQL*Loader. This is great, because the revenue data in the
above example you need to load resides in just such a text file. Before you can
create an external table, you need to create an Oracle directory object that points
to the operating system directory in which your text file resides.
CONNECT sys/manager AS SYSDBA;
CREATE OR REPLACE DIRECTORY dat_dir AS 'C:\Oradata\Data';
CREATE OR REPLACE DIRECTORY log_dir AS 'C:\Oradata\Log';
CREATE OR REPLACE DIRECTORY bad_dir AS 'C:\Oradata\Bad';
GRANT READ ON DIRECTORY dat_dir TO scott;
GRANT WRITE ON DIRECTORY log_dir TO scott;
GRANT WRITE ON DIRECTORY bad_dir TO scott;
You can now use a new form of the CREATE TABLE statement that looks like a cross
between a SQL statement and a SQL*Loader control file:
CONNECT scott/tiger;
CREATE TABLE revext (person VARCHAR2(20),
rev_jan NUMBER(4),
rev_feb NUMBER(4),
rev_mar NUMBER(4),
rev_apr NUMBER(4),
rev_mai NUMBER(4),
rev_jun NUMBER(4),
rev_jul NUMBER(4),
rev_aug NUMBER(4),
rev_sep NUMBER(4),
rev_oct NUMBER(4),
rev_nov NUMBER(4),
rev_dez NUMBER(4))
ORGANIZATION EXTERNAL
(
TYPE ORACLE_LOADER
DEFAULT DIRECTORY dat_dir
ACCESS PARAMETERS
(
records delimited by newline
badfile bad_dir:'revext%a_%p.bad'
logfile log_dir:'revext%a_%p.log'
fields terminated by ','
missing field values are null
( person,
rev_jan,
rev_feb,
rev_mar,
rev_apr,
rev_mai,
rev_jun,
rev_jul,
rev_aug,
rev_sep,
rev_oct,
rev_nov,
rev_dez
)
)
LOCATION ('revext.dat')
)
PARALLEL 4
REJECT LIMIT UNLIMITED;
When you create an external table, you're really only creating some data
dictionary entries. Nothing exciting happens until you query the table. Realize
though, that you can query the table as you would any other SQL table.
In a multitable insert, you insert computed rows derived from the rows returned from
the evaluation of a subquery into one or more tables.
Specify ALL followed by multiple insert_into_clauses to perform an unconditional
multitable insert. Oracle executes each insert_into_clause once for each row returned
by the subquery.
Specify the conditional_insert_clause to perform a conditional multitable insert.
Oracle filters each insert_into_clause through the corresponding WHEN condition,
which determines whether that insert_into_clause is executed. A single multitable
insert statement can contain up to 127 WHEN clauses.
If you specify ALL, Oracle evaluates each WHEN clause regardless of the results of
the evaluation of any other WHEN clause. For each WHEN clause whose condition
evaluates to true, Oracle executes the corresponding INTO clause list.
If you specify FIRST, Oracle evaluates each WHEN clause in the order in which it
appears in the statement. For the first WHEN clause that evaluates to true, Oracle
executes the corresponding INTO clause and skips subsequent WHEN clauses for the
given row.
For a given row, if no WHEN clause evaluates to true:
-
If you have specified an ELSE clause Oracle executes the INTO clause list
associated with the ELSE clause.
-
If you did not specify an else clause, Oracle takes no action for that row.
Specify a subquery that returns rows that are inserted into the table. The subquery
can refer to any table, view, or materialized view, including the target tables of
the INSERT statement. If the subquery selects no rows, Oracle inserts no rows into
the table.
Read from the external table and insert result into the normalized REVENUE table.
You end up with one row for each person/month combination:
PERSON
MON REVENUE
-------------------- --- ----------
Schnyder
01 345
Weber
01 234
Keller
01 596
Meyer
01 987
Holzer
01 509
Müller
01 456
Binggeli
01 487
Stoller
01 596
Marty
01 94
Studer
01 784
Schnyder
02 223
Weber
02 234
Keller
02 276
Meyer
02 345
Holzer
02 154
Müller
02 125
Binggeli
02 347
Stoller
02 237
........
.. ...
CREATE TABLE revenue (
person VARCHAR2(20),
month VARCHAR2(3),
revenue NUMBER,
CONSTRAINT revenue_pk PRIMARY KEY (person,month));
INSERT ALL
INTO revenue (person, month, revenue)
VALUES (person, '01', rev_jan)
INTO revenue (person, month, revenue)
VALUES (person, '02', rev_feb)
INTO revenue (person, month, revenue)
VALUES (person, '03', rev_mar)
INTO revenue (person, month, revenue)
VALUES (person, '04', rev_apr)
INTO revenue (person, month, revenue)
VALUES (person, '05', rev_mai)
INTO revenue (person, month, revenue)
VALUES (person, '06', rev_jun)
INTO revenue (person, month, revenue)
VALUES (person, '07', rev_jul)
INTO revenue (person, month, revenue)
VALUES (person, '08', rev_aug)
INTO revenue (person, month, revenue)
VALUES (person, '09', rev_sep)
INTO revenue (person, month, revenue)
VALUES (person, '10', rev_oct)
INTO revenue (person, month, revenue)
VALUES (person, '11', rev_nov)
INTO revenue (person, month, revenue)
VALUES (person, '12', rev_dez)
SELECT person,
rev_jan,
rev_feb,
rev_mar,
rev_apr,
rev_mai,
rev_jun,
rev_jul,
rev_aug,
rev_sep,
rev_oct,
rev_nov,
rev_dez
FROM revext;
In this example you insert each month in one separate table, all in one transaction.
create table revenue_jan (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_feb (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_mar (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_apr (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_mai (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_jun (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_jul (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_aug (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_sep (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_oct (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_nov (person VARCHAR2(20) NOT NULL, revenue NUMBER);
create table revenue_dez (person VARCHAR2(20) NOT NULL, revenue NUMBER);
INSERT ALL
INTO revenue_jan (person, revenue)
VALUES (person, rev_jan)
INTO revenue_feb (person, revenue)
VALUES (person, rev_feb)
INTO revenue_mar (person, revenue)
VALUES (person, rev_mar)
INTO revenue_apr (person, revenue)
VALUES (person, rev_apr)
INTO revenue_mai (person, revenue)
VALUES (person, rev_mai)
INTO revenue_jun (person, revenue)
VALUES (person, rev_jun)
INTO revenue_jul (person, revenue)
VALUES (person, rev_jul)
INTO revenue_aug (person, revenue)
VALUES (person, rev_aug)
INTO revenue_sep (person, revenue)
VALUES (person, rev_sep)
INTO revenue_oct (person, revenue)
VALUES (person, rev_oct)
INTO revenue_nov (person, revenue)
VALUES (person, rev_nov)
INTO revenue_dez (person, revenue)
VALUES (person, rev_dez)
SELECT person,
rev_jan,
rev_feb,
rev_mar,
rev_apr,
rev_mai,
rev_jun,
rev_jul,
rev_aug,
rev_sep,
rev_oct,
rev_nov,
rev_dez
FROM revext;
In this example you filter the input data in three ranges:
-
Revenue <= 150
-
150 < Revenue <= 300
-
Revenue > 300
and insert the results in three corresponding tables:
CREATE TABLE revenue_low (person VARCHAR2(20) NOT NULL,
month VARCHAR2(3) NOT NULL, revenue NUMBER);
CREATE TABLE revenue_med (person VARCHAR2(20) NOT NULL,
month VARCHAR2(3) NOT NULL, revenue NUMBER);
CREATE TABLE revenue_top (person VARCHAR2(20) NOT NULL,
month VARCHAR2(3) NOT NULL, revenue NUMBER);
INSERT ALL
WHEN (rev_jan <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '01', rev_jan)
WHEN (rev_feb <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '02', rev_feb)
WHEN (rev_mar <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '03', rev_mar)
WHEN (rev_apr <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '04', rev_apr)
WHEN (rev_mai <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '05', rev_mai)
WHEN (rev_jun <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '06', rev_jun)
WHEN (rev_jul <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '07', rev_jul)
WHEN (rev_aug <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '08', rev_aug)
WHEN (rev_sep <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '09', rev_sep)
WHEN (rev_oct <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '10', rev_oct)
WHEN (rev_nov <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '11', rev_nov)
WHEN (rev_dez <= 150) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '12', rev_dez)
WHEN (rev_jan > 150 AND rev_jan < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '01', rev_jan)
WHEN (rev_feb > 150 AND rev_feb < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '02', rev_feb)
WHEN (rev_mar > 150 AND rev_mar < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '03', rev_mar)
WHEN (rev_apr > 150 AND rev_apr < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '04', rev_apr)
WHEN (rev_mai > 150 AND rev_mai < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '05', rev_mai)
WHEN (rev_jun > 150 AND rev_jun < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '06', rev_jun)
WHEN (rev_jul > 150 AND rev_jul < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '07', rev_jul)
WHEN (rev_aug > 150 AND rev_aug < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '08', rev_aug)
WHEN (rev_sep > 150 AND rev_sep < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '09', rev_sep)
WHEN (rev_oct > 150 AND rev_oct < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '10', rev_oct)
WHEN (rev_nov > 150 AND rev_nov < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '11', rev_nov)
WHEN (rev_dez > 150 AND rev_dez < 300) THEN
INTO revenue_med (person, month, revenue)
VALUES (person, '12', rev_dez)
WHEN (rev_jan > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '01', rev_jan)
WHEN (rev_feb > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '02', rev_feb)
WHEN (rev_mar > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '03', rev_mar)
WHEN (rev_apr > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '04', rev_apr)
WHEN (rev_mai > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '05', rev_mai)
WHEN (rev_jun > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '06', rev_jun)
WHEN (rev_jul > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '07', rev_jul)
WHEN (rev_aug > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '08', rev_aug)
WHEN (rev_sep > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '09', rev_sep)
WHEN (rev_oct > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '10', rev_oct)
WHEN (rev_nov > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '11', rev_nov)
WHEN (rev_dez > 300) THEN
INTO revenue_low (person, month, revenue)
VALUES (person, '12', rev_dez)
SELECT person,
rev_jan,
rev_feb,
rev_mar,
rev_apr,
rev_mai,
rev_jun,
rev_jul,
rev_aug,
rev_sep,
rev_oct,
rev_nov,
rev_dez
FROM revext;
In this example the summary over all months are calculated. If the sum is less or
eqal to 6000 then the sum is inserted in table SUMMARY_LOW, else in table
SUMMARY_TOP, again all in one transaction.
CREATE TABLE summary_low (person VARCHAR2(20) NOT NULL, sum NUMBER);
CREATE TABLE summary_top (person VARCHAR2(20) NOT NULL, sum NUMBER);
INSERT FIRST
WHEN ((rev_jan + rev_feb + rev_mar + rev_apr + rev_mai + rev_jun +
rev_jul
+ rev_aug + rev_sep + rev_oct + rev_nov +
rev_dez) <= 6000) THEN
INTO summary_low (person, sum)
VALUES (person, rev_jan + rev_feb + rev_mar +
rev_apr + rev_mai + rev_jun
+ rev_jul + rev_aug + rev_sep + rev_oct +
rev_nov + rev_dez)
ELSE
INTO summary_top (person, sum)
VALUES (person, rev_jan + rev_feb + rev_mar + rev_apr
+ rev_mai + rev_jun
+ rev_jul + rev_aug + rev_sep + rev_oct + rev_nov +
rev_dez)
SELECT person,
rev_jan,
rev_feb,
rev_mar,
rev_apr,
rev_mai,
rev_jun,
rev_jul,
rev_aug,
rev_sep,
rev_oct,
rev_nov,
rev_dez
FROM revext;
Multitable inserts are just one example of the new ETL features of Oracle9i. The full
set of new ETL functionality creates a powerful framework for handling all ETL tasks
within the Oracle database.
|