|
This guide explores Oracle indexing
-
Standard B-TREE index
-
Bitmap index
-
Function-based index
-
Index-only tables (IOTs)
-
Bitmap Join Indexes
-
How to measure Index
Selectivity
-
Histograms and the DBMS_STATS
Package
-
Monitoring Unused Indexes
-
Reversing Indexes
The Standard B-TREE Index
The oldest and most popular type of Oracle
indexing is a standard B-TREE index. The B-TREE index was introduced in the earliest
releases of Oracle and remains widely used with Oracle.

Oracle offers several options when creating an index using the default B-TREE structure.
It allows you to index on multiple columns (concatenated indexes) to improve access
speeds. Also, it allows for individual columns to be sorted in different orders. For
example, we could create a B-TREE index on a column called name in ascending order and have a second column within the index
that displays the sal column in descending
order.
CREATE INDEX idx_name ON emp (
name ASC,
sal DESC
);
While B-TREE indexes are great for simple
queries, they are not very good for the following situations:
-
Low-cardinality columns.
Columns with less than 200 distinct values do not have the selectivity required in
order to benefit from standard B-TREE index structures.
-
No support for SQL functions.
B-TREE indexes are not able to support SQL queries using Oracle's built-in
functions.
Online Index Rebuilds
Online index rebuilds allow DML operations
on the base table during index creation.
The syntax for an online rebuild is:
ALTER INDEX idx_name REBUILD
ONLINE;
When the ONLINE keyword is used as part of
the CREATE or ALTER syntax the current index is left intact while a new copy of the index
is built, allowing DML to access the old index. Any alterations to the old index are
recorded in a Index Organized Table known as a «journal
table». Once the rebuild is complete the alterations
from the journal table are merged into the new index. The process will skip any locked
rows and commit every 20 rows. Once the merge operation is complete the data dictionary
is updated and the old index is dropped. DML access is only blocked during the data
dictionary updates, which complete very quickly.
Ein B-TREE Index kann in den folgenden
Fällen nicht verwendet werden
NULL Werte sind nicht im Index gespeichert
und können nur in der Tabelle mittels Full Table Scan gefunden werden.
... WHERE column IS
NULL;
(Index)
... WHERE column IS NOT
NULL;
(Index)
Alle Funktionen direkt auf der Column
verhindern den Indexzugriff
... WHERE SUBSTR(ename,1,7) =
'CAPITAL'; (Index)
... WHERE ename LIKE
'CAPITAL';
(Index OK)
... WHERE TRUNC(edate) =
TRUNC(sysdate)
(Index)
... WHERE edate BETWEEN
TRUNC(sysdate)
(Index OK)
AND TRUNC(sysdate) + .99999;
Ungleich Operator verhindert
Indexzugriff
... WHERE amount !=
Wildcard0;
(Index)
... WHERE amount >
0;
(Index OK)
Berechnungen verhindern den
Indexzugriff
... WHERE amount + 3000 <
5000;
(Index)
... WHERE amount <
2000;
(Index OK)
Wildcard % auf erstem Zeichen verhindert
Indexzugriff
... WHERE ename LIKE
'%ueller';
(Index)
Gleiche Columns auf beiden Seiten
verhindern Indexzugriff
... WHERE ename = NVL(var,
ename);
(Index)
... WHERE ename LIKE NVL(var,
'%');
(Index OK)
String Concatenierung verhindert den
Indexzugriff
... WHERE ename || vname =
'XYZ';
(Index)
... WHERE ename = 'XY' AND vname =
'Z'; (Index OK)
Mehr als ein Index (auf ename und deptno) pro Tabelle
kann nur verwendet werden, wenn alles Equality-Searches mit NON-UNIQUE Indexes sind. Hier
führt Oracle einen Index-Merge durch.
... WHERE ename LIKE 'XY%' AND deptno =
10; (ein
Index)
... WHERE ename = 'XY' AND deptno =
10; (beide Indexe)
Oracle erlaubt einen Index über mehrere Attribute in
der gleichen Tabelle zu erstellen. Ein solcher concatenated Index eröffnet dem
Optimizer ideale Zugriffe, die von zwei einzelnen Indexes nicht zu lösen sind. Im
folgenden Beispiel ist ein Index über (deptno, ename) vorhanden.
SELECT ename, sal, deptno
FROM emp
WHERE ename LIKE 'KIN%'
AND deptno =
10;
(Index OK)
SELECT ename, sal, deptno
FROM emp
WHERE deptno =
10;
(Index OK)
SELECT ename, sal, deptno
FROM emp
WHERE ename LIKE
'KIN%';
(Index)
Im dritten Beispiel kann der Index nicht benutzt werden,
da das Attribut ename im Index an zweiter Stelle steht.
NOT verhindert den Indexzugriff
SELECT ename, sal, deptno
FROM emp
WHERE deptno NOT =
0
(Index)
SELECT ename, sal, deptno
FROM emp
WHERE deptno >
0
(Index OK)
Manchmal will man einen Indexzugriff explizit verhindern.
Dies wird mit einer Addition von 0 oder einer Concatenierung eines Leerstrings
erreicht.
SELECT ename, sal, deptno
FROM emp
WHERE deptno + 0 =
10
(Index)
SELECT ename, sal, deptno
FROM emp
WHERE ename || '' =
'MU%'
(Index)
Bitmapped Indexes
Oracle bitmap indexes are very different
from standard B-TREE indexes. In bitmap structures, a two-dimensional array is created
with one column for every row in the table being indexed. Each column represents a
distinct value within the bitmapped index. This two-dimensional array
represents each value within the index multiplied by the number of rows in the table. At
row retrieval time, Oracle decompresses the bitmap into the RAM data buffers so it can be
rapidly scanned for matching values. These matching values are delivered to Oracle in the
form of a Row-ID list, and these Row-ID values may directly access the required
information.
The real benefit of bitmapped indexing
occurs when one table includes multiple bitmapped indexes. Each individual column may
have low cardinality. The creation of multiple bitmapped indexes provides a very powerful
method for rapidly answering difficult SQL queries.
Bitmap Indexes and Deadlocks (Example)
Bitmap indexes are not appropriate for tables that have lots of
single row DML operations (inserts) and especially concurrent single row DML operations.
Deadlock situations are the result of concurrent inserts as the following example shows:
Open two windows, one for Session 1 and one for Session 2
| Session 1 |
Session 2 |
|
create table bitmap_index_demo (
value varchar2(20)
);
|
|
|
insert into bitmap_index_demo
select decode(mod(rownum,2),0,'M','F')
from all_objects;
|
|
|
create bitmap index
bitmap_index_demo_idx
on bitmap_index_demo(value);
|
|
|
insert into bitmap_index_demo
values ('M');
1 row created.
|
|
|
|
insert into bitmap_index_demo
values ('F');
1 row created.
|
|
insert into bitmap_index_demo
values ('F');
...... waiting ......
|
|
|
ERROR at line 1:
ORA-00060: deadlock detected while waiting for resource
|
insert into bitmap_index_demo
values ('M');
...... waiting ......
|
Eigenschaften von Bitmapped Indexen
-
Nur der neue Optimizer CBO kann
Bitmapped Indexe verwenden.
-
Verknüpfte Bitmapped Indexes sind
möglich.
-
NULL Werte können ebenfalls im
Bitmapped Index gefunden werden. Das heisst also, das Abfragen auf NULL und IS NULL
möglich sind.
-
Bitmapped Indexe können auch !=
Abfragen lösen.
-
Bitmapped Indexe eignen sich für
SELECT COUNT(*) Abfragen.
-
Das Nachführen von Bitmapped
Indexen ist aufwendig.
-
Das Locking eines Bit ist nicht
möglich, deshalb lockt Oracle einen gesamten Index-Block, was zu Locking
Problemen führen kann. Bitmapped Indexe sollten also in OLTP Datenbanken mit
gleichzeitigen DML Befehlen auf der gleichen Tabelle nicht eingesetzt werden. In
DataWareHouse Anwendungen bieten sie sehr grosse Vorteile.
Function-based Indexes
One of the most important advances in
Oracle indexing is the introduction of function-based indexing. Function-based indexes
allow creation of indexes on expressions, internal functions, and user-written functions
in PL/SQL and Java. Function-based indexes ensure that the Oracle designer is able to use
an index for its query.
Select * from customer where
substr(cust_name,1,4) = ‘BURL';
Select * from customer where to_char(order_date,'MM') = '01;
Select * from customer where upper(cust_name) = ‘JONES';
Select * from customer where initcap(first_name) = ‘Mike';
Oracle always interrogates the where
clause of the SQL statement to see if a matching index exists. By using function-based
indexes, the Oracle designer can create a matching index that exactly matches the
predicates within the SQL where clause. This ensures that the query is retrieved with a
minimal amount of disk I/O and the fastest possible speed.
How to enable Function Based Indexes
The following is a list of what needs to be done to use
function based indexes:
- You must have the system privilege query
rewrite to create function based indexes on tables in your own schema.
- You must have the system privilege global query
rewrite to create function based indexes on tables in other schemas
- For the optimizer to use function based indexes, the
following session or system variables must be set:
QUERY_REWRITE_ENABLED=TRUE
QUERY_REWRITE_INTEGRITY=TRUSTED
You may enable these at either the session level with
ALTER SESSION or at the system level via ALTER SYSTEM or by setting them in the init.ora parameter file. The
meaning of query_rewrite_enabled is to allow the
optimizer to rewrite the query allowing it to use the function based index. The meaning
of is to tell the optimizer to «trust» that the code marked deterministic by
the programmer is in fact deterministic. If the code is in fact not deterministic (that
is, it returns different output given the same inputs), the resulting rows from the index
may be incorrect.
- Use the Cost Based Optimizer. Function based indexes
are only visible to the Cost Based Optimizer and will not be used by the Rule Based
Optimizer ever.
- Use substr() to constrain return values from user
written functions that return VARCHAR2 or RAW types. Optionally hide the substr in a
view (recommended).
Once the above list has been satisfied, it is as easy as
«CREATE INDEX» from there on in. The optimizer will find and use your indexes
at runtime for you.
|
Simple Example: UPPER
So why is it easy and of immediate value? It's easy
because it's just a CREATE INDEX statement. Consider the following
example:
- Begin by creating a copy of the
«scott/tiger» demo employee table.
- Then change the data in the employee name column
to be in mixed case.
- Then create an index on the UPPER of the
ename column -- effectively creating a case insensitive index:
| |
update emp set ename =
initcap(ename);
commit;
create index emp_upper_idx on emp(upper(ename));
|
We now have an index on the «UPPER» of a
column. Any application that already issues case insensitive queries of the
form:
| |
set autotrace on
explain
select ename, empno, sal from emp where upper(ename) = 'KING';
ENAME
EMPNO SAL
---------- ---------- ----------
King
7839 5000
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT
Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=40)
1 0 TABLE ACCESS (BY INDEX ROWID) OF
'EMP' (Cost=1 Card=1 Bytes=40)
2 1 INDEX (RANGE SCAN)
OF 'EMP_UPPER_IDX' (NON-UNIQUE) (Cost=1 Card=1)
|
will transparently make use of this index --
gaining the performance boost an index can deliver. Before this feature was
available, every row in the EMP table would have been scanned, upper-cased and
compared.
This performance boost is most visible when
indexing user written functions on columns.
| |
select my_function(ename)
from emp
where some_other_function(empno) > 10
/
|
This was great because you could now effectively
extend the SQL language to include application specific functions. Unfortunately
however, the performance of the above query was a bit disappointing. Say the EMP
table had 1,000 rows in it -- the function «some_other_function» would
be executed 1,000 times during the query, once per row. Additionally, assume
the function took 1/100 of a second to execute. This relatively simple query now
takes at least 10 seconds.
Simple Example:
Calculation
SQL> SELECT sal
FROM emp
WHERE sal + comm * (sal - 100) <
1500;
Without a function
based index, you will notice, that oracle performs a full table scan.
Create the function
based index
SQL> CREATE INDEX
calc_on_emp ON emp (sal + comm * (sal -
100));
Enable the function
based index
SQL> ALTER SESSION
SET QUERY_REWRITE_ENABLED = TRUE;
SQL> ALTER SESSION SET QUERY_REWRITE_INTEGRITY = TRUSTED;
Check, that the
function based index is used
SQL> ANALYZE INDEX
calc_on_emp COMPUTE STATISTICS;
SQL> set autotrace on explain;
SQL> SELECT sal FROM emp
WHERE sal + comm * (sal - 100) <
1500;
Execution Plan
-------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
(Cost=10 Card=21690
Bytes=108450)
1 0
TABLE ACCESS (BY INDEX ROWID) OF 'EMP'
(Cost=10 Card=21690
Bytes=108450)
2 1
INDEX (RANGE SCAN) OF 'CALC_ON_EMP'
(NON-UNIQUE) (Cost=2
Card=21690)
Advanced Example: Modified
«soundex» routine in PL/SQL
Here is a real example - a modified
«soundex» routine in PL/SQL:
| |
alter session set
query_rewrite_enabled = true;
alter session set query_rewrite_integrity = trusted;
create or replace package stats
as
cnt number default 0;
end;
/
create or replace function my_soundex(p_string in varchar2) return varchar2
deterministic
as
l_return_string varchar2(6) default
substr(p_string, 1, 1);
l_char
varchar2(1);
l_last_digit number
default 0;
type vcArray is table of varchar2(10) index by
binary_integer;
l_code_table
vcArray;
begin
stats.cnt := stats.cnt+1;
l_code_table(1) := 'BPFV';
l_code_table(2) := 'CSKGJQXZ';
l_code_table(3) := 'DT';
l_code_table(4) := 'L';
l_code_table(5) := 'MN';
l_code_table(6) := 'R';
for i in 1 .. length(p_string)
loop
exit when
(length(l_return_string) = 6);
l_char := substr(p_string, i,
1);
for j in 1 ..
l_code_table.count
loop
if (instr(l_code_table(j),
l_char) > 0 AND j <> l_last_digit)
then
l_return_string := l_return_string || to_char(j,'fm9');
l_last_digit := j;
end if;
end loop;
end loop;
return rpad(l_return_string, 6, '0');
end;
/
|
Notice in this function, the usage of the keyword
«deterministic». Deterministic declares that the above function --
when given the same inputs -- will always return the exact same output. This
keyword is needed in order to create an index on a user written function. You
must tell Oracle that the function is «deterministic» and will return a
consistent result given the same inputs. This implies for example that you cannot
index using the package «dbms_rand», the random number generator. Its
results are not deterministic, given the same inputs you'll get random output. The
built-in sql function UPPER on the other hand is deterministic so you can create an
index on the UPPER of a column.
Now that we have the function
«My_Soundex()», lets see how it performs without an index...
| |
create table test_soundex(name
varchar2(30));
insert into test_soundex
select object_name
from all_objects
where rownum <= 5000;
exec stats.cnt := 0;
set timing on
set autotrace on explain
select name
from test_soundex
where my_soundex(name) = my_soundex('FILE$')
/
NAME
------------------------------
FILE$
Elapsed: 00:00:03.00
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT
Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=34)
1 0 TABLE ACCESS (FULL) OF
'TEST_SOUNDEX' (Cost=1 Card=1 Bytes=34)
set autotrace off
set timing off
set serveroutput on
exec dbms_output.put_line(stats.cnt)
10000
|
So, we can see this query took over 3 seconds to
execute and had to do a full scan on the table. The function my_soundex was
invoked 10,000 times (according to our counter), twice for each row. Lets see how
indexing the function can be used to speed things up.
The first thing we will do is create the index as
follows:
| |
create index test_soundex_idx
on test_soundex(my_soundex(name));
|
So, in order to index a user written function that
returns a string, we must constrain the return type in the create index statement.
In the above, knowing that my_soundex returns at most 6 characters, we are
substring the first six characters.
We are now ready to test the performance of the
table with the index on it. We would like to monitoring the effect of the index on
INSERTS as well as the speedup for SELECTS to see the effect on each. In the
un-indexed test case, our queries took over 3 seconds.
| |
delete from test_soundex;
exec stats.cnt := 0;
insert into test_soundex
select object_name
from all_objects
where rownum <= 5000;
exec dbms_output.put_line(stats.cnt)
5000
exec stats.cnt := 0;
set autotrace on explain
set timing on
select name
from test_soundex
where my_soundex(name) = my_soundex('FILE$')
/
NAME
------------------------------
FILE$
Elapsed: 00:00:00.05
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT
Optimizer=ALL_ROWS (Cost=1 Card=1 Bytes=34)
1 0 TABLE ACCESS (BY INDEX ROWID) OF
'TEST_SOUNDEX' (Cost=1 Card=1 Bytes=34)
2 1 INDEX (RANGE SCAN)
OF 'TEST_SOUNDEX_IDX' (NON-UNIQUE) (Cost=1 Card=1)
|
The important things to note here are
that:
- The insert of 5,000 records took longer.
Indexing a user written function will necessarily affect the performance of
inserts and some updates. Since most applications insert and update singleton
entries.
- While the insert ran slower, the query ran much
faster. Also, as the size of our table grows, the full scan query will
take longer and longer to execute. The index based query will always execute with
the near same performance characteristics as the table gets larger.
Advanced Example: Extend Constraints
using Function Based Indexes
Suppose you have the following rows
in a table:
ID ADATE
---------- ---------
1
1 04-JUN-04
1 04-JUN-04
1 05-JUN-04
2
2 04-JUN-04
2 05-JUN-04
Now, you are looking for a constraint, which allows
duplicate rows, such as ...
ID ADATE
---------- ---------
1 04-JUN-04
1 04-JUN-04
... but not with NULL values for
adate, not allowed is therefore:
ID ADATE
---------- ---------
1 NULL
1 NULL
The first idea will be to write a trigger, a
constraint cannot be found (CHECK, NOT NULL, UNIQUE). A trigger will do the job,
but if you have 100 million rows, this will be a performance problem. The answer is
- use a Function Based Index, here is the complete example:
CREATE TABLE demo (id NUMBER, adate
DATE);
INSERT INTO DEMO VALUES (1 ,
NULL);
INSERT INTO DEMO VALUES (2 , NULL);
INSERT INTO DEMO VALUES (1 , SYSDATE);
INSERT INTO DEMO VALUES (1 , SYSDATE);
INSERT INTO DEMO VALUES (2 , SYSDATE);
INSERT INTO DEMO VALUES (1 , SYSDATE+1);
INSERT INTO DEMO VALUES (2 , SYSDATE+1);
Now create the Function Based Index
which will do the job:
CREATE UNIQUE INDEX demo_idx ON
demo (DECODE(adate, NULL, id, NULL));
This is a unique index on id when
adate is NULL, the index is
only indexing rows where by the adate is NULL.
If you have 100 rows in the table, 50 of which have
a non-null adate's, the index will have 50 entries. Entirely NULL entries
are not made in B-TREE indexes -- hence the DECODE only indexes the rows that have
NULL (active) adate's.
DECODE is an IF-THEN-ELSE, row by row:
IF (adate IS NULL) THEN
Return the Value of ID
ELSE
Return NULL
END IF
Make a Test:
INSERT INTO DEMO VALUES (1 ,
NULL);
ERROR at line 1:
ORA-00001: unique constraint (SCOTT.DEMO_IDX) violated
UPDATE demo SET adate = NULL WHERE
id = 2;
ERROR at line 1:
ORA-00001: unique constraint (SCOTT.DEMO_IDX) violated
Advanced Example: How to enforce a unique
condition selectively
When a column in a table has a
certain value, how can you make sure that other columns in that table are unique
for that subset of rows?
Look at the following
table:
desc expedition;
Name
Null? Type
----------------------------------------- -------- ---------------
EXPEDITION_ID
NOT NULL NUMBER
TEAM_ID
NUMBER
RESPONSIBILITY
VARCHAR2(100)
STATUS
VARCHAR2(20)
The column STATUS may have the
values ACTIVE, meaning it is currently active, and INACTIVE, meaning it is
archived. We need the following constraint: The RESPONSIBILITY has to be unique in
the same TEAM_ID for the active expeditions. It means TEAM_ID and RESPONSIBILITY
have to be unique while STATUS=ACTIVE.
The question for this problem went
on to suggest that a trigger of some sort or some procedural code might be the
solution. What we need is a unique condition selectively: «This data must be unique when this condition is
met.»
All you need to do to solve this
problem is uniquely index TEAM_ID, RESPONSIBILITY when STATUS=ACTIVE. This can be
solved using a Function Based Index (FBI).
CREATE TABLE expedition (
expedition_id NUMBER PRIMARY KEY,
team_id NUMBER,
responsibility VARCHAR2(100),
status
VARCHAR2(20) CHECK (status IN ('ACTIVE', 'INACTIVE'))
);
Now create the UNIQUE Function Based
Index, which implements the constraint.
CREATE UNIQUE INDEX
resp_unique_in_team_id ON EXPEDITION (
CASE WHEN status = 'ACTIVE' THEN team_id ELSE NULL END,
CASE WHEN status = 'ACTIVE' THEN responsibility ELSE NULL END
);
Now, try to insert a duplicate
TEAM_ID / RESPONSIBILITY pair in an active set of expeditions.
INSERT INTO expedition
(expedition_id,team_id,responsibility,status)
VALUES (1, 10, 'Oxygen', 'ACTIVE');
1 row created.
INSERT INTO expedition
(expedition_id,team_id,responsibility,status)
VALUES (2, 10, 'Oxygen', 'ACTIVE');
ERROR at line 1:
ORA-00001: unique constraint (SCOTT.RESP_UNIQUE_IN_TEAM_ID)
violated
Now we make STATUS=INACTIVE for the
first row.
UPDATE expedition
SET status = 'INACTIVE'
WHERE expedition_id = 1
AND team_id = 10
AND status = 'ACTIVE';
1 row updated.
Try again:
INSERT INTO expedition
(expedition_id,team_id,responsibility,status)
VALUES( 2, 10, 'A', 'ACTIVE' );
1 row created.
We also have the nice effect that
the index is only on active expeditions, so it is consuming as little space as
possible.
Function Based Indexes can be used
to solve problems such as: «This data must be
unique when this condition is met.»
If at all possible, uniqueness
should be enforced via a UNIQUE constraint. Unique constraints can be used by the
optimizer during query rewrites, whereas a simple unique index won’t be used.
Unique constraints add metadata to the data dictionary that may be used by many
tools as well. However, due to the fact that a UNIQUE constraint will not accept a
function like the one in this section, a unique index was our only course of
action.
Advanced Example: How to Index only some
Rows in a Table
Is it possible to index only some of
the rows in a table instead of all rows in the table? The answer is yes - if you
are using Function Based Indexes (FBIs).
This question often came up with
creating an index for a Y/N column - a status or processed flag for example. You
might have a table with a flag indicating whether a record has been processed, and
most of the records have 'Y' as their value; only a few rows have 'N' to indicate
that they are awaiting processing.
What we need is an index that will
find the 'N' rows, without indexing the 'Y' rows. What immediately pops into most
people's heads at this point is, «Ah, this is a bitmap
index we are looking for».
A bitmap index
is useful for indexing low-cardinality data, and what could be lower cardinality
than just 'Y' and 'N' values. The problem is that bitmap indexes cannot be used in
environments where the indexed value is modified frequently or the underlying table
is concurrently modified by many sessions.
We have the
requirements to index a subset of rows that satisfies some criteria - we do not
want to index the remaining rows - this problem is also called Index
Selectivity.

Two important facts must be
considered to find a solution for such tasks.
-
A B-TREE index will never
have an entirely NULL entry. If all of the columns in the index key are
NULL, there will not be an entry in B-TREE for that column, If you create an
index on a million-row table using a column that is NULL for every row, that
index will be empty.
-
We have Function Based Indexes
that can incorporate complex logic in them.
We will start by creating a table with many rows,
all with a STATUS column set to 'Y'.
create table test as select *
from all_objects;
Table created.
update test set status = 'Y';
9052 rows updated.
commit;
Commit complete.
Now, create the Function Based
Index, which will index only rows where STATUS = 'N'.
create index fbi_test_idx on
test (decode(status,'N','N',NULL));
Index created.
Now, check how many rows are
currently in the index (should be zero ...)
analyze index fbi_test_idx
validate structure;
Index analyzed.
select name, del_lf_rows, lf_rows, lf_blks
from index_stats;
NAME
DEL_LF_ROWS LF_ROWS LF_BLKS
------------------------------ ----------- ---------- ----------
FBI_TEST_IDX
0
0 1
As you can see, it starts with nothing in the
index, due all values for the STATUS column are 'Y' and NULL values are not
inserted in the B-TREE. We have zero leaf rows (LF_ROWS) and no delete leaf rows
(DEL_LF_ROWS). Now we will update 100 rows with STATUS = 'N', then look at the
index again.
update test set status = 'N'
where rownum <= 100;
commit;
100 rows updated.
analyze index fbi_test_idx validate structure;
Index analyzed.
select name, del_lf_rows, lf_rows, lf_blks
from index_stats;
NAME
DEL_LF_ROWS LF_ROWS LF_BLKS
------------------------------ ----------- ---------- ----------
FBI_TEST_IDX
0
100 1
Now, the index has 100 entries - but the table has
9052 rows. Our index is very small and compact.
In the next step we will verify, how the Function
Based Index can be used. We will create a view with an additional column
STATUS_FBI, which will hide the complexity of the
indexed column.
create or replace view
test_view
as
select test.*, decode(status,'N','N',NULL) status_fbi
from test;
desc test_view
Name
Null? Type
----------------------------------------- -------- --------------
OWNER
VARCHAR2(30)
OBJECT_NAME
VARCHAR2(30)
SUBOBJECT_NAME
VARCHAR2(30)
OBJECT_ID
NUMBER
DATA_OBJECT_ID
NUMBER
OBJECT_TYPE
VARCHAR2(19)
CREATED
DATE
LAST_DDL_TIME
DATE
TIMESTAMP
VARCHAR2(19)
STATUS
VARCHAR2(7)
TEMPORARY
VARCHAR2(1)
GENERATED
VARCHAR2(1)
SECONDARY
VARCHAR2(1)
STATUS_FBI
VARCHAR2(1)
This approach works well with most Function Based
Indexes. Rather than have the developers or end-users need to know the exact
function to use, they use a column in a view. That column has the same function in
it which is used in the Function Based Index itself. If the logic ever changes, you
can simply re-create the index and update the view, and all the applications will
be fixed immediately.
Now, query the created view after you have updated
the optimizer statistics.
analyze table test compute
statistics
for table
for all indexes
for all indexed columns
/ Table analyzed.
set autotrace on;
select rowid, object_name from test_view
where status_fbi = 'N'
and rownum = 1;
ROWID
OBJECT_NAME
------------------ ------------------------------
AAACaSAAFAAAAoSAAA ICOL$
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
(Cost=2 Card=1 Bytes=29)
1 0 COUNT (STOPKEY)
2 1 TABLE ACCESS (BY INDEX
ROWID) OF 'TEST' (TABLE)
3 2 INDEX
(RANGE SCAN) OF 'FBI_TEST_IDX' (INDEX)
Statistics
----------------------------------------------------------
8 recursive calls
0 db block gets
4 consistent gets
0 physical reads
0 redo size
471 bytes sent via SQL*Net to
client
507 bytes received via SQL*Net
from client
2 SQL*Net roundtrips
to/from client
0 sorts (memory)
0 sorts (disk)
1 rows
processed
As you can see, the index was used.
In the next step, we will simulate some processing.
We will update the selected row from above, set it to processed and check the index
again.
set autotrace off;
update test set status = 'Y' where rowid = 'AAACaSAAFAAAAoSAAA';
analyze index fbi_test_idx validate structure;
select name, del_lf_rows, lf_rows, lf_blks
from index_stats;
NAME
DEL_LF_ROWS LF_ROWS LF_BLKS
------------------------------ ----------- ---------- ----------
FBI_TEST_IDX
1
100 1
There are still 100 entries in the index, however
one of them is deleted, ready to be reused. Now we insert some unprocessed
data.
insert into test select * from
all_objects a where rownum <= 2; 2 rows
created.
update test set status = 'N' where status not in ('N','Y'); 2 rows updated.
commit;
analyze index fbi_test_idx validate structure;
select name, del_lf_rows, lf_rows, lf_blks
from index_stats;
NAME
DEL_LF_ROWS LF_ROWS LF_BLKS
------------------------------ ----------- ---------- ----------
FBI_TEST_IDX
0
101 1
Conclusion on Function Based
Indexes
| Pros |
Cons |
- Its easy to use/implement and provides
immediate value
- It can be used to speed up existing
applications without changing any of their logic or queries. Many orders of
magnitude query improvement may be observed.
- It can be used to precompute complex values
without using a trigger
- Can be created either as B-TREE or bitmap
index
- Index can be built on an arithmetic
expression or expression containing PL/SQL, package functions, C callout or
SQL built-in functions
- Optimizer can estimate selectivity more
accurately if the expressions are materialized in a function-based index.
Range scan can be used for queries with expression in where clause and has
index build on the expression used.
- Provides efficient linguistic collation to
use NLS sort index
- Indexes can be created on object columns and
REF columns by using methods defined for the object.
|
- You cannot direct path load the table with a
function based index if that function was user written and requires the SQL
Engine. That means you cannot direct path load into a table indexed using
my_soundex(x), but you could if it had indexed upper(x).
- It will affect the performance of inserts
and updates. (Remember, you insert a row once, you query it thousands of
times.)
- If the function you are indexing is a user
written function and it returns a string, you may have to expose a view for
the end users to use.
|
In general, the pros heavily out weigh any of the
cons in this case. The inability to direct path load with a pl/sql based index can
easily be overcome by indexing after the load with the parallel query option. The
performance of inserts is only marginally affected, most applications won't even
notice the effect.
|
Index-organized tables
Index-organized tables are tables with
data rows grouped according to the primary key. The attributes of index-organized tables
are stored entirely within the physical data structures for the index. Index-organized
tables provide fast key-based access to table data for queries involving exact match and
range searches. Changes to the table data (such as adding new rows, updating rows, or
deleting rows) result only in updating the index structure (because there is no separate
table storage area). Also, storage requirements are reduced because key columns are not
duplicated in the table and index. The remaining non-key columns are stored in the index
structure.
Index-organized tables are particularly
useful when you are using applications that must retrieve data based on a primary key
such as intersection tables.
Note the following important
constraints
-
The data in an index-organized table
is not duplicated.
-
Index-organized tables can be
reorganized with the ALTER TABLE
statement MOVE clause
-
Index-organized table must have a
Primary Key
-
Index-organized tables may not have
unique constraints.
-
Cannot contain LONGs or
LOBs.
-
The DBA_TABLES data dictionary view
contains two new columns, IOT_NAME and IOT_TYPE, to provide information on
index-organized tables.
Here is an example
CREATE TABLE my_intersection
(
id1 NUMBER(15) NOT NULL,
id2 NUMBER(15) NOT NULL,
job VARCHAR2(500) NULL,
CONSTRAINT pk_my_intersection
PRIMARY KEY (id1,id2)
)
ORGANIZATION INDEX
STORAGE (INITIAL 256K NEXT 256K PCTINCREASE 0)
TABLESPACE idx
INCLUDING job
OVERFLOW TABLESPACE tab;
Bitmap Join Indexes
Overview
A join index is an index structure
which spans multiple tables and improves the performance of joins of those tables.
With materialized views, Bitmap join indexes can be particularly useful for star queries.
Due to their space-efficient compressed storage, bitmap join indexes also take up little
disk space.
Bitmap join indexes represent the join of
columns in two or more tables. With a bitmap join index, the value of a column in one
table (generally a dimension table) is stored with the associated ROWIDs of the like
value in other tables that the index is defined on. This provides fast join access
between the tables, if that query uses the columns of the bitmap join index.
Example
Bitmap join indexes are best understood by
examining a simple example. Suppose that a data warehouse contains a star schema with a
fact table named SALES and a dimension table named CUSTOMER which holds each customer's
home location. A bitmap join index can be created which indexes SALES by customer home
locations.

Create the tables first:
DROP TABLE sales;
CREATE TABLE sales (
sales_id NUMBER(4) NOT NULL,
cust_id NUMBER(4) NOT NULL,
amount NUMBER(6) NOT NULL
)
/
DROP TABLE customer;
CREATE TABLE customer (
cust_id NUMBER(4) NOT NULL,
name VARCHAR2(20) NOT NULL,
region VARCHAR2(100) NOT NULL
)
/
ALTER TABLE customer ADD (
CONSTRAINT pk_customer
PRIMARY KEY (cust_id)
)
/
ALTER TABLE sales ADD (
CONSTRAINT pk_sales
PRIMARY KEY (sales_id)
)
/
ALTER TABLE sales ADD (
CONSTRAINT fk_sales_customer
FOREIGN KEY (cust_id)
REFERENCES customer (cust_id)
)
/
INSERT INTO customer VALUES (1,'Müller','Thun');
INSERT INTO customer VALUES (2,'Meier','Bern');
INSERT INTO customer VALUES (3,'Holzer','Münsigen');
INSERT INTO customer VALUES (4,'Ammann','Interlaken');
INSERT INTO customer VALUES (5,'Glaus','Gunten');
INSERT INTO customer VALUES (6,'Keller','Oberhofen');
INSERT INTO customer VALUES (7,'Indermühle','Gwatt');
INSERT INTO customer VALUES (8,'Stoller','Wimmis');
INSERT INTO customer VALUES (9,'Marty','Noflen');
INSERT INTO customer VALUES (10,'Schweizer','Seftigen');
COMMIT;
INSERT INTO sales VALUES (1,1,570);
INSERT INTO sales VALUES (2,10,1300);
COMMIT;
Usually the dimension table (e.g. CUSTOMER
table) is large (and customer-based dimension tables can reach tens of millions of
records), then the bitmap join index can vastly improve performance by not requiring
any access to the CUSTOMER table. In addition, bitmap join indexes can eliminate some
of the key iteration and bitmap merge work which is often present in star queries with
bitmap indexes on the fact table.
CREATE BITMAP INDEX cus_sal
ON sales (customer.region)
FROM sales, customer
WHERE sales.cust_id = customer.cust_id;
Now, let's create the corresponding query
for the created bitmap join index. In this example we added the hint /*+ INDEX_COMBINE
(sales cus_sal) */, because we have only a few records in the CUSTOMER table.
set autotrace on
SELECT /*+ INDEX_COMBINE (sales cus_sal) */
SUM (sales.amount)
FROM sales, customer
WHERE sales.cust_id = customer.cust_id
AND customer.region = 'Thun';
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 SORT (AGGREGATE)
2 1 TABLE ACCESS (BY INDEX ROWID) OF 'SALES'
3 2 BITMAP CONVERSION (TO ROWIDS)
4 3 BITMAP INDEX (SINGLE VALUE) OF 'CUS_SAL'
The execution plan shows that
no access to the (large) CUSTOMER table
was necessary for this query!
Bitmap Join Index Restrictions
Join results must be stored, therefore,
bitmap join indexes have the following restrictions:
-
Parallel DML is currently only
supported on the fact table. Parallel DML on one of the participating dimension
tables will mark the index as unusable.
-
Only one table can be updated
concurrently by different transactions when using the bitmap join index.
-
No table can appear twice in the
join.
-
You cannot create a bitmap join index
on an index-organized table or a temporary table.
-
The columns in the index must all be
columns of the dimension tables.
-
The dimension table join columns must
be either primary key columns or have unique constraints.
-
If a dimension table has composite
primary key, each column in the primary key must be part of the join.
How to measure Index Selectivity
Index Selectivity
B-TREE Indexes improve the performance of
queries that select a small percentage of rows from a table. As a general guideline, we
should create indexes on tables that are often queried for less than 15% of the table's rows. This value may be higher in
situations where all data can be retrieved from an index, or where the indexed columns
can be used for joining to other tables.
The ratio of the number of distinct values in the indexed
column / columns to the number of records in the table represents the selectivity of an
index. The ideal selectivity is 1. Such a selectivity can be reached only by unique
indexes on NOT NULL columns.
Example with good Selectivity
A table having 100'000 records and one of
its indexed column has 88000 distinct values, then the selectivity of this index is
88'000 / 10'0000 = 0.88.
Oracle implicitly creates indexes on the columns of all
unique and primary keys that you define with integrity constraints. These indexes are the
most selective and the most effective in optimizing performance. The selectivity of an
index is the percentage of rows in a table having the same value for the indexed column.
An index's selectivity is good if few rows have the same value.
Example with bad Selectivity
lf an index on a table of 100'000 records
had only 500 distinct values, then the index's selectivity is 500 / 100'000 = 0.005 and
in this case a query which uses the limitation of such an index will return 100'000 / 500
= 200 records for each distinct value. It is evident that a full table scan is
more efficient as using such an index where much more I/O is needed to scan
repeatedly the index and the table.
How to Measure Index Selectivity
Manually measure index
selectivity
The ratio of the number of distinct values
to the total number of rows is the selectivity of the columns. This method is useful to
estimate the selectivity of an index before creating it.
select count (distinct job)
"Distinct Values" from emp;
Distinct Values
---------------
5
select count(*) "Total Number
Rows" from emp;
Total Number Rows
-----------------
14
Selectivity = Distinct Values / Total Number Rows
= 5 / 14
=
0.35
Automatically measure index selectivity
We can determine the selectivity of an
index by dividing the number of distinct indexed values by the number of rows in the
table.
create index idx_emp_job on emp(job);
analyze table emp compute statistics;
select distinct_keys from user_indexes
where table_name = 'EMP'
and index_name = 'IDX_EMP_JOB';
DISTINCT_KEYS
-------------
5
select num_rows from
user_tables
where table_name = 'EMP';
NUM_ROWS
---------
14
Selectivity = DISTINCT_KEYS / NUM_ROWS = 0.35
Selectivity of each individual
Column
Assuming that the table has been analyzed
it is also possible to query USER_TAB_COLUMNS to
investigate the selectivity of each column individually.
select column_name,
num_distinct
from user_tab_columns
where table_name = 'EMP';
COLUMN_NAME
NUM_DISTINCT
------------------------------ ------------
EMPNO
14
ENAME
14
JOB
5
MGR
2
HIREDATE
13
SAL
12
COMM
4
DEPTNO
3
Histograms and the DBMS_STATS Package
The ANALYZE statement
The ANALYZE statement should no more be
used to collect optimizer statistics. The COMPUTE and ESTIMATE clauses of ANALYZE
statement are supported solely for backward compatibility and may be removed in a future
release. The DBMS_STATS package collects a broader, more accurate set of statistics, and
gathers statistics more efficiently.
You may continue to use ANALYZE statement
to for other purposes not related to optimizer statistics collection:
Gathering Statistics with DBMS_STATS
Procedures
Statistics are gathered using the
DBMS_STATS package. This PL/SQL package is also used to modify, view, export,
import, and delete statistics.
The DBMS_STATS package can gather
statistics on table and indexes, and well as individual columns and partitions of
tables.
When you generate statistics for a table,
column, or index, if the data dictionary already contains statistics for the object, then
Oracle updates the existing statistics. The older statistics are saved and can be
restored later if necessary.
When statistics are updated for a database
object, Oracle invalidates any currently parsed SQL statements that access the object.
The next time such a statement executes, the statement is re-parsed and the optimizer
automatically chooses a new execution plan based on the new statistics.
sqlplus scott/tiger
exec dbms_stats.gather_table_stats (
-
ownname
=> 'SCOTT', -
tabname
=> 'EMP', -
estimate_percent => dbms_stats.auto_sample_size, -
method_opt => 'for all
columns size auto', -
cascade
=> true, -
degree => 5 -
)
/ PL/SQL procedure successfully
completed.
|
ownname
|
Schema of table to
analyze. |
|
tabname
|
Name of
table. |
|
estimate_percent
|
Percentage of rows to
estimate (NULL means compute). Use the constant DBMS_STATS.AUTO_SAMPLE_SIZE to have
Oracle determine the appropriate sample size for good statistics. This is the
default. |
|
method_opt
|
The default is FOR ALL
COLUMNS SIZE AUTO. |
|
cascade
|
Gather statistics on the
indexes for this table. Index statistics gathering is not parallelized. Using this
option is equivalent to running the GATHER_INDEX_STATS Procedure on each of the
table's indexes. Use the constant DBMS_STATS.AUTO_CASCADE to have Oracle determine
whether index statistics to be collected or not. |
|
degree
|
Degree of parallelism.
The default for degree is NULL. |
All Parameters / Options can be found in
the Oracle Manual
sqlplus scott/tiger
exec dbms_stats.gather_schema_stats (
-
ownname
=> 'SCOTT', -
options
=> 'GATHER', -
estimate_percent => dbms_stats.auto_sample_size, -
method_opt => 'for all
columns size auto', -
cascade
=> true, -
degree => 5 -
)
/ PL/SQL procedure successfully
completed.
|
ownname
|
Schema to analyze
(NULL means current schema).
|
|
options
|
|
gather
|
Reanalyzes the
whole schema |
|
gather empty
|
Only analyzes
tables that have no existing statistics |
|
gather stale
|
Only reanalyzes
tables with more than 10% modifications (inserts, updates,
deletes). |
|
gather auto
|
Reanalyzes
objects which currently have no statistics and objects with stale
statistics (Using gather auto is like combining gather stale and gather
empty.) |
Note that both gather stale and gather auto
require monitoring. If you issue the alter table xxx
monitoring command, Oracle tracks changed tables with the
dba_tab_modifications view, which allows you to see the exact number of inserts,
updates, and deletes tracked since the last analysis of statistics.
|
|
estimate_percent
|
Percentage of rows to
estimate (NULL means compute). Use the constant DBMS_STATS.AUTO_SAMPLE_SIZE to have
Oracle determine the appropriate sample size for good statistics. This is the
default. |
|
method_opt
|
The default is FOR ALL
COLUMNS SIZE AUTO. |
|
cascade
|
Gather statistics on the
indexes for this table. Index statistics gathering is not parallelized. Using this
option is equivalent to running the GATHER_INDEX_STATS Procedure on each of the
table's indexes. Use the constant DBMS_STATS.AUTO_CASCADE to have Oracle determine
whether index statistics to be collected or not. |
|
degree
|
Degree of parallelism.
The default for degree is NULL. |
All Parameters / Options can be found in the Oracle
Manual
DBMS_STATS can collect optimizer statistics on the
following levels, see Oracle Manual
GATHER_DATABASE_STATS
GATHER_DICTIONARY_STATS
GATHER_FIXED_OBJECTS_STATS
GATHER_INDEX_STATS
GATHER_SCHEMA_STATS
GATHER_SYSTEM_STATS
GATHER_TABLE_STATS
Partitioned schema objects may contain
multiple sets of statistics. They can have statistics which refer to the entire schema
object as a whole (global statistics), they can have statistics which refer to an
individual partition, and they can have statistics which refer to an individual
sub-partition of a composite partitioned object.
Unless the query predicate narrows the
query to a single partition, the optimizer uses the global statistics. Because most
queries are not likely to be this restrictive, it is most important to have accurate
global statistics. Therefore, actually gathering global statistics with the
DBMS_STATS package is highly recommended.
Oracle Histogram Statistics
If DBMS_STATS discovers an index whose columns are
unevenly distributed, it will create histograms for that index to aid the cost-based SQL
optimizer in making a decision about index versus full-table scan access.
First we set up a table with some very skewed data - so
skewed that when we query WHERE ID=1, Oracle Database will want to use an index on ID,
and when we query WHERE ID=99, Oracle Database will not want to use an index.
sqlplus scott/tiger
CREATE TABLE skewed_data
AS
SELECT DECODE(ROWNUM,1,1,99) id,
all_objects.*
FROM all_objects
/
Table created.
CREATE INDEX idx_skewed_data ON skewed_data (id);
Index created.
begin
dbms_stats.gather_table_stats
( ownname => USER,
tabname => 'SKEWED_DATA',
method_opt => 'for all indexed columns size
254',
cascade => TRUE
);
end;
/
PL/SQL procedure successfully completed.
set autotrace traceonly explain
SELECT * FROM skewed_data WHERE id=1;
Execution Plan
----------------------------------------------------------
Plan hash value: 1799207757
-----------------------------------------------------------------------------------------------
| Id |
Operation
| Name | Rows |
Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT
STATEMENT
|
| 1 | 96 |
2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID|
SKEWED_DATA | 1 | 96
| 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE
SCAN | IDX_SKEWED_DATA
| 1 |
| 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
SELECT * FROM skewed_data WHERE id=99;
Execution Plan
----------------------------------------------------------
Plan hash value: 746880940
---------------------------------------------------------------------------------
| Id | Operation |
Name | Rows | Bytes | Cost (%CPU)|
Time |
---------------------------------------------------------------------------------
| 0 | SELECT STATEMENT
| | 51545 |
4832K| 190 (3)| 00:00:03 |
|* 1 | TABLE ACCESS FULL| SKEWED_DATA | 51545 |
4832K| 190 (3)| 00:00:03 |
---------------------------------------------------------------------------------
The Table skewed_data contains a column ID, which
is very much skewed most of the values are 99, with one record containing a value of 1.
After we index and gather statistics on the table (generating histograms on that indexed
column, so the optimizer knows that the data is skewed), we can see that the optimizer
prefers an index range scan over a full scan when ID=1 is used and vice versa for
ID=99.
The cost-based optimizer
uses data value histograms to get accurate estimates of the
distribution of column data. Histograms provide improved selectivity estimates in the
presence of data skew, resulting in optimal execution plans with non-uniform data
distributions.
Histograms can affect performance and
should be used only when they substantially improve query plans. They are useful
only when they reflect the current data distribution of a given column. If the data
distribution of a column changes frequently, you must re-compute its histogram
frequently.
The number of Histograms to used is
specified with the SIZE parameter in method_opt:
method_opt => 'for all indexed
columns size 254',
Histograms are bands of column values, so
that each band contains approximately the same number of rows. The useful information
that the histogram provides is where in the range of values the endpoints
fall.
Consider the following table column with
values between 1 and 100 and a histogram with 10 buckets. If the data in the
column is uniformly distributed, then the histogram looks as follows, where the
numbers are the endpoint values.

Histogram with Uniform
Distribution
The number of rows in each bucket is 1/10
of the total number of rows in the table.
If the data is not uniformly distributed,
then the histogram might look as follows:

Histogram with Non-Uniform
Distribution
In this case, most of the rows have the
value 5 for the column. Only 1/10 of the rows have values between 60 and
100.
Conclusion
The package DBMS_STATS can be used to
gather global statistics. It is most important to have accurate global statistics for
partitioned schema objects. Histograms can affect performance and should be used only
when they substantially improve query plans.
Monitoring Unused Indexes
Overview
Oracle 8 introduced table-usage monitoring
to help streamline the process of statistics collection. To automatically gather
statistics for a particular table, enable the monitoring attribute using the MONITORING keyword. This keyword is part of the CREATE TABLE and ALTER TABLE
statement syntax.
After it is enabled, Oracle monitors the
table for DML activity. This includes the approximate number of inserts, updates, and
deletes for that table since the last time statistics were gathered. Oracle uses this
data to identify tables with stale statistics.
View the data Oracle obtains from
monitoring these tables by querying the USER_TAB_MODIFICATIONS view.
With Oracle 9, you can now also monitor the usage of
indexes. Unused indexes are, of course, a waste of space, and also can cause performance
problems, as Oracle is required to maintain the index each time the table associated with
the index is involved in a DML operation.
Index Monitoring
Use ALTER INDEX
index_name MONITORING USAGE clause to begin the collection of statistics on index
usage. This clause is useful in determining whether an index is being used.
Example
Create an index on the EMP table, issue a
SELECT statement which will use the index (check this with AUTOTRACE) and monitor the
index during this operation.
CONNECT scott/tiger
ALTER TABLE emp ADD CONSTRAINT pk_emp PRIMARY KEY (empno);
ALTER INDEX pk_emp MONITORING USAGE;
SET AUTOTRACE ON;
SELECT ename FROM emp WHERE empno = 7900;
Execution Plan
-------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (BY INDEX ROWID) OF 'EMP'
2 1 INDEX (UNIQUE SCAN) OF
'PK_EMP' (UNIQUE)
COLUMN index_name FORMAT A10
COLUMN table_name FORMAT A10
COLUMN start_monitoring FORMAT A10
COLUMN end_monitoring FORMAT A10
SELECT * FROM v$object_usage;
INDEX_NAME TABLE_NAME MON USE START_MONI END_MONITO
---------- ---------- --- --- ---------- ----------
PK_EMP EMP YES
YES 03/30/2002
ALTER INDEX pk_emp NOMONITORING USAGE;
While the index is in monitoring mode, you
can use the V$OBJECT_USAGE view to determine whether the
index has been used. Note that the view will show usage information only when queried by
the index owner. The V$OBJECT_USAGE column USED can be queried. If the columns value is NO, then the index has
not been used. If the columns value is YES, then the index has been used.
Reversing Indexes
Creating a reverse key index, compared to a standard
index, reverses the bytes of each column indexed (except the rowid) while keeping the
column order.
In general, an Oracle reverse key index
relieve data block contention (buffer busy waits) when inserting into any index where the
index key is a monotonically increasing value which must be duplicated in the
higher-level index nodes.
With the index key reversal, only the
lowest-level index node is changed, and not all of the high-order index nodes, resulting
in far faster insert speed. For updates, Oracle updates the index nodes with each
update statement.
Reverse Benefits
- By reversing the keys of the index, the insertions
become distributed across all leaf keys in the index.
- Reverse key indexes are also used to reduce contention
in high concurrent insert situations
- Reverse-key indexes reduce «hot spots» in indexes,
especially primary key indexes, by reversing the bytes of the leaf blocks and thus
eliminating the contention for leaf blocks across instances.
Reverse Costs
Example: Table with a number based reverse key
index
Create the Test Table
CREATE TABLE reverse_number (id
NUMBER, name VARCHAR2(20));
Next, create a Reverse Key Index on the id column,
non-unique index is being used
CREATE INDEX reverse_index ON
reverse_number(id) REVERSE;
Let's insert a whole bunch of rows and collect
statistics
INSERT INTO reverse_number
SELECT rownum, 'Your Message'
FROM dual CONNECT BY LEVEL <= 1000000;
COMMIT;
EXEC dbms_stats.gather_table_stats (
ownname=>'TEST',
tabname=>'REVERSE_NUMBER',
estimate_percent=>null,
cascade=>TRUE,
method_opt=>'FOR ALL COLUMNS SIZE 1'
);
Let's attempt a very simple, innocent
looking range scan predicate
set timing on
set autotrace on explain
SELECT * FROM reverse_number WHERE id BETWEEN 42 AND 43;
Execution Plan
------------------------------------------------------------------------------------
| Id | Operation |
Name | Rows | Bytes |
Cost (%CPU)| Time |
------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT
|
| 3 | 51 | 643 (2)|
00:00:08 |
|* 1 | TABLE ACCESS
FULL| REVERSE_NUMBER | 3
| 51 | 643 (2)| 00:00:08 |
------------------------------------------------------------------------------------
No good, Oracle performed a Full Table
Scan even though we were only after 2 rows. You can try to hint the thing as much as you
want but the CBO does not consider Index Range Scans with Range Predicates.
SELECT /*+ INDEX(rd) */ * FROM
reverse_number rd WHERE id BETWEEN 42 AND 43;
Execution Plan
----------------------------------------------------------------------------------------------
| Id |
Operation
| Name | Rows | Bytes |
Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT
STATEMENT
|
| 3 | 51 | 3027 (1)| 00:00:37
|
| 1 | TABLE ACCESS BY INDEX ROWID| REVERSE_NUMBER
| 3 | 51 | 3027 (1)| 00:00:37
|
|* 2 | INDEX FULL
SCAN |
REVERSE_INDEX | 43 | |
2984 (1)| 00:00:36 |
----------------------------------------------------------------------------------------------
Equality conditions are not a problem.
SELECT * FROM reverse_number WHERE id
= 42;
Execution Plan
----------------------------------------------------------------------------------------------
| Id |
Operation
| Name | Rows | Bytes |
Cost (%CPU)| Time |
----------------------------------------------------------------------------------------------
| 0 | SELECT
STATEMENT
|
| 1 | 17 |
4 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| REVERSE_NUMBER
| 1 | 17 |
4 (0)| 00:00:01 |
|* 2 | INDEX RANGE
SCAN | REVERSE_INDEX
| 1 |
| 3 (0)| 00:00:01 |
----------------------------------------------------------------------------------------------
If a range predicate can be rewritten as an IN condition
Oracle can convert the predicate to separate OR equality conditions and can use the
Reverse Key Index
SELECT * FROM reverse_number WHERE id IN
(42, 43);
Execution Plan
-----------------------------------------------------------------------------------------------
| Id |
Operation
| Name | Rows | Bytes |
Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------
| 0 | SELECT
STATEMENT
|
| 2 | 34 |
6 (0)| 00:00:01 |
| 1 | INLIST
ITERATOR
|
| |
|
| |
| 2 | TABLE ACCESS BY INDEX ROWID| REVERSE_NUMBER
| 2 | 34 |
6 (0)| 00:00:01 |
|* 3 | INDEX RANGE
SCAN | REVERSE_INDEX
| 2 |
| 4 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------
Example: Table with a text based reverse key
index
CREATE TABLE reverse_text AS SELECT *
FROM dba_objects;
CREATE INDEX idx_reverse_text ON reverse_text(object_name) REVERSE;
EXEC dbms_stats.gather_table_stats (
ownname=>'TEST',
tabname=>'REVERSE_TEXT',
estimate_percent=> null,
cascade=> TRUE,
method_opt=>'FOR ALL COLUMNS SIZE 1'
);
SELECT * FROM reverse_text WHERE object_name LIKE 'ARCOS%';
Execution Plan
----------------------------------------------------------------------------------
| Id | Operation |
Name | Rows | Bytes | Cost (%CPU)|
Time |
----------------------------------------------------------------------------------
| 0 | SELECT STATEMENT
|
| 2 | 186 | 146 (1)| 00:00:02
|
|* 1 | TABLE ACCESS
FULL| REVERSE_TEXT | 2
| 186 | 146 (1)| 00:00:02 |
----------------------------------------------------------------------------------
Index access path not even considered by the CBO.
However, again Oracle can pick up on when a LIKE is really equivalent to an equality
predicate.
SELECT * FROM reverse_text WHERE object_name LIKE 'ARCOS';
Execution Plan
------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
------------------------------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 2 | 186 | 2 (0)| 00:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| REVERSE_TEXT | 2 | 186 | 2 (0)| 00:00:01 |
|* 2 | INDEX RANGE SCAN | IDX_REVERSE_TEXT | 2 | | 1 (0)| 00:00:01 |
------------------------------------------------------------------------------------------------
The index is considered in this example.
|