Oracle Advanced Queueing


	Parallel Clause and Small Tables
René Steiner, Akadia AG, Information Technology, CH-3604 Thun Phone: +41 33 335 86 22 / Fax: +41 33 335 86 25 / EMail: rene.steiner@akadia.com Released: 22. Februar 2002

If you have multiple CPUs available you may come up with the idea to set parallel execution on every table. The Oracle Optimizer should decide about the degree of parallelism or even if is necessary to use parallel execution at all. What's a great thing on big tables shows up badly on small ones. The split up of all the work over more than one CPU may result in an overhead which exceeds the execution time needed by one CPU by factors. While the Optimizer recognises for index accesses not to use parallel execution it doesn't for full table scans. This sounds reasonable at first hand but consider that often full table scans are the fastest option on smaller tables.

More Information to this Tip can be found here

To conclude, you may save time, according our experience factor 3 to 4, if you do not turn on parallel for your smaller tables.

Let's do some samples! We create a small table, turn on parallel and fill in 100 records. We have tested the example using the following environment:

Oracle RDBMS Version 8.1.7.1.0

SUN Solaris 8

Number of CPUs = 6

Tables analyzed with ANALYZE

Create the following table

DROP TABLE sample;
CREATE TABLE sample (
a NUMBER NOT NULL,
b NUMBER,
c VARCHAR2(50));

ALTER TABLE sample ADD
(CONSTRAINT pk_sample PRIMARY KEY (a));

ALTER TABLE sample PARALLEL;

Fill in some values

DECLARE
BEGIN
FOR i IN 1..100 LOOP
INSERT INTO sample (a, b, c) VALUES (i, 99, 'xyz');
END LOOP;
COMMIT;
END;
/

We are going to select one single row which results in a primary key access without parallel execution.

SET AUTOTRACE TRACEONLY EXPLAIN;
SELECT * FROM sample WHERE a = 30;

Execution Plan
----------------------------------------------------------
   0      SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=53)
   1    0   TABLE ACCESS (BY INDEX ROWID) OF 'SAMPLE' (Cost=1 Card=1 Bytes=53)
   2    1     INDEX (RANGE SCAN) OF 'PK_SAMPLE' (UNIQUE) (Cost=1 Card=1)

That's fine. Let's try a full table access now.

SELECT * FROM sample;

Execution Plan
----------------------------------------------------------
0   SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=41 Bytes=2173)
1   0   TABLE ACCESS* (FULL) OF 'SAMPLE' (Cost=1 Card=41 Bytes=217 :Q3913003)
1   0   PARALLEL_TO_SERIAL SELECT /*+ Q3913000 NO_EXPAND ROWID(A1) */
        A1."A",A1."B",A1."C" FROM "SAMPLE" PX

Here we get parallel execution on multiple CPUs.

To make another test we turn off parallel and execute the same statement again.

ALTER TABLE sample NOPARALLEL;
SELECT * FROM sample;

Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE
1 0 TABLE ACCESS (FULL) OF 'SAMPLE'

How about execution times?

We made some experience on a date warehouse where, beside the really big tables, data evaluation on small tables has to be performed several million times every day. We got a performance improvement after turning off parallel execution by factors 3 to 4.

What is no difficult to proof in a small tip of the week can be foreseen if we have a look on the following numbers:

Elapsed: 00:00:00.02
Elapsed: 00:00:00.10

We got them by executing the sample in both cases multiple times. Non-parallel execution needed 20 millis whereas parallel execution converged to 100 millis.

Note, that in most cases a full tables scan may be the fastest query if only 10-20% of the table content is selected no matter how big or small the table is. Unfortunately, the COST based optimizer does not recognize that he can't parallelize our small table, even if there is only one extent!