Solution: Navigate many shallow hierachies in parallel

Lets create the indexes we need. There might be better options, but these will do:


create index i1 on tree_emp( empno);
create index i2 on tree_emp( mgr);

Now we define the pipe table package. Of course it is possible to have a more efficient solution, but I want to show the principle.


create or replace package parallel_access
as

TYPE R_REC IS RECORD (empno tree_emp.empno%TYPE,
sum_salaries Number); — result record, change defintion according to your needs
TYPE refcur_t IS REF CURSOR RETURN R_REC;

TYPE result_Tab is TABLE OF R_REC;

FUNCTION passData (p_ref refcur_t) RETURN result_Tab
PIPELINED
PARALLEL_ENABLE(PARTITION p_ref BY ANY); — function will inherit parallelism from ref cursor

END parallel_access;
/

create or replace package body parallel_access
as
FUNCTION passData (p_ref refcur_t) RETURN result_Tab
PIPELINED PARALLEL_ENABLE(PARTITION p_ref BY ANY)
IS
out_rec r_rec;
BEGIN
execute immediate ‘alter session set “_old_connect_by_enabled”=true’;
LOOP — for each parallel process
FETCH p_ref INTO out_rec;
EXIT WHEN p_ref%NOTFOUND;
SELECT sum(sal)
INTO out_rec.sum_salaries
FROM tree_emp
CONNECT BY PRIOR EMPNO = MGR
START WITH mgr = out_rec.empno;

PIPE ROW(out_rec);
END LOOP;
execute immediate ‘alter session set “_old_connect_by_enabled”=false’;
RETURN;
END passData;

END parallel_access;
/

We can use the package as follows. Parallel 4 is just an example, one can choose the parallelism as needed.

SELECT b.* FROM
TABLE(parallel_access.passdata (CURSOR( select /*+ parallel (d 4) */ empno , null from tree_emp where mgr is null))) b;

I tested the example with parallel 16 on my laptop (4 core CPU). The piped table was about twice as fast as the standard plan.

Navigate many shallow hierachies in parallel

Suppose you have a large number of flat hierarchies. As an example we take the emp table from the Scott schema.
Using it is we create a table Tree_emp as shown here:


create table tree_emp as select * from emp;

alter table tree_emp modify (empno number(18));
alter table tree_emp modify (mgr number(18));

INSERT INTO tree_emp
( EMPNO, ENAME, JOB, MGR, HIREDATE, SAL, COMM, DEPTNO
)
;
SELECT EMPNO+(10000*step.n) empno,
ENAME,
JOB,
MGR+(10000*step.n) mgr,
HIREDATE,
TRUNC(DBMS_RANDOM.value(low => 8, high => 50))*100 SAL,
COMM,
DEPTNO
FROM emp,
(SELECT rownum n FROM dual CONNECT BY rownum <= 1000000
) step
;

Show for all presidents (mgr is zero) the sum of the salaries of all subordinate . Write a pipelined table function that can navigate multiple trees simultaneously.
Use Oracle’s connect by Syntax.You can index as you like.

Solution: Using Runtime Statistics

In my practise runtime statistics are the most important optimization tool for non parallel queries.
Runtime statistics can be generated quickly and usually contain already all the necessary information for a optimization.
First I repeat the runtime statistics of the riddle:

 

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |      1 |        |      1 |00:00:15.42 |    3540 |      3262 |
|   1 |  SORT AGGREGATE              |      |      1 |      1 |      1 |00:00:15.42 |    3540 |      3262 |
|*  2 |   TABLE ACCESS BY INDEX ROWID|   T1 |      1 |  52695 |     10 |00:00:15.42 |    3540 |      3262 |
|*  3 |    INDEX RANGE SCAN          |   I1 |      1 |     17M|   4577 |00:00:00.16 |      11 |         9 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("NVL("X1_STATUS",'A')<>'U' AND NVL("X2_STATUS",'A')<>'D' 
               AND NVL("X3_STATUS",'A')<>'D' AND NVL("X4_STATUS",'A')<>'D' 
               AND NVL("X5_STATUS",'A')<>'U'))
   3 - access("T1_ID"=:B2 AND "T2_ID"=:B4 AND "T3_ID"=:B1)

It is initially striking again that the estimated rows in operation 3 are way off from the actual rows. It becomes obvious when we compare the “estimated rows” (E-rows) with the “actual rows” (A-Rows). Since it is a very simple compound condition with a “=” comparision operator we can to assume a functional dependency as a likely reason for the estimation error.

The corresponding countermeasure would be extended statistics.
However, most of the time is lost in the operation 2. Here the “actual rows” of 4577 rows in operation 3 are reduced to merely 10 row s in operation 2.. This is due to the filter condition in operation 2:

2 - filter(("NVL("X1_STATUS",'A')<>'U' AND NVL("X2_STATUS",'A')<>'D' 
               AND NVL("X3_STATUS",'A')<>'D' AND NVL("X4_STATUS",'A')<>'D' 
               AND NVL("X5_STATUS",'A')<>'U'))

Acoording to the  tuning technique Throw-away” invented by Oak Table member Martin Berg we would need to add some extra columns to the index in order to avoid the “throw away” rows in oepration 2.

I have found very clever recommendations in my comments. However, you do not have to be very sofisticated in this case. We have tested the status fields and found which increase the selectivity of the index. There status fields we simply add to the index. The application the developer wrongly assumed the application would not benefit from the extra fields, because in a non-equal condition, no index can be used.

This is not correct. The tree structure can not be used. However, the tree structure is only needed to establish the first leaf block for an index range scan. From there onwards a serial is over the leaf blocks will be done.

In this serial search the datebase can use any condition independend of the comparison operators to filter rows. As we can see the statistics ïn below runtime statistics, that’s enough to ensure a good performance. The improvement factor is run 700.

Not even the improvement of statistics is ultimately required for this case. Although the optimizer estimates wrong it still perceives the additional columns as an advantage.

Of course, the extended statistics would still be worthwhile.

 

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |      1 |        |      1 |00:00:00.02 |   12 |         4 |
|   1 |  SORT AGGREGATE              |                     |      1 |      1 |      1 |00:00:00.02 |   12 |         4 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| MM_SALES_DELIVERIES |      1 |  52695 |     10 |00:00:00.02 |   12 |         4 |
|*  3 |    INDEX RANGE SCAN          | PR_SALE_DEL_03      |      1 |  52695 |     10 |00:00:00.01 |    6 |         0 |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter((NVL("MM_DISTRIBUTION_DEL_STATUS",'H')<>'D' AND NVL("MM_OUTBOUND_DEL_STATUS",'H')<>'D' AND
              NVL("MM_OUTBOUND_STATUS",'H')<>'U'))
   3 - access("MM_WAREHOUSE_ID"=:B4 AND "MM_FIRM_ID"=:B1 AND "MM_ITEM_ID"=:B2 AND "MM_FROM_LOCATION_ID"=:B3)
       filter((NVL("MM_DISTRIBUTION_STATUS",'H')<>'U' AND NVL("MM_DIRECT_DEL_STATUS",'H')<>'D'))