Solution: Using Runtime Statistics

In my practise runtime statistics are the most important optimization tool for non parallel queries.
Runtime statistics can be generated quickly and usually contain already all the necessary information for a optimization.
First I repeat the runtime statistics of the riddle:

 

-----------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads     |
-----------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |      |      1 |        |      1 |00:00:15.42 |    3540 |      3262 |
|   1 |  SORT AGGREGATE              |      |      1 |      1 |      1 |00:00:15.42 |    3540 |      3262 |
|*  2 |   TABLE ACCESS BY INDEX ROWID|   T1 |      1 |  52695 |     10 |00:00:15.42 |    3540 |      3262 |
|*  3 |    INDEX RANGE SCAN          |   I1 |      1 |     17M|   4577 |00:00:00.16 |      11 |         9 |
-----------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter(("NVL("X1_STATUS",'A')<>'U' AND NVL("X2_STATUS",'A')<>'D' 
               AND NVL("X3_STATUS",'A')<>'D' AND NVL("X4_STATUS",'A')<>'D' 
               AND NVL("X5_STATUS",'A')<>'U'))
   3 - access("T1_ID"=:B2 AND "T2_ID"=:B4 AND "T3_ID"=:B1)

It is initially striking again that the estimated rows in operation 3 are way off from the actual rows. It becomes obvious when we compare the “estimated rows” (E-rows) with the “actual rows” (A-Rows). Since it is a very simple compound condition with a “=” comparision operator we can to assume a functional dependency as a likely reason for the estimation error.

The corresponding countermeasure would be extended statistics.
However, most of the time is lost in the operation 2. Here the “actual rows” of 4577 rows in operation 3 are reduced to merely 10 row s in operation 2.. This is due to the filter condition in operation 2:

2 - filter(("NVL("X1_STATUS",'A')<>'U' AND NVL("X2_STATUS",'A')<>'D' 
               AND NVL("X3_STATUS",'A')<>'D' AND NVL("X4_STATUS",'A')<>'D' 
               AND NVL("X5_STATUS",'A')<>'U'))

Acoording to the  tuning technique Throw-away” invented by Oak Table member Martin Berg we would need to add some extra columns to the index in order to avoid the “throw away” rows in oepration 2.

I have found very clever recommendations in my comments. However, you do not have to be very sofisticated in this case. We have tested the status fields and found which increase the selectivity of the index. There status fields we simply add to the index. The application the developer wrongly assumed the application would not benefit from the extra fields, because in a non-equal condition, no index can be used.

This is not correct. The tree structure can not be used. However, the tree structure is only needed to establish the first leaf block for an index range scan. From there onwards a serial is over the leaf blocks will be done.

In this serial search the datebase can use any condition independend of the comparison operators to filter rows. As we can see the statistics ïn below runtime statistics, that’s enough to ensure a good performance. The improvement factor is run 700.

Not even the improvement of statistics is ultimately required for this case. Although the optimizer estimates wrong it still perceives the additional columns as an advantage.

Of course, the extended statistics would still be worthwhile.

 

-----------------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name                | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-----------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |                     |      1 |        |      1 |00:00:00.02 |   12 |         4 |
|   1 |  SORT AGGREGATE              |                     |      1 |      1 |      1 |00:00:00.02 |   12 |         4 |
|*  2 |   TABLE ACCESS BY INDEX ROWID| MM_SALES_DELIVERIES |      1 |  52695 |     10 |00:00:00.02 |   12 |         4 |
|*  3 |    INDEX RANGE SCAN          | PR_SALE_DEL_03      |      1 |  52695 |     10 |00:00:00.01 |    6 |         0 |
-----------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   2 - filter((NVL("MM_DISTRIBUTION_DEL_STATUS",'H')<>'D' AND NVL("MM_OUTBOUND_DEL_STATUS",'H')<>'D' AND
              NVL("MM_OUTBOUND_STATUS",'H')<>'U'))
   3 - access("MM_WAREHOUSE_ID"=:B4 AND "MM_FIRM_ID"=:B1 AND "MM_ITEM_ID"=:B2 AND "MM_FROM_LOCATION_ID"=:B3)
       filter((NVL("MM_DISTRIBUTION_STATUS",'H')<>'U' AND NVL("MM_DIRECT_DEL_STATUS",'H')<>'D'))

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s