Index Rebuild: Magic or Voodoo?

I actually don’t like to write about todays topic. However, I hope you will agree this story is worth to be told.

Via Freelist somebody asked if an Index rebuild could be occaisionally usefull. There was a case at hand, where index rebuild seemed to be usefull after a big delete. I sometimes got annoyed by the popular myth, that regular index rebuild would be helpfull. This is one of the more popular database myths as branded by Mr. Index, Richard Foote.

Thus, I could not resist the opportunity to trash the myth I don’t like and answered that there is no such case. I ignored the warning wisper in my head telling me that things are never that easy.
Of course, the unavoidable happened Jonathan Lewis corrected me stating that there are corner cases were index rebuild can be useful ( e.g. after a big delete).

As if this would not be enough shortly thereafter I came accross this little statement below. (I presenting a test case that makes use of the SH schema.)

select time_id from sales where sparse is not null;

The Exection Plan with runtime statistics looks like that:

| Id  | Operation                                  | Name       | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
|   0 | SELECT STATEMENT                           |            |      1 |        |      0 |00:00:00.04 |   41770 |
|   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID BATCHED| SALES      |      1 |      1 |      0 |00:00:00.04 |   41770 |
|*  2 |   INDEX FULL SCAN                          | SPARSE_IDX |      1 |      1 |      0 |00:00:00.04 |   41770 |

Predicate Information (identified by operation id):

   2 - filter("SPARSE" IS NOT NULL)

Why did the optimizer choose a full index scan? Actually a full table scan would be much better. The estimate (E-Rows) is correct. The statistcs are current.

Can you guess what is missing from my test case? Hint: I researched the DML on this table and I found an update statement.

4 thoughts on “Index Rebuild: Magic or Voodoo?

  1. Is this the sort of column that gets inserted into not-null as a flag that some task needs to be run against it? The task would then set this column to null to mark it as processed. If you had a big batch to run, or the task hadn’t run in some time (looking at your 41K buffers, that must be quite a lot of rows!) then the index would have grown – setting the column back to null isn’t going to shrink it again on it’s own but will mean the index can shrink if asked to.

    The same sort of thing happens with AQ tables, except the rows get deleted (although that can some time after processing)


  2. > Why did the optimizer choose a full index scan? Actually a full table scan would be much better.

    Maybe you updated the “sparse” column to null for all the rows and gathered statistics with DBMS_STATS which doesn’t count empty index leaf blocks which are still in the index structure? Thus an index (fast) full scan is costed much lower than it actually really is at runtime where all the empty leaf blocks are accessed. The cost is also lower than a full table scan.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s