Friday, October 11, 2013

It’s not about ROWS

And it’s not about COST either.  We’ve all seen it.  A plan with a high cost yet runs better than a query with a lower cost.  Maybe we haven’t noticed it but it happens “all the time".  Why?  Well that because it’s really all about the blocks to get the data, not how many rows or this mystical cost number.  I’d like for you as a SQL Optimization Specialist to STOP looking a rows and cost when it comes to optimization.  The BEST plan more often they not, will be the one that returns the required number of rows as the business wants while touch the fewest blocks possible.

Here is a simple example to illustrate the point.

Two version of the same table, one has the data well packed around the key we are looking for (called densely packed data).  And the other has the data very spread out based on the same key (called sparsely packed data).  Both tables are logically the same and have an index on the same column, OBJECT_ID.  The following list shows the relevant statistics about the two tables and its index.

************* First Table APHYS1 ***********
TABLE STATISTICS
Owner         : op
Table name    : aphys1
# Rows        : 10000
# Blocks      : 164

COLUMN STATISTICS
Name        Analyzed              Null?  NDV       Density  # Nulls
---------------------------------------------------------------------
object_id   20-SEP-2013 19:54:10  Y      100       .010000  0
col2        20-SEP-2013 19:54:10  Y      10000     .000100  0

INDEX INFORMATION
Index name    : aphys1_n1
Rows          : 10000
Levels        : 1
Leaf Blocks   : 20
Distinct Keys : 100
Clust. Factor : 152
Table Rows    : 10000
Table Blocks  : 164

INDEX COLUMNS INFORMATION
Index Name                           Pos# Order Column Name
---------------------------------------------------------------------
aphys1_n1                               1 ASC   object_id

************* Second Table APHYS2 ***********
TABLE STATISTICS
Owner         : op
Table name    : aphys2
# Rows        : 10000
# Blocks      : 164

COLUMN STATISTICS
Name        Analyzed              Null?  NDV       Density  # Nulls
---------------------------------------------------------------------
object_id   20-SEP-2013 19:54:11  Y      100       .010000  0
col2        20-SEP-2013 19:54:11  Y      10000     .000100  0

INDEX INFORMATION
Index name    : aphys2_n1
Rows          : 10000
Levels        : 1
Leaf Blocks   : 20
Clust. Factor : 10000
Table Rows    : 10000
Table Blocks  : 164

INDEX COLUMNS INFORMATION
Index Name                           Pos# Order Column Name
---------------------------------------------------------------------
aphys2_n1                               1 ASC   object_id

Here is the query used on each table, same query just the different table:

SQL> get aphys1
  1  select *
  2    from aphys1
  3*  where object_id between 1 and 4
SQL> get aphys2
  1  select *
  2    from aphys2
  3*  where object_id between 1 and 4

And are the run stats of each, these are the stat lines from a 10046 trace for each.

For APHYS1:
STAT #339895352 id=1 cnt=400 pid=0 pos=1 obj=76982
      op='TABLE ACCESS BY INDEX ROWID APHYS1
        (cr=11 pr=0 pw=0 time=859 us cost=11 size=52312 card=503)'
STAT #339895352 id=2 cnt=400 pid=1 pos=1 obj=76983
       op='INDEX RANGE SCAN APHYS1_N1
         (cr=3 pr=0 pw=0 time=352 us cost=3 size=0 card=503)'

For APHYS2:
STAT #339913808 id=1 cnt=400 pid=0 pos=1 obj=76984
    op='TABLE ACCESS FULL APHYS2
      (cr=156 pr=0 pw=0 time=1371 us cost=30 size=52312 card=503)'

So what can we see?  Well then both return exactly 400 rows (this is the “cnt” value on the top stat line for each query); good they both return the same data, this test would be worthless if they didn’t.  The first one with the densely packed data uses the index (index range scan) and has cost of 11.  The second one does a full table scan and has a cost of 30. 

Well doesn’t that mean the full table scan is a bad idea?  Hey its cost is higher!  That’s bad right?  The index must be better, so why did the Optimizer pick the full scan.  Well if we force the second query to use the index like this:

SQL> get aphys3
  1  select /*+ INDEX (aphys2 aphys2_n1) */
  2         *
  3    from aphys2
  4*  where object_id between 1 and 4

Here are the stats of its run:

STAT #339651456 id=1 cnt=400 pid=0 pos=1 obj=76984
     op='TABLE ACCESS BY INDEX ROWID APHYS2
       (cr=403 pr=0 pw=0 time=710 us cost=507 size=52312 card=503)'

STAT #339651456 id=2 cnt=400 pid=1 pos=1 obj=76985
     op='INDEX RANGE SCAN APHYS2_N1
       (cr=3 pr=0 pw=0 time=127 us cost=3 size=0 card=503)'

Yes it did use the index but this time the cost is 507!  Wow! No wonder it used the full scan.  Notice the number of LIOs (the “cr” value) in this one is 403.  This table only has 164 blocks this query is doing the equivalent of about 2.5 full table scans.   The plan for APHYS2 (which did the full scan) only had a LIO of 156 and the APHYS1 had 11.  (There is a stat not shown in the stat line which is the “buffer pin count”, but that will wait for another day).

Yes the index was good for the APHYS1 table because the data we wanted was densely populated.  The same index on what appeared to be the “same” table named APHYS2 was a horrible idea.  So where is the data? I’ll use a nifty little scrip from the Hotsos Harness Tool Pack to show the density of the data.  The little script goes to the table and shows at both the BLOCK and ROW level how well packed the data is (or isn’t).

SQL> @hds
Table Owner  : op
Table Name   : aphys1
Column List  : OBJECT_ID
Where Clause : where object_id between 1 and 4
Page Size[30]:

Table blocks below hwm    Table rows
         (B)                 (R)
---------------------- ----------------
                   164           10,000
More:

                Block selectivity  Block count    Row selectivity     Row count
      OBJECT_ID    (pb = b/B)          (b)          (pr = r/R)           (r)
--------------- ----------------- -------------- ----------------- ----------------
              1             1.83%              3             1.00%              100
              3             1.83%              3             1.00%              100
              2             1.22%              2             1.00%              100
              4             1.22%              2             1.00%              100
SQL>
SQL>
SQL> @hds
Table Owner  : op
Table Name   : aphys2
Column List  : OBJECT_ID
Where Clause : where object_id between 1 and 4
Page Size[30]:

Table blocks below hwm    Table rows
         (B)                 (R)
---------------------- ----------------
                   164           10,000
More:

                Block selectivity  Block count    Row selectivity     Row count
      OBJECT_ID    (pb = b/B)          (b)          (pr = r/R)           (r)
--------------- ----------------- -------------- ----------------- ----------------
              1            60.98%            100             1.00%              100
              2            60.98%            100             1.00%              100
              3            60.98%            100             1.00%              100
              4            60.98%            100             1.00%              100


This does show that the two tables are logically the same (164 blocks and 10,000 rows). But what this also shows us is that the data for the first table is in a very small number of blocks in the table.  So an index is really a good idea.  We’re only going to go about 1-2% of the blocks of the table for all the data we want.   However for the second version of the table it’s in over 60% of the table blocks.  Notice that at a row level the data is the same for the two tables, 1% of the data for each value.

How does the optimizer know this?  It doesn’t have the info the “hds” script shows but it does have the clustering factor on the indexes.  For the APHYS1 the index has a clustering factor of 152, this is below the number of blocks in the table.  What that tells the optimizer is that the data is very well clustered around the key.  This is about the best clustering factor you could have.  Anything close to the number of blocks means the data in the TABLE is very well clustered around the given index value (or values).  Notice that it’s an INDEX stat, but id describes the TABLE.

But for APHYS2 the cluster factor was 10,000.  Which is equal to the number of rows in the table.  The cluster factor doesn’t get much worse.  This tells the optimizer that it will be going all over the table to find a set of rows for a range scan, and low and behold it does. 

As I said at the top, I want you to start thinking about BLOCKs not rows.  A phrase I tell students in my class to tattoo on the back of their hand is “Oracle reads blocks not rows”.  So far I don’t think anyone has done the tattooing part but I hope everyone takes that slogan to heart.