WITH Clause

Working with OLTP which is modeled using surrogate keys, I’m often handed a simple query, basic equity joins, but with several tables. Usually 3 – 5 of those tables are just used to traverse to the necessary information in the real tables that are of interest. Here’s a recent example:

select c.id c_Id,cb.type,Decode(cl.substantiated,'pre-substantiated','Y','N') "Auto-Substantiated",
ca.action_on,
count(cl.id) "Number"
from Table_A a,Table_B b,Table_C c,Table_D d,Table_E e,Table_F f,Table_G g,Table_H h
where a.id=123
and b.a_id=a.id
and b.id=456789
and h.er_id=b.id
and c.c_b_id=b.id
and f.c_id=c.id
and f.h_b_id=b.id
and f.h_cd=h.cd
and f.ends=h.ends
and e.c_id=f.c_id
and d.c_id=cb.c_id
and d.e_id=e.id
and e.f_id=f.id
and d.service_begins between to_date('01/01/2013','mm/dd/yyyy') and to_Date('01/31/2013','mm/dd/yyyy')
and g.clm_id=cl.id
and g.action='Entered'
and g.action_on = (select min(g1.action_on) from table_G g1 where g1.c_id=d.id and g1.action=ca.action)
group by d.id,e.claim_type,Decode(d.substantiated,'pre-substantiated','Y','N'),g.action_on
;

Tables A, B, F and H do not appear in the select clause, but in the where clause. Admittedly this database is I/O bound. Once data is cached we receive a fairly decent response time. Some of the tables listed above aren’t indexed for efficiently for this query. The query ran for over 40 minutes, before I killed it to try other approaches.

This is report and I’m not really interested in adding indices that will slow down the batch processing which is where the majority of the DML takes place. When dealing with reports some times the quickest way to the results is to flatten the data model. In this case I took the tables that don’t participate directly in the select statement and created a working table by using a CTAS (create table as select). This is proves over an over to be a relatively quick way to flatten the model allow the query to be simplified.

create table work as
(select a.id as a_id, f.a_id as f_a_id, f.id as f_id
from Table_A a, Table_B, b, Table_C, Table_F, Table_H
where a.id=123
and b.a_id=a.id
and b.id=456789
and h.er_id=b.id
and c.c_b_id=b.id
and f.c_id=c.id
and f.h_b_id=b.id
and f.h_cd=h.cd
and f.ends=h.ends);

Then just run the remaining query against the created table:

select tw.a_id a_Id,e.type,Decode(d.substantiated,'pre-substantiated','Y','N') "Auto-Substantiated",
g.action_on,
count(d.id) "Number"
from Table_D d,Table_E e,Table_G g,work tw
where
e.a_id=tw.a_id
and d.a_id=e.a_id
and d.e_id=e.id
and d.f_id=tw.f_id
and d.service_begins between to_date('01/01/2013','mm/dd/yyyy') and to_Date('01/31/2013','mm/dd/yyyy')
and g.d_id=d.id
and g.action='Entered'
and g.action_on = (select min(ca1.action_on) from table_g g1 where g1.d_id=d.id and g1.action=g.action)
group by tw.a_id,e.type,Decode(d.substantiated,'pre-substantiated','Y','N'),g.action_on;

The query runs in approximately 8 minutes and add 2 minutes for the initial WORK table creation for a total of 10 minutes (will need to add the actual autotrace output. Since this was ran in production I did not capture the autotrace, and I will also not be able to flush the cache on subsequent runs). A definite improvement, but what if we avoid the table creation and work with the data as part of a sub query?

The next query:

select tw.a_id a_Id,e.type,Decode(d.substantiated,'pre-substantiated','Y','N') "Auto-Substantiated",
g.action_on,
count(d.id) "Number"
from Table_D d,Table_E e,Table_G g,
(select a.id as a_id, f.a_id as f_a_id, f.id as f_id
from Table_A a, Table_B, b, Table_C, Table_F, Table_H
where a.id=123
and b.a_id=a.id
and b.id=456789
and h.er_id=b.id
and c.c_b_id=b.id
and f.c_id=c.id
and f.h_b_id=b.id
and f.h_cd=h.cd
and f.ends=h.ends) tw
where e.a_id=tw.a_id
and d.a_id=e.a_id
and d.e_id=e.id
and d.f_id=tw.f_id
and d.service_begins between to_date('01/01/2013','mm/dd/yyyy') and to_Date('01/31/2013','mm/dd/yyyy')
and g.d_id=d.id
and g.action='Entered'
and g.action_on = (select min(ca1.action_on) from table_g g1 where g1.d_id=d.id and g1.action=g.action)
group by tw.a_id,e.type,Decode(d.substantiated,'pre-substantiated','Y','N'),g.action_on;

Since I didn’t flush the buffer cache the time that this statement ran although much faster can be attributed to the data already being cached. The time clocked was 1 minute and 20 seconds.

I wondered If I re-wrote this to utilize the subquery factor — the With clause would the performance be any faster. The WITH clause allows the optimizer to select either an inline view or creation of temporary table. Adding the hint MATERIALIZE will ensure that a temporary table is used. This would be the same as what I manually did at the beginning. The WITH clause allows the specifying of query block name that is then referenced in later queries.

The re-write looked like this:

with WORK as
(select a.id as a_id, f.a_id as f_a_id, f.id as f_id
from Table_A a, Table_B, b, Table_C, Table_F, Table_H
where a.id=123
and b.a_id=a.id
and b.id=456789
and h.er_id=b.id
and c.c_b_id=b.id
and f.c_id=c.id
and f.h_b_id=b.id
and f.h_cd=h.cd
and f.ends=h.ends)
select tw.a_id a_Id,e.type,Decode(d.substantiated,'pre-substantiated','Y','N') "Auto-Substantiated",
g.action_on,
count(d.id) "Number"
from Table_D d,Table_E e,Table_G g,work tw
where
e.a_id=tw.a_id
and d.a_id=e.a_id
and d.e_id=e.id
and d.f_id=tw.f_id
and d.service_begins between to_date('01/01/2013','mm/dd/yyyy') and to_Date('01/31/2013','mm/dd/yyyy')
and g.d_id=d.id
and g.action='Entered'
and g.action_on = (select min(ca1.action_on) from table_g g1 where g1.d_id=d.id and g1.action=g.action)
group by tw.a_id,e.type,Decode(d.substantiated,'pre-substantiated','Y','N'),g.action_on;

This time the results were returned in under 5 seconds, however, again that can be attributed to the data already cached. I would have to take these queries over to the reporting server to get a better idea of any improvement in performance.

I did find the execution plans to be interesting. The timings included with the plans can be ignored since we did not start with a flushed cache.

Original query:

"----------------------------------------------------------------------------------------------------------------"
"| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |"
"----------------------------------------------------------------------------------------------------------------"
"| 0 | SELECT STATEMENT | | | | 70544 (100)| |"
"| 1 | HASH GROUP BY | | 1 | 44 | 70544 (1)| 00:00:14 |"
"| 2 | VIEW | VM_NWVW_2 | 1 | 44 | 70543 (1)| 00:00:14 |"
"|* 3 | FILTER | | | | | |"
"| 4 | HASH GROUP BY | | 1 | 288 | 70543 (1)| 00:00:14 |"
"| 5 | NESTED LOOPS | | | | | |"
"| 6 | NESTED LOOPS | | 1 | 288 | 70542 (1)| 00:00:14 |"
"| 7 | NESTED LOOPS | | 1 | 253 | 70538 (1)| 00:00:14 |"
"| 8 | NESTED LOOPS | | 1 | 230 | 70534 (1)| 00:00:14 |"
"| 9 | NESTED LOOPS | | 1 | 207 | 70532 (1)| 00:00:14 |"
"| 10 | NESTED LOOPS | | 1668 | 280K| 67195 (1)| 00:00:13 |"
"| 11 | NESTED LOOPS | | 4856 | 516K| 29 (0)| 00:00:01 |"
"| 12 | NESTED LOOPS | | 1 | 67 | 3 (0)| 00:00:01 |"
"| 13 | NESTED LOOPS | | 1 | 37 | 2 (0)| 00:00:01 |"
"|* 14 | TABLE ACCESS BY INDEX ROWID| Table_B | 1 | 21 | 2 (0)| 00:00:01 |"
"|* 15 | INDEX UNIQUE SCAN | B_PK | 1 | | 1 (0)| 00:00:01 |"
"|* 16 | INDEX UNIQUE SCAN | A_PK | 1 | 16 | 0 (0)| |"
"|* 17 | INDEX RANGE SCAN | H_PK | 1 | 30 | 1 (0)| 00:00:01 |"
"| 18 | TABLE ACCESS BY INDEX ROWID | Table_F | 4113 | 168K| 26 (0)| 00:00:01 |"
"|* 19 | INDEX RANGE SCAN | F_G_FK_I | 80 | | 18 (0)| 00:00:01 |"
"|* 20 | TABLE ACCESS BY INDEX ROWID | Table_D | 1 | 63 | 19 (0)| 00:00:01 |"
"|* 21 | INDEX RANGE SCAN | D_F_FK_I | 24 | | 2 (0)| 00:00:01 |"
"|* 22 | TABLE ACCESS BY INDEX ROWID | Table_E | 1 | 35 | 2 (0)| 00:00:01 |"
"|* 23 | INDEX UNIQUE SCAN | E_PK | 1 | | 1 (0)| 00:00:01 |"
"|* 24 | INDEX RANGE SCAN | C_ID_B_ID_I | 1 | 23 | 2 (0)| 00:00:01 |"
"| 25 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 23 | 4 (0)| 00:00:01 |"
"|* 26 | INDEX RANGE SCAN | G_H_D | 1 | | 3 (0)| 00:00:01 |"
"|* 27 | INDEX RANGE SCAN | G_H_I | 1 | | 3 (0)| 00:00:01 |"
"| 28 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 35 | 4 (0)| 00:00:01 |"
"----------------------------------------------------------------------------------------------------------------"

Working Table Query:

The full table scan on the claim and text_work are to be expected since we created the text_work with all the data that we needed.

"-----------------------------------------------------------------------------------------------------------"
"| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |"
"-----------------------------------------------------------------------------------------------------------"
"| 0 | SELECT STATEMENT | | | | 664K(100)| |"
"| 1 | HASH GROUP BY | | 1 | 51 | 664K (1)| 00:02:04 |"
"| 2 | VIEW | VM_NWVW_2 | 1 | 51 | 664K (1)| 00:02:04 |"
"|* 3 | FILTER | | | | | |"
"| 4 | HASH GROUP BY | | 1 | 194 | 664K (1)| 00:02:04 |"
"| 5 | NESTED LOOPS | | | | | |"
"| 6 | NESTED LOOPS | | 1 | 194 | 664K (1)| 00:02:04 |"
"| 7 | NESTED LOOPS | | 1 | 159 | 664K (1)| 00:02:04 |"
"| 8 | NESTED LOOPS | | 1 | 136 | 664K (1)| 00:02:04 |"
"|* 9 | HASH JOIN | | 17608 | 1736K| 629K (1)| 00:01:58 |"
"|* 10 | TABLE ACCESS FULL | Table_D | 17725 | 1090K| 629K (1)| 00:01:58 |"
"| 11 | TABLE ACCESS FULL | TEXT_WORK | 50752 | 1883K| 51 (0)| 00:00:01 |"
"|* 12 | TABLE ACCESS BY INDEX ROWID| Table_E | 1 | 35 | 2 (0)| 00:00:01 |"
"|* 13 | INDEX UNIQUE SCAN | E_PK | 1 | | 1 (0)| 00:00:01 |"
"| 14 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 23 | 4 (0)| 00:00:01 |"
"|* 15 | INDEX RANGE SCAN | G_H_I | 1 | | 3 (0)| 00:00:01 |"
"|* 16 | INDEX RANGE SCAN | G_H_I | 1 | | 3 (0)| 00:00:01 |"
"| 17 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 35 | 4 (0)| 00:00:01 |"

Subquery:

Both the subquery and the subquery factor utilized a Fast Full Index scan on the same index that the original query opted for a range scan.

"-----------------------------------------------------------------------------------------------------------------"
"| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |"
"-----------------------------------------------------------------------------------------------------------------"
"| 0 | SELECT STATEMENT | | | | 17951 (100)| |"
"| 1 | HASH GROUP BY | | 1 | 44 | 17951 (1)| 00:00:04 |"
"| 2 | VIEW | VM_NWVW_2 | 1 | 44 | 17950 (1)| 00:00:04 |"
"|* 3 | FILTER | | | | | |"
"| 4 | HASH GROUP BY | | 1 | 288 | 17950 (1)| 00:00:04 |"
"| 5 | NESTED LOOPS | | | | | |"
"| 6 | NESTED LOOPS | | 1 | 288 | 17949 (1)| 00:00:04 |"
"| 7 | NESTED LOOPS | | 1 | 253 | 17945 (1)| 00:00:04 |"
"| 8 | NESTED LOOPS | | 1 | 230 | 17941 (1)| 00:00:04 |"
"| 9 | NESTED LOOPS | | 1 | 195 | 17939 (1)| 00:00:04 |"
"|* 10 | HASH JOIN | | 4824 | 621K| 1143 (1)| 00:00:01 |"
"| 11 | NESTED LOOPS | | | | | |"
"| 12 | NESTED LOOPS | | 4856 | 516K| 29 (0)| 00:00:01 |"
"| 13 | NESTED LOOPS | | 1 | 67 | 3 (0)| 00:00:01 |"
"| 14 | NESTED LOOPS | | 1 | 37 | 2 (0)| 00:00:01 |"
"|* 15 | TABLE ACCESS BY INDEX ROWID| Table_B | 1 | 21 | 2 (0)| 00:00:01 |"
"|* 16 | INDEX UNIQUE SCAN | B_PK | 1 | | 1 (0)| 00:00:01 |"
"|* 17 | INDEX UNIQUE SCAN | A_PK | 1 | 16 | 0 (0)| |"
"|* 18 | INDEX RANGE SCAN | H_PK | 1 | 30 | 1 (0)| 00:00:01 |"
"|* 19 | INDEX RANGE SCAN | F_H_FK_I | 80 | | 18 (0)| 00:00:01 |"
"| 20 | TABLE ACCESS BY INDEX ROWID | Table_F | 4113 | 168K| 26 (0)| 00:00:01 |"
"|* 21 | INDEX FAST FULL SCAN | C_ID_B_ID_I | 50378 | 1131K| 1113 (1)| 00:00:01 |"
"|* 22 | TABLE ACCESS BY INDEX ROWID | Table_D | 1 | 63 | 4 (0)| 00:00:01 |"
"|* 23 | INDEX RANGE SCAN | D_CID_FID_I | 1 | | 3 (0)| 00:00:01 |"
"|* 24 | TABLE ACCESS BY INDEX ROWID | Table_E | 1 | 35 | 2 (0)| 00:00:01 |"
"|* 25 | INDEX UNIQUE SCAN | E_PK | 1 | | 1 (0)| 00:00:01 |"
"| 26 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 23 | 4 (0)| 00:00:01 |"
"|* 27 | INDEX RANGE SCAN | G_H_I | 1 | | 3 (0)| 00:00:01 |"
"|* 28 | INDEX RANGE SCAN | G_H_I | 1 | | 3 (0)| 00:00:01 |"
"| 29 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 35 | 4 (0)| 00:00:01 |"

Subquery Factor (with clause):

"| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |"
"--------------------------------------------------------------------------------------------------------------"
"| 0 | SELECT STATEMENT | | | | 71191 (100)| |"
"| 1 | HASH GROUP BY | | 1 | 169 | 71191 (1)| 00:00:14 |"
"|* 2 | FILTER | | | | | |"
"| 3 | NESTED LOOPS | | | | | |"
"| 4 | NESTED LOOPS | | 1 | 169 | 71185 (1)| 00:00:14 |"
"| 5 | NESTED LOOPS | | 1 | 146 | 71181 (1)| 00:00:14 |"
"| 6 | NESTED LOOPS | | 1657 | 199K| 67866 (1)| 00:00:13 |"
"|* 7 | HASH JOIN | | 4824 | 339K| 1143 (1)| 00:00:01 |"
"| 8 | NESTED LOOPS | | | | | |"
"| 9 | NESTED LOOPS | | 4856 | 289K| 29 (0)| 00:00:01 |"
"| 10 | NESTED LOOPS | | 1 | 31 | 3 (0)| 00:00:01 |"
"| 11 | NESTED LOOPS | | 1 | 13 | 2 (0)| 00:00:01 |"
"|* 12 | TABLE ACCESS BY INDEX ROWID| Table_B | 1 | 9 | 2 (0)| 00:00:01 |"
"|* 13 | INDEX UNIQUE SCAN | B_PK | 1 | | 1 (0)| 00:00:01 |"
"|* 14 | INDEX UNIQUE SCAN | A_PK | 1 | 4 | 0 (0)| |"
"|* 15 | INDEX RANGE SCAN | H_PK | 1 | 18 | 1 (0)| 00:00:01 |"
"|* 16 | INDEX RANGE SCAN | F_H_FK_I | 80 | | 18 (0)| 00:00:01 |"
"| 17 | TABLE ACCESS BY INDEX ROWID | Table_F | 4113 | 120K| 26 (0)| 00:00:01 |"
"|* 18 | INDEX FAST FULL SCAN | C_ID_B_ID_I | 50378 | 541K| 1113 (1)| 00:00:01 |"
"|* 19 | TABLE ACCESS BY INDEX ROWID | Table_C | 1 | 51 | 19 (0)| 00:00:01 |"
"|* 20 | INDEX RANGE SCAN | C_F_FK_I | 24 | | 2 (0)| 00:00:01 |"
"|* 21 | TABLE ACCESS BY INDEX ROWID | Table_E | 1 | 23 | 2 (0)| 00:00:01 |"
"|* 22 | INDEX UNIQUE SCAN | E_PK | 1 | | 1 (0)| 00:00:01 |"
"|* 23 | INDEX RANGE SCAN | G_D_I | 1 | | 3 (0)| 00:00:01 |"
"| 24 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 23 | 4 (0)| 00:00:01 |"
"| 25 | SORT AGGREGATE | | 1 | 23 | | |"
"| 26 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 23 | 5 (0)| 00:00:01 |"
"|* 27 | INDEX RANGE SCAN | G_D_I | 1 | | 4 (0)| 00:00:01 |"
"--------------------------------------------------------------------------------------------------------------"

The WITH clause does make the query more readable in my opinion. The jury is still out on the performance other than it was faster than the original query. Once I’m able to duplicate these queries in non-production I will add the autotrace information for a more thorough comparison.

Advertisements

Tagged: ,

Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: