Monthly Archives: March 2013

WITH Clause

Working with OLTP which is modeled using surrogate keys, I’m often handed a simple query, basic equity joins, but with several tables. Usually 3 – 5 of those tables are just used to traverse to the necessary information in the real tables that are of interest. Here’s a recent example:

select c.id c_Id,cb.type,Decode(cl.substantiated,'pre-substantiated','Y','N') "Auto-Substantiated",
ca.action_on,
count(cl.id) "Number"
from Table_A a,Table_B b,Table_C c,Table_D d,Table_E e,Table_F f,Table_G g,Table_H h
where a.id=123
and b.a_id=a.id
and b.id=456789
and h.er_id=b.id
and c.c_b_id=b.id
and f.c_id=c.id
and f.h_b_id=b.id
and f.h_cd=h.cd
and f.ends=h.ends
and e.c_id=f.c_id
and d.c_id=cb.c_id
and d.e_id=e.id
and e.f_id=f.id
and d.service_begins between to_date('01/01/2013','mm/dd/yyyy') and to_Date('01/31/2013','mm/dd/yyyy')
and g.clm_id=cl.id
and g.action='Entered'
and g.action_on = (select min(g1.action_on) from table_G g1 where g1.c_id=d.id and g1.action=ca.action)
group by d.id,e.claim_type,Decode(d.substantiated,'pre-substantiated','Y','N'),g.action_on
;

Tables A, B, F and H do not appear in the select clause, but in the where clause. Admittedly this database is I/O bound. Once data is cached we receive a fairly decent response time. Some of the tables listed above aren’t indexed for efficiently for this query. The query ran for over 40 minutes, before I killed it to try other approaches.

This is report and I’m not really interested in adding indices that will slow down the batch processing which is where the majority of the DML takes place. When dealing with reports some times the quickest way to the results is to flatten the data model. In this case I took the tables that don’t participate directly in the select statement and created a working table by using a CTAS (create table as select). This is proves over an over to be a relatively quick way to flatten the model allow the query to be simplified.

create table work as
(select a.id as a_id, f.a_id as f_a_id, f.id as f_id
from Table_A a, Table_B, b, Table_C, Table_F, Table_H
where a.id=123
and b.a_id=a.id
and b.id=456789
and h.er_id=b.id
and c.c_b_id=b.id
and f.c_id=c.id
and f.h_b_id=b.id
and f.h_cd=h.cd
and f.ends=h.ends);

Then just run the remaining query against the created table:

select tw.a_id a_Id,e.type,Decode(d.substantiated,'pre-substantiated','Y','N') "Auto-Substantiated",
g.action_on,
count(d.id) "Number"
from Table_D d,Table_E e,Table_G g,work tw
where
e.a_id=tw.a_id
and d.a_id=e.a_id
and d.e_id=e.id
and d.f_id=tw.f_id
and d.service_begins between to_date('01/01/2013','mm/dd/yyyy') and to_Date('01/31/2013','mm/dd/yyyy')
and g.d_id=d.id
and g.action='Entered'
and g.action_on = (select min(ca1.action_on) from table_g g1 where g1.d_id=d.id and g1.action=g.action)
group by tw.a_id,e.type,Decode(d.substantiated,'pre-substantiated','Y','N'),g.action_on;

The query runs in approximately 8 minutes and add 2 minutes for the initial WORK table creation for a total of 10 minutes (will need to add the actual autotrace output. Since this was ran in production I did not capture the autotrace, and I will also not be able to flush the cache on subsequent runs). A definite improvement, but what if we avoid the table creation and work with the data as part of a sub query?

The next query:

select tw.a_id a_Id,e.type,Decode(d.substantiated,'pre-substantiated','Y','N') "Auto-Substantiated",
g.action_on,
count(d.id) "Number"
from Table_D d,Table_E e,Table_G g,
(select a.id as a_id, f.a_id as f_a_id, f.id as f_id
from Table_A a, Table_B, b, Table_C, Table_F, Table_H
where a.id=123
and b.a_id=a.id
and b.id=456789
and h.er_id=b.id
and c.c_b_id=b.id
and f.c_id=c.id
and f.h_b_id=b.id
and f.h_cd=h.cd
and f.ends=h.ends) tw
where e.a_id=tw.a_id
and d.a_id=e.a_id
and d.e_id=e.id
and d.f_id=tw.f_id
and d.service_begins between to_date('01/01/2013','mm/dd/yyyy') and to_Date('01/31/2013','mm/dd/yyyy')
and g.d_id=d.id
and g.action='Entered'
and g.action_on = (select min(ca1.action_on) from table_g g1 where g1.d_id=d.id and g1.action=g.action)
group by tw.a_id,e.type,Decode(d.substantiated,'pre-substantiated','Y','N'),g.action_on;

Since I didn’t flush the buffer cache the time that this statement ran although much faster can be attributed to the data already being cached. The time clocked was 1 minute and 20 seconds.

I wondered If I re-wrote this to utilize the subquery factor — the With clause would the performance be any faster. The WITH clause allows the optimizer to select either an inline view or creation of temporary table. Adding the hint MATERIALIZE will ensure that a temporary table is used. This would be the same as what I manually did at the beginning. The WITH clause allows the specifying of query block name that is then referenced in later queries.

The re-write looked like this:

with WORK as
(select a.id as a_id, f.a_id as f_a_id, f.id as f_id
from Table_A a, Table_B, b, Table_C, Table_F, Table_H
where a.id=123
and b.a_id=a.id
and b.id=456789
and h.er_id=b.id
and c.c_b_id=b.id
and f.c_id=c.id
and f.h_b_id=b.id
and f.h_cd=h.cd
and f.ends=h.ends)
select tw.a_id a_Id,e.type,Decode(d.substantiated,'pre-substantiated','Y','N') "Auto-Substantiated",
g.action_on,
count(d.id) "Number"
from Table_D d,Table_E e,Table_G g,work tw
where
e.a_id=tw.a_id
and d.a_id=e.a_id
and d.e_id=e.id
and d.f_id=tw.f_id
and d.service_begins between to_date('01/01/2013','mm/dd/yyyy') and to_Date('01/31/2013','mm/dd/yyyy')
and g.d_id=d.id
and g.action='Entered'
and g.action_on = (select min(ca1.action_on) from table_g g1 where g1.d_id=d.id and g1.action=g.action)
group by tw.a_id,e.type,Decode(d.substantiated,'pre-substantiated','Y','N'),g.action_on;

This time the results were returned in under 5 seconds, however, again that can be attributed to the data already cached. I would have to take these queries over to the reporting server to get a better idea of any improvement in performance.

I did find the execution plans to be interesting. The timings included with the plans can be ignored since we did not start with a flushed cache.

Original query:

"----------------------------------------------------------------------------------------------------------------"
"| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |"
"----------------------------------------------------------------------------------------------------------------"
"| 0 | SELECT STATEMENT | | | | 70544 (100)| |"
"| 1 | HASH GROUP BY | | 1 | 44 | 70544 (1)| 00:00:14 |"
"| 2 | VIEW | VM_NWVW_2 | 1 | 44 | 70543 (1)| 00:00:14 |"
"|* 3 | FILTER | | | | | |"
"| 4 | HASH GROUP BY | | 1 | 288 | 70543 (1)| 00:00:14 |"
"| 5 | NESTED LOOPS | | | | | |"
"| 6 | NESTED LOOPS | | 1 | 288 | 70542 (1)| 00:00:14 |"
"| 7 | NESTED LOOPS | | 1 | 253 | 70538 (1)| 00:00:14 |"
"| 8 | NESTED LOOPS | | 1 | 230 | 70534 (1)| 00:00:14 |"
"| 9 | NESTED LOOPS | | 1 | 207 | 70532 (1)| 00:00:14 |"
"| 10 | NESTED LOOPS | | 1668 | 280K| 67195 (1)| 00:00:13 |"
"| 11 | NESTED LOOPS | | 4856 | 516K| 29 (0)| 00:00:01 |"
"| 12 | NESTED LOOPS | | 1 | 67 | 3 (0)| 00:00:01 |"
"| 13 | NESTED LOOPS | | 1 | 37 | 2 (0)| 00:00:01 |"
"|* 14 | TABLE ACCESS BY INDEX ROWID| Table_B | 1 | 21 | 2 (0)| 00:00:01 |"
"|* 15 | INDEX UNIQUE SCAN | B_PK | 1 | | 1 (0)| 00:00:01 |"
"|* 16 | INDEX UNIQUE SCAN | A_PK | 1 | 16 | 0 (0)| |"
"|* 17 | INDEX RANGE SCAN | H_PK | 1 | 30 | 1 (0)| 00:00:01 |"
"| 18 | TABLE ACCESS BY INDEX ROWID | Table_F | 4113 | 168K| 26 (0)| 00:00:01 |"
"|* 19 | INDEX RANGE SCAN | F_G_FK_I | 80 | | 18 (0)| 00:00:01 |"
"|* 20 | TABLE ACCESS BY INDEX ROWID | Table_D | 1 | 63 | 19 (0)| 00:00:01 |"
"|* 21 | INDEX RANGE SCAN | D_F_FK_I | 24 | | 2 (0)| 00:00:01 |"
"|* 22 | TABLE ACCESS BY INDEX ROWID | Table_E | 1 | 35 | 2 (0)| 00:00:01 |"
"|* 23 | INDEX UNIQUE SCAN | E_PK | 1 | | 1 (0)| 00:00:01 |"
"|* 24 | INDEX RANGE SCAN | C_ID_B_ID_I | 1 | 23 | 2 (0)| 00:00:01 |"
"| 25 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 23 | 4 (0)| 00:00:01 |"
"|* 26 | INDEX RANGE SCAN | G_H_D | 1 | | 3 (0)| 00:00:01 |"
"|* 27 | INDEX RANGE SCAN | G_H_I | 1 | | 3 (0)| 00:00:01 |"
"| 28 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 35 | 4 (0)| 00:00:01 |"
"----------------------------------------------------------------------------------------------------------------"

Working Table Query:

The full table scan on the claim and text_work are to be expected since we created the text_work with all the data that we needed.

"-----------------------------------------------------------------------------------------------------------"
"| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |"
"-----------------------------------------------------------------------------------------------------------"
"| 0 | SELECT STATEMENT | | | | 664K(100)| |"
"| 1 | HASH GROUP BY | | 1 | 51 | 664K (1)| 00:02:04 |"
"| 2 | VIEW | VM_NWVW_2 | 1 | 51 | 664K (1)| 00:02:04 |"
"|* 3 | FILTER | | | | | |"
"| 4 | HASH GROUP BY | | 1 | 194 | 664K (1)| 00:02:04 |"
"| 5 | NESTED LOOPS | | | | | |"
"| 6 | NESTED LOOPS | | 1 | 194 | 664K (1)| 00:02:04 |"
"| 7 | NESTED LOOPS | | 1 | 159 | 664K (1)| 00:02:04 |"
"| 8 | NESTED LOOPS | | 1 | 136 | 664K (1)| 00:02:04 |"
"|* 9 | HASH JOIN | | 17608 | 1736K| 629K (1)| 00:01:58 |"
"|* 10 | TABLE ACCESS FULL | Table_D | 17725 | 1090K| 629K (1)| 00:01:58 |"
"| 11 | TABLE ACCESS FULL | TEXT_WORK | 50752 | 1883K| 51 (0)| 00:00:01 |"
"|* 12 | TABLE ACCESS BY INDEX ROWID| Table_E | 1 | 35 | 2 (0)| 00:00:01 |"
"|* 13 | INDEX UNIQUE SCAN | E_PK | 1 | | 1 (0)| 00:00:01 |"
"| 14 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 23 | 4 (0)| 00:00:01 |"
"|* 15 | INDEX RANGE SCAN | G_H_I | 1 | | 3 (0)| 00:00:01 |"
"|* 16 | INDEX RANGE SCAN | G_H_I | 1 | | 3 (0)| 00:00:01 |"
"| 17 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 35 | 4 (0)| 00:00:01 |"

Subquery:

Both the subquery and the subquery factor utilized a Fast Full Index scan on the same index that the original query opted for a range scan.

"-----------------------------------------------------------------------------------------------------------------"
"| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |"
"-----------------------------------------------------------------------------------------------------------------"
"| 0 | SELECT STATEMENT | | | | 17951 (100)| |"
"| 1 | HASH GROUP BY | | 1 | 44 | 17951 (1)| 00:00:04 |"
"| 2 | VIEW | VM_NWVW_2 | 1 | 44 | 17950 (1)| 00:00:04 |"
"|* 3 | FILTER | | | | | |"
"| 4 | HASH GROUP BY | | 1 | 288 | 17950 (1)| 00:00:04 |"
"| 5 | NESTED LOOPS | | | | | |"
"| 6 | NESTED LOOPS | | 1 | 288 | 17949 (1)| 00:00:04 |"
"| 7 | NESTED LOOPS | | 1 | 253 | 17945 (1)| 00:00:04 |"
"| 8 | NESTED LOOPS | | 1 | 230 | 17941 (1)| 00:00:04 |"
"| 9 | NESTED LOOPS | | 1 | 195 | 17939 (1)| 00:00:04 |"
"|* 10 | HASH JOIN | | 4824 | 621K| 1143 (1)| 00:00:01 |"
"| 11 | NESTED LOOPS | | | | | |"
"| 12 | NESTED LOOPS | | 4856 | 516K| 29 (0)| 00:00:01 |"
"| 13 | NESTED LOOPS | | 1 | 67 | 3 (0)| 00:00:01 |"
"| 14 | NESTED LOOPS | | 1 | 37 | 2 (0)| 00:00:01 |"
"|* 15 | TABLE ACCESS BY INDEX ROWID| Table_B | 1 | 21 | 2 (0)| 00:00:01 |"
"|* 16 | INDEX UNIQUE SCAN | B_PK | 1 | | 1 (0)| 00:00:01 |"
"|* 17 | INDEX UNIQUE SCAN | A_PK | 1 | 16 | 0 (0)| |"
"|* 18 | INDEX RANGE SCAN | H_PK | 1 | 30 | 1 (0)| 00:00:01 |"
"|* 19 | INDEX RANGE SCAN | F_H_FK_I | 80 | | 18 (0)| 00:00:01 |"
"| 20 | TABLE ACCESS BY INDEX ROWID | Table_F | 4113 | 168K| 26 (0)| 00:00:01 |"
"|* 21 | INDEX FAST FULL SCAN | C_ID_B_ID_I | 50378 | 1131K| 1113 (1)| 00:00:01 |"
"|* 22 | TABLE ACCESS BY INDEX ROWID | Table_D | 1 | 63 | 4 (0)| 00:00:01 |"
"|* 23 | INDEX RANGE SCAN | D_CID_FID_I | 1 | | 3 (0)| 00:00:01 |"
"|* 24 | TABLE ACCESS BY INDEX ROWID | Table_E | 1 | 35 | 2 (0)| 00:00:01 |"
"|* 25 | INDEX UNIQUE SCAN | E_PK | 1 | | 1 (0)| 00:00:01 |"
"| 26 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 23 | 4 (0)| 00:00:01 |"
"|* 27 | INDEX RANGE SCAN | G_H_I | 1 | | 3 (0)| 00:00:01 |"
"|* 28 | INDEX RANGE SCAN | G_H_I | 1 | | 3 (0)| 00:00:01 |"
"| 29 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 35 | 4 (0)| 00:00:01 |"

Subquery Factor (with clause):

"| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |"
"--------------------------------------------------------------------------------------------------------------"
"| 0 | SELECT STATEMENT | | | | 71191 (100)| |"
"| 1 | HASH GROUP BY | | 1 | 169 | 71191 (1)| 00:00:14 |"
"|* 2 | FILTER | | | | | |"
"| 3 | NESTED LOOPS | | | | | |"
"| 4 | NESTED LOOPS | | 1 | 169 | 71185 (1)| 00:00:14 |"
"| 5 | NESTED LOOPS | | 1 | 146 | 71181 (1)| 00:00:14 |"
"| 6 | NESTED LOOPS | | 1657 | 199K| 67866 (1)| 00:00:13 |"
"|* 7 | HASH JOIN | | 4824 | 339K| 1143 (1)| 00:00:01 |"
"| 8 | NESTED LOOPS | | | | | |"
"| 9 | NESTED LOOPS | | 4856 | 289K| 29 (0)| 00:00:01 |"
"| 10 | NESTED LOOPS | | 1 | 31 | 3 (0)| 00:00:01 |"
"| 11 | NESTED LOOPS | | 1 | 13 | 2 (0)| 00:00:01 |"
"|* 12 | TABLE ACCESS BY INDEX ROWID| Table_B | 1 | 9 | 2 (0)| 00:00:01 |"
"|* 13 | INDEX UNIQUE SCAN | B_PK | 1 | | 1 (0)| 00:00:01 |"
"|* 14 | INDEX UNIQUE SCAN | A_PK | 1 | 4 | 0 (0)| |"
"|* 15 | INDEX RANGE SCAN | H_PK | 1 | 18 | 1 (0)| 00:00:01 |"
"|* 16 | INDEX RANGE SCAN | F_H_FK_I | 80 | | 18 (0)| 00:00:01 |"
"| 17 | TABLE ACCESS BY INDEX ROWID | Table_F | 4113 | 120K| 26 (0)| 00:00:01 |"
"|* 18 | INDEX FAST FULL SCAN | C_ID_B_ID_I | 50378 | 541K| 1113 (1)| 00:00:01 |"
"|* 19 | TABLE ACCESS BY INDEX ROWID | Table_C | 1 | 51 | 19 (0)| 00:00:01 |"
"|* 20 | INDEX RANGE SCAN | C_F_FK_I | 24 | | 2 (0)| 00:00:01 |"
"|* 21 | TABLE ACCESS BY INDEX ROWID | Table_E | 1 | 23 | 2 (0)| 00:00:01 |"
"|* 22 | INDEX UNIQUE SCAN | E_PK | 1 | | 1 (0)| 00:00:01 |"
"|* 23 | INDEX RANGE SCAN | G_D_I | 1 | | 3 (0)| 00:00:01 |"
"| 24 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 23 | 4 (0)| 00:00:01 |"
"| 25 | SORT AGGREGATE | | 1 | 23 | | |"
"| 26 | TABLE ACCESS BY INDEX ROWID | Table_G | 1 | 23 | 5 (0)| 00:00:01 |"
"|* 27 | INDEX RANGE SCAN | G_D_I | 1 | | 4 (0)| 00:00:01 |"
"--------------------------------------------------------------------------------------------------------------"

The WITH clause does make the query more readable in my opinion. The jury is still out on the performance other than it was faster than the original query. Once I’m able to duplicate these queries in non-production I will add the autotrace information for a more thorough comparison.

Advertisements

Data Guard and the Public User

I always like inheriting databases, every day brings new surprises. A Data Guard production primary database underwent hardware maintenance. No biggie happens frequently should be easy to catch up. Over the course of the next few days I noticed that two particular query always seemed active:

INSERT INTO STATS$USER_LOG SELECT USER, SYS_CONTEXT('USERENV', 'SESSIONID'), SYS_CONTEXT('USERENV', 'HOST'), NULL, NULL, NULL, SYSDATE, TO_CHAR(SYSDATE, 'hh24:mi:ss'), NULL, NULL, NULL FROM SYS.DUAL WHERE USER NOT IN ('USER1', 'USER2', 'USER3', 'USER4', 'USER5', 'USER5', 'USER6', 'USER7')

UPDATE STATS$USER_LOG SET LAST_ACTION = (SELECT ACTION FROM V$SESSION WHERE SYS_CONTEXT('USERENV', 'SESSIONID') = AUDSID), LAST_PROGRAM = (SELECT PROGRAM FROM V$SESSION WHERE SYS_CONTEXT('USERENV', 'SESSIONID') = AUDSID), LAST_MODULE = (SELECT MODULE FROM V$SESSION WHERE SYS_CONTEXT('USERENV', 'SESSIONID') = AUDSID), LOGOFF_DAY = SYSDATE, LOGOFF_TIME = TO_CHAR(SYSDATE, 'hh24:mi:ss'), ELAPSED_MINUTES = ROUND((SYSDATE - LOGON_DAY) * 1440) WHERE SYS_CONTEXT('USERENV', 'SESSIONID') = SESSION_ID

Based on the statements there were logon and off triggers at work. Why were we experiencing an unusual high number of logons? A review of the stats$user_log showed a particular user PUBLIC with a high frequency of activity and no module associated.

USER_ID SESSION_ID HOST LAST_PROGRAM LAST_ACTION LAST_MODULE LOGON_DAY LOGON_TIME LOGOFF_DAY LOGOFF_TIME ELAPSED_MINUTES
—————————— ———————- —————————— ———————————————— ——————————– ——————————– ————————- ———- ————————- ———– ———————-
PUBLIC 4294967295 STDB1 14-MAR-13 10:28:05
PUBLIC 4294967295 STDB1 14-MAR-13 10:28:00
PUBLIC 4294967295 STDB1 14-MAR-13 10:27:44
PUBLIC 4294967295 STDB1 14-MAR-13 10:27:41
PUBLIC 4294967295 STDB1 14-MAR-13 10:26:37
PUBLIC 4294967295 STDB1 14-MAR-13 10:26:33
PUBLIC 4294967295 STDB1 14-MAR-13 10:25:44
PUBLIC 4294967295 STDB1 14-MAR-13 10:25:40
PUBLIC 4294967295 STDB1 14-MAR-13 10:25:37
PUBLIC 4294967295 STDB1 14-MAR-13 10:25:33
PUBLIC 4294967295 STDB1 14-MAR-13 10:25:30
PUBLIC 4294967295 STDB1 14-MAR-13 10:25:22
PUBLIC 4294967295 STDB1 14-MAR-13 10:24:35

the logons are occurring within seconds and interestingly the host is the standby. Gap resolution is currently occuring due to the maintenance a few days prior.

Data Guard makes use of the internal user PUBLIC within the Gap resolution. In previous versions the alert log of the primary showed the connection being made as Public. I don’t find any indication of the use of PUBLIC within the 11.2 alert log.

Since the PUBLIC user cannot be used to login into the database it serves no purpose to continue to audit these logon/offs. They are actually an overhead at a time when we don’t need the overhead… trying to catch up the standby.

I recompiled the trigger to include excluding the user PUBLIC and all is well.