Dynamic Sampling: some questions about the guts 2007-12-01 - By David Aldridge
Back I wouold guess that this might be to do with the optimizer looking at different access paths to the table. For example, it might be not be much more selective to access the table via "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1 AND "RCRAPP1"."RCRAPP1_INFC_CODE"=:bind2 AND "RCRAPP1"."RCRAPP1_CURR_REC_IND"=:bind3 AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4
than it is to access it with: "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1 AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4
If there is one index on all four columns and one index on just the two columns then accessing via the two column index might be more efficient than using the four-column one. Just speculation though.
In one of our reporting environments (OSEE 10.2.0.2) we have optimizer _dynamic_sampling cranked up to 5. We have a case where dynamic sampling is pushing the CBO to choose a poor plan. Our situation highlights Level 4, specifically "tables that have single-table predicates that reference 2 or more columns". (from the documentation)
Looking at a 10046 trace, I see a OPT_DYN_SAMP query that counts a sample of rows that match variations on the predicate. I have found that a 10046 can always be good grounds for finding more questions than it answers. *grin*
To avoid the bandwidth, I will show just the relevant parts of the sampling query: SELECT /* OPT_DYN_SAMP */ ... FROM (SELECT /* ... */ 1 AS C1, CASE WHEN "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1 AND "RCRAPP1"."RCRAPP1_INFC_CODE"=:bind2 AND "RCRAPP1"."RCRAPP1_CURR_REC_IND"=:bind3 AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4 THEN 1 ELSE 0 END AS C2, CASE WHEN "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1 AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4 THEN 1 ELSE 0 END AS C3, CASE WHEN "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1 AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4 THEN 1 ELSE 0 END AS C4 FROM ... SAMPLE BLOCK (.037704 , 1) SEED (1) "RCRAPP1" ) SAMPLESUB /
First question: Why the predicate variations? I am assuming that sometimes C4 is different than C3, so I am going to ignore C4 for now. 2) How does the Optimizer use these sample counts to influence the costs and subsequent access/join methods?
I believe what is happening in our case is that sampling is "just unlucky" and it happens to come up with a inaccurate picture of how many rows match (an order of magnitude too many). I can prove this by using this query directly with different sample sizes.
I am hoping that by understanding dynamic sampling a little better, I can understand how to resolve this issue. I realize that one can add the dynamic _sampling query hint, and perhaps that is the "best" solution for this. Aside from adjusting dynamic sampling (whether it be at the system level or via a hint), is there any other way to address this situation?
TIA,
-- Charles Schultz
<div>I wouold guess that this might be to do with the optimizer looking at different access paths to the table. For example, it might be not be much more selective to access the table via </div> <div> </div> <div><FONT color= #333399><FONT face="Courier New">"RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1 <BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> AND "RCRAPP1". "RCRAPP1_INFC_CODE"=:bind2 </SPAN></FONT></FONT><BR style="COLOR: rgb(51,51 ,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace">AND "RCRAPP1"."RCRAPP1_CURR_REC_IND"=:bind3 </SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new ,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new ,monospace"> AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4 </SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"></div> <div>than it is to access it with:</div> <div> </div> <div><FONT color= #333399><FONT face="Courier New">"RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1<BR style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace">AND "RCRAPP1"."RCRAPP1_VPDI _CODE"=:bind4</SPAN></FONT></FONT><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><BR>If there is one index on all four columns and one index on just the two columns then accessing via the two column index might be more efficient than using the four-column one.</div> <div> </div> <div >Just speculation though.<BR></div><B><I></I></B> <BLOCKQUOTE class=replbq style="PADDING-LEFT: 5px; MARGIN-LEFT: 5px; BORDER-LEFT: #1010ff 2px solid">In one of our reporting environments (OSEE <A href="http://10.2.0.2/">10.2.0.2</A> ) we have optimizer_dynamic_sampling cranked up to 5. We have a case where dynamic sampling is pushing the CBO to choose a poor plan. Our situation highlights Level 4, specifically "tables that have single-table predicates that reference 2 or more columns". (from the <A href= "http://download.oracle.com/docs/cd/B19306_01/server.102/b14211/stats.htm #sthref1491">documentation</A>)<BR><BR>Looking at a 10046 trace, I see a OPT_DYN _SAMP query that counts a sample of rows that match variations on the predicate. I have found that a 10046 can always be good grounds for finding more questions than it answers. *grin* <BR><BR>To avoid the bandwidth, I will show just the relevant parts of the sampling query:<BR><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace">SELECT /* OPT_DYN_SAMP */</SPAN><BR style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> ...</SPAN><BR style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace">FROM</SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> (SELECT /* ... */</SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT -FAMILY: courier new,monospace"> 1 AS C1,</SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51 ,51,153); FONT-FAMILY: courier new,monospace"> CASE</SPAN><BR style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> WHEN "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1 </SPAN><BR style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> AND "RCRAPP1"."RCRAPP1_INFC_CODE"=:bind2 </SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT -FAMILY: courier new,monospace"> AND "RCRAPP1"."RCRAPP1_CURR_REC_IND"=:bind3 </SPAN><BR style="COLOR: rgb(51,51 ,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4 </SPAN><BR style="COLOR: rgb(51 ,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153) ; FONT-FAMILY: courier new,monospace"> THEN 1</SPAN><BR style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> ELSE 0< /SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> <SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> END AS C2,</SPAN><BR style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> CASE</SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> WHEN "RCRAPP1"."RCRAPP1_AIDY_CODE"=:bind1</SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> AND "RCRAPP1"."RCRAPP1_VPDI_CODE"=:bind4</SPAN> <BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> THEN 1</SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new ,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> ELSE 0</SPAN><BR style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> END AS C3,< /SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> <SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> CASE</SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new ,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new ,monospace"> WHEN "RCRAPP1"."RCRAPP1 _AIDY_CODE"=:bind1</SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new ,monospace"> AND "RCRAPP1". "RCRAPP1_VPDI_CODE"=:bind4</SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> THEN 1</SPAN><BR style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> ELSE 0< /SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> <SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> END AS C4</SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new ,monospace"> FROM ...</SPAN><BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"> SAMPLE BLOCK (.037704 , 1) SEED (1) "RCRAPP1"</SPAN> <BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style= "COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace">) SAMPLESUB</SPAN> <BR style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace"><SPAN style="COLOR: rgb(51,51,153); FONT-FAMILY: courier new,monospace">/</SPAN><BR><BR>First question: Why the predicate variations? I am assuming that sometimes C4 is different than C3, so I am going to ignore C4 for now. <BR>2) How does the Optimizer use these sample counts to influence the costs and subsequent access/join methods?<BR><BR>I believe what is happening in our case is that sampling is "just unlucky" and it happens to come up with a inaccurate picture of how many rows match (an order of magnitude too many). I can prove this by using this query directly with different sample sizes. <BR><BR>I am hoping that by understanding dynamic sampling a little better, I can understand how to resolve this issue. I realize that one can add the dynamic_sampling query hint, and perhaps that is the "best" solution for this. Aside from adjusting dynamic sampling (whether it be at the system level or via a hint), is there any other way to address this situation? <BR><BR>TIA,<BR><BR>-- <BR>Charles Schultz </BLOCKQUOTE><BR>
|
|