To RAC or not to RAC

Today i’ve re-read mogens nogard (his name is unwritable for me, but also for him)  post on high avalability of last month. It is very interesting what mogens says.  I’ve recently read the book “Oracle Insights” from witch i desume that mogens is a step above others.

I completely agree with the quote “Complexity is the enemy of availability” , i’m conservative. So i think that technologies pushed to simplify management may became a boomerang.  On the other hand i think that RAC, with 10g standard edition has reason to be. It give us a little scalability at a competitive cost.

What mogens says in a comment is the real point:

For political reasons you might have to implement all sorts of things, and I still haven’t found an effective way of preventing that from happening.”

Marketing and politics really drive our (mine) managers so i have to implement things that for me give no advantages.

however i’ve to say that my little experience with RAC, not about HA, is at last good.  What remains are all the bugs that we can encounter, with RAC as with Standalone instance.


Bug Number 4323868 (index full scan instead of index range scan)

In attempto to solve a performance problem encountered on Oracle with CBO i’ve encountered another problem.

Let’s start from the beginning, we have an application developed and tuned on Oracle 9iR2 with RBO. We started with about four years ago and there were not great problems with performance. Application is OLTP. RBO has a great advantage: stabilitity. On the other hand tuning an application with CBO is complex. One day our manager decided to make an installation of a two node RAC standard edition, with 10g (applications will have had 400 users). It was a 10gR1 we made the great jump in to the dark migrating to CBO. Things gone well (apart from installation). Then we made another jump in the dark with installation of 10gR2 and there started the problems. With my experience it seems that optimizer in 10gR2 is worse than optimizer in 10gR1.

The problem

Well, i’m not guilty, our application has a table (i will call it X) that can join alternatively with other three tables (i repeat, i’m not guilty, i think that not a good idea and a bad design, i will call such tables A, B and C). The table X as three columns ID_A that is foreign key (?) with table A, ID_B that is foreign key with table B and ID_C that is foreign key with table C. Our application is a product, customizable, so for a client happens that this table has milions of records that join only table C, foreign keys for tables A and B have value “-1”. So application make a query like this:

ID_A = n1 or ID_B = n2 or ID_C = n3

Where n1>0, n2>0, n3>0 and in table X all values for ID_A and ID_B are “-1”. There are three indexes on columns ID_A, ID_B and ID_C. Using histograms optimizer knows that the thee values n1,n2,n3 (>0) are very selective so it uses the three indexes on 10gR1. Also on 10gR2 if we use litteral values. Our application is a java application that uses Bind Variables. So it happened that suddenly in 10gR2 optimizer decided to do a triple full scan on the table making database server hang on I/O (table has over 20 milion of records). I think that it is a problem with bind peeking, but i’m not been able to reproduce it, so we tried STORED OUTLINES to stabilize execution plan using indexes.

So i encountered the BUG 4323868, optimizer as required from hints of stored outlines used indexes but making index full scan instead of index range scan, making database server hang on I/O.

It is interesting to note what Oracle has made, i’ve not found patch for this bug, with patchset Oracle has introduced “code improvement” adding two new hints: INDEX_RS_ASC and INDEX_RS_DESC

RULE Optimizer Vs. CBO

With Oracle 10g RULE based optimizer is not supported by Oracle. All DBA’s with a bit of experience know that internal queries of Oracle in the dictionary have a lot of hints ” /*+ RULE *”. This morning (in Italy) i’ve read i question about the slowness of the query

SQL> select bytes
2 from dba_extents i
3 where i.owner=’QSDF’
4 and i.segment_name=’MLKQJSDF’
5* and i.extent_id=0
SQL> /
Elapsed: 00:01:21.03

This made me curios, effectively the query take a lot of cpu resources, executed with autotrace in SQL*Plus it reveals a huge amount of logical reads with an execution plan with over 100 of executions steps. We use optimizer_mode=FIRST_ROWS system wide, with ALL_ROWS the logical read are also more.

Adding the /*+ RULE */ hint the query is not a flash but even faster than with default CBO.

The moral i think is: with dictionary queries consider using RULE optimizer.

Execution Plans differents with different Users (?)

On google group on Oracle server ( i’ve seen this:

By accident, I found the solution for this problem on MetaLink :
Bug 4652274 – Explain Plan Differs With Different Users
It has to do with the init parameter secure_view_merging, which is new since
10gR2. Setting it to FALSE in the spfile and boucing the instance resolved
all my problems … Now queries on view from another schema have the same
exection plans when executed by the view owner compared to another user.

I’ve noticed this question because last person to write was Jonathan Lewis. The problem arised seems bad, Oracle in his note says that’s not a bug but an expected behaviour. Since my memory is not so good i’ve decided immediatly to write this here.

P.S. (20/08/2007)

See comments for correct parameter name, thanks to David Jeffrey