Java Mailing List Archive

http://www.dba.5341.com/

Home » Home (12/2007) » oracle l »

Re: Veritas issue with 10G RAC

Alexander Gorbachev

2006-06-08

Replies:

We were on HP-UX platform (3 large nodes) and had issues with Veritas
IO that was "hanging" for a while. We have since identified the root
cause of VxVM problems.
Now, outage of the whole cluster was caused by hanging IO on voting
disk (there is voting disk timeout derived from misscount - in our
case as misscount-15). To address this there is one-off patch (we were
on 10.1.0.4 but I think 10.1.0.5 patchset didn't have it included
either) that allows you define disk timeout independently of
misscount. This one-off is included in 10.2.0.2 patchset, AFAIK. This
is not platform specific issue. Check note 294430.1.

Some benefits of chopping up one huge SMP that Kevin asked for:
--- manageability - you can shutdown part of your system down for
maintenance. However, 6 domains might be somewhat excessive.
--- local failure tolerance. Well, don't know if isolated hardware
issues on single domain happen often but software is not ideal anyway
so nodes do go down. On the other hand, you have much higher chances
hitting the problem specific to clustered environment.
--- limitations in scalability of singe SMP machine. Perhaps, this is
a bit duff since SMP scales better until certain number of CPUs
(someone please correct me if I am wrong). Until "certain" number of
CPUs - is it the max you can have in single SMP? :-)

2006/6/8, fairlie rego <fairlie_r@(protected)>:
>
>
> Hi all,
>
> Environment
> =========
> 6 node 10G RAC on Solaris 2.9 E25K domains with 10.1.0.5 patchset with Veritas SFRAC 4.1 and Vxfs
>
> We've had the following problem on 2 nodes where we lose
> the entire Veritas file system across the cluster after doing the following rendering the whole cluster being unavailable.....


--
Best regards,
Alex Gorbachev

http://oracloid.blogspot.com
--
http://www.freelists.org/webpage/oracle-l


©2008 dba.5341.com - Jax Systems, LLC, U.S.A.