Java Mailing List Archive

http://www.dba.5341.com/

Home » Home (12/2007) » oracle l »

WAS: Question on starting up Oracle with "startup force" IS: Mandatory readong for HA minded folks

Kevin Closson

2006-07-27

Replies:


>Folks,
>We are upgrading from "Veritas Database Edition 3.5 for Oracle 9iRAC"
to the "Veritas Foundation Suite 4.1 for Oracle >RAC". While testing in
the lab with the new version, I noticed that the Veritas agent is
starting up Oracle with >"startup force" command as shown below:



First off, if you want to read about
what real, quality HA clusters (failover,rehosting,etc)
here is a brief bit: http://www.polyserve.com/pdf/VMDB_BC.pdf


This is a really long reply to a short question--as
it should be. The topic deserves deep consideration.
Unfortunately it is true that the people who need
to read this response the most don't have time--thus
the overwhelming success of over-marketed products
(no names mentioned).

I don't think startup force should be a concern. In the
case of a failover, the shutdown abort that is
built into the startup force command is a no-op. Afterall,
the server crashed and that is why a failover occurred.

In the event of a switchover (rehosting), the shutdown Veritas
issue is important. If they do a normal or immediate
shutdown followed by a startup force on the new node,
there again I see it as a no-op. Doing startup force on a
cleanly shutdown database is a no-op.

The PolyServe Database Utility for Oracle (a failover-HA
product) does shutdown immediate (with timeout) followed
by shutdown abort when the admin is moving an instance
from one server to the other (we call this dynamic re-hosting
because it is a simple GUI drag and drop). When rehosting,
the database MUST be shutdown on the node being vacated
so the:

if ( not immediate)
then abort

thing is very important.

More important than any of that is how to make sure a non-RAC
instance is not mistakingly being brought up on 2 servers
at the same time. Non-RAC uses node-local locking on the lkSID
in $ORACLE_HOME/dbs. Tell me, if there is a non-Shared Home
and a database being failed over from node 1 to, say, node 3
who is to say that, perhaps there are processes with the
files open on the vacating node? Especially if your HA kit
is a bare-bones setup where the database resides in RAW
partitions. The locks Oracle rely on only tell it whether or not
there is an instance with that database already ON THE
CURRENT SERVER. So, if you have a non-RAC database in NAS, CFS or
raw partitions, you can accidentally boot instances from the
same database on more than one server. Of course if the HA
setup is centered around non-shared filesystems this is not
an issue. That is, if the database under HA control is in UFS/VxFS/
Ext3 then the failover action includes an unmount from the
vacated node and a mount on the new node. For that reason
I feel that HA solutions that are based on non-shared storage
are complete garbage. A simple re-hosting operation means
filesystem mounting operations? Junk.

Oracle Disk Manager (ODM) makes non-RAC databases completely
safe in all clustered scenarios since a part of the spec
is "cluster keys" associated with every oracle datafile/redo log,
etc. PolyServe and Veritas both implement ODM. So, no matter
how big the cluster, and no matter how chaotic the association
of nodes to databases, ODM will ensure that no more then 1
instance opens a non-RAC database. The lk${SID} file is a no-op
when ODM is in play.

In summary, any "HA solution" that uses shared disk, without
ODM is rediculously dangerous. And, yep, there is Open Source
junk out there that allows the tinkerers of the world to set up
failover HA (with steeleye, or clustersuite, whetever) combined
with "CFS" (e.g., OCFS which deserves no more mention than that).
Anyone that "saves money" that way will be really sorry
they didn't get an ODM implementation just as soon as they
have 2 instances mistakingly opening files from a non-RAC
database. Like I always try to impress upon people, there is
more to this clustering stuff than hooking up cables :-).

Best to talk to people that know clustering.

So, to carry the point even further. A little command line
output is helpful. PolyServe's ODM exposes ODM cluster info
to admins. The same information is used by the HA engine to
make sure our failovers work. The following is output of our mxodmstat
tool
reporting the association of databases instances to server. Veritas
has no such information available. ODM can be implemented in
different ways.

There are 5 databases TESTDB1-TESTDB5 on node tmr6s13 to start.
I only pick out DBWR and LGWR for brevity.



$mxodmstat -lv | egrep "Node|Writer"
Node     Database      Instance      Application     Proc Type
Pid  File Activity
tmr6s13   TESTDB1       TESTDB1       --           DB Writer
24308 SmallData,LargeData,Other; Read,Write; Sync,Async
tmr6s13   TESTDB1       TESTDB1       --           Log
Writer 24333 SmallData,OLG,Other; Read,Write; Sync,Async
tmr6s13   TESTDB2       TESTDB2       --           DB Writer
6052 SmallData,LargeData,Other; Read,Write; Sync,Async
tmr6s13   TESTDB2       TESTDB2       --           Log
Writer 6057 SmallData,OLG,Other; Read,Write; Sync,Async
tmr6s13   TESTDB3       TESTDB3       --           DB Writer
24567 SmallData,LargeData,Other; Read,Write; Sync,Async
tmr6s13   TESTDB3       TESTDB3       --           Log
Writer 24575 SmallData,OLG,Other; Read,Write; Sync,Async
tmr6s13   TESTDB4       TESTDB4       --           DB Writer
13828 SmallData,LargeData,Other; Read,Write; Sync,Async
tmr6s13   TESTDB4       TESTDB4       --           Log
Writer 13834 SmallData,OLG,Other; Read,Write; Sync,Async
tmr6s13   TESTDB5       TESTDB5       --           DB Writer
3908 SmallData,LargeData,Other; Read,Write; Sync,Async
tmr6s13   TESTDB5       TESTDB5       --           Log
Writer 3912 SmallData,OLG,Other; Read,Write; Sync,Async

...next I use the HA cli command to rehost TESTDB3 to its DEFAULT
backup server. I then look at mxodmstat output to see it is on
tmr6s14 (different server). Note, that the mxdb command for
re-hosting says it will take a moment. So I immediately do
mxodmstat again and see that it is still on tmr613, but the
next execution of the command shows it is on tmr6s14. Then
mxdb is used (-Q) to get more detailed info about it now that
it is on tmr6s14 (it has been rehosted)


$mxdb -d TESTDB3 -m DEFAULT
Validating Service Monitor for TESTDB3
Preparing to move the TESTDB3 Service.

Command successfully submitted to the High Availability engine.

Please note, the database will change state asynchronously.
Please allow an additional 15 seconds for the command to be
properly propagated throughout all nodes.
TESTDB3 Service will be moved.
$mxodmstat -lv | grep "TESTDB3.*Writer"
tmr6s13   TESTDB3       TESTDB3       --           DB Writer
24567 SmallData,LargeData,Other; Read,Write; Sync,Async
tmr6s13   TESTDB3       TESTDB3       --           Log
Writer 24575 SmallData,OLG,Other; Read,Write; Sync,Async
$mxodmstat -lv | grep "TESTDB3.*Writer"
tmr6s14   TESTDB3       TESTDB3       --           DB Writer
31827 SmallData,Other; Read,Write; Sync,Async
tmr6s14   TESTDB3       TESTDB3       --           Log
Writer 31829 SmallData,OLG,Other; Read,Write; Sync,Async

$mxdb -d TESTDB3 -Q

Preparing to query the TESTDB3 Service on all nodes.

Service TESTDB3:
Connect through vhost: 10.10.60.173
Primary Node is:     10.10.60.14
Currently active on   10.10.60.14
Backups:          BACKUP-1 10.10.60.13

Enabled Nodes:     10.10.60.13 10.10.60.14
Disabled Nodes:
Maintenance Mode:    OFF

..And, of course I connect using a PolyServe Virtual Oracle Service :

$sqlplus scott/tiger@(protected)

SQL*Plus: Release 10.2.0.2.0 - Production on Thu Jul 27 10:29:41 2006

Copyright (c) 1982, 2005, Oracle. All Rights Reserved.


Connected to:
Oracle Database 10g Enterprise Edition Release 10.2.0.2.0 - 64bit
Production
With the Partitioning, OLAP and Data Mining options

SQL>

...then, while still running a shell on tmr6s13, I use mxodmstat to
see what I/O TESTDB3 is doing (NOTE--without knowing what node it is
running
on):


$uname -a;mxodmstat -a op -i3 -D TESTDB3
Linux tmr6s13 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:56:28 EST 2006 x86_64
x86_64 x86_64 GNU/Linux
                TESTDB3
       Read                 Write
Sync Async   KB/s Ave ms  Sync Async   KB/s Ave ms
4748   14  43933     1    8   951   6891     0
  0    0     0     0    0  0.67    11     0
102    4   5050     1    9   293  15128    10
112    8   5053     1    9   511  15503     9
 60    4   2930     1    5   156   8731     6
 60    4   2578     1    5   238   7902     8
 47    4   1544     1    4   247   5647    10


And for spice, I want to see "who" is doing that I/O so
I tell mxodmstat to report I/O by process type (again,
central monitoring of different cluster activity):

$uname -a;mxodmstat  -i3 -N tmr6s14 -s proc
Linux tmr6s13 2.6.9-34.ELsmp #1 SMP Fri Feb 24 16:56:28 EST 2006 x86_64
x86_64 x86_64 GNU/Linux
                                    tmr6s14
   Background         DB Writer         Log Writer
PQO            Foreground
Sync Asyn KB/s Ave ms Sync Asyn KB/s Ave ms Sync Asyn  KB/s Ave ms
Sync Asyn  KB/s Ave ms Sync Asyn  KB/s Ave ms
6443 1654 238674   1  23 7603 85688   11 982 1643 176746    1
0   0    0    0 278   0  2224    1
77  59 8653    1   0 253 2981    9  36  49  6038    1
0   0    0    0   0   0    0    0
 4   8  165    0   0  78  941   11  10   3  1848    1
0   0    0    0   0   0    0    0
78  59 8897    1   0 231 2805   10  38  47  7664    1
0   0    0    0   0   0    0    0
20  17 2204    0   0  45  683    8  13  13  1998    1
0   0    0    0   0   0    0    0
 3   8  144    0   0 139 1427   12   8 0.67   453    1
0   0    0    0   0   0    0    0


...and finally, cluster keys. Here is mxodmstat output
showing control files, data files online logs from TESTDB1-TESTDB5
with their paths and internal (kernel mode) cluster File identifier.
When
Oracle opens an ODM file, it tells ODM a guaranteed world-wide unique
identifier for the file. ODM maintains that FID in its state. Any
other open of that file on any node in the cluster, Oracle will
throw the key at ODM to see if it is open, and if it is, well, the
files don't get opened ...

$mxodmstat -lf | more
 FID Type  Path
------------------------------------------------------------------------
------
10610723 CTL
/u01/app/oracle/oradata/TESTDB1/controlfile/o1_mf_NIDYf1vEQ2Jr50_.ctl
10610724 CTL
/u01/app/oracle/flash_recovery_area/TESTDB1/controlfile/o1_mf_NIDYf16Re3
Jr51_.ctl
10610725 DATA
/u01/app/oracle/oradata/TESTDB1/datafile/o1_mf_system_5IDYf1eek3oq50_.db
f
10610726 DATA
/u01/app/oracle/oradata/TESTDB1/datafile/o1_mf_undotbs1_5IDYf1KAK3oq52_.
dbf
10610727 DATA
/u01/app/oracle/oradata/TESTDB1/datafile/o1_mf_sysaux_5IDYf1X8t3oq51_.db
f
10610728 DATA
/u01/app/oracle/oradata/TESTDB1/datafile/o1_mf_users_5IDYf177P3oq53_.dbf
10610729 DATA
/u01/app/oracle/oradata/TESTDB1/datafile/o1_mf_example_WIDYf17v118s50_.d
bf
10610730 DATA
/u01/app/oracle/oradata/TESTDB1/datafile/o1_mf_test_G9EYf1jii25f80_.dbf
10610731 DATA
/u01/app/oracle/oradata/TESTDB1/datafile/o1_mf_temp_TIDYf1Qf14Jr58_.tmp
10610732 OLG
/u01/app/oracle/oradata/TESTDB1/onlinelog/o1_mf_1_PIDYf1cMX3Jr52_.log
10610733 OLG
/u01/app/oracle/flash_recovery_area/TESTDB1/onlinelog/o1_mf_1_QIDYf1eer2
Jr53_.log
10610734 OLG
/u01/app/oracle/oradata/TESTDB1/onlinelog/o1_mf_2_QIDYf1BmX2Jr54_.log
10610735 OLG
/u01/app/oracle/flash_recovery_area/TESTDB1/onlinelog/o1_mf_2_RIDYf1xds1
Jr55_.log
10610736 OLG
/u01/app/oracle/oradata/TESTDB1/onlinelog/o1_mf_3_RIDYf1Pb02Jr56_.log
10610737 OLG
/u01/app/oracle/flash_recovery_area/TESTDB1/onlinelog/o1_mf_3_SIDYf1sDsJ
r57_.log
10610785 CTL
/u01/app/oracle/oradata/TESTDB2/controlfile/o1_mf_cRDYf1hDI1cR60_.ctl
10610786 CTL
/u01/app/oracle/flash_recovery_area/TESTDB2/controlfile/o1_mf_cRDYf1RR62
cR61_.ctl
10610787 DATA
/u01/app/oracle/oradata/TESTDB2/datafile/o1_mf_system_sQDYf1Fo72rP60_.db
f
10610788 DATA
/u01/app/oracle/oradata/TESTDB2/datafile/o1_mf_undotbs1_sQDYf1FFx2rP62_.
dbf
10610789 DATA
/u01/app/oracle/oradata/TESTDB2/datafile/o1_mf_sysaux_sQDYf1C7h2rP61_.db
f
10610790 DATA
/u01/app/oracle/oradata/TESTDB2/datafile/o1_mf_users_sQDYf1ypC2rP63_.dbf
10610791 DATA
/u01/app/oracle/oradata/TESTDB2/datafile/o1_mf_example_kRDYf1xEm3KR60_.d
bf
10610792 DATA
/u01/app/oracle/oradata/TESTDB2/datafile/o1_mf_test_kiEYf1HAR1nn80_.dbf
10610793 DATA
/u01/app/oracle/oradata/TESTDB2/datafile/o1_mf_temp_iRDYf190L2cR68_.tmp
10610794 OLG
/u01/app/oracle/oradata/TESTDB2/onlinelog/o1_mf_1_eRDYf17tP2cR62_.log
10610795 OLG
/u01/app/oracle/flash_recovery_area/TESTDB2/onlinelog/o1_mf_1_fRDYf19na1
cR63_.log
10610796 OLG
/u01/app/oracle/oradata/TESTDB2/onlinelog/o1_mf_2_fRDYf1OWH1cR64_.log
10610797 OLG
/u01/app/oracle/flash_recovery_area/TESTDB2/onlinelog/o1_mf_2_gRDYf1kUbc
R65_.log
10610798 OLG
/u01/app/oracle/oradata/TESTDB2/onlinelog/o1_mf_3_gRDYf1OSHcR66_.log
10610799 OLG
/u01/app/oracle/flash_recovery_area/TESTDB2/onlinelog/o1_mf_3_gRDYf1pxi3
cR67_.log
10610911 CTL
/u01/app/oracle/oradata/TESTDB4/controlfile/o1_mf_CeVYf196u26o30_.ctl
10610912 CTL
/u01/app/oracle/flash_recovery_area/TESTDB4/controlfile/o1_mf_CeVYf1rd63
6o31_.ctl
10610913 DATA
/u01/app/oracle/oradata/TESTDB4/datafile/o1_mf_system_pdVYf1m373Gk30_.db
f
10610914 DATA
/u01/app/oracle/oradata/TESTDB4/datafile/o1_mf_undotbs1_qdVYf1pHyGk32_.d
bf
10610915 DATA
/u01/app/oracle/oradata/TESTDB4/datafile/o1_mf_sysaux_pdVYf1HMZ3Gk31_.db
f
10610916 DATA
/u01/app/oracle/oradata/TESTDB4/datafile/o1_mf_users_qdVYf16c71Gk33_.dbf




--
http://www.freelists.org/webpage/oracle-l


©2008 dba.5341.com - Jax Systems, LLC, U.S.A.