  | | | re: 10.2.0.3 - root.sh fails when adding 2nd node ? | re: 10.2.0.3 - root.sh fails when adding 2nd node ? 2007-08-10 - By Peter Santos
Back Folks, I'm trying to install a 2nd node on my 2 node test cluster and I can't seem to get past running the root.sh script on the 2nd node. Whenever I execute the root.sh on the newly added node (this is the very last step of the installer), the CSS deamon doesn't come up and eventually it reboots my original node.
From the ocssd.log files, I can tell that it has something to do with the 2 nodes speaking to each other ... either via the ocr/vote disks or network connectivity.
I've setup my raw partitions via fdisk, bound them in /etc/raw and setup permissions in udev.permissions. I've even cksum'd all raw devices from both nodes .. and it all looks good.
Could I be missing something else? Any ideas?
Here is that the ocssd.log complains about.
[ CSSD]2007-08-09 15:40:27.547 >USER: CSS daemon log for node sdbe3, number 2, in cluster oracm_crs [ clsdmt]Listening to (ADDRESS=(PROTOCOL=ipc)(KEY=sdbe3DBG_CSSD)) [ CSSD]2007-08-09 15:40:27.642 [2546082016] >TRACE: clssscmain: local-only set to false [ CSSD]2007-08-09 15:40:34.506 [2546082016] >TRACE: clssnmReadNodeInfo: added node 1 (sdbe1) to cluster [ CSSD]2007-08-09 15:40:34.543 [2546082016] >TRACE: clssnmReadNodeInfo: added node 2 (sdbe3) to cluster [ CSSD]2007-08-09 15:40:34.548 [1082145120] >TRACE: clssnm_skgxnmon: skgxn init failed, rc 1 [ CSSD]2007-08-09 15:40:34.548 [2546082016] >TRACE: clssnm_skgxnonline: Using vacuous skgxn monitor [ CSSD]2007-08-09 15:40:37.912 [2546082016] >TRACE: clssnmInitNMInfo: misscount set to 60 [ CSSD]2007-08-09 15:40:37.918 [2546082016] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (0//dev/raw/raw1) [ CSSD]2007-08-09 15:40:37.979 [2546082016] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (1//dev/raw/raw3) [ CSSD]2007-08-09 15:40:37.981 [2546082016] >TRACE: clssnmDiskStateChange: state from 1 to 2 disk (2//dev/raw/raw5) [ CSSD]2007-08-09 15:40:40.816 [1084246368] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (1//dev/raw/raw3) [ CSSD]2007-08-09 15:40:40.825 [1082145120] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (0//dev/raw/raw1) [ CSSD]2007-08-09 15:40:40.830 [1084246368] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(483) LATS(0) Disk lastSeqNo(483) [ CSSD]2007-08-09 15:40:40.837 [1082145120] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(483) LATS(0) Disk lastSeqNo(483) [ CSSD]2007-08-09 15:40:41.767 [1086347616] >TRACE: clssnmDiskStateChange: state from 2 to 4 disk (2//dev/raw/raw5) [ CSSD]2007-08-09 15:40:41.779 [1086347616] >TRACE: clssnmReadDskHeartbeat: node(1) is down. rcfg(2) wrtcnt(484) LATS(0) Disk lastSeqNo(484) [ CSSD]2007-08-09 15:40:41.797 [2546082016] >TRACE: clssscSclsFatal: read value of disable [ CSSD]2007-08-09 15:40:41.797 [1090550112] >TRACE: clssnmFatalThread: spawned [ CSSD]2007-08-09 15:40:41.797 [2546082016] >TRACE: clssscSclsFatal: read value of disable [ CSSD]2007-08-09 15:40:41.798 [1092651360] >TRACE: clssnmconnect: connecting to node 2, flags 0x0001, connector 1 [ CSSD]2007-08-09 15:40:41.798 [1092651360] >TRACE: clssnmconnect: connecting to node 0, flags 0x0000, connector 1 [ CSSD]2007-08-09 15:40:41.799 [1092651360] >TRACE: clssnmconnect: connecting to node 1, flags 0x0001, connector 0 [ CSSD]2007-08-09 15:40:41.801 [1094752608] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc) (KEY=Oracle_CSS_LclLstnr_oracm_crs_2)) [ CSSD]2007-08-09 15:40:41.801 [1094752608] >TRACE: clssgmclientlsnr: listening on (ADDRESS=(PROTOCOL=ipc)(KEY=OCSSD_LL_sdbe3_oracm_crs)) [ CSSD]2007-08-09 15:40:42.832 [1092651360] >TRACE: clssnmConnComplete: connected to node 1 (con 0x2a981016c0), state 3 birth 0, unique 1186687891/1186687891 prevConuni(0) [ CSSD]2007-08-09 15:40:43.307 [1105258848] >TRACE: clssnmSendingThread: Connection complete [ CSSD]2007-08-09 15:40:43.307 [1103157600] >TRACE: clssnmPollingThread: Connection complete [ CSSD]2007-08-09 15:40:43.307 [1107360096] >TRACE: clssnmRcfgMgrThread: Connection complete [ CSSD]2007-08-09 15:40:43.307 [1107360096] >TRACE: clssnmRcfgMgrThread: Local Join [ CSSD]2007-08-09 15:40:43.307 [1107360096] >TRACE: clssnmLocalJoinEvent: set node(1) inactive [ CSSD]2007-08-09 15:40:43.307 [1107360096] >WARNING: clssnmLocalJoinEvent: takeover aborted due to UNKNOWN nodes [ CSSD]2007-08-09 15:40:43.992 [1092651360] >TRACE: clssnmHandleSync: Acknowledging sync: src[1] srcName[sdbe1] seq[5] sync[2] [ CSSD]2007-08-09 15:40:44.309 [1107360096] >TRACE: clssnmRcfgMgrThread: lastleader(1) unique(1186688418) [ CSSD]2007-08-09 15:40:44.994 [1092651360] >TRACE: clssnmSendVoteInfo: node(1) syncSeqNo(2) [ CSSD]2007-08-09 15:40:46.998 [1092651360] >TRACE: clssnmUpdateNodeState: node 0, state (0/0) unique (0/0) prevConuni(0) birth (0/0) (old/new) [ CSSD]2007-08-09 15:40:46.998 [1092651360] >TRACE: clssnmDeactivateNode: node 0 () left cluster
[ CSSD]2007-08-09 15:40:46.998 [1092651360] >TRACE: clssnmUpdateNodeState: node 1, state (4/3) unique (1186687891/1186687891) prevConuni(0) birth (0/1) (old/new) [ CSSD]2007-08-09 15:40:46.998 [1092651360] >TRACE: clssnmUpdateNodeState: node 2, state (1/2) unique (1186688418/1186688418) prevConuni(0) birth (0/2) (old/new) [ CSSD]2007-08-09 15:40:46.998 [1092651360] >USER: clssnmHandleUpdate: SYNC(2) from node(1) completed [ CSSD]2007-08-09 15:40:46.998 [1092651360] >USER: clssnmHandleUpdate: NODE 1 (sdbe1) IS ACTIVE MEMBER OF CLUSTER [ CSSD]2007-08-09 15:40:46.998 [1092651360] >USER: clssnmHandleUpdate: NODE 2 (sdbe3) IS ACTIVE MEMBER OF CLUSTER [ CSSD]2007-08-09 15:40:47.002 [2546082016] >USER: NMEVENT_SUSPEND [00][00][00][00] [ CSSD]2007-08-09 15:40:47.003 [1109461344] >TRACE: clssgmReconfigThread: started for reconfig (2) [ CSSD]2007-08-09 15:40:47.003 [1109461344] >USER: NMEVENT_RECONFIG [00][00][00][06] [ CSSD]2007-08-09 15:40:47.003 [1109461344] >TRACE: clssgmEstablishConnections: 2 nodes in cluster incarn 2 [ CSSD]2007-08-09 15:40:47.075 [1101056352] >TRACE: clssgmInitialRecv: (0x774770) accepted a new connection from node 1 born at 1 active (2, 2), vers (10,3,1,2) [ CSSD]2007-08-09 15:40:47.075 [1101056352] >TRACE: clssgmInitialRecv: conns done (2/2) [ CSSD]2007-08-09 15:40:47.075 [1109461344] >TRACE: clssgmEstablishMasterNode: MASTER for 2 is node(1) birth(1) [ CSSD]2007-08-09 15:40:47.075 [1109461344] >TRACE: clssgmChangeMasterNode: requeued 0 RPCs [ CSSD]2007-08-09 15:40:47.590 [1084246368] >TRACE: clssnmvFatalCheck: extra node 1 [ CSSD]2007-08-09 15:40:47.590 [1084246368] >TRACE: clssnmvFatalCheck: fatal 1, sclsfatal 0 [ CSSD]2007-08-09 15:40:47.593 [1086347616] >TRACE: clssnmvFatalCheck: extra node 1 [ CSSD]2007-08-09 15:40:47.593 [1086347616] >TRACE: clssnmvFatalCheck: fatal 1, sclsfatal 0 [ CSSD]2007-08-09 15:40:47.600 [1082145120] >TRACE: clssnmvFatalCheck: extra node 1 [ CSSD]2007-08-09 15:40:47.600 [1082145120] >TRACE: clssnmvFatalCheck: fatal 1, sclsfatal 0 [ CSSD]2007-08-09 15:40:47.824 [1090550112] >TRACE: clssnmFatalThread: Fatal mode enabled [ CSSD]2007-08-09 15:40:48.045 [1092651360] >TRACE: clssnmSendFatalOn: req to syncLeader(1) [ CSSD]2007-08-09 15:40:51.322 [1103157600] >TRACE: clssnmPollingThread: node sdbe3 (2) missed(2) checkin(s) [ CSSD]2007-08-09 15:41:17.132 [1109461344] >ERROR: clssgmSlaveCMSync: reconfig timeout on master 1
[ CSSD]2007-08-09 15:41:17.132 [1109461344] >TRACE: clssgmReconfigThread: completed for reconfig(2), with status(0) [ CSSD]2007-08-09 15:41:17.190 [2546082016] >ERROR: clssgmStartNMMon: reconfig incarn 2 failed. Retrying.
-- To unsubscribe, email: suse-oracle-unsubscribe@(protected) For additional commands, email: suse-oracle-help@(protected) Please see http://www.suse.com/oracle/ before posting
|
|
 |