We don't know yet why the first node froze. We had consoles (ssh) open on it
that didn't respond, and the local console was blank, NumLock not working.
We are investigating now for possible causes.
But why did the node2 panicked? Does this bellow means that node1 was still
able to write on the shared storage at that moment? And it was just not
responding on the network interface?
Feb 14 14:40:43 node2-cluster1 kernel: (0,2):o2net_idle_timer:1306 connection
to node node1-cluster1 (num 0) at 10.0.0.1:7777 has been idle for 10 seconds,
shutting it down.
Feb 14 14:40:43 node2-cluster1 kernel: (0,2):o2net_idle_timer:1317 here are
some times that might help debug the situation: (tmr 1139920833.453526 now
11399
20843.451782 dr 1139920833.453505 adv 1139920833.453576:1139920833.453578 func
(5bed52dc:505) 1139920833.453526:1139920833.453568)
Feb 14 14:40:43 node2-cluster1 kernel: (6591,2):o2net_set_nn_state:407 no
longer connected to node node1-cluster1 (num 0) at 10.0.0.1:7777
Feb 14 14:40:43 node2-cluster1 kernel: (7873,4):dlm_do_master_request:1172
ERROR: link to 0 went down!
Feb 14 14:40:43 node2-cluster1 kernel: (7873,4):dlm_get_lock_resource:778
ERROR: status = -112
Feb 14 14:40:43 node2-cluster1 kernel: (8344,0):dlm_do_master_request:1172
ERROR: link to 0 went down!
Feb 14 14:40:43 node2-cluster1 kernel: (8344,0):dlm_get_lock_resource:778
ERROR: status = -112
Feb 14 14:41:19 node2-cluster1 kernel: (20,2):o2quo_make_decision:144 ERROR:
fencing this node because it is connected to a half-quorum of 1 out of 2
nodes
which doesn't include the lowest active node 0
Feb 14 14:41:19 node2-cluster1 kernel: (20,2):o2hb_stop_all_regions:1728
ERROR: stopping heartbeat on all active regions.
Feb 14 14:41:19 node2-cluster1 kernel: Kernel panic: ocfs2 is very sorry to be
fencing this system by panicing
Feb 14 14:41:19 node2-cluster1 kernel:
Feb 14 14:45:19 node2-cluster1 syslogd 1.4.1: restart.
--
To unsubscribe, email: suse-oracle-unsubscribe@(protected)
For additional commands, email: suse-oracle-help@(protected)
Please see http://www.suse.com/oracle/ before posting