While I do have the usual self-fencing issues when trying
to bring down individual nodes, I have not experienced
problems running OCFS2 under high IO loads. Infact, I have
run a load test where the entire cluster was doing about
80MB/sec of physical reads with system load averages of
20-30. This was with 10.2.0.1 and SLES9R2.
This is a 3 node cluster using Dell 2850's and an IBM DS400.
So this is FC SAN and not iscsi. Linux multipath is being used
for path redundancy.
I also have a 4th node that is a partially configured. It is
intended to be added to the cluster as a 4th node but currently
only has OCFS2 configured. It will have CRS and RAC installed
eventually. Now, it just runs a physical standby.
The whole system is still in testing and I have no problem
with just running that 4th node as a backup target when
I need to rebuild everything. I just mount the SAN volumes
and copy happily away.
I also run SLES9R3 with 10.2.0.1 just fine.
If 10.2.0.2 is a dud, that would be very handy to know.
Alexei_Roudnev wrote:
>I run few stress tests on SLES9 SP3 with Oracle 10.2 (Async IO by default), ASM and OCFS.
>
>Final conclusing - it is not working combination. OCFSv2 adds a great instability into any system, because a very poor connectivity adnd heartbeat
>detection, that it fail (jnto panic - fencing) under heavy load easily and kiils the whole cluster.
>
>Worst of all:
>
>- OCFSv2 have not any method to use a few redundant links, so it cannot be configured for HA anvironment 9except using low level IP binding).
>remember that both, iSCSI and Oarcle CRS/CSS, have such meahanisms (Oracle can be configured to use multiple interfaces, and iSCSI have
>both, multi port and multi path);
>- OCFSv2 reboots server even if there was not any activity on file system. It makes impossible using it as a near store (sich as backups etc).
>- OCFSv2 is extremely prine to different freezes. For example, if (it happen sometimes) it experienced internal failure in rerunning journal (on idle node) then it stick and you cannot stop it. remembering that it require unmounting all nodes for every simple operation (fsck, resize, changing cluster)
>it cause numerous reboots of the whole cluster.
>
>I am not sure when it all started - I remember running the same tests on older OCFSv2 (SP3 beta) and it worked fine (at least it looked fine).
>I will run few more experiments to determine a point, when it all became unstable, but for now, it looks extremely unstable (in test environment, fortunately). (At ;least it is unstable agter 10.2.0.1 -> 10.2.0.2 upgrade, but we did not run stress tests between all upgrades so I cannot eliminate OCFSv2 version problem, which is more likely than Oracle 10.2.0.2 but /I cannot image how Oracle can crash OCFS/.
>
>
>
>
>
>
--
To unsubscribe, email: suse-oracle-unsubscribe@(protected)
For additional commands, email: suse-oracle-help@(protected)
Please see http://www.suse.com/oracle/ before posting