AW: [suse-oracle] Novell - it is VERY SERIOUS! Weshouldadviseagainst using S 2007-01-09 - By Alexei_Roudnev
Back Ok, it's something.
Questions to all, who have this bug and who do not have it:
- SAN system/driver? - MPIO software/mode? - DO you use partitions? (it is strange question, I know, but I posted/SuSe fixed one bug, related to unpartitioned devices)
Who can reproduce a bug - can you show exact configuration? I mean - platform - kernel - lsmod output - mpio if any - FC cards if any... If you use MPIO - what happen without it (I mean - not turning it down, but not using md /or other/ driver, normally running with mpio)?
PS. Looks as SLES version and file system are not important for this bug. Bug is somewhere below, in the disk access. Can it be a bug in SCSI or HBA driver?
-- -- Original Message -- -- From: "Eduardo D Piovesam" <eduardo@(protected)> To: "suse-oracle" <suse-oracle@(protected)> Sent: Tuesday, January 09, 2007 4:20 PM Subject: Re: AW: [suse-oracle] Novell - it is VERY SERIOUS! Weshouldadviseagainst using SuSe and Oracle if
Certainlly it's a bug with an specific config.
We use many Oracle databases with AIO + 10.1.0.4 + SLES9SP1_x86 (32 bits) + reiserFS + specific patch for SLES9 and AIO with no issues.
All databases run 24x7, and some with very intensive I/O almost all the time. No problem so far... running for 200+ days and counting.
We're very "happy" with this config, we use it in almost all our Oracle installations.
Regards, Eduardo
On Tue, 2007-01-09 at 13:04 -0800, Alexei_Roudnev wrote: > You may try to get a statistics, WHO and in WHICH CONFIGURATIONS had this > bug. For example, I never saw it in the lab, and our data center in Canada > never saw it in production, but we all use iSCSI and ext3/ASM (and don't use > multipath software because iSCSI have embedded reliability). > > I don't think, that this problem exists in all configurations - if so, they > in SuSe was able to find it many month ago. On the other hand, > you have a customer, have exact configuration with exact description - so > it's a matter of time to reproduce it. > > What's possible here, as a factor, raising this bug: > > - multipath IO > - specific file system > - specific IO scheduler > - specific kernel parameters (unlikely, but who knows) > - specific response, coming from SAN storage under special conditions (so > that oter SAN storage wil not show this bug). > > I am testing few new servers now, and I will try to get this bug running > TPCC tests, BUT the chances to have it are low. > > > > -- -- Original Message -- -- > From: "Arun Singh" <Arun.Singh@(protected)> > To: "Frank Westheider" <frank.westheider@(protected)> > Cc: "'Anton Dischner'" <Anton.Dischner@(protected)>; > <suse-oracle@(protected)> > Sent: Tuesday, January 09, 2007 11:41 AM > Subject: Re: AW: [suse-oracle] Novell - it is VERY SERIOUS! Weshouldadvise > against using SuSe and Oracle if > > > Hi Frank, > > I can understand the frustration here as I am following this in the > bugzilla. You are right in asserting the complexity of the bug. > > Regards, > Arun > > >>> On 1/9/2007 at 11:11 AM, "Frank Westheider" > <frank.westheider@(protected)> > wrote: > > Hi Arun, > > > > Sorry...but cool down is not an option. > > > > We have this bug on 3 machines since beginning of september (!!!) and > opened > > novell and oracle support-cases....without any success/effort. > > > > We escalated these cases at novell and got 2 kernel-ptf-fixes...we > tested > > both and diffed the ptf-kernel-source with the original buggy > > kernel-source...and even we saw, that the difference were not > concerning the > > aio-kernel-part. > > > > We can reproduce the problem (and wrote this to the support too) with > heavy > > async-io-load over a duration of several days (batch-processes). So i > think, > > this would be possible to Novell/Oracle/the kernel-guys as well with > simple > > db-jobs using aio and firing until the aio-bug is shown.... > > > > Head-shaking about the support...beginning of september and no really > effort > > to fix this thing. I hope you know, what aio is doing for really > busy > > databases... > > > > Frank > > > > -- --Urspr??ngliche Nachricht-- -- > > Von: Arun Singh [mailto:Arun.Singh@(protected)] > > Gesendet: Dienstag, 9. Januar 2007 16:34 > > An: suse-oracle@(protected) > > Betreff: RE: [suse-oracle] Novell - it is VERY SERIOUS! We > shouldadvise > > against using SuSe and Oracle if suc > > > > Cool down folks. Do any of you have any easy way to trigger this > kernel bug? > > I can promise fix will take no time, If we can find a way to simulate > this > > bug internally. SUSE support/kernel team is aware of the seriousness > of this > > bug. > > > > -Arun > > > > > > > >>>> On 1/9/2007 at 1:16 AM, "Yoav Givon" <YGivon@(protected)> wrote: > >> I strongly support you Alexei on this one. This is not very > business > >> like and I believe Novell Officials should make some strong > commitment > >> here other wise they will loose us . Loud and Clear. > >> > >> Yoav Givon > >> > >> -- --Original Message-- -- > >> From: Alexei_Roudnev [mailto:Alexei_Roudnev@(protected)] > >> Sent: Monday, January 08, 2007 10:40 PM > >> To: Anton Dischner; Didier Boiteux > >> Cc: SuSE MailingList > >> Subject: [suse-oracle] Novell - it is VERY SERIOUS! We should > advise > >> against using SuSe and Oracle if such bugs are not fixed PROMPTLY!! > Re: > >> AW: [suse-oracle] kernel bug > >> > >> This is a critical bug - why it takes so long to fix it? > >> > >> Novell is not primary Linux in Oracle (it is not tested with it > >> automatically by internal users, as they do with RHEL4.4), so it > makes > >> a very important to fix such problems ASAP when they arise!) > >> > >> > >> -- -- Original Message -- -- > >> From: "Anton Dischner" <Anton.Dischner@(protected)> > >> To: "Didier Boiteux" <dboiteux@(protected)> > >> Cc: "SuSE MailingList" <suse-oracle@(protected)> > >> Sent: Monday, January 08, 2007 12:54 AM > >> Subject: Re: AW: [suse-oracle] kernel bug > >> > >> > >>> Hi, > >>> > >>> we had a kernel crash with a system which had the last kernel > patches > >>> installed. > >>> > >>> Kernel 2.6.5-7.283 did -not- fix the kernel BUG at fs/aio.c:733 > >>> with bug number 165140 > >>> > >>> regards, > >>> > >>> Toni > >>> > >>> > >>> -- > >>> > +-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- --+ > >>> | Anton Dischner, Unix admin, Oracle-DBA Phone: +49 89 > 70953202| > >>> | Institut fuer Klinische Chemie Fax : +49 89 > 70958888| > >>> | Klinikum Grosshadern Home : +49 89 > 69708766| > >>> | Ludwig Maximilians Universitaet Muenchen Mobil: +49 172 > 8388880| > >>> | 81366 Muenchen Germany > | > >>> | Marchioninistr. 15 Anton.Dischner"at"med.uni-muenchen.de > | > >>> > +-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- --+ > >>> > >>> > >>> > >>> -- > >>> To unsubscribe, email: suse-oracle-unsubscribe@(protected) > >>> For additional commands, email: suse-oracle-help@(protected) > >>> Please see http://www.suse.com/oracle/ before posting > >>> > >>> > >> > > > > > > > > > > > > > -- > To unsubscribe, email: suse-oracle-unsubscribe@(protected) > For additional commands, email: suse-oracle-help@(protected) > Please see http://www.suse.com/oracle/ before posting > > >
-- To unsubscribe, email: suse-oracle-unsubscribe@(protected) For additional commands, email: suse-oracle-help@(protected) Please see http://www.suse.com/oracle/ before posting
-- To unsubscribe, email: suse-oracle-unsubscribe@(protected) For additional commands, email: suse-oracle-help@(protected) Please see http://www.suse.com/oracle/ before posting
|
|