Hi all,
Can anybody provide me with some insight on what RMAN is doing
about ten minutes before it gets an error like the one below? I ask
because frequently an unrelated process (always the same one) will send a
timeout message about 10 minutes before RMAN fails. When RMAN finishes
successfully, the unrelated process never times out.
The bottom of the RMAN log file has the following error
messages:
RMAN-03009: failure of backup command on t2 channel at
07/01/2007 12:27:19
ORA-27191: sbtinfo2 returned error
Additional information: 2
ORA-19511: Error received from media manager layer, error
text:
do_info2: A connection to NW server 'tcclnw01'
could not be established because
'Port mapper failure'.
The person in charge of our media management layer says it
just means the RMAN process is not able to find a port to connect
to. He will be looking into the problem further. However,
this does not explain why we frequently see a timeout error with this unrelated
process.
I suspect that RMAN may be utilizing all the CPU under these
conditions, but I don’t know for sure. This backup occurs on the
weekend, and I am not monitoring the system at that time.
Additional Information:
- AIX 5.3
- Oracle 10.2.0.2
- RMAN backup to tape (which you already know because a
media management layer is involved)
- We rarely (or never) see this error when RMAN completes
successfully.
- In the past, this unrelated process has also timed out when
a run away process causes the entire UNIX system to hang (until it
finishes).
Thanks a lot for any insight you can provide!
Sam Bootsma
Oracle Database Administrator
Phone: 416-415-5000 x4933
Fax: 416-415-4836
E-mail: sbootsma@georgebrown.ca