I’ve got OCP

It is hard for me to keep writing on this blog; fortunately this morning i’ve found a comment on the post “RAC Virtual IP” and so after i’ve anwered to that comment i’ve decided to post a little update. On 20 November 2008 i’ve take and passed Oracle 10g R2 Database OCP exam so i’m OCP!

Advertisements

I’m still alive

… but not OCP. Yes in the last times i’ve been very busy and very lazy. So i’ve not written nothing here, i’m still studying (?) to get OCP but work does not leave me so much time to study and to write here something useful or interesting. Now i write this useless post only to say that i’m here and i’m still working with Oracle. by the way, one custemer of my company as recently upgraded from 9iR2 to 11g. That’s incredible but it seems that 11g works fine.

OCP: Oracle Certified Professional

On 6 December 2007 i’ve got OCA certification. Now i dream to get OCP certification so i’ve started with “OCP Oracle 10g Administration II” book by Sybex. I like the challenges

Oracle Certification

I’m studying to get Oracle Certification, OCA Oracle Certified Associate, the first level of Oracle 10g database certification, so i’ve create on this blog a new page (certification) where i put my personal notes mostly on the arguments where i’m least prepared. This means that probably it that page there are not a lot of interesting informations, on the other hand to get certification the necessary informations are all contained in Oracle public documentation, the manuals.

Write that page helps me to remind the things.

Another Interesting Issue

On the same database of which i’ve already blogged i’ve encountered another interesting issue. Some times the database get down without messages on alert.log. Database is used by two applications, on log of one of this applications there was:

java.sql.SQLException: ORA-01114: IO error writing block to file 201 (block # 188712)
ORA-27070: skgfdisp: async read/write failed
OSD-04016: Error queuing an asynchronous I/O request.
O/S-Error: (OS 2) The system cannot find the file specified.
ORA-01114: IO error writing block to file 201 (block # 188712)
ORA-27070: skgfdisp: async read/write failed
OSD-04016: Error queuing an asynchronous I/O request.
O/S-Error: (OS 2) The system cannot find the file specified.
ORA-01114: IO error writing block to file 201 (block # 188712)
ORA-27070: skgfdisp: async read/write failed
OSD-04016: Error queuing an asynchronous I/O request.
O/S-Error: (OS 2) The system cannot find the file specified.

I’ve also opened a SR on Metalink but with only suggestion that it was an hardware problem. The strange things were two:

  1. On alert.log there was nothing
  2. The file indicated by the message in application log (#201) did not exist on the database

After a while we was able to reproduce the problem, it was a query with a group by on a large data set. After a couple of test my intuition was that the problem were the TEMPORARY tablespace, so i’ve created a new TEMPORARY tablespace, i’ve setted it as new default temporary tablespace ad re-tried the test with success. It is clear that there is a bug that cause Oracle db (9.2.0.1) to crash with particular corruption on TEMPORARY tablespace.

Undo Tablespace Corruption

Some time ago i’ve encountered a problem with a database of a customer. It is Oracle 9.2.0.1 on Windows 2000 with Oracle Fail Safe.

On Alert.log we found:

KCF: write/open error block=0x4351 online=1
file=2 O:\ORACLE\ORADATA\GEOP\UNDOTBS01.DBF
error=27070 txt: 'OSD-04016: Error queuing an asynchronous I/O request.
O/S-Error: (OS 2) The system cannot find the file specified.'
Automatic datafile offline due to write error on
file 2: O:\ORACLE\ORADATA\GEOP\UNDOTBS01.DBF
Tue Jul 10 02:08:42 2007
Errors in file o:\oracle\admin\geop\udump\geop_ora_844.trc:
ORA-00376: file 2 cannot be read at this time
ORA-01110: data file 2: 'O:\ORACLE\ORADATA\GEOP\UNDOTBS01.DBF'
ORA-00372: file 2 cannot be modified at this time
ORA-01110: data file 2: 'O:\ORACLE\ORADATA\GEOP\UNDOTBS01.DBF'

where the two lines

ORA-00376: file 2 cannot be read at this time
ORA-01110: data file 2: 'O:\ORACLE\ORADATA\GEOP\UNDOTBS01.DBF'

were repeated thousands of times.

at same time in windows event viewer:

Event Type: Warning
Event Source: Ftdisk
Event Category: None
Event ID: 50
Date: 07/07/2007
Time: 07:55:15
User: N/A
Computer: GEOCALL2
Description:
{Lost Delayed-Write Data} The system was attempting to transfer file data from buffers to \Device\HarddiskVolume5. The write operation failed, and only some of the data may have been written to the file.
Data:
0000: 00 00 04 00 02 00 56 00 ......V.
0008: 00 00 00 00 32 00 04 80 ....2..€
0010: 00 00 00 00 00 00 00 00 ........
0018: 00 00 00 00 00 00 00 00 ........
0020: 00 00 00 00 00 00 00 00 ........
0028: 0e 00 00 c0 ...À

and repeated messages


Event Type: Warning
Event Source: Disk
Event Category: None
Event ID: 51
Date: 07/10/2007
Time: 02:21:01
User: N/A
Computer: GEOCALL2
Description:
An error was detected on device \Device\Harddisk3\DR3 during a paging operation.
Data:
0000: 04 00 22 00 01 00 72 00 .."...r.
0008: 00 00 00 00 33 00 04 80 ....3..€
0010: 2d 01 00 00 0e 00 00 c0 -......À
0018: 00 00 00 00 00 00 00 00 ........
0020: 00 00 00 00 00 00 00 00 ........
0028: 04 00 00 00 03 00 00 00 ........
0030: 00 00 00 00 2a 00 00 00 ....*...
0038: 00 08 00 00 00 00 00 00 ........
0040: 2a 00 02 4f db 2f 00 00 *..OÛ/..
0048: 08 00 ..

I’ve to say that this Oracle installation is not very lucky, messages of disk problems in event viewer sometimes returns, but hardware vendor tell us that there are no problems on the hardware.

Another thing that i’ve to remember is to read with my eyes the alert.log. In fact i was called by a collaborator and i did not see the line

Automatic datafile offline due to write error on

Immediately i’ve thought a corruption on the file, i’ve created a new UNDO tablespace, i’ve changed UNTOTBS parameter to point to the new tablespace. Then we tried to remove the old tablespace but we got a message that a rollback segment was “active”. In V$TRANSACTION there was no records. I was not able to understand why Oracle was telling us that. The database could be opened but the application on one step was still given an error message by Oracle stating that old undo tablespace datafile were not available. So we decided to restore from backup the tablespace, we recovered. After that i onlined the tablespace, i made a “select count(*)” from a table (the table used by application that had given the error). After that i’ve been able to drop tablespace with datafile.

Conclusion

My description has been confused but the conclusion, and the lesson i’ve learned is that UNDO TABLESPACE may contain data needed to the integrity of the database. I think that is the case of “delayed block cleanout”. If there are active trasactions that is obvious and it is visible by V$TRANSACTION system view, but in the case of “delayed block cleanout” i think that information is not easily available.

English Dictionary

I’ve been busy for a while, so i’ve haven’t written new posts here. But two days ago i’ve bought a new fantastic English Dictionary: my battle with English language is very hard but i’ll never give up, please forgive me for this. In the meanwhile i still write on my original blog.