Ilmar Kerm

Oracle, databases, Linux and maybe more

I came to work today morning and there was an alert in my inbox saying that one of the large databases failed the nightly restore test.
Looking into RMAN logs I saw that recovery failed when applying archivelogs and error was something I have never seen before:

ORA-00756: recovery detected a lost write of a data block
ORA-10567: Redo is inconsistent with data block (file# 3, block# 37166793, file offset is 3350347776 bytes)
ORA-10564: tablespace SYSAUX
ORA-01110: data file 3: '/nfs/...'
ORA-10560: block type 'FIRST LEVEL BITMAP BLOCK'

Version: 12.1.0.2 with 2017-08 bundle patch

Looking at MOS I see two bugs that could match:

  • Bug 22302666 ORA-753 ORA-756 or ORA-600 [3020] with KCOX_FUTURE after RMAN Restore / PITR with BCT after Open Resetlogs in 12c

but this bugfix is already included in 2017-08 bundle patch, and:

  • Bug 23589471 ORA-600 [3020] KCOX_FUTURE or ORA-756 Lost Write detected during Recovery of a RMAN backup that used BCT

Looks like this matches quite well with my situation and the note has a really scary sentence in it: Incremental RMAN backups using BCT may miss to backup some blocks.

Bugs exists in modern software, bugs exist even in rock solid Oracle database backup and recovery procedures. It doesn’t matter if the backup was completed successfully, the state of the backup can only be determened when a restore is attempted. So please start testing your backups. Regularly. Daily.

Non-technical post about database backup and recovery
Scripts to implement incremental-forever backup strategy in Oracle and testing backups automatically