Ilmar Kerm

Oracle, databases, Linux and maybe more

I hit this issue by accident, developers wanted to disable inserts to child table so they could perform some one-time maintenance operation, and this maintenance only affected one rown from the parent table (and all it’s children). I started wondering if there is any smaller level impact solution possible than taking a shared read lock on child table.

Database version: 12.1.0.2, but the same behaviour was also present in 11.2.0.4.

Very simple test schema set up:

create table p (
  id number(10) primary key,
  v varchar2(10) not null
) organization index;

create table c (
  id number(10) primary key,
  p_id number(10) not null references p(id),
  v varchar2(10) not null
);

insert into p values (1, '1');
insert into p values (2, '2');
insert into c values (1, 1, '1');
insert into c values (2, 1, '2');
insert into c values (3, 2, '3');

create index cpid on c (p_id);

Note, the foreign key is indexed.

I just had a thought that what will happen if I lock the parent table row first using SELECT FOR UPDATE in order to take the lock as low level as possible. What would happen then with inserts to the child table? Database needs to somehow protect that the parent row does not change/disappear while the child is being inserted.

SQL> SELECT * FROM p WHERE id=1 FOR UPDATE;

        ID V        
---------- ----------
         1 1         

SQL> SELECT sys_context('userenv','sid') session_id from dual;

SESSION_ID                                                                     
--------------------
268                                                                             

Now row id=1 is locked in the parent table p by session 268.
Could another session insert into table c when the the parent table row is locked?

SQL> SELECT sys_context('userenv','sid') session_id from dual;

SESSION_ID                                                                     
---------------------
255                                                                             

SQL> INSERT INTO c (id, p_id, v) VALUES (12, 2, 'not locked');

1 row inserted.

SQL> INSERT INTO c (id, p_id, v) VALUES (11, 1, 'locked');

So I could insert into the child table c a new row where p_id=2 (p.id=2 was not locked), but the second insert where p_id=1 (p.id=1 was locked by session 268 earlier) just hangs. Lets look why session 255 is hanging:

SQL> select status, event, state, blocking_session from v$session where sid=255;

STATUS   EVENT                                                        STATE                                 BLOCKING_SESSION
-------- ------------------------------------------------------------ ------------------- ----------------------------------
ACTIVE   enq: TX - row lock contention                                WAITING                                            268

Session doing the insert is blocked by the session who is holding a lock on the parent table row that the insert is referring to.
Lets look at the locks both sessions are holding/requesting:

SQL> select session_id, lock_type, mode_held, mode_requested, blocking_others, trunc(lock_id1/power(2,16)) rbs, bitand(lock_id1, to_number('ffff','xxxx'))+0 slot, lock_id2 seq
from dba_locks where session_id in (255,268) and lock_type != 'AE'
order by rbs, slot,seq

SESSION_ID LOCK_TYPE       MODE_HELD       MODE_REQUESTED  BLOCKING_OTHERS        RBS       SLOT SEQ    
---------- --------------- --------------- --------------- --------------- ---------- ---------- --------
       255 DML             Row-X (SX)      None            Not Blocking             1      34910 0       
       268 DML             Row-X (SX)      None            Not Blocking             1      34910 0       
       255 DML             Row-X (SX)      None            Not Blocking             1      34912 0       
       255 Transaction     Exclusive       None            Not Blocking             2         23 3016    
       255 Transaction     None            Share           Not Blocking             9          3 2903    
       268 Transaction     Exclusive       None            Blocking                 9          3 2903    

 6 rows selected 

SQL> select o.object_type, o.object_name from v$locked_object lo join dba_objects o on o.object_id = lo.object_id
where lo.xidusn=9 and lo.xidslot=3 and lo.xidsqn=2903;

OBJECT_TYPE     OBJECT_NAME   
--------------- ---------------
TABLE           P              

Here we see that session 268 is holding transaction (TX, row level) lock in Exclusive mode on table P and it is blocking session 255 that is requesting lock on the same row in Share mode.
Here I have to conclude, that when inserting a row to child table, Oracle also tries to get a Shared row lock on the parent table. Looks perfect for my use case and I announced victory in our internal DBA mailing list. But a few minutes later a colleque emailed me back, that it does not work. He recreated the setup (his own way, not using my scripts) and after locking the parent table row he was able to insert to the child table just fine.
It took some time to work out the differences in our setups and in the end I reduced the difference to a simple fact that I create index-organized tables by default, and he creates heap tables by default and that makes all the difference in this case. This only works when the parent table is index-organized.

Lets try the same example, but now creating parent table as HEAP:

create table pheap (
  id number(10) primary key,
  v varchar2(10) not null
);

create table cheap (
  id number(10) primary key,
  p_id number(10) not null references pheap(id),
  v varchar2(10) not null
);

insert into pheap values (1, '1');
insert into pheap values (2, '2');
insert into cheap values (1, 1, '1');
insert into cheap values (2, 1, '2');
insert into cheap values (3, 2, '3');

create index cheappid on cheap (p_id);

Lock the parent row in one session:

SQL> SELECT * FROM pheap WHERE id=1 FOR UPDATE;
        ID V        
---------- ----------
         1 1         

SQL> SELECT sys_context('userenv','sid') session_id from dual;

SESSION_ID                                                                                                                                                                                             
--------------------
3  

And try to insert into child from another session:

SQL> SELECT sys_context('userenv','sid') session_id from dual;

SESSION_ID    
---------------
12             

SQL> INSERT INTO cheap (id, p_id, v) VALUES (12, 2, 'not locked');

1 row inserted.

SQL> INSERT INTO cheap (id, p_id, v) VALUES (11, 1, 'locked');

1 row inserted.

No waiting whatsoever. Does anybody know why this difference in behaviour between IOT and HEAP tables?

UPDATE 2016-06-26: After getting Jonathan Lewis involved in Twitter, he solved this mystery very quickly 🙂 Link to twitter conversation

Oracle EE 11.2.0.4 on Linux x86-64.

I got a really surprising error message today when setting up a new data guard standby database.
I created a standby controlfile as usual and placed it on a common NFS share accessible also to the new data guard host:

SQL> alter database create standby controlfile as '/nfs/install/oemdb/cf2.f';

Database altered.

Now, on a new node I tried to restore that controlfile, but got a really surprising RMAN-06172: no AUTOBACKUP found or specified handle is not a valid copy or piece. This shouldn’t happen, it is just stored on a common NFS share, file should not be damaged.

RMAN> restore controlfile from '/nfs/install/oemdb/cf2.f';

Starting restore at 20-MAY-16
using channel ORA_DISK_1

RMAN-00571: ===========================================================
RMAN-00569: =============== ERROR MESSAGE STACK FOLLOWS ===============
RMAN-00571: ===========================================================
RMAN-03002: failure of restore command at 05/20/2016 12:58:33
RMAN-06172: no AUTOBACKUP found or specified handle is not a valid copy or piece

Although the error message does not say it, but I remembered that I had mounted the NFS using SOFT mount option and when trying to restore datafiles from soft mounted NFS shared you will usually get ORA-27054: NFS file system not mounted with correct options, unless you have turned on Direct-NFS on the database kernel. So I just wondered, maybe this is the real error message in this case also.
After turning on Direct NFS, restoring the control file worked as expected:

[production|oracle@vdb0005.mlt.unibet.com oemdb]$ cd $ORACLE_HOME/rdbms/lib
[production|oracle@vdb0005.mlt.unibet.com lib]$ make -f ins_rdbms.mk dnfs_on
rm -f /u01/app/oracle/product/11.2.0.4/db/lib/libodm11.so; cp /u01/app/oracle/product/11.2.0.4/db/lib/libnfsodm11.so /u01/app/oracle/product/11.2.0.4/db/lib/libodm11.so
[production|oracle@vdb0005.mlt.unibet.com lib]$ sqlplus / as sysdba

SQL*Plus: Release 11.2.0.4.0 Production on Fri May 20 13:01:56 2016

Copyright (c) 1982, 2013, Oracle.  All rights reserved.

Connected to an idle instance.

SQL> startup nomount
ORACLE instance started.

Total System Global Area 9620525056 bytes
Fixed Size                  2261368 bytes
Variable Size            2449477256 bytes
Database Buffers         7147094016 bytes
Redo Buffers               21692416 bytes
SQL> Disconnected from Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production
With the Partitioning, OLAP, Data Mining and Real Application Testing options
[production|oracle@vdb0005.mlt.unibet.com lib]$ rman target /

Recovery Manager: Release 11.2.0.4.0 - Production on Fri May 20 13:02:14 2016

Copyright (c) 1982, 2011, Oracle and/or its affiliates.  All rights reserved.

connected to target database: OEM (not mounted)

RMAN> restore controlfile from '/nfs/install/oemdb/cf2.f';

Starting restore at 20-MAY-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=474 device type=DISK

channel ORA_DISK_1: copied control file copy
output file name=+DATA/oem/controlfile/current.257.912344539
Finished restore at 20-MAY-16

The NFS share was mounted using options:
type nfs (rw,bg,soft,rsize=32768,wsize=32768,tcp,nfsvers=3,timeo=600,addr=10.10.10.10)

I’ll be presenting my brand new presentation “Using image copies for Oracle database backups” at ilOUG Tech Days on 30. May in Israel.

More information about the event can be found here

Abstract of my presentation:
When databases get ever larger and larger, backing them up using traditional RMAN backupsets will quickly get unfeasible. Completing a backup requires too much time and resources, but more importantly the same also applies to restores. RMAN has always provided a solution as incrementally updated image copies, but they are much less manageable than backupsets. This presentation goes into detail on how to successfully implement incrementally updated image copy backups, automate them and implement features that together with a capable storage system can provide almost everything that Oracle ZDLRA promises and beyond.

Looking forward to the event!

Since 11.1 RMAN has had a silent new feature – RMAN Backup Undo Optimization. This feature will exclude undo from committed transactions (after undo_retention time has also passed) from backups, possibly making the undo tablespace backup much smaller. The documentation just says that it will work for disk backups and Oracle Secure Backup tape backups. Since lately I’m been playing around a lot with image copy backups I wanted to find out if this feature only works with backupsets or does it also work for incrementally refreshed image copies.

I first thought that it cannot possibly work with image copies, since image copies should be exact datafile copies, but on the other hand when you refresh and image copy, then you at first also have to create incremental backupset of the changes that you then apply to the image copy, so maybe the optimization is applied silently there also 🙂 Would be really good. Better to test it out. Fingers crossed.

I’m using 12.1.0.2 on OEL 7.2.

Before taking the test I created an image copy from my undo tablespace (309 338 112 bytes):

RMAN> BACKUP INCREMENTAL LEVEL 1 FOR RECOVER OF COPY WITH TAG 'image_copy_backup' TABLESPACE UNDOTBS1;
-rw-r-----+ 1 oracle oinstall 309338112 Dec 28 05:06 data_D-ORCL_I-1433672784_TS-UNDOTBS1_FNO-3_04qvtmir

Yes I know, my filesystem dates were wrong at that point 🙂 Ignore this, NTP wasn’t running on the storage box.

Also a level 0 uncompressed backupset of the same tablespace (207 110 144 bytes, so it has already been optimized, but I’m interested in the next incremental backup size):

RMAN> BACKUP INCREMENTAL LEVEL 0 TABLESPACE UNDOTBS1;
-rw-r-----+ 1 oracle oinstall 207110144 Dec 28 05:16 0kqvtpaj_1_1

Next I ran a large UPDATE statement and committed it immediately. I also had snapper running to catch the amount of undo my update caused. Snapper reported that my update generated 146MB of undo:

STAT, undo change vector size                                   ,     146 042 740

Now immediately I run incremental backup for both, backupset and to incrementally update the image copy.
BACKUP INCREMENTAL LEVEL 1 TABLESPACE UNDOTBS1 command produced file named 0mqvtpkf_1_1 and command BACKUP INCREMENTAL LEVEL 1 FOR RECOVER OF COPY WITH TAG ‘image_copy_backup’ tablespace undotbs1 produced file named 0oqvtpm2_1_1. As you can see, both are almost equally as big and close to the reported undo change vector size.
No surprise herem undo optimization did not kick in since undo_retention time has not yet passed.

-rw-r-----+ 1 oracle oinstall 151470080 Dec 28 05:21 0mqvtpkf_1_1
-rw-r-----+ 1 oracle oinstall 181190656 Dec 28 05:22 0oqvtpm2_1_1

Then I deleted both these files and removed them from RMAN catalog.

After 30 minutes or so (my undo_retention time is 600 = 10 minutes) I ran the backup commands again:

RMAN> backup incremental level 1 tablespace undotbs1;

Starting backup at 07-MAR-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=45 device type=DISK
channel ORA_DISK_1: starting incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00003 name=/u01/app/oracle/oradata/ORCL/datafile/o1_mf_undotbs1_cfvpb5hx_.dbf
channel ORA_DISK_1: starting piece 1 at 07-MAR-16
channel ORA_DISK_1: finished piece 1 at 07-MAR-16
piece handle=/nfs/backup/orcl/14qvtsgf_1_1 tag=TAG20160307T230238 comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:01
Finished backup at 07-MAR-16

RMAN> BACKUP INCREMENTAL LEVEL 1 FOR RECOVER OF COPY WITH TAG 'image_copy_backup' tablespace undotbs1;

Starting backup at 07-MAR-16
using target database control file instead of recovery catalog
allocated channel: ORA_DISK_1
channel ORA_DISK_1: SID=61 device type=DISK
channel ORA_DISK_1: starting incremental level 1 datafile backup set
channel ORA_DISK_1: specifying datafile(s) in backup set
input datafile file number=00003 name=/u01/app/oracle/oradata/ORCL/datafile/o1_mf_undotbs1_cfvpb5hx_.dbf
channel ORA_DISK_1: starting piece 1 at 07-MAR-16
channel ORA_DISK_1: finished piece 1 at 07-MAR-16
piece handle=/nfs/backup/orcl/16qvtsj0_1_1 tag=IMAGE_COPY_BACKUP comment=NONE
channel ORA_DISK_1: backup set complete, elapsed time: 00:00:07
Finished backup at 07-MAR-16

This can’t be good.. regular backupset took only 1 second to execute and taking an incremental backup for image copy refresh took 7 seconds.
Looking at the file sizes the difference is clear – 1,7MB for the incremental backup and 181MB (no change) for the image copy refresh:

-rw-r-----+ 1 oracle oinstall   1794048 Mar  7 23:02 14qvtsgf_1_1
-rw-r-----+ 1 oracle oinstall 181567488 Mar  7 23:04 16qvtsj0_1_1

So the backup undo optimization works, but only if you use backupsets.

I published my 2014 presentation “Making MySQL highly available using Oracle Grid Infrastructure” in Slideshare.
Please also read my page how to set up the mysql scripts for Oracle GI