Ilmar Kerm

Oracle, databases, Linux and maybe more

If you need to make HTTPS requests (or other TLS secured network requests) from Oracle database, you also need a wallet containing the certification authority (CA) certificates you trust. In todays distributed web world it is quite painful to keep updated. But operating systems and popular web browsers already come with bundled trust lists – the certificate authorities the OS vendor has considered trustworthy according to their policies. Wouldn’t it be nice to make it usable for Oracle Database?

Enterprise linuxes have package ca-certificates that contain the CA-s Mozilla has deemed trustworthy, this is available in file /etc/pki/ca-trust/extracted/pem/tls-ca-bundle.pem

Oracle Wallet is basically PKCS#12 container, so we can just use standard openssl command to generate it.

I tested it with Oracle Database 19.18 and it accepted the generated file without any problems.

In order to prepare for APEX 23.1 upgrade in production, I upgraded several test environments from existing 21.1.5 to 23.1.3. Nothing special about the upgrade, no errors. But then developers requested I downgrade one of the environments back to 21.1, since they needed to make some changes to existing apps.

APEX downgrade should be easy, since APEX always installs new version into a new schema (23.1 goes to APEX_230100, 21.1 is installed in APEX_210100) and then it just copies over the application metadata. Downgrade then it should be easy, just point APEX back to the old schema. Downgrade is documented and Oracle does provide apxdwngrd.sql script for it.

After running apxdwngrd.sql and dropping APEX_230100 schema – the users started receiving a page from ORDS that “Application Express is currently unavailable”.

No other information, no other error message in ORDS logs, no errors in alert.log. How does it determine that APEX is unavailable? No idea.

I did the usual checks, I saw ORDS sessions connected to the database, so there were no connection issues.

SQL> select * from dba_registry where comp_id='APEX';

version = 21.1.5
status = VALID
procedure = VALIDATE_APEX

SQL> exec sys.validate_apex;

PL/SQL procedure successfully completed.

SQL> select * from dba_registry where comp_id='APEX';

version = 21.1.5
status = VALID

APEX component in the database is valid, with correct version and after validation it is still VALID.

SQL> select count(*) from dba_objects where owner='APEX_210100' and status != 'VALID';

0

There are no invalid objects in the APEX schema, but I did see some public synonyms still left over from APEX 23.1. And then recreated all APEX 21.1 public synonyms.

SQL> select 'drop public synonym '||synonym_name||';' from dba_synonyms where owner='PUBLIC' and table_owner like 'APEX_230100';

... cleaned them up

SQL> alter session set current_schema = APEX_210100;
SQL> exec wwv_flow_upgrade.recreate_public_synonyms('APEX_210100');

No help… Still the same “Application Express is currently unavailable” to the users.

After that I got a useful tip in twitter to check view APEX_RELEASE.

Version is correct, but PATCH_APPLIED=APPLYING? That cannot be right and checking the not yet upgraded source production database it said PATCH_APPLIED=APPLIED.

First I tried to reapply 21.1.5 patch, and then 21.1.7 patch, but none of them reset this PATCH_APPLIED field.

Time to dig into APEX internals. I see that PATCH_APPLIED field is populated using function call wwv_flow_platform.get_patch_status_and_init_cgi but the code for it is wrapped. We also have the good old SQL trace that showed me that this fuction is calling

SELECT VALUE FROM APEX_210100.WWV_FLOW_PLATFORM_PREFS WHERE NAME = 'APEX_PATCH_STATUS';

And there is see value APPLYING as a plain string. What if I just update it?

update APEX_210100.WWV_FLOW_PLATFORM_PREFS set value='APPLIED' WHERE NAME = 'APEX_PATCH_STATUS';

And the “Application Express is currently unavailable” message is gone and APEX works again! My guess is that something in the 23.1 upgrade process sets the old schema to APPLYING mode and then downgrade does not reset it. Sounds like a bug.

NB! Don’t do it in production – talk to Oracle Support first. I only did it because it was development environment and developers wanted to get their environment back fast.

I’ve been using the old good Radius authentication protocol to authenticate database accounts (created for humans) with Active Directory credentials. It may sound strange use case, specially since Oracle also advertises its own Active Directory integration (Centrally Managed Users) and also there is Kerberos. I’ve had the following problems with them:

  • CMU – in order to use Active Directory passwords, AD schema needs to be modified and AD filter installed on AD side. I think the latter removes this feature from consideration.
  • Kerberos – passwordless login is very tempting and if you get it running on Oracle side – definetly a feature to consider. But deploying at scale and maintaining it is a nightmare, almost impossible to automate.

Radius on the other hand – Windows domain controllers have Radius server built in and it is also very easy to deploy at large scale and maintain on Oracle database side.

Configure database server

First add the following to database sqlnet.ora. File /u01/app/oracle/radius.key is a simple text file containing just the Radius secret. 10.0.0.1 and 10.0.0.2 are my Radius servers running on Windows, both using port 1812.

# Radius authentication settings
SQLNET.RADIUS_AUTHENTICATION_PORT = 1812
SQLNET.RADIUS_SECRET=/u01/app/oracle/radius.key
SQLNET.RADIUS_AUTHENTICATION = 10.0.0.1
SQLNET.RADIUS_ALTERNATE = 10.0.0.2

In the database itself set parameter os_authent_prefix to empty string:

alter system set os_authent_prefix='' scope=spfile;

And create the database users IDENTIFIED EXTERNALLY, and database username must match WInodws AD username.

CREATE USER ilmker IDENTIFIED EXTERNALLY;

Configure the client

The bad thing with Radius authenticated users is that the database client must also support Radius. Oracle thick driver supports it just fine, also JDBC thin driver.

When using Oracle thick client (Instatnt client), turn on Radius authentication by adding it to sqlnet.ora:

$ cat ORACLE_CLIENT_HOME/network/admin/sqlnet.ora

SQLNET.AUTHENTICATION_SERVICES=(RADIUS)

After that you can use this client to log into the database using both database authenticated users and Radius authenticated users.

JDBC thin driver is a little bit trickier (tested using 21c JDBC driver)

To use Radius add the following Java VM option – but the problem with that is that you cannot use database authenticated users after turning on this option.

-Doracle.net.authentication_services='(RADIUS)'

If you want to use it with SQL Developer, add the following to product.conf file

AddVMOption -Doracle.net.authentication_services='(RADIUS)'

As mentioned earlier this would disable database authenticated accounts, so in case of SQL Developer changing product.conf is not desirable.

Since 19c JDBC thin driver, it is also possible to change Java properties within the connection string using EasyConnect syntax:

tcp://oracle.db.example.com:1521/application_service.domain?oracle.net.authentication_services='(RADIUS)'

One bug that I discovered in JDBC thin driver support for Radius (and Oracle is still working on it) – if you use Radius together with TCPS and database server has also enabled Oracle Native Encryption – you will get the following error from JDBC driver IO Error: Checksum fail

This is rather strange error, since when using TCPS – Oracle Native Encryption should be turned off automatically, but this error comes from Native encryption checksumming. To get around it, have to disable Native Encryption checksumming from the client side – which can also be done from inside the connection string.

tcps://oracle.db.example.com:1523/application_service.domain?oracle.net.authentication_services='(RADIUS)'&oracle.net.crypto_checksum_client=REJECTED

We have hundreds of developers who need access hundreds of application schemas, deployed to multiple locations. Data is sensitive and there is a requirement that each human access must be done via a personal database account that has access to only allowed application schemas. That has always been a struggle for me how to manage all these access privileges in a nice and easy way. Since many many many different databases are involved, database itself cannot be the source of truth for the access privileges and so far we have just synchronised the access privileges from the source of truth system to all individual databases.

I think there is a better way now – Secure Application Roles.

The idea behind them is very flexible – there is no need to grant the individual roles to database users, the users need to execute a procedure (which will do all necessary validations) and then enable the role(s) for the user session.

Lets first set up a common SECURITY_MANAGER schema, that will contain all our security related code and logging.

create user security_manager no authentication
  default tablespace users
  quota 1g on users;

grant create table, create procedure to security_manager;

create table security_manager.allowed_grants (
    db_username varchar2(128) not null,
    db_role varchar2(128) not null,
    primary key (db_username,db_role)
) organization index;

create table security_manager.role_grant_log (
    grant_time timestamp default sys_extract_utc(systimestamp) not null,
    db_username varchar2(128) not null,
    granted_role varchar2(128) not null,
    is_allowed number(1) not null,
    comments varchar2(1000),
    client_host varchar2(200),
    client_ip varchar2(50),
    unified_audit_session_id varchar2(100)
);

Here are two helper programs, since ACTIVATE_ROLE procedure below will need to read and write to tables in SECURITY_MANAGER schema and I do not want to grant access to these tables to users directly.

-- The job of the following procedure is just logging the grant request
CREATE OR REPLACE PROCEDURE security_manager.log_role_grant
    (p_requested_role role_grant_log.granted_role%TYPE
      , p_is_allowed role_grant_log.is_allowed%TYPE
      , p_comments role_grant_log.comments%TYPE) 
ACCESSIBLE BY (activate_role) IS
    PRAGMA AUTONOMOUS_TRANSACTION;
BEGIN
    INSERT INTO role_grant_log
      (db_username, granted_role, is_allowed, comments
        , client_host, client_ip, unified_audit_session_id)
    VALUES 
      (SYS_CONTEXT('USERENV','SESSION_USER')
       , p_requested_role, p_is_allowed, SYS_CONTEXT('USERENV','HOST')
       , SYS_CONTEXT('USERENV','IP_ADDRESS')
       , SYS_CONTEXT('USERENV','UNIFIED_AUDIT_SESSIONID'));
    COMMIT;
END;
/

-- The following function just check the master autorisation table if
-- user is allowed to activate the role or not
CREATE OR REPLACE FUNCTION security_manager.is_role_allowed
    (p_username allowed_grants.db_username%TYPE
      , p_requested_role role_grant_log.granted_role%TYPE) RETURN boolean
ACCESSIBLE BY (activate_role) IS
    v_is_role_allowed NUMBER;
BEGIN
    SELECT COUNT(*) INTO v_is_role_allowed
    FROM allowed_grants
    WHERE db_username = p_username AND db_role = UPPER(p_requested_role);
    RETURN v_is_role_allowed = 1;
END;
/

Now the security code itself. The procedure below just checks from a simple table, if the logged in user is allowed to activate the requested role or not and also logs the request. In real life it can be much more complex – the code could make a REST call to external autorisation system and ofcourse logging should be much more detailed.

NB! The procedure must be declared AUTHID CURRENT_USER – using invokers rights.

CREATE OR REPLACE PROCEDURE security_manager.activate_role
    (p_requested_role allowed_grants.db_role%TYPE
    , p_comments role_grant_log.comments%TYPE)
AUTHID CURRENT_USER IS
    v_activated_roles VARCHAR2(4000);
BEGIN
    -- Check if users is allowd to activate the requested role
    IF NOT is_role_allowed(SYS_CONTEXT('USERENV','SESSION_USER'), p_requested_role) THEN
        log_role_grant(upper(p_requested_role), 0, p_comments);
        raise_application_error(-20000, 'You are not allowed to activate the requested role.');
    END IF;
    -- Query all roles that are currently active for the session and append the requested role to that list
    SELECT listagg(role, ',') WITHIN GROUP (ORDER BY role) INTO v_activated_roles FROM (
        SELECT role FROM session_roles
        UNION
        SELECT upper(p_requested_role) FROM dual
    );    
    -- Activate all roles
    log_role_grant(upper(p_requested_role), 1, p_comments);
    DBMS_SESSION.SET_ROLE(v_activated_roles);
END;
/

Now I create the role itself and grant the role read access to one application table. Here IDENTIFIED USING clause does the magic – it tells Oracle that sec_app_role_test1 role can only be enabled by security_manager.activate_role procedure.

CREATE ROLE sec_app_role_test1
    IDENTIFIED USING security_manager.activate_role;

GRANT READ ON app1.t1 TO sec_app_role_test1;

And my developer personal account is called ILMKER and this account only needs execute privileges on my security package. In real life you would grant this execute to a common role that all developers have (in my case that custom role is called PERSONAL_ACCOUNT).

GRANT execute ON security_manager.activate_role TO ilmker;

By default ILMKER user cannot access table APP1.T1.

ILMKER SQL> SELECT * FROM session_roles;

ROLE                                                                                                                            
-----------------
PERSONAL_ACCOUNT

ILMKER SQL> SELECT * FROM app1.t1;

SQL Error: ORA-00942: table or view does not exist

Lets test using my developer account ILMKER… first I try to request a role that I do not have been granted access to. No luck, I get the exception “ORA-20000: You are not allowed to activate the requested role.”

SQL> SELECT * FROM session_roles;

ROLE                
--------------------
PERSONAL_ACCOUNT

SQL> exec security_manager.activate_role('sec_app_role_test1', 'Jira ref: ABC-490');

ORA-20000: You are not allowed to activate the requested role.

SQL> SELECT * FROM session_roles;

ROLE                
--------------------
PERSONAL_ACCOUNT

SQL> SELECT * FROM app1.t1;

SQL Error: ORA-00942: table or view does not exist

After security administrator grants me the role – inserts a row to security_manager.allowed_grants table for this example and NOT executing Oracle GRANT command.

insert into security_manager.allowed_grants (db_username, db_role)
    values ('ILMKER', upper('sec_app_role_test1'));
commit;

I ask my developer to run again.

SQL> SELECT * FROM session_roles;

ROLE                
--------------------
PERSONAL_ACCOUNT

SQL> exec security_manager.activate_role('sec_app_role_test1', 'Jira ref: ABC-490');

PL/SQL procedure successfully completed.

SQL> SELECT * FROM session_roles;

ROLE                
--------------------
PERSONAL_ACCOUNT
SEC_APP_ROLE_TEST1

SQL> SELECT * FROM app1.t1;
no rows selected

The developer is happy now! Role was activated in the developer session and developer was able to read the application table. All requests were also logged by SECURITY_MANAGER schema.

SQL> select * from security_manager.role_grant_log;

GRANT_TIME                       DB_USERNAME    GRANTED_ROLE          CLIENT_HOST         CLIENT_IP       UNIFIED_AUDIT_SESSION_ID    IS_ALLOWED    COMMENTS             
2023-05-12 15:05:28,478732000    ILMKER         SEC_APP_ROLE_TEST1    STH-FVFFV04QQ05R    10.88.17.241    3428220339                              0 Jira ref: ABC-490    
2023-05-12 15:10:41,923908000    ILMKER         SEC_APP_ROLE_TEST1    STH-FVFFV04QQ05R    10.88.17.241    3428220339                              1 Jira ref: ABC-490

I think this is a powerful feature to control access to data based on much more complex criteria than just DBA executing GRANT commands. Before enabling the role code can make a REST query to external autorisation system, check the client host IP, check the client authentication method, enable detailed unified auditing policies. Possibilities are endless!

It feels like no-one is really talking about running Oracle database on a filesystem for years now. ASM seems to be the way to go. Oracle in the documentation even says that local file systems, although supported, are not recommended.

https://docs.oracle.com/en/database/oracle/oracle-database/19/ladbi/supported-storage-options-for-oracle-database.html#GUID-120C3995-F827-4D01-B955-CB1BD0457AF7

ASM works great and if you have RAC or a few dear pet databases, ASM for sure is a safe way to go – no argument from me here at all. I especially love the possibility to mirror the storage between different failure goups, that are defined by the administrator, so you can mirror the storage lets say between different hardware racks or availability zones or just different storage hardware. For high-end dear pet setups, ASM is no doubt the only way to go.

But I think the story changes for large cattle of small cloudy databases, where you have a large number of databases and you want to spend minimal effort managing a single one. After Oracle removed ASM from the database installation in 11g, now ASM requires Oracle Grid Infrastructure (Oracle Restart) installation – a separate Oracle software setup – that is, as any Oracle software, notoriously hard to maintain, patch – compared to standard Linux software. And the performance benefits promised by ASM, async IO and direct IO, are also available on a file system. True – the choice of filesystems and setups is harder to manage, but if you do your testing properly, the end result is easier to maintain. Do keep in mind, Oracle does not test running on filesystems any more and any problem is left to the OS support to resolve – which in case of using Oracle Linux, is also Oracle 🙂

There are many historical blogs about running Oracle on filesystems in the Internet, and the consensus seems to be that XFS is the best filesystem currently for running databases.

In addition, note Oracle Linux: Supported and Recommended File Systems (Doc ID 236826.1) also says this about XFS:

XFS is designed for high scalability and provides near native I/O performance even when the file system spans multiple storage devices. Beginning with Oracle Linux 7, XFS is the default file system and is included with Basic and Premier Support subscriptions, at no additional charge.

Based on this information I’ve been running my cattle on custom cloud small database instances on XFS, but quite soon started to have complaints about occasional database slowness and observed high log file sync and commit times. The important bit here – the applications in question commit a lot – they are micro/mini-services that get called a lot. Like many thousands of times per second. Before during platform testing I wasn’t really able to see any such behaviour, because i used tools like Swingbench and SLOB – and they are quite well behaved apps.

Then by accident found a post in Redhat support portal:

https://access.redhat.com/solutions/5906661

That rather recently XFS maintainers introduced a change, that non-aligned (to filesystem block size) direct IO write requests get serialized more thoroughly now. That change was introduced on purpose to avoid possible corruptions.

The change itself is here. It is just intorucing some extra waits on XFS filesystem level.

https://lore.kernel.org/linux-xfs/e2d4678c-eeae-e9df-dde7-84601a52d903@oracle.com/

By default XFS uses 4096 byte block size – so unless you use 4096 blocksize for Oracle redo logs – you will hit this issue. Using 512 block size redo logs on 4096 block size XFS filesystem, means Oracle will issue non-aligned writes for redo.

On Linux Oracle allows setting redo log block size to either 512 or 4096 bytes, but it has a protection mechanism – the redo log block size must match the storage physical sector size. Quoting from Doc ID 2148290.1.

Using 512N disks as 4K leads to atomicity problem (e.g. ASM/DB write operation thinks that 4K IO size is atomic which is not. It leads to consistency issues when there is a crash in the middle of a 4K write causing partial write).

If you use 512 sector size disks, then Oracle will allow you to only create redo logs with 512 byte blocksize.

# cat /sys/block/vdf/queue/physical_block_size
512
# cat /sys/block/vdf/queue/logical_block_size
512

SQL> alter database add logfile '/oralocal/1.rdo' size 1g blocksize 4096;

ORA-01378: The logical block size (4096) of file /oralocal/1.rdo is not
compatible with the disk sector size (media sector size is 512 and host sector size is 512)

If you want to use XFS for redo logs, you need to create XFS filesystem with 512 byte block size. This bring another issue – XFS has deprecated this small block size. It is still currently possible, but warning is issued.

# mkfs.xfs -b size=512 -m crc=0 -f /dev/vg01/redo
V4 filesystems are deprecated and will not be supported by future versions.
meta-data=/dev/vg01/redo         isize=256    agcount=4, agsize=31457280 blks
         =                       sectsz=512   attr=2, projid32bit=1
         =                       crc=0        finobt=0, sparse=0, rmapbt=0
         =                       reflink=0    bigtime=0 inobtcount=0
data     =                       bsize=512    blocks=125829120, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0, ftype=1
log      =internal log           bsize=512    blocks=204800, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0

To me this brings the following conclusions:

  • XFS can only be used for Oracle redo logs if you use storage with 4096 sector size (advanced format)
  • For 512 byte sector size storage, a different filesystem is needed. For example EXT4.
  • XFS is still fine to use for Oracle datafiles. This issue only concerns redo.

Testing the impact

How bad this issue is exactly. When I ran comparisons using SLOB, I did not notice the difference betwen XFS and EXT4 much. Looking at the storage patterns I saw that SLOB issues large writes and commits rarely, so storage sees quite large writes coming from the log writer.

But my apps make small changes and commit crazy often (yes, bad database application desgn – but I think quite common thinking for an average enterprise application developer). So I needed a new testing suite that can generate high number of IOPS with very small IO sizes. Introducting Horrible Little Oracle Benchmark.

https://gist.github.com/ilmarkerm/462d14da050fb73fb4eeed5384604f1b

This just commits a lot, after making a small change to the database. The results below do not represent any kind of maximum throughput limits, I just wanted to see the impact of using different filesystems under similar load.

Baseline… XFS bsize=4096

Load Profile                    Per Second   Per Transaction
~~~~~~~~~~~~~~~            ---------------   ---------------
      Redo size (bytes):       2,247,197.7             573.7
  Logical read (blocks):          19,760.3               5.0
          Block changes:          15,744.3               4.0
 Physical read (blocks):               0.3               0.0
Physical write (blocks):              45.6               0.0
       Read IO requests:               0.3               0.0
      Write IO requests:               3.6               0.0
         Executes (SQL):           3,923.7               1.0
           Transactions:           3,917.3
 
Event                                Waits Time (sec)      Wait   time
------------------------------ ----------- ---------- --------- ------
log file sync                    2,367,110     3340.7    1.41ms   92.8
DB CPU                                          281.9              7.8
buffer busy waits                  129,582        4.6   35.49us     .1
log file switch (private stran          10         .1    9.29ms     .0
db file sequential read                122         .1  513.16us     .0
control file sequential read            98          0  327.92us     .0

                                                Total
                                       %Time     Wait              Waits   % bg
Event                            Waits -outs Time (s)  Avg wait     /txn   time
-------------------------- ----------- ----- -------- --------- -------- ------
log file parallel write      1,182,647     0    1,166     .99ms      0.5   64.6
LGWR any worker group        1,179,342     0      592  501.58us      0.5   32.8

EXT4

Load Profile                    Per Second   Per Transaction
~~~~~~~~~~~~~~~            ---------------   ---------------
      Redo size (bytes):       4,151,622.2             571.5
         Executes (SQL):           7,267.6               1.0
           Transactions:           7,264.7
 
Event                                Waits Time (sec)      Wait   time
------------------------------ ----------- ---------- --------- ------
log file sync                    4,386,595     3114.6  710.03us   86.5
DB CPU                                          517.8             14.4
buffer busy waits                  302,823       11.1   36.71us     .3

                                                Total
                                       %Time     Wait              Waits   % bg
Event                            Waits -outs Time (s)  Avg wait     /txn   time
-------------------------- ----------- ----- -------- --------- -------- ------
log file parallel write      1,954,831     0      841  430.21us      0.4   59.8
LGWR any worker group        1,226,037     0      271  221.22us      0.3   19.3
LGWR all worker groups         330,590     0      166  502.00us      0.1   11.8
LGWR worker group ordering     318,164     0       47  146.44us      0.1    3.3

XFS with blocksize 512

Load Profile                    Per Second   Per Transaction
~~~~~~~~~~~~~~~            ---------------   ---------------
      Redo size (bytes):       4,118,949.8             571.4
         Executes (SQL):           7,213.7               1.0
           Transactions:           7,208.6
 
Event                                Waits Time (sec)      Wait   time
------------------------------ ----------- ---------- --------- ------
log file sync                    4,354,303     3127.9  718.34us   86.9
DB CPU                                          503.7             14.0
buffer busy waits                  349,311       12.5   35.76us     .3

                                                Total
                                       %Time     Wait              Waits   % bg
Event                            Waits -outs Time (s)  Avg wait     /txn   time
-------------------------- ----------- ----- -------- --------- -------- ------
log file parallel write      1,950,597     0      847  434.01us      0.4   59.7
LGWR any worker group        1,245,012     0      277  222.65us      0.3   19.5
LGWR all worker groups         330,132     0      169  512.99us      0.1   11.9
LGWR worker group ordering     318,918     0       44  139.26us      0.1    3.1

EXT4 with LVM

On previous tests I created file system on top of the raw block device, here I want to compare what happens if there is LVM in the middle.

Load Profile                    Per Second   Per Transaction
~~~~~~~~~~~~~~~            ---------------   ---------------
      Redo size (bytes):       3,412,249.5             571.3
         Executes (SQL):           5,981.8               1.0
           Transactions:           5,972.8
 
Event                                Waits Time (sec)      Wait   time
------------------------------ ----------- ---------- --------- ------
log file sync                    3,613,132     3203.5  886.64us   88.9
DB CPU                                          423.9             11.8
buffer busy waits                  301,471         11   36.60us     .3

                                                Total
                                       %Time     Wait              Waits   % bg
Event                            Waits -outs Time (s)  Avg wait     /txn   time
-------------------------- ----------- ----- -------- --------- -------- ------
log file parallel write      1,610,090     0      843  523.49us      0.4   59.0
LGWR any worker group        1,024,262     0      280  273.01us      0.3   19.6
LGWR all worker groups         270,089     0      171  631.60us      0.1   11.9
LGWR worker group ordering     310,863     0       66  210.95us      0.1    4.6

The important bits

  • Oracle RDBMS 19.16.2
  • Oracle Linux 8.7 with UEKr6 kernel 5.4.17-2136.316.7.el8uek.x86_64
  • Dedicated filesystem for redo logs
  • filesystemio_options=setall
  • No partitioning on block devices

During all tests datafiles remained at the same filesystem and the load on them was absolutely minimal. Between each test I reformatted the redo log filesystem and copied the redo log files over to the new filesystem.

I also tested both XFS and EXT4 when filesystemio_options set to asynch or none, but both of them performed similarly to XFS bsize=4096 with filesystemio_options=setall.

Conclusion

From my tests I conclude, that XFS performance degradation (when not using 4096 byte sector size storage) for high-commit rate applications is significant. The slowdown is observed as longer latency in log file sync and log file parallel write wait events.

As a replacement, EXT4 and XFS blocksize=512 perform similarly. Adding LVM in the mix, reduces the commit throughput a little. Since XFS blocksize=512 is deprecated, EXT4 is the go-to major filesystem on Linux for Oracle database redo logs.

Before going for a file system for Oracle database, do read through Oracle Linux: Supported and Recommended File Systems (Doc ID 236826.1) to understand the implications.