Ilmar Kerm

Oracle, databases, Linux and maybe more

Here you can see some small snippets how to return quite a lot of useful attributed about Oracle database directly from CRS. And the code I use to do an automated RAC rolling restart.

I published my 2014 presentation “Making MySQL highly available using Oracle Grid Infrastructure” in Slideshare.
Please also read my page how to set up the mysql scripts for Oracle GI

I’m currently involved in a project where we are replacing one company’s entire hardware platform. They also have Oracle database 10.2.0.5 (that we cannot upgrade right now) and what is really unusual for me, is that this Oracle database runs under Windows (and we cannot migrate to another platform). We also decided to use Oracle Grid Infrastructure (aka Oracle Clusterware) 11.2.0.4 to implement active-passive standby server for this database. Why? Because Windows Cluster was not an option, we didn’t have RAC licenses, Oracle Clusterware is free (if you are protecting Oracle software or running on Oracle OS) and we have really good previous experience with it under Linux.
For more information on how I’ve used Oracle Grid Infrastructure to provide high availability for MySQL (or any other application), check out my page MySQL HA with Oracle Clusterware.

When I started testing this it was quite surprising that I didn’t find any Oracle Clusterware action script examples for Windows in Oracle documentation or even on google 🙂 Oracle documentation just refers that in Windows the action script has to be a batch script.

This is the example action script I came up with to manage stand alone Oracle 10.2.0.5 database. It uses oradim to start and stop the database instance dbgp and a small sqlplus script to check if the database instance is alive. Tested using Oracle Grid Infrastructure 11.2.0.4 under Windows 2008R2.
I named the action script: d:\scripts\dbgp.cmd

Just for completeness, this script refers to d:\scripts\check.sql that is just used to run a quick database healt check, here is its contents:

whenever sqlerror exit failure
conn / as sysdba
select 1 from dual;
exit

WINDOWS SPECIFIC ONE TIME OPERATION: Before clusterware can execute the action script in Windows, you need to create OracleCRSToken_username service for the OS user who is executing the script. In my setup both Oracle Clusterware, managed database and the action script are executed by the local user WINRAC1\oracle and on the second node as WINRAC2\oracle. It is actually easier if you use domain user, please check the referred note.
Reference: Windows: How to Modify OS User Privileges for 11gR2 Grid Infrastructure and RAC Services (Needed for Backup To Network Shares) (Doc ID 1339053.1) steps 2 and 3.

set ORACLE_HOME=d:\app\11.2.0.4\grid
%ORACLE_HOME%\bin\crsuser add winrac1\oracle

.. it will show errors ..
.. but repeat the command on other node also ..

set ORACLE_HOME=d:\app\11.2.0.4\grid
%ORACLE_HOME%\bin\crsuser add winrac2\oracle

After that need to open services.msc and edit service OracleCRSToken_oracle. First set its startup type to Automatic and then on Log On As tab also set the oracle user password. After that start service OracleCRSToken_oracle and repeat these steps on all cluster nodes.

ADDING THE RESOURCE TO CLUSTERWARE: Adding the resource to cluster is the same as under Linux:

set ORACLE_HOME=d:\app\11.2.0.4\grid
%ORACLE_HOME%\bin\crsctl add resource oradb_dbgp -type cluster_resource -attr "ACTION_SCRIPT=d:\scripts\dbgp.cmd, CHECK_INTERVAL=60, RESTART_ATTEMPTS=2, PLACEMENT=favored, HOSTING_MEMBERS=winrac1"

When instance fails in Oracle RAC, the services that were using this instance as a preferred instance are automatically relocated to instances marked as available for this service. But after the failed instance recovers and starts up again, the relocated services are not moved back and need manual srvctl relocate service command from administrator to move them back.

Here is a little Bash script to automate this process. Oracle Clusterware (Grid Infrastructure) can execute user callout scripts on FAN events, like INSTANCE up/down. Place this script under $GRID_HOME/racg/usrco/ and set the execute bits on the file. Then clusterware will execute that script for all FAN events, but the script will start processing only for instance up event.

Why is it needed? We just switched over to 4-node RAC consisting of many different applications, almost each of them connecting to its own schema. We created each application its own service restricting it to 1 (or max 2) nodes (1 or 2 nodes as preferred, all other nodes listed as available). After the first rolling patching, I noticed that the connection count and load on each node was very unbalanced, vast majority of the connections were connected to node1 and the last patched node had almost none and it did not get better over a few hours. This was because most of the services ended up on node1 and I had to manually look over each service and relocate it back where it belongs. This script attempts to automate this process.

Tested on Oracle Linux 5.8 with Oracle Grid Infrastructure 11.2.0.3 and Oracle Database 11.2.0.2 and 11.2.0.3.

I’ve been doing lately quite many database clustering implementations; Oracle RAC and since we have many MySQL instances in production, had to find a good way to make MySQL instances highly available also.

One good solution for this is managing MySQL instances with clusterware and since we are planning to use Oracle RAC on Oracle Enterprise Linux anyway, then Oracle Clusterware is an excellent candidate for this task. Also… Oracle Clusterware is included with Oracle Enterprise Linux at no additional charge.

Requirements I had:

  • Multiple MySQL instances running in the same cluster, in case of node failure affected MySQL instances are moved to any other surviving node (least loaded)
  • Different MySQL instances may run different MySQL RDBMS software versions
  • Each MySQL instance is listening to its own dedicated and clusterware managed IP address (so MySQL can always respond on default 3306 port and can move independently around nodes without any dependencies to other MySQL instances)
  • Clusterware monitors each MySQL instance and in case of trouble automatically moves IP and starts instance on another cluster node

For this task I wrote some Bash scripts to help managing MySQL instances in Oracle Clusterware environment. All these scripts are available here: Google Code project page, also with documentation of the whole setup and how to use scripts: Documentation

All comments welcome!