Scipt to automatically move RAC 11gR2 services back to preferred instances

Written by: ilmarkerm

Category: Blog entry

Published: May 10, 2012

When instance fails in Oracle RAC, the services that were using this instance as a preferred instance are automatically relocated to instances marked as available for this service. But after the failed instance recovers and starts up again, the relocated services are not moved back and need manual srvctl relocate service command from administrator to move them back.

Here is a little Bash script to automate this process. Oracle Clusterware (Grid Infrastructure) can execute user callout scripts on FAN events, like INSTANCE up/down. Place this script under $GRID_HOME/racg/usrco/ and set the execute bits on the file. Then clusterware will execute that script for all FAN events, but the script will start processing only for instance up event.

Why is it needed? We just switched over to 4-node RAC consisting of many different applications, almost each of them connecting to its own schema. We created each application its own service restricting it to 1 (or max 2) nodes (1 or 2 nodes as preferred, all other nodes listed as available). After the first rolling patching, I noticed that the connection count and load on each node was very unbalanced, vast majority of the connections were connected to node1 and the last patched node had almost none and it did not get better over a few hours. This was because most of the services ended up on node1 and I had to manually look over each service and relocate it back where it belongs. This script attempts to automate this process.

Tested on Oracle Linux 5.8 with Oracle Grid Infrastructure 11.2.0.3 and Oracle Database 11.2.0.2 and 11.2.0.3.

19 comments

NandanDas K says:

July 16, 2013 at 19:13

Thanks Ilmar, the script worked like a charm !
jayasudha says:

April 22, 2014 at 16:03

Hi Ilmar,

Could you please let me know how i need to run this script to test . I ran the script as shell script it doesnt work . I have placed the script $CRS_HOME/racg/usrco

ksh -x service_callout.sh
+ LOGFILE=/tmp/grid_callout.txt
+ + dirname service_callout.sh
SCRIPTDIR=.
service_callout.sh[11]: "${SCRIPTDIR:(-11)}": bad substitution
Ilmar Kerm says:

April 22, 2014 at 16:12

This is a bash script not ksh script and you need to set the executable bits on this file also:
chmod a+x $CRS_HOME/racg/usrco/relocate_services_callout.sh

Then you can just execute $CRS_HOME/racg/usrco/relocate_services_callout.sh with the parameters you need.
Viraj Mokadam says:

May 2, 2014 at 16:39

Can you please give an detail example , as which all parameters to provide to your script for service relocation.
jayasudha says:

May 8, 2014 at 08:25

please give an example , as which all parameters to provide to your script for service relocation
Ilmar Kerm says:

May 8, 2014 at 08:40

The parameters are defined by Grid Infrastructure. For example to simulate instance up event for instance "db1" in database "db" execute:

$CRS_HOME/racg/usrco/relocate_services_callout.sh INSTANCE status=up instance=db1 database=db
Jagatheesh Ramakrishnan says:

May 8, 2014 at 17:49

Hi,
Thanks a lot .

Do you have simiilar script for 10g ?
Ilmar Kerm says:

May 8, 2014 at 17:59

I have not tested this script on 10g.
Patrick Santucci says:

October 6, 2017 at 20:41

I’m having a couple of issues with this script and hoping you can help.

I had to change this line:

if [[ `echo “$SERVICECONFIG” | grep “Service is enabled” | wc -l` -eq 1 ]]; then

to this:

if [[ `echo “$SERVICECONFIG” | grep “^Service is enabled$” | wc -l` -eq 1 ]]; then

because it was returning false every time.

$ echo “$SERVICECONFIG” | grep “Service is enabled” | wc -l
2
$ echo “$SERVICECONFIG” | grep “Service is enabled”
Service is enabled
Service is enabled on instances: BDUAT1,BDUAT2

Next, I think there’s a problem with the logic flow. I’m going to use line numbers because it’s easier.
On line 91 you check if the current instance is in the list of preferred instances.
If so, you check to see if the service is started (line 97).
If it’s not started, you start it (line 102). Otherwise, you check to see if it’s not running on a preferred instance (line 109). The problem here is that you’re still inside the if logic from line 91. If line 91 evaluates as FALSE, everything down to line 135 will be skipped, including the check on line 109.

Here’s an example. I have two services for database bduat_dg. Service bduats is running on its preferred instance BDUAT1; service bduat_maint has preferred instance BDUAT1 but is running on BDUAT2. When I run your script, here is the output I get in the log file:

Fri Oct 6 14:04:28 EDT 2017
[bduat_dg][hq-xdb01.intl] Instance BDUAT1 up
ORACLE_HOME=/u01/app/oracle/product/11.2.0.4/dbhome_3
Service bduat_maint
enabled
Service bduats
enabled
preferred
running on preferred BDUAT1

As you can see, it does not take any action on the bduat_maint service, or even list where it is running, just that it is enabled. Not sure what the best way is to fix this.
Patrick Santucci says:

October 6, 2017 at 21:14

Sorry, my line numbers were off by 1. Corrected below:

On line 90 you check if the current instance is in the list of preferred instances.
If so, you check to see if the service is started (line 96).
If it’s not started, you start it (line 101). Otherwise, you check to see if it’s not running on a preferred instance (line 108). The problem here is that you’re still inside the if logic from line 90. If line 90 evaluates as FALSE, everything down to line 134 will be skipped, including the check on line 108.
ilmarkerm says:

November 14, 2017 at 11:09

Thanks for the comment. I just noticed that the version on this page was out of date and I have already fixed the first issue you mentioned in the script I’m using in production 🙂
But i don’t really follow the second issue… The check on line 89 (new version) is there to check that if the instance that was just brought up is listed as a preferred instance for that service, if it is not, then nothing should be done with that service. The callout script should not work cluster-wide, it should work on each node separately, since each clusterware instance calls the script itself.
Egeil Sanderson says:

March 30, 2018 at 05:12

Dear Ilmar,
Where can I find the script? Is it not available anymore?

Regards,
Egeil
ilmarkerm says:

March 30, 2018 at 07:47

It is right here in the post. or access it through github https://gist.github.com/ilmarkerm/4845e4288d1cda98b7f88425d199f979/raw/4c80adf9874358e9e1ec41911fc4215026ac6424/relocate-services-callout.sh
Egeil Sanderson says:

April 2, 2018 at 08:40

Hi,
Thanks. Was wondering where it was gone until I realized that my enterprise would block clear-cut scripts.

Regards,
Egeil
sam says:

January 8, 2019 at 23:47

Hi ,

Does this work if cluster has multiple databases running
ilmarkerm says:

January 9, 2019 at 07:25

yes it does
Nadeem says:

July 15, 2019 at 15:08

Hello Ilmarkerm,

Thanks for creating such a useful script. Appreciate that.

It worked on 11g but when I tried on 12c (CDB/PDB), It wouldn’t pick up the CRS_HOME (/u01/app/12.2.0.1/grid/racg/usrco)

This is how I called it.

sh -x relocate_srv.sh INSTANCE status=up instance=CDEV1 database=CDEV

+ LOGFILE=/u01/app/oracle/relocate_db_services_script.log
++ dirname relocate_srv.sh
+ SCRIPTDIR=.
ilmarkerm says:

July 15, 2019 at 15:14

If you look at the code it all depends on that the script is placed under $CRS_HOME/racg/usrco

SCRIPTDIR=`dirname $0`

# Determine grid home
if [[ “${SCRIPTDIR:(-11)}” == “/racg/usrco” ]]; then
CRS_HOME=”${SCRIPTDIR:0:$(( ${#SCRIPTDIR} – 11 ))}”
export CRS_HOME
fi
Nadeem Hameed says:

July 15, 2019 at 19:57

Thanks ilmarkerm,

I placed the script in the right location.

[oracle@node1 usrco]$ pwd
/u01/app/12.2.0.1/grid/racg/usrco
[oracle@node1 usrco]$ ls
relocate_srv.sh
[oracle@mtldemtsedb11 usrco]$

OS version is also different from our 11g server.

11g OS version is 6.9
12c OS version is 7.4

There could be some difference in the functionality of dirname command in both?

Do you have the same OS in your lab to test?

Thank you!
Nadeem

Comments are closed.

Ilmar Kerm

Scipt to automatically move RAC 11gR2 services back to preferred instances

19 comments

About me

Pages

Recent Posts

Tags

Archive