Ilmar Kerm

Oracle, databases, Linux and maybe more

SystemD is standard today in the Linux world, and it is very powerful and good, I really like it.

Wouldn’t it be nice to manage Oracle database components also with SystemD? Not just simple starting and stopping, but also making sure it keeps running and restarts it automatically if needed.

I also want to manage listener separately from the database instances, since one listener can service multiple database instances and also keeping automation and automated out-of-place patching in mind – listener does not need to run from the same home as database instance.

Realistic example: you need to wait for months until Oracle support creates you an essential merge patch for the database, but before that you can already run listener with the latest Release Update Revision, that fixes the latest security bugs.

Here is the code:

oracle-listener.service is the SystemD unit file. I have intended it to be pushed out centrally from Ansible, so the content there is generated automatically, for example the Oracle Home path, the the PID file location.

get_listener_pid.sh the purpose of this script is to return the listener process PID (tnslsnr) to SystemD, so it could keep an eye on it, and restart if it happens to die. Rather long script for such a simple task, but we are dealing with Oracle here, and instead of just returning a simple process PID, “lsnrctl show pid” has a lot on unnecessary clutter in the output.

To continue my previous post abiout ADR log mining, another monitoring agent that I created was just a very simple (initially) Linux monitoring agent. System metrics.

I know there are plenty of existing software products already doing exactly that, but I don’t really like the one that was chosen by my employer – other people maintaining it for different goals. Also I wanted to have much richer metadata (and Oracle specific – like cluster name) added to the monitoring data.

Here is the code:

https://github.com/ilmarkerm/monitoring/blob/main/linux/system-monitor.py

Cheap to run and just uses regular expressions to parse information returned by standard Linux monitoring command. Data is again sent to InfluxDB backend intended to be used in Grafana dashboards.

I push it out using Ansible, so I left in some Ansible tags in the configuration section… and so pleople would not just blindly take the code and try to run it without understanding it 🙂

For a while now Oracle is using standardised log structure under Automatic Diagnostic Repository (ADR) – this includes database alert.log, listener log, CRS log, ASM log and much more. All this is very valuable information for monitoring.

I’m a fan of using Grafana (using InfluxDB as a data source) for monitoring, since in Grafana it is very easy to create your own customised dashboards and alerts – so lately I’ve ben creating my own monitoring agents to gather send Oracle monitoring information to InfluxDB.

InfluxDB is fast time series database, perfect for storing monitoring data – each record must have timestamp associated and data retention is also natively built in.

Since our Oracle environment is exploding in size, my goal is also to have monitoring tools/agents that figure out what to do themselves, deployed using Ansible. All without any human involvment.

Here is monitoring agent with following high level features:

  • Searches for all recently modified alert/log.xml files under ADR base
  • Parses newly added log messages for any tags that Oracle itself has added (like component and message level)
  • Parses log messages against custom regular expressions to parse out custom information

Code is here:

https://github.com/ilmarkerm/monitoring/blob/main/oracle/adr-log.py

It is expected to run as SystemD service. I currently left in some Ansible tags, since this is how I distribute the agent to all servers (and to make sure people do no just take the code and run it without reading and understanding it).

Here you can see some small snippets how to return quite a lot of useful attributed about Oracle database directly from CRS. And the code I use to do an automated RAC rolling restart.

If you are doing many HTTPS calls from PL/SQL to many different sites, managing oracle wallet with all the required trusted certificates is quite painful. But Linux distros already come bundled with all the trusted CA certs from Mozilla, would it be nice just to convert it to Oracle wallet format so PL/SQL could use it?

Here is an Ansible role just for this, you could run it regularly to make sure you have the latest Mozilla trusted certs.

https://github.com/ilmarkerm/oracle-wallet-mozilla-ca