Ilmar Kerm

Oracle, databases, Linux and maybe more

In this part I’ll look at some features of Oracle Database for getting the data out in XML format, with SQL code only. In part 1 I looked at some features for reading/parsing XML with pure SQL.

Populating the tables for examples

Loading the departments into table DEPARTMENTS.

CREATE TABLE departments AS
   SELECT dep.*
     FROM xml, 
     XMLTABLE ('/company/employees/item[1]/departments/item'
        PASSING x
        COLUMNS id FOR ORDINALITY,
                name VARCHAR2 (100 CHAR) PATH 'name') dep;

Employee data to table EMPLOYEES.

CREATE TABLE employees AS
   SELECT emp.id,
          emp.first_name,
          emp.last_name,
          emp.country,
          dep.id active_department_id
     FROM xml,
          departments dep,
          XMLTABLE (
             '/company/employees/item'
                PASSING xml.x
                COLUMNS id NUMBER PATH '@id',
                        first_name VARCHAR2 (10 CHAR) PATH 'first_name',
                        last_name VARCHAR2 (10 CHAR) PATH 'last_name',
                        country VARCHAR2 (10 CHAR) PATH 'country',
                        active_department VARCHAR2 (20 CHAR) PATH 'departments/item[@active="true"]/name') emp
    WHERE dep.name = emp.active_department;

Automatic XML generation + XML Stylesheet

Oracle provides functions sys_xmlgen and sys_xmlagg to automatically generate XML document based on table data. sys_xmlgen adds XML tags around a single value (returns XMLType data type from a single value) and sys_xmlagg aggregates/encloses rows into one XML document. Both of these functions also accept XMLFormat formatting parameter, that can be used to override the default enclosing XML tag name.

SELECT SYS_XMLAGG (x, XMLFormat (enclTag => 'COMPANY')) company_xml
  FROM (SELECT SYS_XMLAGG (emp_xml, XMLFormat (enclTag => 'EMPLOYEES')) x
          FROM (  SELECT tmp_id,
                         SYS_XMLAGG (xml_col, XMLFormat (enclTag => 'ITEM')) emp_xml
                    FROM (SELECT *
                            FROM (SELECT e.id tmp_id,
                                         SYS_XMLGEN (e.id) id,
                                         SYS_XMLGEN (e.first_name) first_name,
                                         SYS_XMLGEN (e.last_name) last_name,
                                         SYS_XMLGEN (e.country) country,
                                         SYS_XMLGEN (d.name, XMLFormat (enclTag => 'ACTIVE_DEPARTMENT')) active_department
                                    FROM employees e, departments d
                                   WHERE e.active_department_id = d.id) UNPIVOT ((xml_col)
                                                                        FOR tmp_col
                                                                        IN (id, first_name, last_name, 
                                                                          country, active_department)))
                GROUP BY tmp_id)
        UNION ALL
        SELECT SYS_XMLGEN ('Some Comany name', XMLFormat (enclTag => 'NAME')) x
          FROM DUAL)

The query is quite long, but the idea behind it is pretty simple – generate a XMLType row for every data attribute and use SYS_XMLAGG to group and aggregate the generated XMLType rows into desirable XML format.

And the result is one XML document.

<?xml version="1.0"?>
<COMPANY>
<EMPLOYEES>
  <ITEM>
    <ID>1</ID>
    <ACTIVE_DEPARTMENT>DBA</ACTIVE_DEPARTMENT>
    <COUNTRY>Estonia</COUNTRY>
    <LAST_NAME>Kerm</LAST_NAME>
    <FIRST_NAME>Ilmar</FIRST_NAME>
  </ITEM>
  <ITEM>
    <ID>2</ID>
    <ACTIVE_DEPARTMENT>Development</ACTIVE_DEPARTMENT>
    <COUNTRY>Estonia2</COUNTRY>
    <LAST_NAME>Kerm2</LAST_NAME>
    <FIRST_NAME>Ilmar2</FIRST_NAME>
  </ITEM>
</EMPLOYEES>
<NAME>Some Comany name</NAME>
</COMPANY>

If the XML is still not at the correct format, then XSL Transformation can be applied to convert the output to another XML format.

XSL Transformation is an XML document, that describes the rules how to convert one XML document to another XML. For this example I wrote the following XSLT document:

<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:template match="COMPANY">
  <company>
    <name><xsl:value-of select="NAME"/></name>
  </company>
  <employees>
    <xsl:for-each select="EMPLOYEES/ITEM">
      <item>
        <xsl:attribute name="id">
          <xsl:value-of select="ID" />
        </xsl:attribute>
        <first_name><xsl:value-of select="FIRST_NAME" /></first_name>
        <last_name><xsl:value-of select="LAST_NAME" /></last_name>
        <country><xsl:value-of select="COUNTRY" /></country>
        <departments>
          <item active="true">
            <name><xsl:value-of select="ACTIVE_DEPARTMENT" /></name>
          </item>
        </departments>
      </item>
    </xsl:for-each>
  </employees>
</xsl:template>

</xsl:stylesheet>

Save it to XML table named comp_xml_cslt.

CREATE TABLE comp_xml_xslt of xmltype;

INSERT INTO comp_xml_xslt VALUES ('<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
...
</xsl:stylesheet>
');

COMMIT;

To apply the XSL Transformation to source XML, Oracle has a function XMLTransform(source_xml, transformation_xml). Both arguments are XML documents of XMLType and this function returns transformed XML document.

WITH company AS 
  (SELECT SYS_XMLAGG (x, XMLFormat (enclTag => 'COMPANY')) xml
     FROM (SELECT SYS_XMLAGG (
                     emp_xml,
                     XMLFormat (enclTag => 'EMPLOYEES'))
                     x
             FROM (  SELECT tmp_id,
                            SYS_XMLAGG (
                               xml_col,
                               XMLFormat (enclTag => 'ITEM'))
                               emp_xml
                       FROM (SELECT *
                               FROM (SELECT e.id tmp_id,
                                            SYS_XMLGEN (e.id) id,
                                            SYS_XMLGEN (e.first_name) first_name,
                                            SYS_XMLGEN (e.last_name) last_name,
                                            SYS_XMLGEN (e.country) country,
                                            SYS_XMLGEN (d.name,
                                               XMLFormat (enclTag => 'ACTIVE_DEPARTMENT')) active_department
                                       FROM employees e,
                                            departments d
                                      WHERE e.active_department_id =
                                               d.id) UNPIVOT ((xml_col)
                                                     FOR tmp_col
                                                     IN (id, first_name, last_name, country, active_department)))
                   GROUP BY tmp_id)
           UNION ALL
           SELECT SYS_XMLGEN ('Some Comany name',
                              XMLFormat (enclTag => 'NAME'))
                     x
             FROM DUAL))
SELECT XMLTRANSFORM (company.xml, comp_xml_xslt.object_value)
  FROM company, comp_xml_xslt

In this query, COMPANY inline view is the same XML generating query as in the example before, but the XMLTRANSFORM function in main query is used apply the transformation. Result is below:

<company>
 <name>Some Comany name</name>
</company>
<employees>
 <item id="1">
  <first_name>Ilmar</first_name>
  <last_name>Kerm</last_name>
  <country>Estonia</country>
  <departments>
   <item active="true">
    <name>DBA</name>
   </item>
  </departments>
 </item>
 <item id="2">
  <first_name>Ilmar2</first_name>
  <last_name>Kerm2</last_name>
  <country>Estonia2</country>
  <departments>
   <item active="true">
    <name>Development</name>
   </item>
  </departments>
 </item>
</employees>

XQuery

Oracle also implements the standard XQuery language, that can be used for querying XML data and Oracle has provided some XQuery views to access relational database tables.

SELECT XMLQuery(
         'for $emp in ora:view("EMPLOYEES")/ROW
          return <item id="{$emp/ID}">
                 {$emp/FIRST_NAME}
                 {$emp/LAST_NAME}
                 {$emp/COUNTRY}
                 <departments>
                   {for $dep in ora:view("DEPARTMENTS")/ROW 
                    return <item active="{if ($dep/ID eq $emp/ACTIVE_DEPARTMENT_ID) then "true" else ("false")}">{$dep/NAME}</item>}
                </departments>
                </item>'
          RETURNING CONTENT) x FROM DUAL

And the XML result is:

<item id="1"><FIRST_NAME>Ilmar</FIRST_NAME><LAST_NAME>Kerm</LAST_NAME><COUNTRY>Estonia</COUNTRY><departments><item active="false"><NAME>Development</NAME></item><item active="true"><NAME>DBA</NAME></item></departments></item><item id="2"><FIRST_NAME>Ilmar2</FIRST_NAME><LAST_NAME>Kerm2</LAST_NAME><COUNTRY>Estonia2</COUNTRY><departments><item active="true"><NAME>Development</NAME></item><item active="false"><NAME>DBA</NAME></item></departments></item>

I have tried to read Oracle XMLDB Developer Guide and my first reaction was that parsing XML is very complicated in Oracle… And lately I got a task to rewrite some PL/SQL code to use a different SOAP service and the existing code had a few hundred rows just to parse XML (with XML DOM API)! Actually, starting from 10.2, this job is much easier…

Here I will go over some methods for working with XML, that are usable directly from SQL. In this first part, reading and extracting data from XML.

XML for the examples

<?xml version="1.0" encoding="utf-8"?>

  Some Company name
  
    
      Ilmar
      Kerm
      Estonia
      
        
          Development
        
        
          DBA
        
      
    
    
      Ilmar2
      Kerm2
      Estonia2
      
        
          Development
        
        
          DBA
        
      
    
  

Load the XML data to a table, to special XMLTYPE data type.

create table xml (
  x xmltype
);

insert into xml values (
xmltype.createxml('<?xml version="1.0" encoding="utf-8"?>

  Some Company name
  
...
  
'));

COMMIT;

Extracting a single value

extractValue can be used to extract a single value from XML using XPath expression.

SQL> SELECT extractValue(x, '/company/name') FROM xml;

EXTRACTVALUE(X,'/COMPANY/NAME')
-------------------------------
Some Company name

extractValue only works with single values, otherwise exception will be raised.

SQL> SELECT extractValue(x, '/company/employees/item') FROM xml;
SELECT extractValue(x, '/company/employees/item') FROM xml
*
ERROR at line 1:
ORA-19025: EXTRACTVALUE returns value of only one node

Query XML as relational object

Oracle Database, since 10g, has a very easy way to map XML to a relational object and query it with SQL, so no PL/SQL code is needed for parsing XML – the XMLTABLE function.

SQL> SELECT emp.*
  FROM xml,
       XMLTABLE (
         '/company/employees/item'
         PASSING x
         COLUMNS id NUMBER PATH '@id',
                 first_name VARCHAR2 (10 CHAR) PATH 'first_name',
                 last_name VARCHAR2 (10 CHAR) PATH 'last_name',
                 country VARCHAR2 (10 CHAR) PATH 'country',
                 active_department VARCHAR2 (20 CHAR) PATH 'departments/item[@active="true"]/name'
       ) emp;

        ID FIRST_NAME LAST_NAME  COUNTRY    ACTIVE_DEPARTMENT
---------- ---------- ---------- ---------- --------------------
         1 Ilmar      Kerm       Estonia    DBA
         2 Ilmar2     Kerm2      Estonia2   Development

XmlTable takes the following arguments:

‘/company/employees/item’ XPath expression of the “row”
PASSING x Source of XML data – XMLTYPE data type
COLUMNS
id NUMBER PATH ‘@id’,
first_name VARCHAR2 (10 CHAR) PATH ‘first_name’,
last_name VARCHAR2 (10 CHAR) PATH ‘last_name’,
country VARCHAR2 (10 CHAR) PATH ‘country’,
active_department VARCHAR2 (20 CHAR) PATH ‘departments/item[@active=”true”]/name’
Column definitions with the corresponding Oracle data type and XPath expression

After XML is readable as a relational table, all the power of Oracle SQL can be used for querying.

I love database performance visualization tools and now Oracle has released small desktop widgets that give a quick overview of the targets in OEM Grid Control.

The one I really like is High-Load Databases widget. That shows a quick overview of how the top databases are performing at the moment – total number of average active sessions and a graph how this load is divided between the CPU/IO/Other wait classes.

With a click of a button, the screen changes to the new rectangular style performance graph, also showing the latest ADDM findings.

One feature request to Oracle – please add more lightweight skin and allow the user to configure how many databases the widget shows. I’d like to set that widget “stay on top”, but the view with TOP 5 databases takes too much screen space.

Download the widgets here: http://www.oracle.com/technology/products/oem/widgets/index.html

Blog for these widgets: http://blogs.oracle.com/emwidgets/

Actually, this is the first application I’d like to have running on my mobile device…

I needed to develop one small application that communicates with external mobile devices, being like a central server where mobile clients send their data and receive configuration parameters. Reporting will be added later using Oracle APEX.
It’s a small application for a customer demo, so it sounded like a perfect opportunity to have my first look at the Oracle XE database 🙂

Oracle XE comes with a pre-configured XDB HTTP listener for APEX.
I don’t need the full APEX engine for my simple request server, just a PL/SQL package using PL/SQL Web Toolkit will do just fine and there is no need to use PHP or some other scripting engine outside the database.

Here are the steps I used to configure Oracle XE XDB HTTP listener with additional path (DAD) to my custom procedure.

Make HTTP listener to listen on all network interfaces, not only localhost:

EXEC DBMS_XDB.SETLISTENERLOCALACCESS(FALSE);

Oracle XE has pre-defined DAD for APEX – /apex/
The following code will create a new DAD /mobileapp/, associate it with schema MOBILEAPP and authorize it to run without asking the user for a password.

BEGIN
  DBMS_EPG.create_dad (
    dad_name => 'mobileapp',
    path     => '/mobileapp/*');

  DBMS_EPG.set_dad_attribute (
    dad_name   => 'mobileapp',
    attr_name  => 'default-page',
    attr_value => 'home');

  DBMS_EPG.set_dad_attribute (
    dad_name   => 'mobileapp',
    attr_name  => 'database-username',
    attr_value => 'MOBILEAPP');

  DBMS_EPG.set_dad_attribute (
    dad_name   => 'mobileapp',
    attr_name  => 'nls-language',
    attr_value => 'american_america.al32utf8');

  DBMS_EPG.authorize_dad (
    dad_name => 'mobileapp',
    user     => 'MOBILEAPP');
END;
/

I used the following DAD attributes:

default-page specifies the default procedure name, that will be used when the procedure name is not specified in a HTTP URL
database-username associated database schema name
nls-language NLS client value, for a web application the important part is the charset. al32utf8 means the result page is in UTF-8.
DBMS_EPG.authorize_dad This will disable the HTTP basic authentication

There are other interesting DAD attributes and the current XDB HTTP configuration and APEX DAD configuration can be views using this SQL:

SQL> SELECT   DBMS_XDB.CFG_GET().getClobVal() FROM DUAL;
...
              
                APEX
                PL/SQL
                APEX
                
                  ANONYMOUS
                  apex
                  wwv_flow_file_objects$
                  docs
                  wwv_flow_file_mgr.process_download
                  american_america.al32utf8
                  wwv_flow_epg_include_modules.authorize
                
              
...

Now all MOBILEAPP PL/SQL procedures and packages can be accessed over HTTP and the HTTP URL has the following form:
http://oraclexeserver:8080/mobileapp/procedurename
or
http://oraclexeserver:8080/mobileapp/packagename.procedurename

So, a simple hello world procedure using PL/SQL Web Toolkit could look like this:

CREATE OR REPLACE PROCEDURE mobileapp.home IS
BEGIN
  HTP.htmlopen;
  HTP.headopen;
  HTP.title('Page title');
  HTP.headclose;
  HTP.bodyopen;
  HTP.print('Hello world');
  HTP.bodyclose;
  HTP.htmlclose;
END;
/

HTTP URL for this procedure will be http://oraclexeserver:8080/mobileapp/home or just http://oraclexeserver:8080/mobileapp/ (note the default-page DAD attribute).
The proceudres like HTP.htmlopen will just generate HTML tags.

Passing HTTP GET parameters is very easy and they will be set as a procedure parameters with the same name. Parameter data type is VARCHAR2 or if HTTP request has more than one parameter with the same name, then TABLE OF VARCHAR2 data type is used.

CREATE OR REPLACE PROCEDURE mobileapp.paramtest(p1 IN VARCHAR2 DEFAULT NULL,
  p2 IN VARCHAR2 DEFAULT NULL) IS
BEGIN
  IF p1 IS NULL THEN
    HTP.p('Hello world');
  ELSE
    HTP.p('Hello p1='||p1||' p2='||p2);
  END IF;
END;
/

This procedure can be called with HTTP URL:
http://oraclexeserver:8080/mobileapp/paramtest?p1=value1&p2=value2
Both parameters are optional (procedure parameters have DEFAULT value set).

If the procedure raises an exception or suitable procedure name with correct argument list cannot be found, then HTTP error code 404 (page not found) is the result.

I got a task some time ago, to copy a-lot-of-millions of rows from Oracle to MySQL. As fast as possible.
In the last year I have used Oracle Heterogeneous Services (HSODBC) a lot to select data from different non-Oracle databases, but I had yet to try to unload data from Oracle with best performance.

My first thought was, of course, that the process needs to run in parallel. Here I’ll first take a look at the overall HSODBC performance and then look how can I make this process run in parallel.

In this test I use Oracle Database 11.1.0.7 as source, Oracle Heterogeneous Services 10.2.0.4 with MyODBC 5.1 driver, MySQL 5.1 destination database. In my setup, I have all these three components running on separate servers.

SQL> CREATE TABLE hsperftest
AS
   SELECT   * FROM all_objects
   UNION ALL
   SELECT   * FROM all_objects
   UNION ALL
   SELECT   * FROM all_objects;

SQL> DESC hsperftest
 Name                                        Null?    Type
 ------------------------------------------- -------- ---------------------
 OWNER                                                VARCHAR2(30)
 OBJECT_NAME                                          VARCHAR2(30)
 SUBOBJECT_NAME                                       VARCHAR2(30)
 OBJECT_ID                                            NUMBER
 DATA_OBJECT_ID                                       NUMBER
 OBJECT_TYPE                                          VARCHAR2(19)
 CREATED                                              DATE
 LAST_DDL_TIME                                        DATE
 TIMESTAMP                                            VARCHAR2(19)
 STATUS                                               VARCHAR2(7)
 TEMPORARY                                            VARCHAR2(1)
 GENERATED                                            VARCHAR2(1)
 SECONDARY                                            VARCHAR2(1)
 NAMESPACE                                            NUMBER
 EDITION_NAME                                         VARCHAR2(30)
 ID                                          NOT NULL NUMBER(10)

And I create the same table in MySQL.

CREATE TABLE `hsperftest` (
  `OWNER` varchar(30) DEFAULT NULL,
  `OBJECT_NAME` varchar(30) DEFAULT NULL,
  `SUBOBJECT_NAME` varchar(30) DEFAULT NULL,
  `OBJECT_ID` int(11) DEFAULT NULL,
  `DATA_OBJECT_ID` int(11) DEFAULT NULL,
  `OBJECT_TYPE` varchar(19) DEFAULT NULL,
  `CREATED` datetime DEFAULT NULL,
  `LAST_DDL_TIME` datetime DEFAULT NULL,
  `TIMESTAMP` varchar(19) DEFAULT NULL,
  `STATUS` varchar(7) DEFAULT NULL,
  `TEMPORARY` varchar(1) DEFAULT NULL,
  `GENERATED` varchar(1) DEFAULT NULL,
  `SECONDARY` varchar(1) DEFAULT NULL,
  `NAMESPACE` int(11) DEFAULT NULL,
  `EDITION_NAME` varchar(30) DEFAULT NULL,
  `ID` int(11) NOT NULL
) ENGINE=InnoDB DEFAULT CHARSET=latin1;

Set up HSODBC to connect to the MySQL database and create a database link on Oracle side. When setting up HSODBC, do not enable logging/tracing (HS_FDS_TRACE_LEVEL = off)! Tracing will serialize all database calls to the remote database.

I tested this with HSODBC running on Windows Vista and Linux (using unixODBC). The results were similar.

Test that MySQL table is accessible for Oracle:

SQL> DESC "hsperftest"@hsdb
 Name                                                  Null?    Type
 ----------------------------------------------------- -------- -----------------
 OWNER                                                          VARCHAR2(30)
 OBJECT_NAME                                                    VARCHAR2(30)
 SUBOBJECT_NAME                                                 VARCHAR2(30)
 OBJECT_ID                                                      NUMBER(10)
 DATA_OBJECT_ID                                                 NUMBER(10)
 OBJECT_TYPE                                                    VARCHAR2(19)
 CREATED                                                        DATE
 LAST_DDL_TIME                                                  DATE
 TIMESTAMP                                                      VARCHAR2(19)
 STATUS                                                         VARCHAR2(7)
 TEMPORARY                                                      VARCHAR2(1)
 GENERATED                                                      VARCHAR2(1)
 SECONDARY                                                      VARCHAR2(1)
 NAMESPACE                                                      NUMBER(10)
 EDITION_NAME                                                   VARCHAR2(30)
 ID                                                    NOT NULL NUMBER(10)

Before continuing, check that all MySQL data types are mapped correctly to Oracle. And make sure, that none of the columns are using deprecated LONG datatype.

First of all… the following SQL will not work:

SQL> insert into "hsperftest"@hsdb select * from hsperftest;
insert into "hsperftest"@hsdb select * from hsperftest
                                     *
ERROR at line 1:
ORA-02025: all tables in the SQL statement must be at the remote database

Some PL/SQL is needed to first select from Oracle cursor and then bind the values to the remote database INSERT statement. The easiest way is:

BEGIN
   FOR rec IN (SELECT   * FROM HSPERFTEST)
   LOOP
      INSERT INTO "hsperftest"@hsdb
        VALUES   rec;
   END LOOP;
   COMMIT;
END;
/

PL/SQL procedure successfully completed.

Elapsed: 00:14:28.38

When running this anonymous block I noticed, that the load on Oracle database and MySQL database was very light, but the HSODBC CPU usage was at the maximum. So HSODBC seems to be the bottleneck. What is it doing? With this kind of CPU usage its not going to scale much in parallel. If its performing some mappings between Oracle SQL and MySQL SQL, then can I remove this bottleneck by using DBMS_HS_PASSTHROUGH, that sends the query unmodified to the remote database?

DECLARE
   p_c   BINARY_INTEGER;
   nr    INTEGER;
BEGIN
   p_c := DBMS_HS_PASSTHROUGH.OPEN_CURSOR@hsdb;
   DBMS_HS_PASSTHROUGH.PARSE@hsdb (
      p_c,
      'insert into hsperftest (OWNER,OBJECT_NAME,SUBOBJECT_NAME,OBJECT_ID,
      DATA_OBJECT_ID,OBJECT_TYPE,CREATED,LAST_DDL_TIME,TIMESTAMP,STATUS,
      TEMPORARY,GENERATED,SECONDARY,NAMESPACE,EDITION_NAME,ID)
      VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)'
   );

   FOR rec IN (SELECT   * FROM HSPERFTEST)
   LOOP
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 1, rec.owner);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 2, rec.object_name);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 3, rec.SUBOBJECT_NAME);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 4, rec.OBJECT_ID);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 5, rec.DATA_OBJECT_ID);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 6, rec.OBJECT_TYPE);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 7, rec.CREATED);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 8, rec.LAST_DDL_TIME);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 9, rec.TIMESTAMP);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 10, rec.STATUS);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 11, rec.TEMPORARY);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 12, rec.GENERATED);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 13, rec.SECONDARY);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 14, rec.NAMESPACE);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 15, rec.EDITION_NAME);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 16, rec.ID);
      nr := DBMS_HS_PASSTHROUGH.EXECUTE_NON_QUERY@hsdb (p_c);
   END LOOP;

   DBMS_HS_PASSTHROUGH.CLOSE_CURSOR@hsdb (p_c);
   COMMIT;
END;
/

PL/SQL procedure successfully completed.

Elapsed: 00:07:37.77

That is a huge difference in execution time! And with DBMS_HS_PASSTHROUGH, the HSODBC process CPU usage is much less, making it more scalable running in parallel.

Now, how to run it in parallel?

Since the destination database is non-Oracle, the database built-in parallel execution cannot be used, so “DIY parallelism” is needed: split/partition the input data with some kind of criteria and start a new process for each partition.

In the real project I used each records CREATED_DATE value for partitioning, because underlying table was already partitioned by that column, but its also possible to use table ROWID values for partitioning. A good example of it is on book Expert Oracle database architecture By Thomas Kyte, chapter 14, Parallel Execution, Do-It-Yourself Parallelism.

Here I’m going to split the table just into 3 pieces to make it simple. To get the maximum IO performance from Oracle, Tom Kytes example needs to be followed. But I’ll make my example just simple.
Table hsperftest_job will keep the metadata for every parallel slave.

CREATE TABLE hsperftest_job
(
   job_no       NUMBER NOT NULL,
   low_id       ROWID NOT NULL,
   hi_id        ROWID NOT NULL,
   max_in       NUMBER NOT NULL,
   start_date   DATE,
   end_date     DATE,
   rows_processed   NUMBER
);
CREATE SEQUENCE seq_hsperftest_job;

DECLARE
  p_min rowid;
  p_max rowid;
  p_row1 rowid;
  p_row2 rowid;
  p_count number;
BEGIN
  SELECT MIN(rowid), MAX(rowid), COUNT(*) INTO p_min, p_max, p_count
    FROM hsperftest;
  SELECT MAX(a) INTO p_row1 FROM (
    SELECT rowid a FROM hsperftest WHERE rownum < CEIL(p_count/3)
  );
  SELECT MAX(a) INTO p_row2 FROM (
    SELECT rowid a, rownum rn FROM hsperftest WHERE rowid > p_row1 
  ) WHERE rn < CEIL(p_count/3);
  INSERT INTO hsperftest_job (job_no, low_id, hi_id, max_in) VALUES (seq_hsperftest_job.nextval, p_min, p_row1, 0);
  INSERT INTO hsperftest_job (job_no, low_id, hi_id, max_in) VALUES (seq_hsperftest_job.nextval, p_row1, p_row2, 0);
  INSERT INTO hsperftest_job (job_no, low_id, hi_id, max_in) VALUES (seq_hsperftest_job.nextval, p_row2, p_max, 1);
  COMMIT;
END;
/

  SELECT   j.job_no, COUNT (1)
    FROM   hsperftest_job j, hsperftest h
   WHERE   h.ROWID BETWEEN j.low_id AND j.hi_id
GROUP BY   j.job_no;

    JOB_NO   COUNT(1)
---------- ----------
         1      59313
         2      59314
         3      59317

Looks like my very simple ROWID hack is good enough for this test.

Now I'll create a procedure that will be executed in parallel.

CREATE OR REPLACE PROCEDURE hsperftest_proc(p_job_no hsperftest_job.job_no%TYPE) IS
  p_min ROWID;
  p_max ROWID;
  p_max_in  NUMBER;
  p_c   BINARY_INTEGER;
  nr    INTEGER;
  p_count NUMBER:= 0;
BEGIN
  -- Read the job metadata
  SELECT low_id, hi_id, max_in INTO p_min, p_max, p_max_in
  FROM hsperftest_job
  WHERE job_no = p_job_no;
  UPDATE hsperftest_job SET start_date = sysdate WHERE job_no = p_job_no;
  COMMIT;
  -- Do the work
   p_c := DBMS_HS_PASSTHROUGH.OPEN_CURSOR@hsdb;
   DBMS_HS_PASSTHROUGH.PARSE@hsdb (
      p_c,
      'insert into hsperftest (OWNER,OBJECT_NAME,SUBOBJECT_NAME,OBJECT_ID,DATA_OBJECT_ID,OBJECT_TYPE,CREATED,LAST_DDL_TIME,TIMESTAMP,STATUS,
      TEMPORARY,GENERATED,SECONDARY,NAMESPACE,EDITION_NAME,ID)
      VALUES (?,?,?,?,?,?,?,?,?,?,?,?,?,?,?,?)'
   );
   FOR rec IN (SELECT  * FROM HSPERFTEST WHERE rowid BETWEEN p_min AND p_max AND (p_max_in = 1 OR (p_max_in = 0 AND rowid < p_max)))
   LOOP
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 1, rec.owner);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 2, rec.object_name);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 3, rec.SUBOBJECT_NAME);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 4, rec.OBJECT_ID);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 5, rec.DATA_OBJECT_ID);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 6, rec.OBJECT_TYPE);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 7, rec.CREATED);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 8, rec.LAST_DDL_TIME);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 9, rec.TIMESTAMP);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 10, rec.STATUS);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 11, rec.TEMPORARY);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 12, rec.GENERATED);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 13, rec.SECONDARY);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 14, rec.NAMESPACE);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 15, rec.EDITION_NAME);
      DBMS_HS_PASSTHROUGH.BIND_VARIABLE@hsdb (p_c, 16, rec.ID);
      nr := DBMS_HS_PASSTHROUGH.EXECUTE_NON_QUERY@hsdb (p_c);
      p_count:= p_count + 1;
   END LOOP;

   DBMS_HS_PASSTHROUGH.CLOSE_CURSOR@hsdb (p_c);
   COMMIT;  
  -- Write the job end time
  UPDATE hsperftest_job SET end_date = sysdate, rows_processed = p_count WHERE job_no = p_job_no;
  COMMIT;
END;
/

And finally I'll execute the jobs by putting them on the job queue.

DECLARE
   x   PLS_INTEGER;
BEGIN
   FOR rec IN (SELECT   job_no FROM hsperftest_job)
   LOOP
      DBMS_JOB.submit (job => x, what => 'hsperftest_proc(' || rec.job_no || ');');
   END LOOP;

   COMMIT;
END;
/

SQL> SELECT   job, total_time, broken, what FROM   user_jobs;
       JOB TOTAL_TIME B WHAT
---------- ---------- - -----------------
     10967         95 N hsperftest_proc(1);
     10968         95 N hsperftest_proc(2);
     10969         95 N hsperftest_proc(3);

When the jobs are running, monitor the HSODBC processes CPU and memory usage. If there are too much parallel slaves running, then they may become a bottleneck.
If HSODBC memory usage becomes a problem, then partition the source data into smaller chunks.

When everything is done, look at the results.

SQL> SELECT SUM(rows_processed) total_rows, (MAX(end_date)-MIN(start_date))*24*60 runtime_min FROM hsperftest_job;

TOTAL_ROWS RUNTIME_MIN
---------- -----------
    177942  3.58333333

3.58 min = 3 min 34.8 sec. Nice 🙂