Ilmar Kerm

Oracle, databases, Linux and maybe more

I published my 2014 presentation “Making MySQL highly available using Oracle Grid Infrastructure” in Slideshare.
Please also read my page how to set up the mysql scripts for Oracle GI

Fre MySQL seminar on 27. august 2014 @ 13:00. Announcement by Oracle User Group Estonia:

Developing modern applications using MySQL.

In this seminar series learn how to best use MySQL for your existing and new development requirements with leading MySQL expert and Oracle Ace Director Ronald Bradford.

These presentations provide a detailed review of the essential lifecycle components for developing a successful software application and offer a checklist for your company to review the design, development, deployment and support of your business applications with MySQL.

The presentations include:

* Effective MySQL Architecture and Design Practices
* Effective Software Development with MySQL
* Effective Web Site Operations
* Upcoming MySQL features for modern applications

Detailed description about the topics: read here.

More information about Ronald Bradford:

http://ronaldbradford.com/
https://en.wikipedia.org/wiki/Ronald_Bradford
https://apex.oracle.com/pls/apex/f?p=19297:4:::NO:4:P4_ID:1820

To attend this event, PLEASE REGISTER!

This event is organized by Oracle User Group Estonia in cooperation with Finnish, Swedish and Latvian user groups.

The event in Tallinn is sponsored by TransferWise.

If you require more information about this event, please contact ouge@ouge.eu

Back-story: A developer came to me and wanted explanation for a weird behavior in MySQL. They inserted a record (to InnoDB table), committed, and after receiving a message (on another application) tried to read that inserted record immediately, but the newly inserted record was not found. Problem only happened in production, but not always (quite frequently).

After comparing the MySQL parameter files between production and development environments I discovered that in production autocommit was disabled to make MySQL behave more like Oracle. This setting was removed from development after we rebuilt the environment (to use multiple MySQL instances with Oracle Clusterware, instead of one large MySQL instance), but the rebuild was not yet done in production.

The default transaction level for MySQL InnoDB is REPEATABLE READ (unlike Oracle, that has READ COMMITTED as default), that means that the SELECT query always returns the data at the time point when the transaction was started. If autocommit is off, then the first issued select statement will open the transaction and any subsequent select statement will return the data at the time point when the first select was issued, until transaction is ended with COMMIT/ROLLBACK. If autocommit is enabled, SELECT statement is run in a self-contained transaction, ending with COMMIT, so the end result is like READ COMMITTED isolation level in Oracle.

Here is an example what you’d expect to see as a result:

mysql session 1$ create table test (id integer unsigned primary key) engine=innodb;
Query OK, 0 rows affected (0.01 sec)

mysql session 1$ set autocommit=1;
Query OK, 0 rows affected (0.00 sec)

mysql session 1$ select * from test;
Empty set (0.01 sec)

  mysql session 2$ begin;
  Query OK, 0 rows affected (0.00 sec)

  mysql session 2$ insert into test values (1);
  Query OK, 1 row affected (0.05 sec)

  mysql session 2$ commit;
  Query OK, 0 rows affected (0.00 sec)

mysql session 1$ select * from test;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.00 sec)

The same example of having autocommit off and transaction isolation level set as default:

mysql session 1$ set autocommit=0;
Query OK, 0 rows affected (0.00 sec)

mysql session 1$ select * from test;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.00 sec)

  mysql session 2$ begin;
  Query OK, 0 rows affected (0.00 sec)

  mysql session 2$ insert into test values (2);
  Query OK, 1 row affected (0.05 sec)

  mysql session 2$ commit;
  Query OK, 0 rows affected (0.00 sec)

mysql session 1$ select * from test;
+----+
| id |
+----+
|  1 |
+----+
1 row in set (0.00 sec)

mysql session 1$ commit;
Query OK, 0 rows affected (0.00 sec)

mysql session 1$ select * from test;
+----+
| id |
+----+
|  1 |
|  2 |
+----+
2 rows in set (0.00 sec)

Now, autocommit off and transaction isolation level set to READ COMMITTED:

mysql session 1$ set autocommit=0;
Query OK, 0 rows affected (0.00 sec)

mysql session 1$ set session transaction isolation level read committed;
Query OK, 0 rows affected (0.03 sec)

mysql session 1$ select * from test;
+----+
| id |
+----+
|  1 |
|  2 |
+----+
2 rows in set (0.01 sec)

  mysql session 2$ begin;
  Query OK, 0 rows affected (0.00 sec)

  mysql session 2$ insert into test values (3);
  Query OK, 1 row affected (0.05 sec)

  mysql session 2$ commit;
  Query OK, 0 rows affected (0.00 sec)

mysql session 1$ select * from test;
+----+
| id |
+----+
|  1 |
|  2 |
|  3 |
+----+
3 rows in set (0.00 sec)

You can read more about how autocommit is handled by InnoDB in documentation.

Another key component in our puzzle – application server used connection pools for managing MySQL connections, so after Java session requested a connection from pool, it was only a matter of chance whether the connection already had a transaction left open by the previous user of that connection, did the previous user commit, or was it a brand new connection.

I’ve been doing lately quite many database clustering implementations; Oracle RAC and since we have many MySQL instances in production, had to find a good way to make MySQL instances highly available also.

One good solution for this is managing MySQL instances with clusterware and since we are planning to use Oracle RAC on Oracle Enterprise Linux anyway, then Oracle Clusterware is an excellent candidate for this task. Also… Oracle Clusterware is included with Oracle Enterprise Linux at no additional charge.

Requirements I had:

  • Multiple MySQL instances running in the same cluster, in case of node failure affected MySQL instances are moved to any other surviving node (least loaded)
  • Different MySQL instances may run different MySQL RDBMS software versions
  • Each MySQL instance is listening to its own dedicated and clusterware managed IP address (so MySQL can always respond on default 3306 port and can move independently around nodes without any dependencies to other MySQL instances)
  • Clusterware monitors each MySQL instance and in case of trouble automatically moves IP and starts instance on another cluster node

For this task I wrote some Bash scripts to help managing MySQL instances in Oracle Clusterware environment. All these scripts are available here: Google Code project page, also with documentation of the whole setup and how to use scripts: Documentation

All comments welcome!

Here I’m following up my previous post Using linguistic indexes for sorting in Oracle. I don’t much like the Oracle solution, that requires creating a special index to speed up sorting, but… at the same time its very powerful, allows to index in many languages and no database changes are needed.

In this post I’ll take a look at the two popular open source databases MySQL and PostgreSQL. I’ll take a look only at features, that the database has included and that can be used without any special application changes.

PostgreSQL 8.4

Starting from 8.4, collation (sorting) rules can be defined per database and there is no possibility to set it in session level. All sorting and all indexes are ordered according to the database collation locale. In previous versions there was only one collation locale allowed for the entire database cluster.

For my example I create two databases, one with Estonian locale and one with German.

$ createdb -E utf8 -l 'et_EE.utf8' -O ilmar ilmar_ee
$ createdb -E utf8 -l 'de_DE.utf8' -O ilmar -T template0 ilmar_de

Currently PostgreSQL relies on the underlying OS to do the collations, so the OS must also support the specified locale. Check it with:

$ locale -a | grep et_EE
et_EE
et_EE.iso88591
et_EE.iso885915
et_EE.utf8

To change the collation you need to dump the entire database to a text file (pg_dump), create a new database and reload the data. So, a pretty painful procedure for large databases.

A small test if it really works. In ilmar_ee and ilmar_de I create table test_coll and load it with 4 rows:

CREATE TABLE test_coll (
  t character varying(100) NOT NULL
);

begin;
insert into test_coll values ('a');
insert into test_coll values ('o');
insert into test_coll values ('õ');
insert into test_coll values ('ä');
commit;


ilmar_ee=> select t from test_coll order by t;
a
o
õ
ä

ilmar_de=> select t from test_coll order by t;
a
ä
o
õ

Now can index be used for sorting?

CREATE TABLE t (
  t character varying(100) NOT NULL
);

CREATE INDEX idxt
  ON t
  USING btree
  (t);

CREATE OR REPLACE FUNCTION fill_t(p_num_rows bigint) RETURNS bigint AS
$BODY$
declare
  s t.t%type;
  i integer;
begin
  for i in 1..p_num_rows loop
    s:= case mod(i, 4) when 0 then 'a' when 1 then 'ä' when 2 then 'o' when 3 then 'õ' end;
    if mod(i,2) = 0 then
      s:= upper(s);
    end if;
    s:= s ||' wqe wqe wqe wqeqwdsa asd asdasd sasss we qwewq dssas';
    insert into t (t) values (s);
  end loop;
  return 0;
end;
$BODY$
  LANGUAGE 'plpgsql' VOLATILE;

select fill_t(10000);
vacuum analyze t;

After the test data is created, some tests. Oracle users will find the following very strange:

ilmar=> explain select t from t order by t;
                          QUERY PLAN
--------------------------------------------------------------
 Sort  (cost=868.39..893.39 rows=10000 width=55)
   Sort Key: t
   ->  Seq Scan on t  (cost=0.00..204.00 rows=10000 width=55)

ilmar=> explain select t from t where t between 'a' and 'b' order by t;
                               QUERY PLAN
-------------------------------------------------------------------------
 Sort  (cost=395.10..401.35 rows=2500 width=55)
   Sort Key: t
   ->  Seq Scan on t  (cost=0.00..254.00 rows=2500 width=55)
         Filter: (((t)::text >= 'a'::text) AND ((t)::text <= 'b'::text))

ilmar=> explain select t from t where t between 'a' and 'ak' order by t;
                               QUERY PLAN
------------------------------------------------------------------------
 Index Scan using idxt on t  (cost=0.00..8.27 rows=1 width=55)
   Index Cond: (((t)::text >= 'a'::text) AND ((t)::text <= 'ak'::text))

It seems that Postgres optimizer only consideres using index for sorting, when there is only a small fraction of the table filtered. A reason for this in the documentation is:

The planner will consider satisfying an ORDER BY specification either by scanning an available index that matches the specification, or by scanning the table in physical order and doing an explicit sort. For a query that requires scanning a large fraction of the table, an explicit sort is likely to be faster than using an index because it requires less disk I/O due to following a sequential access pattern. Indexes are more useful when only a few rows need be fetched.

So it seems that Postgres cannot fast full scan an index. If I fill the table up even more, then finally optimizer is costing the sort operation higher than index scan. But the query is slow, 4125ms on my system.

ilmar=> select fill_t(90000);
ilmar=> vacuum analyze t;

ilmar=> explain select t from t order by t;
                              QUERY PLAN
-----------------------------------------------------------------------
 Index Scan using idxt on t  (cost=0.00..9771.11 rows=100000 width=55)

A note from documentation:

The drawback of using locales other than C or POSIX in PostgreSQL is its performance impact. It slows character handling and prevents ordinary indexes from being used by LIKE. For this reason use locales only if you actually need them.

Better collation supports seems to be a work in progress:
Todo:Collate
Todo:IDU

Its also possible to use third party Orafce package, that enables the use on NLSSORT function, that is similar to Oracle.

MySQL 5.1

MySQL supports collations at the column level.
All available collations for the given charset can be queried like this:

mysql> show collation where charset = 'utf8';
+--------------------+---------+-----+---------+----------+---------+
| Collation          | Charset | Id  | Default | Compiled | Sortlen |
+--------------------+---------+-----+---------+----------+---------+
| utf8_general_ci    | utf8    |  33 | Yes     | Yes      |       1 |
| utf8_bin           | utf8    |  83 |         | Yes      |       1 |
| utf8_unicode_ci    | utf8    | 192 |         | Yes      |       8 |
| utf8_icelandic_ci  | utf8    | 193 |         | Yes      |       8 |
| utf8_latvian_ci    | utf8    | 194 |         | Yes      |       8 |
| utf8_romanian_ci   | utf8    | 195 |         | Yes      |       8 |
| utf8_slovenian_ci  | utf8    | 196 |         | Yes      |       8 |
| utf8_polish_ci     | utf8    | 197 |         | Yes      |       8 |
| utf8_estonian_ci   | utf8    | 198 |         | Yes      |       8 |
| utf8_spanish_ci    | utf8    | 199 |         | Yes      |       8 |
| utf8_swedish_ci    | utf8    | 200 |         | Yes      |       8 |
| utf8_turkish_ci    | utf8    | 201 |         | Yes      |       8 |
| utf8_czech_ci      | utf8    | 202 |         | Yes      |       8 |
| utf8_danish_ci     | utf8    | 203 |         | Yes      |       8 |
| utf8_lithuanian_ci | utf8    | 204 |         | Yes      |       8 |
| utf8_slovak_ci     | utf8    | 205 |         | Yes      |       8 |
| utf8_spanish2_ci   | utf8    | 206 |         | Yes      |       8 |
| utf8_roman_ci      | utf8    | 207 |         | Yes      |       8 |
| utf8_persian_ci    | utf8    | 208 |         | Yes      |       8 |
| utf8_esperanto_ci  | utf8    | 209 |         | Yes      |       8 |
| utf8_hungarian_ci  | utf8    | 210 |         | Yes      |       8 |
+--------------------+---------+-----+---------+----------+---------+
21 rows in set (0.04 sec)

I'll create a table test, in where column e uses Estonian sorting and column h uses hungarian sorting.

mysql> create table test (
  e varchar(100) not null collate 'utf8_estonian_ci',
  h varchar(100) not null collate 'utf8_hungarian_ci'
) charset=utf8;

mysql> insert into test values ('a','a');
mysql> insert into test values ('ä','ä');
mysql> insert into test values ('o','o');
mysql> insert into test values ('õ','õ');
mysql> select * from test;
+---+---+
| e | h |
+---+---+
| a | a |
| ä | ä |
| o | o |
| õ | õ |
+---+---+
4 rows in set (0.02 sec)

mysql> select * from test order by e;
+---+---+
| e | h |
+---+---+
| a | a |
| o | o |
| õ | õ |
| ä | ä |
+---+---+
4 rows in set (0.04 sec)

mysql> select * from test order by h;
+---+---+
| e | h |
+---+---+
| a | a |
| ä | ä |
| o | o |
| õ | õ |
+---+---+
4 rows in set (0.01 sec)

Perfect, now what about indexes?

mysql> create index idxe on test (e);
Query OK, 4 rows affected (0.71 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> create index idxh on test (h);
Query OK, 4 rows affected (0.18 sec)
Records: 4  Duplicates: 0  Warnings: 0

mysql> explain select e from test order by e;
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | test  | index | NULL          | idxe | 302     | NULL |    4 | Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

mysql> explain select h from test order by h;
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra       |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
|  1 | SIMPLE      | test  | index | NULL          | idxh | 302     | NULL |    4 | Using index |
+----+-------------+-------+-------+---------------+------+---------+------+------+-------------+
1 row in set (0.00 sec)

If I force a different collation on a column, then values are read from index, but extra filesort step is needed:

mysql> explain select h from test order by h collate utf8_estonian_ci;
+----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------------+
| id | select_type | table | type  | possible_keys | key  | key_len | ref  | rows | Extra                       |
+----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------------+
|  1 | SIMPLE      | test  | index | NULL          | idxh | 302     | NULL |    4 | Using index; Using filesort |
+----+-------------+-------+-------+---------------+------+---------+------+------+-----------------------------+
1 row in set (0.00 sec)

I'll add more rows to see how much difference there is when reading the order from index and when doing extra sorting.

mysql> delimiter //
mysql> create procedure load_data(p_num_rows INT)
    -> BEGIN
    ->   SET @i = 0;
    ->   REPEAT
    ->     SET @i = @i + 1;
    ->     INSERT INTO test (e, h) VALUES (CONCAT('eeeaad sadsa dasd asd', @i),
    ->       CONCAT('213aad sadsa dasd asd', @i));
    ->   UNTIL @i > p_num_rows END REPEAT;
    -> END
    -> //
Query OK, 0 rows affected (0.44 sec)

mysql> delimiter ;
mysql> call load_data(10000);
mysql> call load_data(20000);
mysql> ANALYZE TABLE test;

mysql> select sql_no_cache count(*) from (select h from test order by h collate utf8_estonian_ci) a;
+----------+
| count(*) |
+----------+
|    30002 |
+----------+
1 row in set (1.22 sec)

mysql> select sql_no_cache count(*) from (select h from test order by h) a;
+----------+
| count(*) |
+----------+
|    30002 |
+----------+
1 row in set (0.09 sec)

It is also possible to set collation at connection level, but this does not change the row sorting order like in Oracle.

mysql> truncate table test;
Query OK, 0 rows affected (0.07 sec)

mysql> insert into test values ('a','a');
Query OK, 1 row affected (0.03 sec)

mysql> insert into test values ('ä','ä');
Query OK, 1 row affected (0.08 sec)

mysql> insert into test values ('o','o');
Query OK, 1 row affected (0.03 sec)

mysql> insert into test values ('õ','õ');
Query OK, 1 row affected (0.03 sec)

mysql> set collation_connection=utf8_estonian_ci;
Query OK, 0 rows affected (0.01 sec)

mysql> select h from test order by h;
+---+
| h |
+---+
| a |
| ä |
| o |
| õ |
+---+
4 rows in set (0.00 sec)

mysql> set collation_connection=utf8_hungarian_ci;
Query OK, 0 rows affected (0.00 sec)

mysql> select h from test order by h;
+---+
| h |
+---+
| a |
| ä |
| o |
| õ |
+---+
4 rows in set (0.00 sec)