Ansible, ARA and MariaDB (or MySQL)

ARA ^

A short while back I switched to Ansible for configuration management.

One of the first things I needed was a way to monitor Ansible runs for problems, you know, so I wasn’t relying on seeing abnormal logging to tell me that a particular host was unreachable, or that tasks were failing somewhere.

I know I could use Ansible Tower for this, but it seemed like such a heavyweight solution for what seems like a simple requirement. That was when I found ARA.

Honestly even ARA is more than I need, and I don’t typically find myself using the web interface even though it’s very pretty. No, I’m only really interested in poking around in the database.

It’s great that it’s simple to install and get started with, so very quickly I was recording details about my Ansible runs a SQLite database, which is the default setup. From there it was a simple SQL query to check if a particular host had suffered any task failures. It was easy to knock up a Nagios-compatible check plugin to call from my Icinga2 monitoring.

Issues ^

Excess data ^

The first problem I noted was that the SQLite database file was starting to grow in size quite rapidly. Around one week of run data used around 800MiB of storage. It’s not huge, but it was relentlessly growing and I could see very little value in keeping that data as I never looked at data from runs more than a few days previous. So, I wrote a script to delete old stuff from the database, keeping the last week’s worth.

Locking ^

Next up I started seeing SQLite locking problems.

Checks from Icinga2 were connecting to the SQLite DB, and so was the prune script, and Ansible itself too. The ARA Ansible plugin started complaining about locking:

[WARNING]: Failure using method (v2_runner_on_ok) in callback plugin                                                   
(<ansible.plugins.callback./opt/ansible/venv/ansible/lib/python2.7/site-                                                  
packages/ara/plugins/callbacks/log_ara.CallbackModule object at                                                         
0x7f2a84834b10>): (sqlite3.OperationalError) database is locked [SQL: u'SELECT                                          
tasks.id AS tasks_id, tasks.playbook_id AS tasks_playbook_id, tasks.play_id AS                                          
tasks_play_id, tasks.name AS tasks_name, tasks.sortkey AS tasks_sortkey,                                                
tasks.action AS tasks_action, tasks.tags AS tasks_tags, tasks.is_handler AS                                             
tasks_is_handler, tasks.file_id AS tasks_file_id, tasks.lineno AS tasks_lineno,                                         
tasks.time_start AS tasks_time_start, tasks.time_end AS tasks_time_end \nFROM                                           
tasks \nWHERE tasks.id = ?'] [parameters:                                                                               
('5f4506f7-95ac-4468-bea3-672d399d4507',)] (Background on this error at:                                                
http://sqlalche.me/e/e3q8)

The Ansible playbook run itself isn’t quick, taking about 13 minutes at the moment, and it seems like sometimes when that was running the check script too was running into locking issues resulting in alerts that were not actionable.

The ARA Ansible reporting plugin is the only thing that should be writing to the database so I thought it should be simple to let that have a write lock while everything else is freely able to read, but I couldn’t get to the bottom of this. No matter what I tried I was getting lock errors not just from ARA but also from my check script.

The basic problem here is that SQLite really isn’t designed for multiple concurrent access. I needed to move to a proper database engine.

At this point I think people really should be looking at PostgreSQL. My postgres knowledge is really rusty though, and although MySQL has its serious issues I felt like I had a handle on them. Rather than have this be the only postgres thing I have deployed I decided I’d do it with MariaDB.

MariaDB ^

So, MariaDB. Debian stretch. That comes with v10.1. Regular install, create the ARA database, point Ansible at it, do a playbook run and let ARA create its tables on first run. Oh.

UTF8 what? ^

Every play being run by Ansible was giving this warning:

PLAY [all] **************************************************************************
/opt/ansible/venv/ansible/local/lib/python2.7/site-packages/pymysql/cursors.py:170: Warning: (1300, u"Invalid utf8 character string: '9C9D56'")                             
  result = self._query(query)

A bit of searching around suggested that my problem here was that the database was defaulting to utf8mb3 character set when these days MariaDB (and MySQL) should really be using utf8mb4.

Easy enough, right? Just switch to utf8mb4. Oh.

[WARNING]: Skipping plugin (/opt/ansible/venv/ansible/lib/python2.7/site-             
packages/ara/plugins/callbacks/log_ara.py) as it seems to be invalid:                
(pymysql.err.InternalError) (1071, u'Specified key was too long; max key length is
767 bytes') [SQL: u'\nCREATE TABLE files (\n\tid VARCHAR(36) NOT NULL,
\n\tplaybook_id VARCHAR(36), \n\tpath VARCHAR(255), \n\tcontent_id VARCHAR(40),
\n\tis_playbook BOOL, \n\tPRIMARY KEY (id), \n\tFOREIGN KEY(content_id) REFERENCES
file_contents (id), \n\tFOREIGN KEY(playbook_id) REFERENCES playbooks (id) ON DELETE
RESTRICT, \n\tUNIQUE (playbook_id, path), \n\tCHECK (is_playbook IN (0, 1))\n)\n\n']
(Background on this error at: http://sqlalche.me/e/2j85)

The problem now is that by default InnoDB has a maximum key length of 767 bytes, and with the utf8mb4 character set it is possible for each character to use up 4 bytes. A VARCHAR(255) column might need 1020 bytes.

Bigger keys in InnoDB ^

It is possible to increase the maximum key size in InnoDB to 3,000 and something bytes. Here’s the configuration options needed:

[mysqld]
innodb_file_format=Barracuda
innodb_large_prefix=1

But sadly in MariaDB 10.1 even that is not enough, because the InnoDB row format in v10.1 is compact. In order to use large prefixes you need to use either compressed or dynamic.

In MariaDB v10.2 the default changes to dynamic and there is an option to change the default on a server-wide basis, but in v10.1 as packaged with the current Debian stable the only way to change the row format is to override it on the CREATE TABLE command.

Mangling CREATE TABLE ^

Well, this is a little awkward. ARA creates its own database schema when you run it for the first time. Or, you can tell it not to do that, but then you need to create the database schema yourself.

I could have extracted the schema out of the files in site-packages/ara/db/versions/ (there’s only two files), but for a quick hack it was easier to change them. Each op.create_table( needs to have a line added right at the end, e.g. changing from this:

def upgrade():
    op.create_table('data',
    sa.Column('id', sa.String(length=36), nullable=False),                       
    sa.Column('playbook_id', sa.String(length=36), nullable=True),
    sa.Column('key', sa.String(length=255), nullable=True),
    sa.Column('value', models.CompressedData((2 ** 32) - 1), nullable=True),
    sa.ForeignKeyConstraint(['playbook_id'], ['playbooks.id'], ondelete='RESTRICT'),
    sa.PrimaryKeyConstraint('id'),
    sa.UniqueConstraint('playbook_id', 'key')
    )
    ### end Alembic commands ###

To this:

def upgrade():
    op.create_table('data',
    sa.Column('id', sa.String(length=36), nullable=False),                       
    sa.Column('playbook_id', sa.String(length=36), nullable=True),
    sa.Column('key', sa.String(length=255), nullable=True),
    sa.Column('value', models.CompressedData((2 ** 32) - 1), nullable=True),
    sa.ForeignKeyConstraint(['playbook_id'], ['playbooks.id'], ondelete='RESTRICT'),
    sa.PrimaryKeyConstraint('id'),
    sa.UniqueConstraint('playbook_id', 'key'),
    mysql_row_format='DYNAMIC'
    )
    ### end Alembic commands ###

That’s not a good general purpose patch because this is specific to MariaDB/MySQL whereas ARA supports SQLite, MySQL and PostgreSQL. I’m not familiar enough with sqlalchemy/alembic to know how it should properly be done. Maybe the ARA project would take the view that it shouldn’t be done, since the default row format changed already in upstream InnoDB. But I need this to work now, with a packaged version of MariaDB.

Wrapping up ^

So there we go, that all works now, no concurrency issues, ARA is able to create its schema and insert its data properly and my scripts were trivial to port from SQLite syntax to MariDB syntax. The database still needs pruning, but that was always going to be the case.

It was a little more work than I really wanted, having to run a full database engine just to achieve the simple task of monitoring runs for failures, but I think it’s still less work than installing Ansible Tower.

It is a shame that ARA doesn’t work with a Debian stable package of MariaDB or MySQL just by following ARA’s install instructions. I have a feeling this should be considered a bug in ARA but I’m not sure enough to just go ahead and file it, and it seems that the only discussion forums are IRC and Slack. Anyway, once the newer versions of MariaDB and MySQL find their way into stable releases hopefully the character set issues will be a thing of the past.

This limited key length thing (and the whole of MySQL’s history with UTF8 in general) is just another annoyance in a very very long list of MySQL annoyances though, most of them stemming from basic design problems. That’s why I will reiterate: if you’re in a position to do so, I recommend picking PostgreSQL over MySQL.