Thinking about MariaDB/MySQL Performance

Performance is always a hot area, we always want more. The good news is that it's always possible to improve performance, to make it go faster. The bad news is that it requires time, money, expertise and understanding. Before investing too much time or money, we should verify that we're not wasting our time/money.

If you want to improve performance or be able to cope with more clients, there are two basic options

Add capacity; use more hardware, otherwise known as "throwing money at the problem".
Make effort to improve the configuration, which is approximately "throwing time/expertise at the problem".

Choosing which approach to take is part of the Art.

When to add capacity or tune for performance? The answer to this is rather beyond the scope of this article, but in short: "Not until you can demonstrate that you need to". Or in other words, don't bother with performance tuning those tiny internal databases that hardly get any traffic.
How do you know whether your proposed or recently implemented change (either a configuration change or additional/different hardware) is beneficial? It's not uncommon that a minor tweak made without understanding could severely damage performance. That is a consequence of the complexity of a world that is composed of our operating systems, databases systems and collections of services that, in aggregrate, are the application(s) that the customers care about. The answer is "By understanding the performance metrics".

So I'm going to suggest some simple approaches to getting some performance metrics to help demonstrate performance characteristics.

Performance metrics

Monitoring the performance of databases is a huge topic, I'm barely going to scratch the surface; partly because there's already plenty of literature that goes into depth, but partly because the surface of this topic appears to me to be underserved.

It's likely that you're really more interested in the user experience - so performance metrics about the application itself might be more informative than database metrics; this is for the case where you're just trying to get a rough idea or you're wanting to compare hardware before a purchase or the like.

Checking the database

mysqlslap(1) is able to get you some quick easy numbers to suggest whether your change is pushing you in the right direction. mysqlslap can be tweaked with custom queries so that your test better reflects the query patterns of your application.

I wrote a wrapper, mysql_test.sh, around mysqlslap to cover a number of the settings for my situation:

And then I ran the wrapper in a loop to get some numbers into graphite:

for c in $(seq 5 5 25)
do for q in 50 500 5000
    do
      mysql_test.sh -c $c -i 1 -q $q -H mariadb.example.com -f | nc -c mycarbon.example.com 2003
    done
done

Piping to |logger dbtest instead of to netcat made it available to a splunk instance.

Performance tuning

All of the following are reasonably easy to understand and therefore are not too likely to be set so badly as to cause problems, but of course, defaults are set for a reason and if you want to change the defaults you should make an effort to understand the consequences.

The database itself

innodb_buffer_pool_size: for a large memory, single-purpose database server this can perhaps be as much as 80% of system memory.
innodb_flush_log_at_trx_commit=2: if you already have resilience to failure (in the form of HA/multiple nodes) or tolerance of failure, you can relax the requirement to flush every transaction.
innodb_log_file_size: InnoDB stores the changes to tables before they are written to disk so that random IO becomes sequential. Increasing the size of the log file(s) saves disk IO costs which is a large factor in performance if the active size of the database is larger than the buffer pool.

OS/kernel/other

In roughly descending order of (cost|effort): likely benefit.

mount datadirectory filesystems with noatime
vm.swappiness = (0|1)
vm.overcommit_memory = 2, vm.overcommit_ratio: allocatable memory=(swap size + (RAM size * overcommit ratio))
elevator-deadline`|`noop
separate bin logs (and possibly query/error logs, if they are busy) to different filesystems/"spindles".
numactl(1)

https://www.redhat.com/magazine/001nov04/features/vm/ https://www.kernel.org/doc/gorman/

AllGoodBits.org