Ottendorf performance tests (under construction)

Vaclav Kubart


Table of Contents

1. Tested machine
2. Transaction stateful proxy (TODO)
3. Registrar
3.1. Result notes
4. Attributes (TODO)
5. Locking (TODO)
5.1. Registrar and locking

Abstract

Basic performance tests of SER Ottendorf version as it was captured from its CVS branch on 29th March 2007, without any modifications (warning it has DBG_QM_MALLOC defined and thus may not give the best performance!).

1. Tested machine

There is detailed description in each test what machine was used for testing. For all tests was the machine running SER the same. Here are its lmbench results.

2. Transaction stateful proxy (TODO)

New SIPP seems much slower than the older one - it is able to generate at most something around 7k requests per second from testing client what is insufficient to saturate SER. Anything else than SIPP has to be used here or more SIPP instances.

Here are results measured with old sipp, but they are near the border of sipp possibilities so don't care much about them.

3. Registrar

Do tests for all DB modes; only simple REGISTER handling...

Here and here are registrar test results with different DB modes and different usernames (sequentialy or randomly generated with constant length); 1000 000 registrations was done; only one client was used. In short:

db_modeusernamestest1/test2 throughput (registrations per second)
0 ... in memorysequential8057 / 8117
0 ... in memoryrandom8022 / 8061
1 ... write throughsequential2348 / 2359
1 ... write throughrandom1037 / 1042
2 ... write backsequential2788 / 2786 (both with errors)
2 ... write backrandom2099 / 2085 (both with errors)

3.1. Result notes

  • Test was done with default compilation flags and default MySQL database running localy on tested machine.

  • For in memory tests it seems that randomized usernames give sometimes a bit better performance sometimes worse, but for storing into database (in write-through mode) it seems that randomized usernames are significantly worse than sequential ones (tried more times with similar results).

  • DB mode 2 - write back - seems rather unusable for huge amount of data in current implementation. The usrloc data structure is probably locked for whole time when it is traversed and changes are written in database and thus at this time is no registration processed.

  • VERY interesting is context switching rate measured by SAR during the test:

    db_modeusernamesaverage context switches per second
    0 ... in memorysequential239966
    0 ... in memoryrandom240894
    1 ... write throughsequential323460
    1 ... write throughrandom512373
    2 ... write backsequential340578
    2 ... write backrandom427975

4. Attributes (TODO)

Test performance of attributes usage:

  • loading from DB (domain, uid)

  • setting value (setting many AVPs)

  • getting AVP value (many AVPs, try to get the last one)

5. Locking (TODO)

Try to use different mutex implementations to show performance impact on transaction stateful proxy and on registrar.

5.1. Registrar and locking

Longer operations like registrations have other demands on mutexes than short ones. Here are registrar results got with default locking settings and with pthread mutexes. It seems that in this case pthread mutexes give significantly better performance than default locking (measured on machine with one 3GHz Intel CPU with hyperthreading).

Table 1. average throughput (registrations per second) with default locking

db_mode1 process2 processes4 processes8 processes
0 ... in memory9898941085488191
1 ... write through3490279424382389
2 ... write back4566431927592789

Table 2. average throughput (registrations per second) with pthread mutexes

db_mode1 process2 processes4 processes8 processes
0 ... in memory9824984594359301
1 ... write through3466325432983182
2 ... write back5105513351215096

Note that the context switching rate reported by sar for 4/8 processes is much lower when using pthread mutexes (80k switches/s in db_mode 0) than in the case of default locking (200-250k switches/s in db_mode 0).