Table of Contents
Abstract
Basic performance tests of SER Ottendorf version as it was captured from its CVS branch on 29th March 2007, without any modifications (warning it has DBG_QM_MALLOC defined and thus may not give the best performance!).
There is detailed description in each test what machine was used for testing. For all tests was the machine running SER the same. Here are its lmbench results.
New SIPP seems much slower than the older one - it is able to generate at most something around 7k requests per second from testing client what is insufficient to saturate SER. Anything else than SIPP has to be used here or more SIPP instances.
Here are results measured with old sipp, but they are near the border of sipp possibilities so don't care much about them.
Do tests for all DB modes; only simple REGISTER handling...
Here and here are registrar test results with different DB modes and different usernames (sequentialy or randomly generated with constant length); 1000 000 registrations was done; only one client was used. In short:
| db_mode | usernames | test1/test2 throughput (registrations per second) |
|---|---|---|
| 0 ... in memory | sequential | 8057 / 8117 |
| 0 ... in memory | random | 8022 / 8061 |
| 1 ... write through | sequential | 2348 / 2359 |
| 1 ... write through | random | 1037 / 1042 |
| 2 ... write back | sequential | 2788 / 2786 (both with errors) |
| 2 ... write back | random | 2099 / 2085 (both with errors) |
Test was done with default compilation flags and default MySQL database running localy on tested machine.
For in memory tests it seems that randomized usernames give sometimes a bit better performance sometimes worse, but for storing into database (in write-through mode) it seems that randomized usernames are significantly worse than sequential ones (tried more times with similar results).
DB mode 2 - write back - seems rather unusable for huge amount of data in current implementation. The usrloc data structure is probably locked for whole time when it is traversed and changes are written in database and thus at this time is no registration processed.
![]() | ![]() |
VERY interesting is context switching rate measured by SAR during the test:
| db_mode | usernames | average context switches per second |
|---|---|---|
| 0 ... in memory | sequential | 239966 |
| 0 ... in memory | random | 240894 |
| 1 ... write through | sequential | 323460 |
| 1 ... write through | random | 512373 |
| 2 ... write back | sequential | 340578 |
| 2 ... write back | random | 427975 |
Test performance of attributes usage:
loading from DB (domain, uid)
setting value (setting many AVPs)
getting AVP value (many AVPs, try to get the last one)
Try to use different mutex implementations to show performance impact on transaction stateful proxy and on registrar.
Longer operations like registrations have other demands on mutexes than short ones. Here are registrar results got with default locking settings and with pthread mutexes. It seems that in this case pthread mutexes give significantly better performance than default locking (measured on machine with one 3GHz Intel CPU with hyperthreading).
Table 1. average throughput (registrations per second) with default locking
| db_mode | 1 process | 2 processes | 4 processes | 8 processes |
|---|---|---|---|---|
| 0 ... in memory | 9898 | 9410 | 8548 | 8191 |
| 1 ... write through | 3490 | 2794 | 2438 | 2389 |
| 2 ... write back | 4566 | 4319 | 2759 | 2789 |
Table 2. average throughput (registrations per second) with pthread mutexes
| db_mode | 1 process | 2 processes | 4 processes | 8 processes |
|---|---|---|---|---|
| 0 ... in memory | 9824 | 9845 | 9435 | 9301 |
| 1 ... write through | 3466 | 3254 | 3298 | 3182 |
| 2 ... write back | 5105 | 5133 | 5121 | 5096 |
Note that the context switching rate reported by sar for 4/8 processes is much lower when using pthread mutexes (80k switches/s in db_mode 0) than in the case of default locking (200-250k switches/s in db_mode 0).