Vaclav's Performance Tests

Keywords: 2.0.x | core

Ottendorf tests

Here are some Ottendorf performance tests. The page will be updated according to progress in testing.

Registrar with database tuning. (updated on 25th May 2007)

Memory allocation methods

Many people wonder, why there are non-standard memory allocation methods used within SERi. I run some tests which show significant difference between standard malloc/free and SER's allocation methods. (Memory management is one of most critical parts related to SIP performance.)

Note: be aware that SHM memory opearions do locking and thus are always slower than operations on PKG memory.

The test can show different results on different machines. Results published here were got on a machine with one hyperthreaded Intel CPU at 3GHz with 8GB RAM.

The test measured the run time of 50 times repeated 1 million of allocations of memory blocks of random size (between 3 a 1666 bytes, the sizes were the same for all tests). There were at most 599 blocks allocated at the same time (for example the 600th block was allocated after the first one was freed etc). Before the end were freed all remaining blocks.

standalone program with malloc/free11 475 ms
PKG mem + QM_MALLOC7 119.1 ms
PKG mem + QM_MALLOC + QM_JOIN_FREE11 236.8 ms
PKG mem + F_MALLOC4 851.3 ms
SHM mem + QM_MALLOC12 610.5 ms
SHM mem + QM_MALLOC + QM_JOIN_FREE17 023.3 ms
SHM mem + F_MALLOC10 404.2 ms
malloc/free called within SER (same conditions like tests above)11 535.4 ms

Results are from only one test (not an average or anything like that); it was running more times and the numbers seemed to be similar

The code used for testing is here. You can unpack it and look into run_test.sh script what it does. Please, if you find a bug there, send me an email (vaclav.kubart@iptel.org).

Results shows that QM_MALLOC + QM_JOIN_FREE (should be used due to fragmentation) are at the same time as malloc/free. F_MALLOC is really better but I don't know its "fragmentation behaviour". It will be nice to have a method to say "defragment memory now" (via XML-RPC or done by timer) to eliminate the need of QM_JOIN_FREE.

Note: if we want to compare what method is better we will need to do more fragmentation related tests - don't know much about standard functions and F_MALLOC ones. With QM_MALLOC it seems to me that after some time of run when processing similar data the fragmentation reach some value and then keeps here...

Andrei's note: One note on fragmentation: without QM_JOIN_FREE fragmentation will happen but as you pointed out it should stop at some constant value. There is some code witch makes sures that very small fragments are not created (this was a problem in the past especially with realloc(smaller_size)) and so the fragments will be reused in the future making future allocation even faster. You can check that by measuring performance immediately after seri is started, then letting ser under heavy traffic for some time and re-measuring later. The perfomance after the "warm-up" will be a little better.

Important note: After some investigation I have found that malloc/free in libc library are most likely synchronized (because are thread safe) and thus malloc/free might be faster than QM malloc/free! Needs to be investigated more deeply!

Transaction proxy performance, SERxOpenSER

These tests are out-of-date now - they were done before Ottendorf branchi was created. They need to be done again with new versions of SER and OpenSER.

Temporary link directly to tests. Add info about the various tests here.

Home |  Recent changes |  Search |  Glossary |  Sitemap |  Login