Ongoing discussion
No dependencies.
Split userbase, SIP proxies as balancers, standby nodes
Contributions by Andreas Granig, Andrei Pelinescu-Onciul, Greger V. TeigreDescription
Use SIP proxies as balancers which route requests to the appropriate proxies/registrars. Each proxy/registrar only hosts a subset of the overall user-base, and the balancers know, which proxy/registrar is responsible for which users (e.g. by hashing the r-uri).Variations
- Replicate across servers using SIP (t_replicate or tcp_forward)
- Replicate across servers using a low-cost, reliable replication protocol (proprietary)
- Use Linux HA to create advanced failover between active and passive nodes
- Run two SERi [1] daemons on each node, one active and one passive, thus creating two active-passive pairs on two physical servers
- Let each node only have usrloc in memory and replicate between active and passive using t_replicate
Pros
- No need to share the location table across active registrars, thus no replication is needed between active nodes
Cons
- Redundancy requires a standby node, thus requiring synchronization of the location table, using replication between the active and passive nodes, ex. database replication
- Expensive having duplicate standby nodes for all the active ones
- Needs changes in the provisioning work-flow if a new proxy/registrar pair is added
Shared userbase, network or DNS load balancing, no standby nodes
Contributions by Andreas Granig, Andrei Pelinescu-Onciul, Greger V. TeigreDescription
Share a location table across all proxies/registrars, e.g. by using a database cluster. This way, a load balancer could route requests to any of the proxies, because each node knows the location of each user.Variations
- Each SER is connected to a single mysql DB cluster, but usrloc is also in memory (cacheless usrloc is not used), replication is done between the SER servers and save_memory() is used to store the location only in memory (the first registrar updates the cluster with save())
- Use the Path header to know which registrar has the direct network connection with the user agent, thus be able to traverse NATi [2] (and keep TCP connections open). Since the path is stored in the location table, every proxy can route the message via the right path. NATed UACs should communicate with an external SIP loadbalancer. Every proxy has to route calls towards the UACi [3] via this loadbalancer (using the path header). The loadbalancer also has to be set up in a redundant way (active/backup), but since it only relays SIP messages back and forth, it can handle way more calls than the SIP proxy "work horses"
- Simulate Path by using multiple location tables in the database cluster and then create logic in the seri [4].cfgs to store the location in the correct table, then lookup the registrar in each of the location tables
- As above, simulate path with multiple location tables, but not use a database cluster for all the location tables and let each registrar keep one location table for each registrar
Pros
- No need to make sure requests are directed to the right node, as all nodes can serve requests to any users
- No need for standby nodes, because the remaining nodes take over the load of the failed node
- All hardware is in use, reducing costs
- Maintenance is easier, as all nodes have equal configurations
- Complexity is lower because of equal configurations and no routing needed between nodes, thus troubleshooting is easier
Cons
- Needs a cacheless usrloc, i.e. no memory caching of locations, thus all requests result in a database query
- Requires hardware and competence for database cluster operations
- Requires load balancing either with a SIP Call-Id aware load balancer or relay on user agents' DNS SRV implementations
- Can have performance bottlenecks in setups with many database writes and tweaking cluster settings is necessary for reasonable performance
Jiri's Design Suggestions
Contribution by Jiri Kuthan -- solely esthetical opinion which may or may not be shared by others- never trust database. If you can, get finished on the application/SER side and dont let the traffic go to database. If there is some annoyance, better stop at the gate than in backyard. Specifically, I like to keep usrloc isolated.
- "horizontal partitioning" is good. Then you have all the data you need at the place where you are -- it is fast and fail-safe.
- multi-tier clustering with three levels (IP-LB-ing, SIP lb-ing, and the eventual 'work horses') seems a good (simple and reliable) division-of-work to me.