A study of modern cluster-based high availability database solutions
Abstract
High availability in cloud computing is a top concern which refers to keeping service
operational for the maximum amount of time. In any given application and service, the
high availability of a database plays a crucial role in high application availability. For high
database availability, master-slave replication has been used for a long as a high
availability solution. In traditional master-slave clustering technology, write queries are
routed to the master server, and the data change on the master database is propagated to
one or multiple slaves asynchronously or semi-synchronously. One problem here is data
inconsistency in case of an ongoing transaction. Data loss can occur if the master server
goes down and fails to send requests to the slaves. As slaves are not in synchrony with
their master, there could be some data delays resulting in slaves having old data. In
recent years, modern multi-master clustering technologies have been made available that
feature synchronous replication. This report investigates high availability clustering
solutions available today, their features, and compares their similarities and differences
with qualitative analysis.
This thesis work aims to evaluate the high availability clustering technologies based on
their performance and availability. Comparison has been made with practical
implementation and running benchmark tests in Openstack Alto cloud. These High
availability database solutions are the Percona XtraDB Cluster and MySQL NDB Cluster.
Percona XtraDB Cluster is chosen for its instrumental key characteristics, vibrant and
cutting edge features such as high performance and scaling, mitigates downtime and data
loss, and allows writes to perform on any node in the cluster. It uses the storage engine
XtraDB, an enhancement of the InnoDB storage engine for performance improvements
and compatibility to MySQL. Moreover, it is a free and open-source high availability
clustering solution for MySQL high availability (Percona, 2022). MySQL NDB Cluster is
another high-availability synchronous multi-master technology. It is generally available
open-source, free and vibrant solution that features horizontal scaling, and automatic
table partitioning and uses an NDB storage engine designed to deliver high performance
and minimize downtime (MySQL, 2022c). They have been studied and investigated
qualitatively and quantitatively. Their performance is evaluated by implementing the
latest version of MySQL server available at present. Experiments are carried out in the
Openstack cloud platform. Experimental setup for both high availability solutions and
their performance is analyzed and compared with metrics throughput, response time,
CPU, and Memory usage.
From the results obtained, in failover scenarios, both HA solutions handled node failures
by keeping service operational and data synchronized in nodes after recovering from
failures, thereby maintaining consistency in data. Benchmarking results obtained for both
HA setups show performance for Readonly tests better in Percona XtraDB Cluster, for
updates and delete tests, the performance of MySQL NDB Cluster is better. For ReadWrite
test, both setups show more or less similar performance. In case of the need for low
latency requirements, MySQL NDB Cluster is an effective solution for intensive write
workloads. With not much low latency requirements for read operations with high
throughput, Percona XtraDB Cluster is the effective solution in such cases. The framework
for comparison and methods used in this report provides insights into which high
availability solution is effective for a particular application and could be helpful for picking
up the right solution for their use cases.