Predicting performance and scaling behaviour in a data center with multiple application servers
Abstract
As web pages become more user friendly and interactive we see that objects
such as pictures, media files, cgi scripts and databases are more frequently
used. This development causes increased stress on the servers due to intensified
cpu usage and a growing need for bandwidth to serve the content. At the
same time users expect low latency and high availability. This dilemma can
be solved by implementing load balancing between servers serving content to
the clients. Load balancing can provide high availability through redundant
server solutions, and reduce latency by dividing load.
This paper describes a comparative study of different load balancing algorithms
used to distribute packets among a set of equal web servers serving
HTTP content. For packet redirection, a Nortel Application Switch 2208 will
be used, and the servers will be hosted on 6 IBM bladeservers. We will compare
three different algorithms: Round Robin, Least Connected and Response
Time. We will look at properties such as response time, traffic intensity and
type. How will these algorithms perform when these variables change with
time. If we can find correlations between traffic intensity and efficiency of the
algorithms, we might be able to deduce a theoretical suggestion on how to
create an adaptive load balancing scheme that uses current traffic intensity to
select the appropriate algorithm. We will also see how classical queueing algorithms
can be used to calculate expected response times, and whether these
numbers conform to the experimental results. Our results indicate that there
are measurable differences between load balancing algorithms. We also found
the performance of our servers to outperform the queueing models in most of
the scenarios.
Description
Master i nettverks- og systemadministrasjon
Publisher
Høgskolen i Oslo. Avdeling for ingeniørutdanningUniversitetet i Oslo