Predicting performance and scaling behaviour in a data center with multiple application servers

2006

As web pages become more user friendly and interactive we see that objects

such as pictures, media files, cgi scripts and databases are more frequently

used. This development causes increased stress on the servers due to intensified

cpu usage and a growing need for bandwidth to serve the content. At the

same time users expect low latency and high availability. This dilemma can

be solved by implementing load balancing between servers serving content to

the clients. Load balancing can provide high availability through redundant

server solutions, and reduce latency by dividing load.

This paper describes a comparative study of different load balancing algorithms

used to distribute packets among a set of equal web servers serving

HTTP content. For packet redirection, a Nortel Application Switch 2208 will

be used, and the servers will be hosted on 6 IBM bladeservers. We will compare

three different algorithms: Round Robin, Least Connected and Response

Time. We will look at properties such as response time, traffic intensity and

type. How will these algorithms perform when these variables change with

time. If we can find correlations between traffic intensity and efficiency of the

algorithms, we might be able to deduce a theoretical suggestion on how to

create an adaptive load balancing scheme that uses current traffic intensity to

select the appropriate algorithm. We will also see how classical queueing algorithms

can be used to calculate expected response times, and whether these

numbers conform to the experimental results. Our results indicate that there

are measurable differences between load balancing algorithms. We also found

the performance of our servers to outperform the queueing models in most of

the scenarios.

Master i nettverks- og systemadministrasjon

Høgskolen i Oslo. Avdeling for ingeniørutdanning
Universitetet i Oslo