Minimum-Impact First: Scheduling Virtual Machines Under Maintenance Scenarios
Original version
Yazidi A, Haugerud H, Ung F, Begnum KM: Minimum-Impact First: Scheduling Virtual Machines Under Maintenance Scenarios. In: NN N. 11th ACM International Conference on Management of Digital EcoSystems, 2019. ACM Publications https://dx.doi.org/10.1145/3297662.3365831Abstract
Virtual Machine (VM) migration is an important feature for ensuring smooth operations during maintenance and disaster recovery scenarios. The migration might be inter-site and in such a case the inter-site bandwidth which is typically Wide Area Network (WAN) might be a bottleneck. In such a case, the bandwidth is affected by the amount of inter-VM traffic that becomes separated during the migration process. The amount of separated traffic might not only cause degradation of the of the Quality of Service (QoS) of inter-communicating VMs but can also delay the migration process due to the congestion of the migration link. The state-of-the-art algorithm due to Yazidi et al. is an affinity aware algorithm that does not consider the completion time of the migration. The first stage of our algorithm is identical to Yazidi et al. where we resort to graph partitioning theory in order to partition the VMs into groups with high intra-group communication. In the second stage, we devise a greedy algorithm for controlling the order of the migration groups by considering their inter-group traffic that greedily selects groups with the lowest impact in terms of volume of separated traffic which we denominate Minimum-Impact First (MIF). We also design a latency-aware algorithm that only schedules the quickest migration first. The latter simple heuristic interestingly outperforms legacy works in the case of migration over a non-dedicated link. We find that our MIF algorithm consistently outperforms the state-of-the-art algorithms by a clear margin using real-traffic traces by a margin larger than 40%. We show that the MIF algorithm ensures the lowest amount of separated traffic in both dedicated-link and non-dedicated-link scenarios.