Vis enkel innførsel

dc.contributor.authorTamrakar, Kabin
dc.contributor.authorYazidi, Anis
dc.contributor.authorHaugerud, Hårek
dc.date.accessioned2018-02-04T16:02:21Z
dc.date.accessioned2018-08-06T08:13:44Z
dc.date.available2018-02-04T16:02:21Z
dc.date.available2018-08-06T08:13:44Z
dc.date.issued2017
dc.identifier.citationTamrakar, Yazidi A, Haugerud H. Cost efficient batch processing in amazon cloud with deadline awareness. Advanced Information Networking and Applications. 2017:963-971en
dc.identifier.issn1550-445X
dc.identifier.issn1550-445X
dc.identifier.urihttps://hdl.handle.net/10642/6023
dc.description.abstractAmazon spot instances have become a very popular alternative for cost-saving in the cloud. The spot instances are prone to abrupt termination whenever the spot market price exceeds the bid price. In this paper, spot instances are resorted to in task instances’ group of Amazon Elastic Map Reduce (EMR)clustertoprocessbatchjobswithdeadline.AmazonEMR makes it convenient to process Big Data with the aid of the Hadoop framework. However, the processed intermediate results in the task nodes of the cluster are lost if the spot instances gets terminated which can cause processing delay. The cost efficiency can be realized by exploiting the non-real time nature of batch computingforBigData.Twoalgorithmsaredevisedforachieving cost efficient processing in Hadoop MapReduce. Both algorithms process data in divisions such that abrupt termination of spot instances only affects the last division. Based on monitoring the progress at given checkpoints, task group’s capacity is resized to completetheprocessingwithinthedeadline.Progressismeasured in terms of the number of completed work divisions. The first algorithm begins with some spot instances whose number is initially estimated. To complete processing of all data in time, ondemandinstancesaredeployedafteracertainthresholdtime.The second algorithm starts by using higher number of spot instances than required to complete the work within the given deadline. Therefore, it has higher chance to rely solely on instances during the whole execution of the batch job. On-demand instances are deployed only in case of slow progress caused by termination of the spot instances combined with subsequent unsuccessful bids. The experiments show that both algorithms are able to minimize the processing cost. The second algorithm further minimizes the cost in most cases.en
dc.language.isoenen
dc.publisherInstitute of Electrical and Electronics Engineersen
dc.relation.ispartofseriesAdvanced Information Networking and Applications;2017 IEEE 31st International Conference on Advanced Information Networking and Applications (AINA)
dc.rights© 2017 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.en
dc.subjectBatch processingen
dc.subjectSpot instancesen
dc.subjectElastic Map Reduceen
dc.subjectDeadline awarenessen
dc.titleCost efficient batch processing in amazon cloud with deadline awarenessen
dc.typeJournal articleen
dc.typePeer revieweden
dc.date.updated2018-02-04T16:02:21Z
dc.description.versionacceptedVersionen
dc.identifier.doihttps://dx.doi.org//10.1109/AINA.2017.170
dc.identifier.cristin1509342
dc.source.journalAdvanced Information Networking and Applications


Tilhørende fil(er)

Thumbnail

Denne innførselen finnes i følgende samling(er)

Vis enkel innførsel