ebook img

Reducing Checkpoint/Restart Overhead using Near Data Processing for Exascale System PDF

89 Pages·2017·1.82 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Reducing Checkpoint/Restart Overhead using Near Data Processing for Exascale System

ABSTRACT AGRAWAL,ABHINAVRAJIV. ReducingCheckpoint/RestartOverheadusingNearDataProcessing forExascaleSystem.(UnderthedirectionofJamesTuck.) Withincreasingsizeandcomplexityofhigh-performancecomputing(HPC)systemstoachieve exascaleperformance,thesystemmeantimetointerrupt(systemMTTI)isprojectedtodecrease. Tomaintaintheperformanceefficiencyofthesystem,checkpointsneedtobestoredatafasterrate whenusingcheckpoint/restartformitigation.Inadditionitrequiresalowercheckpointcommit andrestoretime.Thelowercheckpointcommitandrestoretimerequirementisaggravatedbythe increasingcheckpoint-sizetoIO-bandwidthratio.Toovercomethis,priorworkshaveproposed multilevel(hierarchical)checkpointschemesthatinvolvefrequentcheckpointwritestofasternode- localstoragewithoccasionalwritestoslowerglobalI/O-basedstorage(e.g.,disk).However,dueto increasingcostofwriting/readingcheckpointsto/fromglobalI/Obasedstorage,thistechnique maynotscalewellwithsystemsapproachingexaflopsperformance.WhileI/Oorstoragehierarchy alleviatestheperformancecostbyreducingI/Oaccesstimes(includingforcheckpoint/restart), movinglargedatabetweenstorageindifferentlevelsofhierarchyaddsoverhead.Neardatapro- cessing(NDP)hasbeenshowntobeeffectiveinreducingtheamountofdatamovementinmany applicationsbyperformingcomputationsclosertodata,thusreducingtheoverhead.Inaddition, offloadingcomputationsofsomeapplicationsfromthehostprocessorstoNDPhasshowntoim- proveperformance.InthisworkweshowhowNDPcanbeleveragedtoimproveC/Rperformance. WeproposeoffloadingtheprocessofwritingcheckpointstoglobalI/Ofromthemaincompute cores to NDP. We also explore opportunities for additional optimizations using NDP to further reducecheckpointoverheads.Overall,ourapproacheliminatestheperformancecostofwriting checkpointstoI/OastheseoperationsareperformedbyNDP. WeevaluatetheperformanceofournovelapplicationofNDPtoreducecheckpoint/restartcost andcompareittoexistingcheckpoint/restartoptimizations.Fortwo-levelcheckpointschemes(i.e., checkpointssavedtolocalstorageandremoteI/Onodes),ourevaluationforaprojectedexascale systemshowsthatabaselinesystem(withoutNDP)spendsnearlyhalfitstimewritingcheckpoints toI/Oorrestoringfromacheckpointorre-executinglostwork.WithNDPforoffloadingcheckpoint managementandcompression,thehostprocessorisabletoincreaseitsprogressratefrom51%to 78%(i.e.,a>50%speedupintheapplicationperformance). Wefurtherexplorehowcheckpointcompressioncanbecombinedwithmultilevelcheckpointing. Weperformacompressionstudyanddiscussthecompressionperformancerequirementformaking it beneficial to add compression to all levels of multilevel checkpointing. We analyze the C/R performanceandotherbenefitsofthistechnique.Ourdatashowsthatmultilevelcheckpointing combined with compression at all levels improves the efficiency of a system with C/R to 73% comparedto35%formultilevelcheckpointingwithoutcompression.Theefficiencyofmultilevel checkpointingwithcompressionisfurtherimprovedto89%whenusingNDPtooffloadcertainC/R tasks. Finally,weexplorehowthetwoapproachesofcompressionatalllevelsofmultilevelcheck- pointing and the use of NDP can be combined. Adding compression to all levels of multilevel checkpointingwillresultincompressedcheckpointdatabeingavailableinlocalstorage.Therefore theroleandbenefitofNDPforfurthercheckpointdatacompressionbeforewritingittoglobal storageisevaluated.Inadditiontoevaluatingtheperformanceoverhead,wealsoestimatetheen- ergyandhardwarecostofthevariousC/Rconfigurationswediscussed.Ourcostefficiencyanalysis showsthataddingcheckpointcompressiontoimproveprogressrateisamoreefficientsolution thanincreasingbandwidthofnodelocalstorage.Wealsoshowthataconfigurationthatleverages NDPtooffloadthetaskofwritingdatatoglobalI/Ohashighercostefficiencythanaconfiguration thatperformscheckpointcompressionateachlevelofmultilevelcheckpointing. ©Copyright2017byAbhinavRajivAgrawal AllRightsReserved ReducingCheckpoint/RestartOverheadusingNearDataProcessingforExascaleSystem by AbhinavRajivAgrawal AdissertationsubmittedtotheGraduateFacultyof NorthCarolinaStateUniversity inpartialfulfillmentofthe requirementsfortheDegreeof DoctorofPhilosophy ComputerEngineering Raleigh,NorthCarolina 2017 APPROVEDBY: GregoryByrd EricRotenberg FrankMueller JamesTuck ChairofAdvisoryCommittee DEDICATION Tomyparents-RajniandRajivAgrawal. ii ACKNOWLEDGEMENTS Thisresearchwasmadepossibleduetosupportandguidanceofmanypeople-myadvisor,research groupmembers,collaborators,familyandfriends. Foremost,IwouldliketoexpressmysinceregratitudetomyadvisorDr.JamesTuckforhis constantsupportduringmyPh.Dstudies.Iwouldliketothankhimforhisguidanceandpatience whilementoringmeinmyresearchwork.IamgratefultoDr.Tuckforallowingmetoworkonmy researchwithenoughindependenceandflexibility. Iwouldliketothankmydissertationcommitteemembers:Dr.GregoryByrd,Dr.EricRotenberg andDr.FrankMuellerfortheirserviceonmycommitteeaswellasfortheirinsightfulcomments, feedbackandadvice. IwouldalsoliketothankGabrielLohforcollaboratingwithmeonthisworkandforhisadvice duringmyinternship. MysincerethanksalsogoestoBagusWibowoforhelpingwithmyresearchaswellasforthe manystimulatingdiscussionsandlatenightsbeforedeadlines.Manythankstomyfellowlabmates- JoonmooHuh,AmroAwad,HusseinElnawawy,VineshSrinivasanandSeungheeShin.Thanksto GayatriPowarforproofreadingmanypaperandreportdrafts. LastlyIwouldliketothankmyparentsforinstillinginmetheimportanceofeducationfroma youngageandsupportingmethroughoutmyacademicjourney.Thisaccomplishmentisasmuch theirsasitismine. iii TABLEOFCONTENTS LISTOFTABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi LISTOFFIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . viii Chapter1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1.2 ExistingC/ROptimizationTechniques . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 1.3 AddingCheckpointCompressiontoMultilevelCheckpointing . . . . . . . . . . . . . . . . . 4 1.4 LeveragingNDPtoImproveC/REfficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1.5 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1.6 OrganizationofThisThesis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 Chapter2 BACKGROUNDANDRELATEDWORK . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1 Checkpoint/Restart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.1.1 CoordinatedCheckpoint/Restart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Checkpoint/RestartOverhead . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.1 FailureRate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.2 CheckpointSize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2.3 ProgressRateorC/REfficiency . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.3 Checkpoint/RestartOptimizationTechniques. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.1 IncreaseCheckpointCommitBandwidth. . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3.2 ReduceCheckpointDataSize . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 NearDataProcessing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Chapter3 SCALINGSTUDY. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 ExascaleSystemProjection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.3 MTTIProjection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.4 Checkpoint/RestartOverheadwithnoOptimization . . . . . . . . . . . . . . . . . . . . . . . . 14 Chapter4 MULTILEVELCHECKPOINTINGWITHCOMPRESSION . . . . . . . . . . . . . . . . . 15 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1.2 MultilevelCheckpointing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4.1.3 AddingCheckpointCompressiontoMultilevelC/R . . . . . . . . . . . . . . . . . . . 17 4.2 CompressionStudy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.1 ToolsandMethodology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.2.2 CheckpointCompressionSpeedAndFactor. . . . . . . . . . . . . . . . . . . . . . . . . 20 4.2.3 SelectingUtilityforCheckpointCompression . . . . . . . . . . . . . . . . . . . . . . . 23 4.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.3.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.3.2 Checkpoint/RestartOverheadComponents. . . . . . . . . . . . . . . . . . . . . . . . . 26 4.3.3 ProgressRateComparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 iv 4.3.4 C/ROverheadBreakdown(byLocalandI/OLevel) . . . . . . . . . . . . . . . . . . . 28 4.4 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 Chapter5 LEVERAGINGNDPFORCHECKPOINT/RESTART . . . . . . . . . . . . . . . . . . . . . . 34 5.1 ComputeNodewithNDP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 5.1.1 OperationofMultilevelCheckpointingwithNDP . . . . . . . . . . . . . . . . . . . . . 35 5.1.2 NDPforCheckpointDataCompression . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 5.2 NDPPerformanceRequirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 5.2.1 ConfiguringNDPforCompression . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 5.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 5.3.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 5.3.2 Checkpoint/RestartOverheadComponents. . . . . . . . . . . . . . . . . . . . . . . . . 44 5.3.3 ProgressRateComparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 5.3.4 C/ROverhead-Breakdown(4%I/ORecovery). . . . . . . . . . . . . . . . . . . . . . . 46 5.3.5 C/ROverhead-SensitivityStudy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 5.4 Summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49 Chapter6 PERFORMANCE,POWERANDCOSTANALYSISFORCOMBINATIONOFCHECK- POINT/RESTARTOPTIMIZATIONS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51 6.2 CompressionStudy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 6.2.1 ToolsandMethodology. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 6.2.2 Data:CompressionSpeedandFactor. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 6.2.3 SelectingUtilityforCheckpointCompressionusingNDP . . . . . . . . . . . . . . . 54 6.3 PerformanceEvaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.3.1 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 6.3.2 ProgressRateComparison . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 6.3.3 C/ROverhead-Breakdown(15%I/ORecovery) . . . . . . . . . . . . . . . . . . . . . . 58 6.4 Methodology-CostAnalysis. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 6.4.1 EnergyCost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 6.4.2 HardwareCost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 6.5 Results-CostAnalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 6.5.1 AbsoluteCostBreakdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64 6.5.2 CostPerformanceRatio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 Chapter7 CONCLUSION. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 v LISTOFTABLES Table3.1 ExascalesystemprojectionscaledfromtheTitanCrayXK7supercomputer. . . 14 Table4.1 CheckpointDataDetails.Secondcolumnshowsthesizeoftotalcheckpoint datacollectedforeachmini-appingigabytes.Furthercolumnsshowcompres- sionspeedforcheckpointdatausingdifferentutilitiesandcompressionlevels ontheHDDandSSDsystem.Compressionspeedisforasinglethreadofeach utility.Valueinside()isthecompressionlevel.. . . . . . . . . . . . . . . . . . . . . . . . 20 Table4.2 Checkpointcommitandrestoretimeinsecondsforallcompressionutilities. Checkpointsizeforallmini-appsissetto112GBpercomputenode.‘I/O’ columncontainscheckpointtimeswhencheckpointsarecompressedand savedtoglobalI/Ostorage.‘L/S’and‘L/F’containscheckpointtimeswhen checkpointsarecompressedandsavedtoslowcomputenodelocalstorage(5 GB/s)andfastcomputenodelocalstorage(15GB/s)respectively.Notethat thecheckpointtimevaluesinthe“Average"rowarenottheaveragevalues ofthesevenmini-apps,butthecheckpointtimeiftheperformancemodel issimulatedusingaveragecompressionfactorandcompressionspeedfrom Figure4.1.Notethatcheckpointcommit/restoretimeintheabsenceofcom- pressionwouldbeI/O:1120s,L/S:22.4sandL/F:7.47s. . . . . . . . . . . . . . . . . . 22 Table4.3 C/Rparametersforevaluationusingperformancemodel. . . . . . . . . . . . . . . . 26 Table5.1 Therequiredcompressionspeed,requirednumberofprocessorcoresinNDP andthesmallestpossiblecheckpointintervaltoI/Obasedonaveragecom- pressionfactorandspeed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42 Table5.2 C/Rparametersforevaluationusingperformancemodel. . . . . . . . . . . . . . . . 44 Table6.1 Checkpointcompressiondata.lz4-compresseddataof7mini-appsiscom- pressedagainusingvariouscompressionutilities.Thefirstcolumnshowsthe sizeoflz4-compressedcheckpointdatausedtocollectcompressionparame- ters.Columnswithheader’F’containcompressionfactorandcolumnswith header ’S’ contain compression speed in MB/s. Compression speed is the speedatwhichlz4-compresseddataiscompressedusingvariousutilities. . . . 52 Table6.2 Cumulativeorequivalentcheckpointcompressiondataforcompressionafter lz4compression.lz4-compresseddataof7mini-appsiscompressedagainus- ingvariouscompressionutilities.Compressionfactorinthistableisameasure ofthecumulativereductionincheckpointsizeaftercompressionusinglz4and theutilityinthefirstrowofthecorrespondingcolumn.Compressionspeed isanequivalentcompressionspeed,iftheuncompressedcheckpointdata werebeingcompressedinthesameamountoftimeasthelz4-compression checkpointdataisbeingcompressed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55 vi Table6.3 Checkpointcommittimeinsecondsforallcompressionutilitiesfor2scenar- ios.UnC:UncompressedcheckpointdatacompressedbyNDP(Scenario-1); Uncompressedcheckpointsizeforallmini-appsissetto112GBpercompute node.Comp:lz4-compressedcheckpointdatacompressedbyNDP(Scenario- 2).Checkpointsizeisthesizeif112GBofcheckpointdataofthecorresponding mini-appiscompressedusinglz4.Notethatthecheckpointtimevaluesinthe “Average"rowarenottheaveragevaluesofthesevenmini-apps,butthecheck- pointtimeiftheperformancemodelissimulatedusingaveragecompression factorandcompressionspeedfromFigure4.1. . . . . . . . . . . . . . . . . . . . . . . . 55 Table6.4 C/R parameters for performance, power and cost evaluation of multilevel checkpointingcombinedwithcompressionandNDP . . . . . . . . . . . . . . . . . . 58 Table6.5 Powerandcostparameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 vii

Description:
increasing cost of writing/reading checkpoints to/from global I/O based storage, this technique may not scale well checkpointing will result in compressed checkpoint data being available in local storage. which implements a couple of kernels representative of implicit finite-element applications.
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.