ebook img

Go Optimizations 101 (2022/08/29) PDF

161 Pages·0.598 MB·English
by  Tapir
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Go Optimizations 101 (2022/08/29)

Go Optimizations 101 TapirLiu Contents 0.1 Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1 AboutGoOptimizations101 7 1.1 Abouttheauthor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.2 Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2 ValuePartsandValueSizes 9 2.1 Valuesandvalueparts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 2.2 Value/typesizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.3 Detailedtypesizes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.4 Memoryalignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.5 Structpadding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 2.6 Valuecopycostsandsmall-sizetypes/values . . . . . . . . . . . . . . . . . . . . 13 2.7 Valuecopyscenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3 MemoryAllocations 22 3.1 Memoryblocks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.2 Memoryallocationplaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 3.3 Memoryallocationscenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.4 Memorywastingcausedbyallocatedmemoryblockslargerthanneeded . . . . . . 23 3.5 Reducememoryallocationsandsavememory . . . . . . . . . . . . . . . . . . . 25 3.6 Avoidunnecessaryallocationsbyallocatingenoughinadvance . . . . . . . . . . 25 3.7 Avoidallocationsifpossible . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 3.8 Savememoryandreduceallocationsbycombiningmemoryblocks . . . . . . . . 29 3.9 Usevaluecachepooltoavoidsomeallocations . . . . . . . . . . . . . . . . . . 31 4 StackandEscapeAnalysis 33 4.1 Goroutinestacks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2 Escapeanalysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 4.2.1 Escapeanalysisexamples . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.3 Stackframesoffunctioncalls . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 4.4 Stacksgrowthandshrinkage . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37 4.5 Memorywastingcausedbyunusedstackspaces . . . . . . . . . . . . . . . . . . 39 4.6 Forallkindsofreasons,avalue(part)willescapetoheapevenifitisonlyusedin onegoroutine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.6.1 Alocalvariablesdeclaredinaloopwillescapetoheapifitisreferenced byavalueoutoftheloop . . . . . . . . . . . . . . . . . . . . . . . . . 39 4.6.2 Thevaluepartsreferencedbyanargumentwillescapetoheapiftheargu- mentispassedtointerfacemethodcalls . . . . . . . . . . . . . . . . . . 40 1 4.6.3 Areflect.ValueOffunctioncallmakesthevaluesreferencedbyitsar- gumentescapetoheap . . . . . . . . . . . . . . . . . . . . . . . . . . . 41 4.6.4 Thevaluesreferencedbyfunctionreturnresultswillescape . . . . . . . . 41 4.7 Functioninlinemightaffectescapeanalysisresults. . . . . . . . . . . . . . . . . 42 4.7.1 Functioninliningisnotalwayshelpfulforescapeanalysis . . . . . . . . . 43 4.8 Controlmemoryblockallocationplaces . . . . . . . . . . . . . . . . . . . . . . 43 4.8.1 Ensureavalueisallocatedonheap . . . . . . . . . . . . . . . . . . . . . 44 4.8.2 Useexplicitvaluecopiestohelpcompilersdetectsomevaluesdon’tescape 45 4.8.3 Memorysizethresholdsusedbythecompilertomakeallocationplacement decisions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 4.8.4 Usesmallerthresholds . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 4.8.5 Allocatethebackingarrayofasliceonstackevenifitssizeislargerthan orequalto64K(butnotlargerthan10M) . . . . . . . . . . . . . . . . . 51 4.8.6 Allocatethebackingarrayofaslicewithanarbitrarylengthonstack . . . 52 4.8.7 Moretrickstoallocatearbitrary-sizevaluesonstack . . . . . . . . . . . . 53 4.9 Growstackinlesstimes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 5 GarbageCollection 56 5.1 GCpacer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56 5.2 AutomaticGCmightaffectGoprogramexecutionperformance . . . . . . . . . . 57 5.3 HowtoreduceGCpressure? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.4 Memoryfragments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 5.5 Memorywastingcausedbysharingmemoryblocks . . . . . . . . . . . . . . . . 58 5.6 Trytogeneratelessshort-livedmemoryblockstolowerautomaticGCfrequency . 59 5.7 UsenewheapmemorypercentagestrategytocontrolautomaticGCfrequency. . . 59 5.8 SinceGotoolchain1.18,thelargerGCroots,thelargerGCcycleintervals. . . . . 62 5.9 UsememoryballaststoavoidfrequentGCcycles . . . . . . . . . . . . . . . . . 65 5.10 UseGotoolchain1.19introducedmemorylimitstrategytoavoidfrequentGCcycles 66 6 Pointers 68 6.1 Avoidunnecessarynilarraypointerchecksinaloop . . . . . . . . . . . . . . . . 68 6.1.1 Thecaseinwhichanarraypointerisastructfield . . . . . . . . . . . . . 69 6.2 Avoidunnecessarypointerdereferencesinaloop . . . . . . . . . . . . . . . . . 71 7 Structs 74 7.1 Avoidaccessingfieldsofastructinaloopthoughpointerstothestruct . . . . . . 74 7.2 Small-sizestructsareoptimizedspecially . . . . . . . . . . . . . . . . . . . . . 75 7.3 Makestructsizesmallerbyadjustingfieldorders . . . . . . . . . . . . . . . . . 75 8 ArraysandSlices 76 8.1 Avoidusingliteralsoflarge-sizearraytypesascomparisonoperands . . . . . . . 76 8.2 Usingslice-to-array-pointerconversionsintroducedinGo1.17tocopyslices . . . 77 8.3 Themakeandappendbuiltinfunctionimplementations . . . . . . . . . . . . . . 78 8.4 Trytoclipthefirstargumentofanappendcallifweknowthecallwillallocate . . 81 8.5 Growslices(enlargeslicecapacities) . . . . . . . . . . . . . . . . . . . . . . . . 81 8.6 Trytogrowasliceinonestep . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 8.7 Cloneslices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 8.8 Mergetwoslices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 8.9 Mergemorethantwoslices(intoanewslice) . . . . . . . . . . . . . . . . . . . 83 8.10 Insertasliceintoanotherone . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 8.11 Don’tusetheseconditerationvariableinafor-rangeloopifhighperformance isdemanded . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85 2 8.12 Resetallelementsofanarrayorslice . . . . . . . . . . . . . . . . . . . . . . . 87 8.13 Specifycapacityexplicitlyinsubsliceexpression . . . . . . . . . . . . . . . . . 88 8.14 Useindextablestosavesomecomparisons . . . . . . . . . . . . . . . . . . . . 89 9 StringandByteSlices 90 9.1 Conversionsbetweenstringsandbyteslices . . . . . . . . . . . . . . . . . . . . 90 9.1.1 Astring-to-byte-sliceconversionfollowingtherangekeyworddoesn’tal- locate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 9.1.2 A byte-slice-to-string conversion appearing as a comparison operand doesn’tallocate. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91 9.1.3 Abyte-slice-to-stringconversionappearingastheindexkeyofamapele- mentretrievalexpressiondoesn’tallocate . . . . . . . . . . . . . . . . . 92 9.1.4 A byte-slice-to-string conversion appearing as an operand in a string concatenation expression doesn’t allocate if at least one of concatenated operandsisanon-blankstringconstant . . . . . . . . . . . . . . . . . . . 94 9.1.5 Iftheresultofanoperationisastirngorbyteslice, andthelengthofthe resultislargerthan32,thenthebyteelementsoftheresultwillbealways allocatedonheap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 9.2 Efficientwaystoconcatenatestrings . . . . . . . . . . . . . . . . . . . . . . . . 96 9.2.1 Thestrings.Builderwaymightwastesomememorysometimes . . . . 98 9.2.2 Usebyteslicetoconcatenatestrings . . . . . . . . . . . . . . . . . . . . 98 9.3 Mergeastringandabytesliceintoanewbyteslice . . . . . . . . . . . . . . . . 99 9.4 Thestrings.Comparefunctionisnotveryperformantnow . . . . . . . . . . . 100 9.5 Avoidallocationsifpossible . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101 10 BCE(BoundCheckEliminate) 106 10.1 Example1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106 10.2 Example2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107 10.3 Example3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 10.4 Example4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 10.5 Sometimes,thecompilerneedssomehintstoremovesomeboundchecks . . . . . 110 10.6 WritecodeinBCE-friendlyways. . . . . . . . . . . . . . . . . . . . . . . . . . 113 10.7 The current official standard Go compiler fails to eliminate some unnecessary boundchecks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116 11 Maps 118 11.1 Clearmapentries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 11.2 aMap[key]++ismoreefficientthanaMap[key] = aMap[key] + 1 . . . . . . . 119 11.3 Pointersinmaps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 11.4 Usingbytearraysinsteadofshortstringsaskeys . . . . . . . . . . . . . . . . . . 119 11.5 Lowermapelementmodificationfrequency . . . . . . . . . . . . . . . . . . . . 120 11.6 Trytogrowamapinonestep . . . . . . . . . . . . . . . . . . . . . . . . . . . 121 11.7 Useindextablesinsteadofmapswhichkeytypeshaveonlyasmallsetofpossible values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122 12 Channels 125 12.1 Programmingwithchannelsisfunbutchannelsarenotthemostperformantway forsomeusecases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125 12.2 Useonechannelinsteadofseveralonestoavoidusingselectblocks . . . . . . . 126 12.3 Try-sendandtry-receiveselectcodeblocksarespeciallyoptimized . . . . . . . 128 13 Functions 130 13.1 Functioninlining . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130 3 13.1.1 Whichfunctionsareinline-able? . . . . . . . . . . . . . . . . . . . . . . 131 13.1.2 Acalltoafunctionvalueisnotinline-ableifthevalueishardtobedeter- minedatcompiletime . . . . . . . . . . . . . . . . . . . . . . . . . . . 133 13.1.3 Thego:noinlinecommentdirective . . . . . . . . . . . . . . . . . . . 134 13.1.4 Writecodeinthewayswhicharelessinlinecostly . . . . . . . . . . . . . 134 13.1.5 Makehotpathsinline-able . . . . . . . . . . . . . . . . . . . . . . . . . 138 13.1.6 Manualinliningisoftenmoreperformancethanauto-inlining . . . . . . . 139 13.1.7 Inliningmightdonegativeimpactonperformance . . . . . . . . . . . . . 140 13.2 Pointerparameters/resultsvs. non-pointerparameters/results . . . . . . . . . . . . 141 13.3 Namedresultsvs. anonymousresults . . . . . . . . . . . . . . . . . . . . . . . . 143 13.4 Trytostoreintermediatecalculationresultsinlocalvariables . . . . . . . . . . . 145 13.5 Avoiddeferredcallsinloops . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145 13.6 Avoidusingdeferredcallsifextremehighperformanceisdemanded . . . . . . . . 147 13.7 Theargumentsofafunctioncallwillbealwaysevaluatedwhenthecallisinvoked 147 13.8 Trytomakelessvaluesescapetoheapinthehotpaths . . . . . . . . . . . . . . . 148 14 Interfaces 150 14.1 Boxvaluesintoandunboxvaluesfrominterfaces . . . . . . . . . . . . . . . . . 150 14.2 Trytovoidmemoryallocationsbyassigninginterfacetointerface . . . . . . . . . 157 14.3 Callinginterfacemethodsneedsalittleextracost . . . . . . . . . . . . . . . . . 158 14.4 Avoid using interface parameters and results in small functions which are called frequently . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159 4 5 0.1 Acknowledgments Firstly,thankstothewholeGocommunity. Anactiveandresponsivecommunityguaranteesthis bookisfinishedintime. Specially, I want to give thanks to the following people who helped me understand some imple- mentationdetailsintheofficialstandardcompilerandruntime: KeithRandall,IanLanceTaylor, AxelWagner,CuongManhLe,MichaelPratt,JanMercl,MatthewDempsky,MartinMöhrmann, etc. I’msorryifIforgotmentioningsomebodyinabovelists. Therearesomanykindandcreative gophersintheGocommunitythatImusthavemissedoutonsomeone. IalsowouldliketothankallgopherswhoevermadeinfluencesontheGo101book,beitdirectly orindirectly,intentionallyorunintentionally. ThankstoOlexandrShalakhinforthepermissiontouseoneofthewonderfulgophericondesigns asthecoverimage. AndthankstoReneeFrenchfordesigningthelovelygophercartooncharacter. Thankstotheauthorsofthefollowingopensourcesoftwareandlibrariesusedinbuildingthisbook: • golang,https://go.dev/ • gomarkdown,https://github.com/gomarkdown/markdown • goini,https://github.com/zieckey/goini • go-epub,https://github.com/bmaupin/go-epub • pandoc,https://pandoc.org • calibre,https://calibre-ebook.com/ • GIMP,https://www.gimp.org Thanks the gophers who ever reported mistakes in this book or made corrections for this book: yingzewen,ivanburak,cortes-,skeeto@reddit,etc. 6 Chapter 1 About Go Optimizations 101 Thisbookprovidessomecodeperformanceoptimizationtricks,tips,andsuggestions. Mostofthe contentsinthisbookaremadebasedontheofficialstandardGocompilerandruntimeimplemen- tation. Lifeisfulloftrade-offs,theprogrammingworldis,too. Inprogramming,weoftenneedtomake trade-offs between code readability, maintainability, development efficiency, and program effi- ciency, etc. Even for one of the aspects, there are also trade-offs needing to be made. Taking program efficiency for an example, we might need to make trade-offs between memory saving, codeexecutionspeed,andimplementationdifficulty,etc. Inpractice, mostpartsofthecodebaseofaprojectdon’tneedtobeimplementedwithhighper- formances. Keepingthemmaintainableandreadableisoftenmoreimportant(thanmakingthem memorysavingandrunveryfast). Thesuggestionsmadeinthisbookarejustforthecodeparts whichimplementationsreallyneedtobehighperformant. Someofthesuggestionsoftenleadto more verbose code. And please note that some of the suggested implementations might be only performantatsomecertainscenarios. Thecontentsinthisbookinclude: • howtoconsumelessCPUresources. • howtoconsumelessmemory. • howtomakelessmemoryallocations. • howtocontrolmemoryallocationplaces. • howtoreducegarbagecollectionpressure. Thisbookneitherexplainshowtouseperformanceanalysistools,suchaspprof,nortriestostudy deeply on compiler and runtime implementation details. None of the contents provided in this bookmakeuseofunsafepointersandcgo. Andthebookdoesn’ttalkaboutalgorithms. Inother words,thisbooktriestoprovidesomeoptimizationsuggestionsinawaywhichisclearandeasy tounderstand,fordailygeneralGoprogramming. Without particularly indicated, the code examples provided in this book are tested and run on a notebookwiththefollowingenvironmentsetup: go version go1.19 linux/amd64 goos: linux goarch: amd64 cpu: Intel(R) Core(TM) i5-4210U CPU @ 1.70GHz 7 Somebenchmarktimesinformationisremovedfrombenchmarkresultstokeepbenchmarklines short. Pleasenotethat: • someofthetalkedsuggestionsinthisbookworkonanyplatformandforanyCPUmodels, butsomeothersmightonlyworkonspecifiedplatformsandforspecifiedCPUmodels. So pleasebenchmarkthemonthesameenvironmentsasyourproductionenvironmentsbefore adoptinganyofthem. • someimplementationdetailsoftheofficialstandardGocompilerandruntimemightchange from version to version, which means some of the talked suggestions might not work for futureGotoolchainversions. • the book will be open sourced eventually since a later undetermined time point (about 2023/Jun),inachapterbychapterway. 1.1 About the author Tapiristheauthorofthisbook. HealsowrotetheGo101book. Heisplanningtowritesomeother Go101seriesbooks. Pleaselookforwardto. Tapir was ever (maybe will be again) an indie game developer. You can find his games here: tapirgames.com. 1.2 Feedback WelcometoimprovethisbookbysubmittingcorrectionstoGo101issuelist(https://github.com/g o101/go101)forallkindsofmistakes,suchastypos,grammarerrors,wordinginaccuracies,wrong explanations,descriptionflaws,codebugs,etc. ItisalsowelcometosendyourfeedbacktotheGo101twitteraccount: @go100and1(https://twit ter.com/go100and1). 8 Chapter 2 Value Parts and Value Sizes 2.1 Values and value parts InGo,avalueofsomekindsoftypesalwayscontainsonlyonepart(inmemory),whereasavalue ofotherkindsoftypesmightcontainmorethanonepart. Ifavaluecontainsmorethanonepart, thenoneofthepartsiscalledthedirectpartandtheothersarecalledindirectparts. Thedirectpart referencestheindirectparts. Ifavaluealwayscontainsonlyonepart,thenthepartmaybealsocalledasthedirectpartofthe value,andwesaythevaluehasnoindirectparts. In the official standard Go compiler implementation, each value of the following kinds of types alwayscontainsonlyonepart: • booleantypes • numerictypes(int8,uint8,int16,uint16,int32,uint32,int64,uint64,int,uint,uintptr,float32, float64,complex64,complex128) • pointertypes • unsafepointertypes • structtypes • arraytypes Andavalueofthefollowingkindsoftypesalwaysmaycontainoneormoreindirectparts: • slicetypes • maptypes • channeltypes • functiontypes • interfacetypes • stringtypes When assigning/copying a value, only the direct part of the value is copied. After copying, the directpartsofthedestinationandsourcevaluesbotharereferencingtheindirectpartsofthesource value(iftheindirectpartsexist). Atruntime,eachvaluepartiscarriedononememoryblock(memoryblockswillbeexplainedina followingchapter). So,ifavaluecontainstwoparts,thevalueisverypossiblydistributedontwo memoryblocks. (Note: The terminology value part was invented by the Go 101 series books. It is not widely 9

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.