ebook img

Web Proxy Cache Replacement Strategies: Simulation, Implementation, and Performance Evaluation PDF

109 Pages·2013·5.829 MB·English
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Web Proxy Cache Replacement Strategies: Simulation, Implementation, and Performance Evaluation

SpringerBriefs in Computer Science Series Editors Stan Zdonik Peng Ning Shashi Shekhar Jonathan Katz Xindong Wu Lakhmi C. Jain David Padua Xuemin Shen Borko Furht V. S. Subrahmanian Martial Hebert Katsushi Ikeuchi Bruno Siciliano For furthervolumes: http://www.springer.com/series/10028 Hala ElAarag Web Proxy Cache Replacement Strategies Simulation, Implementation, and Performance Evaluation With Contributions by Sam Romano and Jake Cobb 123 HalaElAarag Department of Mathematics and Computer Science Stetson University DeLand, FL USA ISSN 2191-5768 ISSN 2191-5776 (electronic) ISBN 978-1-4471-4892-0 ISBN 978-1-4471-4893-7 (eBook) DOI 10.1007/978-1-4471-4893-7 SpringerLondonHeidelbergNewYorkDordrecht LibraryofCongressControlNumber:2012954855 (cid:2)HalaElAarag2013 Thisworkissubjecttocopyright.AllrightsarereservedbythePublisher,whetherthewholeorpartof the material is concerned, specifically the rights of translation, reprinting, reuse of illustrations, recitation,broadcasting,reproductiononmicrofilmsorinanyotherphysicalway,andtransmissionor informationstorageandretrieval,electronicadaptation,computersoftware,orbysimilarordissimilar methodology now known or hereafter developed. Exempted from this legal reservation are brief excerpts in connection with reviews or scholarly analysis or material supplied specifically for the purposeofbeingenteredandexecutedonacomputersystem,forexclusiveusebythepurchaserofthe work. Duplication of this publication or parts thereof is permitted only under the provisions of theCopyrightLawofthePublisher’slocation,initscurrentversion,andpermissionforusemustalways beobtainedfromSpringer.PermissionsforusemaybeobtainedthroughRightsLinkattheCopyright ClearanceCenter.ViolationsareliabletoprosecutionundertherespectiveCopyrightLaw. The use of general descriptive names, registered names, trademarks, service marks, etc. in this publicationdoesnotimply,evenintheabsenceofaspecificstatement,thatsuchnamesareexempt fromtherelevantprotectivelawsandregulationsandthereforefreeforgeneraluse. While the advice and information in this book are believed to be true and accurate at the date of publication,neithertheauthorsnortheeditorsnorthepublishercanacceptanylegalresponsibilityfor anyerrorsoromissionsthatmaybemade.Thepublishermakesnowarranty,expressorimplied,with respecttothematerialcontainedherein. Printedonacid-freepaper SpringerispartofSpringerScience?BusinessMedia(www.springer.com) To my loving parents, husband, and children Preface Theneedforthisbookstemsfromthesheeramountofmodernwebtrafficcoupled with increases in cacheable, bandwidth-consuming multimedia objects. In this book, we provide the most comprehensive study for proxy cache replacement strategies. We categorize these strategies into four categories; recency-based, frequency-based, recency/frequency-based, and function-based. We provide a quantitativecomparisonofcachereplacementstrategiesonthecategoryleveland then compare the best strategies of each category based on very important per- formance metrics. We then diverge from these replacement policies by con- structing a model, represented in the weights and structure of a neural network, from actual web traffic. This approach has the advantage of incorporating subtle traffic pattern information which may be difficult for a human to discern when designing a top-down algorithm. Furthermore, we provide a single method which is capable of creating multiple models; this allows models to be created which target localized traffic patterns as well as general ones. We first provide the simulationarchitecture,setup,parameters,andresultsofthisnoveltechniquethen we explain the implementation details in the Squid proxy server. This book is based on a number of my publications co-authored with my students Sam Romano, now a software engineer at Google and Jake Cobb, now a Ph.D.studentatGeorgiaInstituteofTechnology. Iwouldliketothankthemboth for their contributions to this research. I would like to thank the anonymous reviewersofSimulation:TransactionsoftheSocietyforModelingandSimulation International, Sage Publication, The Journal of Systems and Software, Elsevier, andNeuralComputingandApplications,Springer-Verlag,fortheircommentsand critique that helped to improve the quality of this research. I would also like to thank the staff members of Springer, Wayne Wheeler for introducing me to the SpringerBriefseriesandSimonRees,forthefollowupthroughthepreparationof this book. Last, but not least, I am deeply grateful to my family for their continuous support and encouragement. vii Contents 1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2 Background Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.1 Web Request. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 2.2 Characteristics of Web Objects. . . . . . . . . . . . . . . . . . . . . . . . 6 2.3 Web Proxy Servers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 2.4 Squid. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 3 Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 3.1 Firing Rule/Squashing Function. . . . . . . . . . . . . . . . . . . . . . . . 12 3.2 Multi-Layer Perceptrons. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3.3 Supervised Learning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.4 Objective Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.5 Backpropagation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 4 A Quantitative Study of Web Cache Replacement Strategies Using Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2 Replacement Strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.3 Simulation Details. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 4.3.1 Web Cache Components . . . . . . . . . . . . . . . . . . . . . . . 19 4.3.2 Simulation Architecture. . . . . . . . . . . . . . . . . . . . . . . . 20 4.3.3 Details of Implemented Strategies. . . . . . . . . . . . . . . . . 20 4.3.4 Other Implementation Details. . . . . . . . . . . . . . . . . . . . 28 4.4 Performance Metrics and Cost Models. . . . . . . . . . . . . . . . . . . 30 4.4.1 Performance Metrics. . . . . . . . . . . . . . . . . . . . . . . . . . 30 4.4.2 Cost Models. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 ix x Contents 4.5 Experiment Setup and Results. . . . . . . . . . . . . . . . . . . . . . . . . 32 4.5.1 Trace Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 4.5.2 Simulation Setup and Parameters . . . . . . . . . . . . . . . . . 33 4.5.3 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 4.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5 Web Proxy Cache Replacement Scheme Based on Backpropagation Neural Network . . . . . . . . . . . . . . . . . . . . . . 61 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61 5.2 Neural Network Proxy Cache Replacement . . . . . . . . . . . . . . . 62 5.3 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.3.1 Data Preprocessing . . . . . . . . . . . . . . . . . . . . . . . . . . . 65 5.3.2 Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 5.3.3 Performance Evaluation. . . . . . . . . . . . . . . . . . . . . . . . 67 5.4 Results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 5.4.1 Training. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68 5.4.2 Proxy Cache Simulation. . . . . . . . . . . . . . . . . . . . . . . . 70 5.4.3 Neural Network Input/Output Relationships. . . . . . . . . . 74 5.4.4 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 5.5 Summary and Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . 80 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82 6 Implementation of a Neural Network Proxy Cache Replacement Strategy in the Squid Proxy Server. . . . . . . . . . . . . . . . . . . . . . . . 83 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 6.2 Squid’s Implemented Cache Replacement Strategies . . . . . . . . . 84 6.3 NNPCR-2. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 6.3.1 NNPCR-2 Implementation Details. . . . . . . . . . . . . . . . . 85 6.3.2 NNPCR-2 Training Setup . . . . . . . . . . . . . . . . . . . . . . 88 6.3.3 Implementation of NNPCR-2 in Squid 3.1. . . . . . . . . . . 89 6.4 NNPCR Training Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90 6.4.1 Effect of Hidden Nodes and Input Nodes. . . . . . . . . . . . 91 6.4.2 Effect of the Sliding Window Length . . . . . . . . . . . . . . 93 6.4.3 Effect of the Learning Rate . . . . . . . . . . . . . . . . . . . . . 95 6.5 Emulation Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97 6.6 Emulation Results and Observations . . . . . . . . . . . . . . . . . . . . 98 6.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102 References . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Chapter 1 Introduction Keywords Web proxy servers (cid:2) Proxy cache (cid:2) Cache replacement strategies (cid:2) Neural networks (cid:2) Squid proxy server With the debut of Web 2.0, many researchers are interested in studying their workload characteristics in order to design more efficient servers. YouTube is an exampleofaverypopular Web2.0site, andhence,itwasthe choice formanyto conduct such research [1]. Gill et al. [1] conducted an extensive analysis of YouTube workload and observed 25-million YouTube transactions over a three- month period that included the downloads of 600,000 videos. They compared traditional Web workload characteristics to that of YouTube and found many similarities.TheyconcludedthattrafficcharacterizationofYouTubesuggeststhat caching popular video files on Web proxy servers reduces network traffic and increases scalability of YouTube servers [1]. User-perceived delay results from both overload of individual servers and net- work congestion. Proxy caches are used to address both issues by attempting to serverequestslocally.Thereareseveraldecisionsthatmustbemadesuchascache placementandreplacement.Ourmainfocuswillbethecachereplacementproblem, theprocessofevictingobjectsinthecachetomakeroomfornewobjects,appliedto Webproxyservers.Proxyserverswanttoserveasmanyobjectsfromthecacheas possible,serveasmuchdatafromthecacheaspossible,orboth.Optimizingbothis ideal, butmany practical algorithms optimizefor oneover the other. There are many replacement strategies to consider when designing a proxy server. The most commonly known cache replacement strategies are least frequentlyused(LFU)andleastrecentlyused(LRU).Podlipnigetal.[2]provided asurveyofWebcachereplacementstrategies.Theyhavedonewelltonotonlylist well-known strategies, but also categorize the strategies into five groups. Althoughbothasurveyandclassificationofthesestrategiesareavailable,thereis not a known record of results comparing the majority of the cache replacement H.ElAarag,WebProxyCacheReplacementStrategies, 1 SpringerBriefsinComputerScience,DOI:10.1007/978-1-4471-4893-7_1, (cid:2)HalaElAarag2013 2 1 Introduction algorithms.BalamashandKunz[3]comparedcachereplacementpoliciesqualita- tively rather than quantitatively. Many of the works consulted for this research presentedresultsfordifferentstrategiesagainst,atmost,threetofiveotherstrategies. However, their results cannot be compared effectively because most literature devisedtheirexperimentswithdifferencesintheirimplementations,suchastheuse ofauxiliarycaching,representationofWebobjectcharacteristics.Thereisalsothe differenceintracefilesbetweenexperimentsandalargerangeofdifferentsizesused for the cache space. Lastly, different proposals used different criteria on when to cache an object such as ignoring PHP files, cgi-bin scripts, POST requests, and simplyjustusingallrequestspresentedinaparticulartracefile. Inspiteofthemanycachereplacementpoliciesproposedinthepastyears,the most popular Web proxy software, Squid, uses least recently used as its default strategy.Webelievethatthisisbecausetherehasnotbeenacomprehensivestudy presentedthatcomparesthesestrategiesquantitatively.AsWong[4]mentions‘‘all policies were purported to perform better than others, creating confusion as to which policy should be used’’. In this book, we present a study of cache replacement strategies designed for static Web content, as opposed to dynamic Web content seen in some Web 2.0 applications. Most caching that occurs with dynamic content occurs on the browsersideanddoesnotoccurfromthestandpointofproxyservers.Asaresult, these strategieshave notbeen considered.Wehave researched how proxyservers can still improve performance by caching static Web content such as cascading style sheets, java script source files, and most importantly larger files such as images (jpeg, gif, etc.). This topic is particularly important in wireless ad hoc networks. In such net- works, mobile devices act as proxy servers for a group of other mobile devices. However, they have limited storage space. The extensive research on cache replacementpoliciesweprovideinthisbookiscrucialforsuchenvironmentswith small cache sizes and limited battery life. Neural networks have been employed in a number of applications, particularly intheareaofpatternrecognition.Neuralnetworksareabletolearnbyexperience andholdaninternalrepresentationofmeaningindata.Anappropriatelystructured neuralnetworkwillbeabletogeneralizetheknowledgeacquiredfromtrainingto data that lies outside the training set. This property makes neural networks useful forsolvingproblemsthatcontainuncertaintyorhaveaproblemspacewhichistoo large for an exhaustive search. We use neural networks to solve the Web proxy cache replacement problem. A bottom-up approach is justified by the heavy var- iabilityinWebtrafficwhichmakesgeneralcharacterizationofthattrafficdifficult. Neuralnetworksareselectedfortheirstrengthinpatternrecognitioninnoisydata sets. Additionally, they can learn from example by training against a data set, yet are able to generalize beyond the training set. Thus, they are well suited for developing a general replacement strategy from a set of data samples. This approach has the advantage of incorporating subtle traffic pattern information which may be difficult for a human to discern when designing a top-down algorithm.

See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.