An Adaptive Partitioning Scheme for Ad-hoc and Time-varying Database Analytics by Anil Shanbhag B.Tech. in Computer Science Indian Institute of Technology Bombay, 2014 Submitted to the Department of Electrical Engineering and Computer Science in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering and Computer Science at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2016 (cid:13)c Massachusetts Institute of Technology 2016. All rights reserved. Author .............................................................. Department of Electrical Engineering and Computer Science May 19, 2016 Certified by.......................................................... Samuel Madden Professor of Electrical Engineering and Computer Science Thesis Supervisor Accepted by......................................................... Leslie A. Kolodziejski Chairman, Department Committee on Graduate Students 2 An Adaptive Partitioning Scheme for Ad-hoc and Time-varying Database Analytics by Anil Shanbhag Submitted to the Department of Electrical Engineering and Computer Science on May 19, 2016, in partial fulfillment of the requirements for the degree of Master of Science in Electrical Engineering and Computer Science Abstract Data partitioning significantly improves query performance in distributed database systems. A large number of techniques have been proposed to efficiently partition a dataset, often focusing on finding the best partitioning for a particular query work- load. However, many modern analytic applications involve ad-hoc or exploratory analysiswhereusersdonothavearepresentativequeryworkload. Furthermore, work- loads change over time as businesses evolve or as analysts gain better understanding of their data. Static workload-based data partitioning techniques are therefore not suitable for such settings. In this thesis, we present Amoeba, an adaptive distributed storage system for data skipping. It does not require an upfront query workload and adapts the data partitioning according to the queries posed by users over time. We present the data structures, partitioning algorithms, and an efficient implementation on top of Apache Spark and HDFS. Our experimental results show that the Amoeba storage system provides improved query performance for ad-hoc workloads, adapts to changes in the query workloads, and converges to a steady state in case of recurring workloads. On a real world workload, Amoeba reduces the total workload runtime by 1.8x compared to Spark with data partitioned and 3.4x compared to unmodified Spark. Thesis Supervisor: Samuel Madden Title: Professor of Electrical Engineering and Computer Science 4 Acknowledgments I would like to thank Alekh Jindal, Qui Nguyen, Aaron Elmore, Jorge Quiane and Divyakanth Agarwal who have contributed many ideas to this work and helped build the system. I would also like to thank Prof. Samuel Madden, my thesis supervisor, for being a constant source of guidance and feedback in this project and outside. Finally, I am always grateful to my family and friends, who encouraged me and supported me along the way. 5 6 Contents 1 Introduction 13 2 Related Work 17 3 System Overview 21 4 Upfront Data Partitioning 23 4.1 Key Ideas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 4.2 Upfront Partitioning Algorithm . . . . . . . . . . . . . . . . . . . . . 27 5 Adaptive Repartitioning 31 5.1 Workload Monitor and Cost Model . . . . . . . . . . . . . . . . . . . 32 5.2 Partitioning Tree Transformations . . . . . . . . . . . . . . . . . . . . 33 5.3 Divide-And-Conquer Repartitioning . . . . . . . . . . . . . . . . . . . 36 5.4 Handling Multiple Predicates . . . . . . . . . . . . . . . . . . . . . . 39 6 Implementation 41 6.1 Initial Robust Partitioning . . . . . . . . . . . . . . . . . . . . . . . . 41 6.2 Query Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43 7 Discussion 45 7.1 Leveraging Replication . . . . . . . . . . . . . . . . . . . . . . . . . . 45 7.2 Handling Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 8 Evaluation 47 8.1 Upfront Partitioning Performance . . . . . . . . . . . . . . . . . . . . 47 8.2 Micro-benchmarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 7 8.3 Amoeba on Real Workload . . . . . . . . . . . . . . . . . . . . . . . 53 9 Conclusion 55 Appendices 60 A Fast Remote Reads 61 8 List of Figures 1-1 Example partitioning tree with 8 blocks . . . . . . . . . . . . . . . . . 14 3-1 Amoeba Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4-1 Partitioning Techniques. . . . . . . . . . . . . . . . . . . . . . . . . . 25 4-2 Upfront Partitioning Algorithm Example. . . . . . . . . . . . . . . . . 26 5-1 Node swap in the partitioning tree. . . . . . . . . . . . . . . . . . . . 34 5-2 Illustrating adaptive partitioning when predicate A appears repeatedly. 35 2 5-3 Node pushdown in partitioning tree. . . . . . . . . . . . . . . . . . . 35 5-4 Node rotation in partitioning tree. . . . . . . . . . . . . . . . . . . . . 35 7-1 Heterogenous Replication. . . . . . . . . . . . . . . . . . . . . . . . . 45 8-1 Ad-hoc query runtimes for different attributes of TPC-H lineitem. . . 48 8-2 Comparing the upload time in Amoeba with HDFS . . . . . . . . . . 49 8-3 Comparing performance of upfront partition tree vs kd-tree . . . . . . 49 8-4 Query runtimes for changing query attributes on TPC-H lineitem. . . 51 8-5 Query runtimes for changing predicates on the same attribute of TPC- H lineitem. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52 8-6 Cumulative Optimizer Runtime Across 100 queries . . . . . . . . . . 52 8-7 Cumulative Repartitioning Cost . . . . . . . . . . . . . . . . . . . . . 52 8-8 Total runtimes of the different approaches . . . . . . . . . . . . . . . 53 A-1 Response time with varying data locality (%) . . . . . . . . . . . . . 61 9 10
Description: