Algorithms Illuminated Part 3: Greedy Algorithms and Dynamic Programming Tim Roughgarden c 2019 by Tim Roughgarden � All rights reserved. No portion of this book may be reproduced in any form without permission from the publisher, except as permitted by U. S. copyright law. First Edition Cover image: Untitled, by Johanna Dickson ISBN: 978-0-9992829-4-6 (Paperback) ISBN: 978-0-9992829-5-3 (ebook) Library of Congress Control Number: 2017914282 Soundlikeyourself Publishing, LLC New York, NY [email protected] www.algorithmsilluminated.org In memory of Stephen H. Schneider Contents Preface vii 13 Introduction to Greedy Algorithms 1 13.1 The Greedy Algorithm Design Paradigm 1 13.2 A Scheduling Problem 4 13.3 Developing a Greedy Algorithm 6 13.4 Proof of Correctness 12 Problems 21 14 Huffman Codes 23 14.1 Codes 23 14.2 Codes as Trees 28 14.3 Huffman’s Greedy Algorithm 32 *14.4 Proof of Correctness 41 Problems 49 15 Minimum Spanning Trees 52 15.1 Problem Definition 52 15.2 Prim’s Algorithm 57 *15.3 Speeding Up Prim’s Algorithm via Heaps 62 *15.4 Prim’s Algorithm: Proof of Correctness 69 15.5 Kruskal’s Algorithm 76 *15.6 Speeding Up Kruskal’s Algorithm via Union-Find 81 *15.7 Kruskal’s Algorithm: Proof of Correctness 91 15.8 Application: Single-Link Clustering 94 Problems 99 16 Introduction to Dynamic Programming 103 16.1 The Weighted Independent Set Problem 104 16.2 A Linear-Time Algorithm for WIS in Paths 108 v vi Contents 16.3 A Reconstruction Algorithm 116 16.4 The Principles of Dynamic Programming 118 16.5 The Knapsack Problem 123 Problems 133 17 Advanced Dynamic Programming 137 17.1 Sequence Alignment 137 *17.2 Optimal Binary Search Trees 148 Problems 163 18 Shortest Paths Revisited 167 18.1 Shortest Paths with Negative Edge Lengths 167 18.2 The Bellman-Ford Algorithm 172 18.3 The All-Pairs Shortest Path Problem 185 18.4 The Floyd-Warshall Algorithm 187 Problems 198 Epilogue: A Field Guide to Algorithm Design 201 Hints and Solutions to Selected Problems 203 Index 211 Preface This book is the third of a four-part series based on my online algo- rithms courses that have been running regularly since 2012, which in turn are based on an undergraduate course that I taught many times at Stanford University. The first two parts of the series are not strict prerequisites for this one, though portions of this book do assume at least a vague recollection of big-O notation (covered in Chapter 2 of Part 1 or Appendix C of Part 2), divide-and-conquer algorithms (Chapter 3 of Part 1), and graphs (Chapter 7 of Part 2). What We’ll Cover Algorithms Illuminated, Part 3 provides an introduction to and nu- merous case studies of two fundamental algorithm design paradigms. Greedy algorithms and applications. Greedy algorithms solve problems by making a sequence of myopic and irrevocable decisions. For many problems, they are easy to devise and often blazingly fast. Most greedy algorithms are not guaranteed to be correct, but we’ll cover several killer applications that are exceptions to this rule. Examples include scheduling problems, optimal compression, and minimum spanning trees of graphs. Dynamic programming and applications. Few benefits of a se- rious study of algorithms rival the empowerment that comes from mastering dynamic programming. This design paradigm takes a lot of practice to perfect, but it has countless applications to problems that appear unsolvable using any simpler method. Our dynamic program- ming boot camp will double as a tour of some of the paradigm’s killer applications, including the knapsack problem, the Needleman-Wunsch genome sequence alignment algorithm, Knuth’s algorithm for opti- vii viii Preface mal binary search trees, and the Bellman-Ford and Floyd-Warshall shortest-path algorithms. For a more detailed look into the book’s contents, check out the “Upshot” sections that conclude each chapter and highlight the most important points. The “Field Guide to Algorithm Design” on page 201 provides a bird’s-eye view of how greedy algorithms and dynamic programming fit into the bigger algorithmic picture. The starred sections of the book are the most advanced ones. The time-constrained reader can skip these sections on a first reading without any loss of continuity. Topics covered in the other three parts. Algorithms Illumi- nated, Part 1 covers asymptotic notation (big-O notation and its close cousins), divide-and-conquer algorithms and the master method, randomized QuickSort and its analysis, and linear-time selection algo- rithms. Part 2 covers data structures (heaps, balanced search trees, hash tables, bloom filters), graph primitives (breadth- and depth-first search, connectivity, shortest paths), and their applications (ranging fromdeduplicationtosocialnetworkanalysis). Part 4isallaboutNP- completeness, whatit means forthe algorithm designer, andstrategies for coping with computationally intractable problems, including the analysis of heuristics and local search. Skills You’ll Learn Mastering algorithms takes time and effort. Why bother? Become a better programmer. You’ll learn several blazingly fast subroutines for processing data as well as several useful data structuresfororganizingdatathatyoucandeploydirectlyinyourown programs. Implementing and using these algorithms will stretch and improve your programming skills. You’ll also learn general algorithm design paradigms that are relevant to many different problems across different domains, as well as tools for predicting the performance of such algorithms. These “algorithmic design patterns” can help you come up with new algorithms for problems that arise in your own work. Sharpen your analytical skills. You’llgetlotsofpracticedescrib- ing and reasoning about algorithms. Through mathematical analysis, Preface ix you’ll gain a deep understanding of the specific algorithms and data structures that these books cover. You’ll acquire facility with sev- eral mathematical techniques that are broadly useful for analyzing algorithms. Think algorithmically. After you learn about algorithms, you’ll start seeing them everywhere, whether you’re riding an elevator, watching a flock of birds, managing your investment portfolio, or even watching an infant learn. Algorithmic thinking is increasingly useful and prevalent in disciplines outside of computer science, including biology, statistics, and economics. Literacy with computer science’s greatest hits. Studying al- gorithms can feel like watching a highlight reel of many of the greatest hits from the last sixty years of computer science. No longer will you feel excluded at that computer science cocktail party when someone cracks a joke about Dijkstra’s algorithm. After reading these books, you’ll know exactly what they mean. Ace your technical interviews. Over the years, countless stu- dents have regaled me with stories about how mastering the concepts in these books enabled them to ace every technical interview question they were ever asked. How These Books Are Different Thisseriesofbookshasonlyonegoal: to teach the basics of algorithms in the most accessible way possible. Think of them as a transcript of what an expert algorithms tutor would say to you over a series of one-on-one lessons. There are a number of excellent more traditional and encyclopedic textbooks about algorithms, any of which usefully complement this book series with additional details, problems, and topics. I encourage you to explore and find your own favorites. There are also several books that, unlike these books, cater to programmers looking for ready-made algorithm implementations in a specific programming language. Many such implementations are freely available on the Web as well.