ebook img

Low Power and Low Complexity Shift-and-Add Based Computations PDF

280 Pages·2008·1.86 MB·English
by  
Save to my drive
Quick download
Download
Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.

Preview Low Power and Low Complexity Shift-and-Add Based Computations

Linköping Studies in Science and Technology Dissertations, No. 1201 Low Power and Low Complexity Shift-and-Add Based Computations Kenny Johansson Department of Electrical Engineering Linköping University, SE-581 83 Linköping, Sweden Linköping 2008 Low Power and Low Complexity Shift-and-Add Based Computations © 2008 Kenny Johansson Division of Electronics Systems Department of Electrical Engineering Linköping University SE-581 83 Linköping Sweden http://www.es.isy.liu.se/ I SBN 978-91-7393-836-5 ISSN 0345-7524 Printed by LiU-Tryck, Linköping 2008 ABSTRACT The main issue in this thesis is to minimize the energy consumption per operation for the arithmetic parts of DSP circuits, such as digital filters. More specific, the focus is on single- and multiple-constant multiplica- tions, which are realized using shift-and-add based computations. The possibilities to reduce the complexity, i.e., the chip area, and the energy consumption are investigated. Both serial and parallel arithmetic are con- sidered. The main difference, which is of interest here, is that shift opera- tions in serial arithmetic require flip-flops, while shifts can be hardwired in parallel arithmetic. The possible ways to connect a given number of adders is limited. Thus, for single-constant multiplication, the number of shift-and-add structures is finite. We show that it is possible to save both adders and shifts compared to traditional multipliers. Two algorithms for multiple- constant multiplication using serial arithmetic are proposed. For both algorithms, the total complexity is decreased compared to one of the best-known algorithms designed for parallel arithmetic. Furthermore, the impact of the digit-size, i.e., the number of bits to be processed in parallel, is studied for FIR filters implemented using serial arithmetic. Case studies indicate that the minimum energy consumption per sample is often obtained for a digit-size of around four bits. The energy consumption is proportional to the switching activity, i.e., the average number of transitions between the two logic levels per clock cycle. To achieve low power designs, it is necessary to develop accurate high-level models that can be used to estimate the switching activity. A method for computing the switching activity in bit-serial constant multi- pliers is proposed. For parallel arithmetic, a detailed complexity model for constant mul- tiplication is introduced. The model counts the required number of full and half adder cells. It is shown that the complexity can be significantly reduced by considering the interconnection between the adders. A main factor for energy consumption in constant multipliers is the adder depth, i.e., the number of cascaded adders. The reason for this is that the switch- ii Abstract ing activity will increase when glitches are propagated to subsequent adders. We propose an algorithm, where all multiplier coefficients are guaranteed to be realized at the theoretically lowest depth possible. Implementation examples show that the energy consumption is signifi- cantly reduced using this algorithm compared to solutions with fewer word level adders. For most applications, the input data are correlated since real world signals are processed. A data dependent switching activity model is derived for ripple-carry adders. Furthermore, a switching activity model for the single adder multiplier is proposed. This is a good starting point for accurate modeling of shift-and-add based computations using more adders. Finally, a method to rewrite an arbitrary function as a sum of weighted bit-products is presented. It is shown that for many elementary functions, a majority of the bit-products can be neglected while still maintaining rea- sonable high accuracy, since the weights are significantly smaller than the allowed error. The function approximation algorithms can be imple- mented using a low complexity architecture, which can easily be pipeli- ned to an arbitrary degree for increased throughput. ACKNOWLEDGEMENTS I thank my supervisors, Dr. Oscar Gustafsson, who always takes an active interest in discussing new research ideas, and Professor Lars Wanham- mar, for giving me the opportunity to do my Ph.D. studies at Electronics Systems. Furthermore, they both did a great job proofreading the thesis. I also thank former and current colleagues at Electronics Systems. Dr. Henrik Ohlsson, for considerable support during my first working years, and for the many interesting, not only work related, discussions. Dr. Andrew Dempster for introducing me to the fundamentals in the field of multiple-constant multiplication. I want to thank all the students that I have taught in various courses within the area of digital circuits. It has been enjoyable teaching you, and I hope that you have learned as much from me as I have learned from you. All the friends during the years of undergraduate studies, in particular Magnus Karlsson, Joseph Jacobsson, and Ingvar Carlson, thanks to you this was the best time of my life. Teachers through the years, especially Jan Alvarsson and Arne Karls- son in the upper secondary school. Also, the classmates Martin Källström and Peter Eriksson for all the conversations, about everything but school related subjects, both during and in between lessons. Above all, I thank my parents, Mona and Nils-Gunnar Johansson, for all the support during my many years of studies. My sisters, Linda Johansson and Tanja Henze, who gave me an early start in mathematics. I remember getting extra homework when playing school because my 3:s looked angry. Considering this method of learning by discipline, it is not surprising that they became a teacher and a police officer. Finally, I hope that selecting the colors of the club emblem for the front cover will help bringing home many gold medals to Färjestads BK! This work was financially supported by the Swedish Research C ouncil(Vetenskapsrådet). The Coffee Room, August 24, 2008 Kenny Johansson iv Acknowledgements TABLE OF CONTENTS 1 Introduction ..............................................................................1 1.1 Digital Filters ..............................................................................2 1.1.1 IIR Filters........................................................................ 2 1.1.2 FIR Filters....................................................................... 2 1.2 Number Representations ...........................................................4 1.2.1 Negative Numbers.......................................................... 4 1.2.2 Signed-Digit Numbers.................................................... 5 1.3 Logarithmic Number Systems ....................................................6 1.3.1 Conversions.................................................................... 6 1.3.2 Addition........................................................................... 8 1.4 Constant Multiplication .............................................................10 1.4.1 Single-Constant Multiplication...................................... 11 1.4.2 Multiple-Constant Multiplication.................................... 13 1.4.3 Graph Representation.................................................. 15 1.4.4 Terms used for Graph Based MCM Algorithms............ 16 1.5 Computer Arithmetic ................................................................18 1.5.1 Parallel Arithmetic......................................................... 18 1.5.2 Serial Arithmetic........................................................... 19 1.5.3 Carry-Save Arithmetic.................................................. 22 1.6 Power and Energy Consumption .............................................23 1.7 Outline and Main Contributions ................................................25 2 Complexity of Serial Constant Multipliers ...........................31 2.1 Graph Multipliers ......................................................................32 2.1.1 Multiplier Types............................................................ 32 2.1.2 Graph Elimination......................................................... 34 vi Table of Contents 2.2 Complexity Comparison – Single Multiplier ..............................35 2.2.1 Comparison of Flip-Flop Cost....................................... 36 2.2.2 Comparison of Building Block Cost.............................. 39 2.3 Complexity Comparison – RSAG-n ..........................................42 2.3.1 The Reduced Shift and Adder Graph Algorithm........... 42 2.3.2 Comparison by Varying the Wordlength....................... 46 2.3.3 Comparison by Varying the Setsize............................. 47 2.4 Digit-Size Trade-Offs ................................................................49 2.4.1 Implementation Aspects............................................... 51 2.4.2 Specification of the Example Filter............................... 52 2.4.3 Chip Area...................................................................... 53 2.4.4 Sample Rate................................................................. 54 2.4.5 Energy Consumption.................................................... 56 2.5 Complexity Comparison – RASG-n ..........................................58 2.5.1 The Reduced Adder and Shift Graph Algorithm........... 59 2.5.2 Comparison by Varying the Wordlength....................... 59 2.5.3 Comparison by Varying the Setsize............................. 61 2.5.4 Adder Depth................................................................. 63 2.6 Implementation Examples ........................................................65 2.6.1 Example 1..................................................................... 66 2.6.2 Example 2..................................................................... 73 2.7 Conclusions ..............................................................................77 3 Switching Activity in Bit-Serial Multipliers ..........................79 3.1 Multiplier Stage ........................................................................80 3.1.1 Preliminaries................................................................. 80 3.1.2 Sum Output Switching Activity...................................... 81 3.1.3 Switching Activity Using STGs..................................... 83 3.1.4 Carry Output Switching Activity.................................... 87 3.1.5 Input-Output Correlation Probability............................. 88 3.1.6 Glitching Activity........................................................... 91 3.1.7 Example........................................................................ 91 3.2 Graph Multipliers ......................................................................93 3.2.1 Correlation Probability Look-Up Tables........................ 93 3.2.2 The Applicability of the Equations................................ 93 3.2.3 Example........................................................................ 95 Table of Contents vii 3.3 Serial/Parallel Multipliers ..........................................................96 3.3.1 Simplification of the Switching Activity Equation.......... 97 3.3.2 Example........................................................................ 99 3.4 Conclusions ............................................................................100 4 Complexity of Parallel Constant Multipliers ......................101 4.1 Bit-Level Optimization ............................................................102 4.1.1 Scaling........................................................................ 102 4.1.2 Complexity Model....................................................... 103 4.1.3 Removing Half Adders................................................ 108 4.1.4 Single-Constant Multiplication Example..................... 109 4.2 Low Complexity Algorithm ......................................................112 4.2.1 Multiple-Constant Multiplication Example................... 113 4.2.2 Results for Random Coefficient Sets.......................... 113 4.3 Interconnection Algorithms .....................................................115 4.3.1 Algorithm Formulations............................................... 117 4.3.2 Implementation Examples.......................................... 119 4.4 Minimum Adder Depth Algorithm ...........................................126 4.4.1 Fundamental Pairs..................................................... 126 4.4.2 MCM Defined as a Covering Problem........................ 130 4.4.3 Optimal Approach....................................................... 130 4.4.4 Heuristic Approaches................................................. 134 4.4.5 Optimal vs. Heuristic................................................... 143 4.4.6 Results........................................................................ 148 4.4.7 Implementation Examples.......................................... 154 4.5 Conclusions ............................................................................160 5 Energy Estimation for Ripple-Carry Adders ......................163 5.1 Background ............................................................................164 5.1.1 Exact Method for Transitions in RCA......................... 164 5.2 Energy Model .........................................................................165 5.2.1 Timing Issues............................................................. 169 5.3 Switching Activity ...................................................................170 5.3.1 Switching due to Change of Input............................... 171 5.3.2 Switching due to Carry Propagation........................... 173 5.3.3 Total Switching Activity............................................... 174 viii Table of Contents 5.3.4 Uncorrelated Input Data............................................. 177 5.3.5 Summary.................................................................... 179 5.4 Experimental Results .............................................................180 5.4.1 Uncorrelated Data...................................................... 181 5.4.2 Correlated Data.......................................................... 182 5.5 Adopting the Dual Bit Type Method .......................................183 5.5.1 Statistical Definition of Signals................................... 186 5.5.2 The DBT Method........................................................ 187 5.5.3 DBT Model for Switching Activity in RCA................... 189 5.5.4 Example...................................................................... 191 5.6 Switching Activity in Constant Multipliers ...............................194 5.6.1 Addition with High Correlation.................................... 195 5.6.2 Results........................................................................ 199 5.6.3 Discussion on Switching Activity in MCM................... 202 5.6.4 Time Instant Model..................................................... 204 5.7 Conclusions ............................................................................205 6 Function Approximation by a Sum of Bit-Products ..........207 6.1 Background ............................................................................208 6.1.1 PPA Methods.............................................................. 209 6.2 Function Approximation Approach .........................................211 6.2.1 General Formulation................................................... 211 6.2.2 Optimization................................................................ 215 6.2.3 Results for Some Elementary Functions.................... 217 6.3 Architecture ............................................................................222 6.3.1 Implementation for a Sum of Bit-Products.................. 222 6.3.2 Conditional Blocks...................................................... 225 6.3.3 Results Using Conditional Blocks............................... 226 6.4 Functions for LNS ..................................................................232 6.4.1 Sign Transformation................................................... 233 6.4.2 Results for the LNS Functions.................................... 237 6.4.3 Comparison with ROM............................................... 241 6.5 Sine and Cosine Functions ....................................................244 6.5.1 Angle Rotation Based Approach................................ 244 6.5.2 Octant Mapping.......................................................... 246 6.5.3 Comparison with CORDIC.......................................... 248

Description:
tions, which are realized using shift-and-add based computations. The To achieve low power designs, it is necessary to develop accurate Digital Filters. 3. It is not recommended to use recursive algorithms to realize FIR fil- ters, because of stability problems. Hence, here all coefficients bk in
See more

The list of books you might like

Most books are stored in the elastic cloud where traffic is expensive. For this reason, we have a limit on daily download.