Preface A ims This book introduces the concepts and methodologies employed in designing a system-on-chip (SoC) based around a microprocessor core and in designing the microprocessor core itself. The principles of microprocessor design are made con- crete by extensive illustrations based upon the ARM. The aim of the book is to assist the reader in understanding how SoCs and micro- processors are designed and used, and why a modern processor is designed the way that it is. The reader who wishes to know only the general principles should find that the ARM illustrations add substance to issues which can otherwise appear somewhat ethereal; the reader who wishes to understand the design of the ARM should find that the general principles illuminate the rationale for the ARM being as it is. Other microprocessor architectures are not described in this book. The reader who wishes to make a comparative study of architectures will find the required informa- tion on the ARM here but must look elsewhere for information on other designs. Audience The book is intended to be of use to two distinct groups of readers: • Professional hardware and software engineers who are tasked with designing an SoC product which incorporates an ARM processor, or who are evaluating the ARM for a product, should find the book helpful in their duties. Although there is considerable overlap with ARM technical publications, this book provides a broader context with more background. It is not a substitute for the manufac turer's data, since much detail has had to be omitted, but it should be useful as an introductory overview and adjunct to that data. • Students of computer science, computer engineering and electrical engineering should find the material of value at several stages in their courses. Some chapters are closely based on course material previously used in undergraduate teaching; some other material is drawn from a postgraduate course. Prerequisite This book is not intended to be an introductory text on computer architecture or knowledge computer logic design. Readers are assumed to have a level of familiarity with these subjects equivalent to that of a second year undergraduate student in computer sci- ence or computer engineering. Some first year material is presented, but this is more by way of a refresher than as a first introduction to this material. No prior familiarity with the ARM processor is assumed. The ARM On 26 April 1985, the first ARM prototypes arrived at Acorn Computers Limited in Cambridge, England, having been fabricated by VLSI Technology, Inc., in San Jose, iv Preface California. A few hours later they were running code, and a bottle of Moet & Chan-don was opened in celebration. For the remainder of the 1980s the ARM was quietly developed to underpin Acorn's desktop products which form the basis of educational computing in the UK; over the 1990s, in the care of ARM Limited, the ARM has sprung onto the world stage and has established a market-leading position in high-performance low-power and low-cost embedded applications. This prominent market position has increased ARM's resources and accelerated the rate at which new ARM-based developments appear. The highlights of the last decade of ARM development include: • the introduction of the novel compressed instruction format called 'Thumb' which reduces cost and power dissipation in small systems; • significant steps upwards in performance with the ARM9, ARM 10 and 'Strong- ARM' processor families; • a state-of-the-art software development and debugging environment; • a very wide range of embedded applications based around ARM processor cores. Most of the principles of modern SoC and processor design are illustrated some- where in the ARM family, and ARM has led the way in the introduction of some con- cepts (such as dynamically decompressing the instruction stream). The inherent simplicity of the basic 3-stage pipeline ARM core makes it a good pedagogical intro- ductory example to real processor design, whereas the debugging of a system based around an ARM core deeply embedded into a complex system chip represents the cutting-edge of technological development today. Book Structure Chapter 1 starts with a refresher on first year undergraduate processor design mate- rial. It illustrates the principle of abstraction in hardware design by reviewing the roles of logic and gate-level representations. It then introduces the important con- cept of the Reduced Instruction Set Computer (RISC) as background for what fol- lows, and closes with some comments on design for low power. Chapter 2 describes the ARM processor architecture in terms of the concepts intro- duced in the previous chapter, and Chapter 3 is a gentle introduction to user-level assembly language programming and could be used in first year undergraduate teach- ing for this purpose. Chapter 4 describes the organization and implementation of the 3- and 5-stage pipeline ARM processor cores at a level suitable for second year undergraduate teach- ing, and covers some implementation issues. Chapters 5 and 6 go into the ARM instruction set architecture in increasing depth. Chapter 5 goes back over the instruction set in more detail than was presented in Chapter 3, including the binary representation of each instruction, and it penetrates more deeply into the comers of the instruction set. It is probably best read once and then used for reference. Chapter 6 backs off a bit to consider what a high-level lan- guage (in this case, C) really needs and how those needs are met by the ARM instruc- tion set. This chapter is based on second year undergraduate material. Preface V Chapter 7 introduces the 'Thumb' instruction set which is an ARM innovation to address the code density and power requirements of small embedded systems. It is of peripheral interest to a generic study of computer science, but adds an interesting lat- eral perspective to a postgraduate course. Chapter 8 raises the issues involved in debugging systems which use embedded processor cores and in the production testing of board-level systems. These issues are background to Chapter 9 which introduces a number of different ARM integer cores, broadening the theme introduced in Chapter 4 to include cores with 'Thumb', debug hardware, and more sophisticated pipeline operation. Chapter 10 introduces the concept of memory hierarchy, discussing the principles of memory management and caches. Chapter 11 reviews the requirements of a modern operating system at a second year undergraduate level and describes the approach adopted by the ARM to address these requirements. Chapter 12 introduces the integrated ARM CPU cores (including StrongARM) that incorporate full support for memory management. Chapter 13 covers the issues of designing SoCs with embedded processor cores. Here, the ARM is at the leading edge of technology. Several examples are presented of produc- tion embedded system chips to show the solutions that have been developed to the many problems inherent in committing a complex application-specific system to silicon. Chapter 14 moves away from mainstream ARM developments to describe the asyn- chronous ARM-compatible processors and systems developed at the University of Manchester, England, during the 1990s. After a decade of research the AMULET technology is, at the time of writing, about to take its first step into the commercial domain. Chapter 14 concludes with a description of the DRACO SoC design, the first commercial application of a 32-bit asynchronous microprocessor. A short appendix presents the fundamentals of computer logic design and the ter- minology which is used in Chapter 1. A glossary of the terms used in the book and a bibliography for further reading are appended at the end of the book, followed by a detailed index. Course The chapters are at an appropriate level for use on undergraduate courses as follows: relevance Year 1: Chapter 1 (basic processor design); Chapter 3 (assembly language program- ming); Chapter 5 (instruction binaries and reference for assembly language programming). Year 2: Chapter 4 (simple pipeline processor design); Chapter 6 (architectural sup- port for high-level languages); Chapters 10 and 11 (memory hierarchy and architectural support for operating systems). Year 3: Chapter 8 (embedded system debug and test); Chapter 9 (advanced pipe- lined processor design); Chapter 12 (advanced CPUs); Chapter 13 (example embedded systems). A postgraduate course could follow a theme across several chapters, such as proc- essor design (Chapters 1, 2, 4, 9, 10 and 12), instruction set design (Chapters 2, 3, 5, 6, 7 and 11) or embedded systems (Chapters 2,4, 5, 8, 9 and 13). vi Preface Chapter 14 contains material relevant to a third year undergraduate or advanced postgraduate course on asynchronous design, but a great deal of additional back- ground material (not presented in this book) is also necessary. Support material Many of the figures and tables will be made freely available over the Internet for non-commercial use. The only constraint on such use is that this book should be a recommended text for any course which makes use of such material. Information about this and other support material may be found on the World Wide Web at: http://www.cs.man.ac.uk/amulet/publications/books/ARMsysArch Any enquiries relating to commercial use must be referred to the publishers. The assertion of the copyright for this book outlined on page iv remains unaffected. Feedback The author welcomes feedback on the style and content of this book, and details of any errors that are found. Please email any such information to: [email protected] Acknowledgements Many people have contributed to the success of the ARM over the past decade. As a policy decision I have not named in the text the individuals with principal responsi- bilities for the developments described therein since the lists would be long and attempts to abridge them invidious. History has a habit of focusing credit on one or two high-profile individuals, often at the expense of those who keep their heads down to get the job done on time. However, it is not possible to write a book on the ARM without mentioning Sophie Wilson whose original instruction set architecture survives, extended but otherwise largely unscathed, to this day. I would also like to acknowledge the support received from ARM Limited in giving access to their staff and design documentation, and I am grateful for the help I have received from ARM's semiconductor partners, particularly VLSI Technology, Inc., which is now wholly owned by Philips Semiconductors. The book has been considerably enhanced by helpful comments from reviewers of draft versions. I am grateful for the sympathetic reception the drafts received and the direct suggestions for improvement that were returned. The publishers, Addison Wesley Longman Limited, have been very helpful in guiding my responses to these suggestions and in other aspects of authorship. Lastly I would like to thank my wife, Valerie, and my daughters, Alison and Cather- ine, who allowed me time off from family duties to write this book. Steve Furber March 2000 Contents Preface in An Introduction to Processor Design 1 1.1 Processor architecture and organization 2 1.2 Abstraction in hardware design 3 1.3 MU0 - a simple processor 7 1.4 Instruction set design 14 1.5 Processor design trade-offs 19 1.6 The Reduced Instruction Set Computer 24 1.7 Design for low power consumption 28 1.8 Examples and exercises 32 The ARM Architecture 35 2.1 The Acorn RISC Machine 36 2.2 Architectural inheritance 37 2.3 The ARM programmer's model 39 2.4 ARM development tools 43 2.5 Example and exercises 47 ARM Assembly Language Programming 49 3.1 Data processing instructions 50 3.2 Data transfer instructions 55 3.3 Control flow instructions 63 3.4 Writing simple assembly language programs 69 3.5 Examples and exercises 72 ARM Organization and Implementation 74 4.1 3-stage pipeline ARM organization 75 4.2 5-stage pipeline ARM organization 78 4.3 ARM instruction execution 82 4.4 ARM implementation 86 viii Contents 4.5 The ARM coprocessor interface 101 4.6 Examples and exercises 103 The ARM Instruction Set 105 5.1 Introduction 106 5.2 Exceptions 108 5.3 Conditional execution 111 5.4 Branch and Branch with Link (B, BL) 113 5.5 Branch, Branch with Link and eXchange (BX, BLX) 115 5.6 Software Interrupt (SWI) 117 5.7 Data processing instructions 119 5.8 Multiply instructions 122 5.9 Count leading zeros (CLZ - architecture v5T only) 124 5.10 Single word and unsigned byte data transfer instructions 125 5.11 Half-word and signed byte data transfer instructions 128 5.12 Multiple register transfer instructions 130 5.13 Swap memory and register instructions (SWP) 132 5.14 Status register to general register transfer instructions 133 5.15 General register to status register transfer instructions 134 5.16 Coprocessor instructions 136 5.17 Coprocessor data operations 137 5.18 Coprocessor data transfers 138 5.19 Coprocessor register transfers 139 5.20 Breakpoint instruction (BRK - architecture v5T only) 141 5.21 Unused instruction space 142 5.22 Memory faults 143 5.23 ARM architecture variants 147 5.24 Example and exercises 149 Architectural Support for High-Level Languages 15 1 6.1 Abstraction in software design 152 6.2 Data types 153 6.3 Floating-point data types 158 6.4 The ARM floating-point architecture 163 6.5 Expressions 168 6.6 Conditional statements 170 6.7 Loops 173 6.8 Functions and procedures 175 Contents ix 6.9 Use of memory 180 6.10 Run-time environment 185 6.11 Examples and exercises 186 The Thumb Instruction Set 188 7.1 The Thumb bit in the CPSR 189 7.2 The Thumb programmer's model 190 7.3 Thumb branch instructions 191 7.4 Thumb software interrupt instruction 194 7.5 Thumb data processing instructions 195 7.6 Thumb single register data transfer instructions 198 7.7 Thumb multiple register data transfer instructions 199 7.8 Thumb breakpoint instruction 200 7.9 Thumb implementation 201 7.10 Thumb applications 203 7.11 Example and exercises 204 Architectural Support for System Development 207 8.1 The ARM memory interface 208 8.2 The Advanced Microcontroller Bus Architecture (AMBA) 216 8.3 The ARM reference peripheral specification 220 8.4 Hardware system prototyping tools 223 8.5 The ARMulator 225 8.6 The JTAG boundary scan test architecture 226 8.7 The ARM debug architecture 232 8.8 Embedded Trace 237 8.9 Signal processing support 239 8.10 Example and exercises 245 ARM Processor Cores 247 9.1 ARM7TDMI 248 9.2 ARM8 256 9.3 ARM9TDMI 260 9.4 ARM10TDMI 263 9.5 Discussion 266 9.6 Example and exercises 267 X Contents Memory Hierarchy 269 10.1 Memory size and speed 270 271 10.2 On-chip memory 272 279 10.3 Caches 283 289 10.4 Cache design - an example 10.5 Memory management 290 10.6 Examples and exercises Architectural Support for Operating Systems 11.1 An introduction to operating systems 291 293 11.2 The ARM system control coprocessor 294 297 11.3 CP15 protection unit registers 298 302 11.4 ARM protection unit 309 310 11.5 CP15 MMU registers 312 316 11.6 ARM MMU architecture 11.7 Synchronization 317 11.8 Context switching 11.9 Input/Output 11.10 Example and exercises ARM CPU Cores 12.1 The ARM710T, ARM720T and 318 323 ARM740T 327 335 12.2 The ARM810 339 341 12.3 The StrongARM SA-110 344 346 12.4 The ARM920T and ARM940T 12.5 The ARM946E-S and ARM966E-S 347 12.6 The ARM1020E 12.7 Discussion 12.8 Example and exercises Embedded ARM Applications 13.1 The VLSI Ruby II Advanced Communication Processor 348 349 13.2 The VLSI ISDN Subscriber Processor 352 355 13.3 The OneC™ VWS22100 GSM chip 360 13.4 The Ericsson-VLSI Bluetooth Baseband Controller 13.5 The ARM7500 and ARM7500FE Contents xi 13.6 The ARM7100 364 13.7 The SA-1100 368 13.8 Examples and exercises 371 The AMULET Asynchronous ARM Processors 374 14.1 Self-timed design 375 14.2 AMULET1 377 14.3 AMULET2 381 14.4 AMULET2e 384 14.5 AMULET3 387 14.6 The DRACO telecommunications controller 390 14.7 A self-timed future? 396 14.8 Example and exercises 397 Appendix: Computer Logic 399 Glossary 405 Bibliography 410 Index 413 An Introduction to Processor Design Summary of chapter contents The design of a general-purpose processor, in common with most engineering endeavours, requires the careful consideration of many trade-offs and compro- mises. In this chapter we will look at the basic principles of processor instruction set and logic design and the techniques available to the designer to help achieve the design objectives. Abstraction is fundamental to understanding complex computers. This chapter introduces the abstractions which are employed by computer hardware designers, of which the most important is the logic gate. The design of a simple processor is presented, from the instruction set, through a register transfer level description, down to logic gates. The ideas behind the Reduced Instruction Set Computer (RISC) originated in proc- essor research programmes at Stanford and Berkeley universities around 1980, though some of the central ideas can be traced back to earlier machines. In this chapter we look at the thinking that led to the RISC movement and consequently influenced the design of the ARM processor which is the subject of the following chapters. With the rapid development of markets for portable computer-based products, the power consumption of digital circuits is of increasing importance. At the end of the chapter we will look at the principles of low-power high-performance design. 1
Description: