Computer organization and design RISC V PDF

Title Computer organization and design RISC V
Author Vignesh Ramanathan
Pages 1,050
File Size 83.9 MB
File Type PDF
Total Downloads 840
Total Views 962

Summary

In Praise of Computer Organization and Design: The Hardware/ Software Interface “Textbook selection is often a frustrating act of compromise—pedagogy, content coverage, quality of exposition, level of rigor, cost. Computer Organization and Design is the rare book that hits all the right notes acros...


Description

Accelerat ing t he world's research.

Computer organization and design RISC V Vignesh Ramanathan

Related papers

Download a PDF Pack of t he best relat ed papers 

In Praise of Comput er Organizat ion and Design: T he Hardware/ Soft ware Int erface, ARM® Ed… Somit h Das Fundament als of Comput er Organizat ion and Archit ect ure by Most afa Rehab Abdelwahab Comput er archit ect ure Farhi Kiani

In Praise of Computer Organization and Design: The Hardware/ Software Interface “Textbook selection is often a frustrating act of compromise—pedagogy, content coverage, quality of exposition, level of rigor, cost. Computer Organization and Design is the rare book that hits all the right notes across the board, without compromise. It is not only the premier computer organization textbook, it is a shining example of what all computer science textbooks could and should be.” —Michael Goldweber, Xavier University “I have been using Computer Organization and Design for years, from the very first edition. This new edition is yet another outstanding improvement on an already classic text. The evolution from desktop computing to mobile computing to Big Data brings new coverage of embedded processors such as the ARM, new material on how software and hardware interact to increase performance, and cloud computing. All this without sacrificing the fundamentals.” —Ed Harcourt, St. Lawrence University “To Millennials: Computer Organization and Design is the computer architecture book you should keep on your (virtual) bookshelf. The book is both old and new, because it develops venerable principles—Moore’s Law, abstraction, common case fast, redundancy, memory hierarchies, parallelism, and pipelining—but illustrates them with contemporary designs.” —Mark D. Hill, University of Wisconsin-Madison “The new edition of Computer Organization and Design keeps pace with advances in emerging embedded and many-core (GPU) systems, where tablets and smartphones will/are quickly becoming our new desktops. This text acknowledges these changes, but continues to provide a rich foundation of the fundamentals in computer organization and design which will be needed for the designers of hardware and software that power this new class of devices and systems.” —Dave Kaeli, Northeastern University “Computer Organization and Design provides more than an introduction to computer architecture. It prepares the reader for the changes necessary to meet the everincreasing performance needs of mobile systems and big data processing at a time that difficulties in semiconductor scaling are making all systems power constrained. In this new era for computing, hardware and software must be co-designed and system-level architecture is as critical as component-level optimizations.” —Christos Kozyrakis, Stanford University “Patterson and Hennessy brilliantly address the issues in ever-changing computer hardware architectures, emphasizing on interactions among hardware and software components at various abstraction levels. By interspersing I/O and parallelism concepts with a variety of mechanisms in hardware and software throughout the book, the new edition achieves an excellent holistic presentation of computer architecture for the postPC era. This book is an essential guide to hardware and software professionals facing energy efficiency and parallelization challenges in Tablet PC to Cloud computing.” —Jae C. Oh, Syracuse University

This page intentionally left blank

R

I

S

C

-

V

E

D I

T

I O N

Computer Organization and Design T H E

H A R D W A R E / S O F T W A R E

I N T E R FA C E

David A. Patterson is the Pardee Professor of Computer Science, Emeritus at the University of California at Berkeley, which he joined after graduating from UCLA in 1977. His teaching has been honored by the Distinguished Teaching Award from the University of California, the Karlstrom Award from ACM, and the Mulligan Education Medal and Undergraduate Teaching Award from IEEE. Patterson received the IEEE Technical Achievement Award and the ACM Eckert-Mauchly Award for contributions to RISC, and he shared the IEEE Johnson Information Storage Award for contributions to RAID. He also shared the IEEE John von Neumann Medal and the C & C Prize with John Hennessy. Like his coauthor, Patterson is a Fellow of the American Academy of Arts and Sciences, the Computer History Museum, ACM, and IEEE, and he was elected to the National Academy of Engineering, the National Academy of Sciences, and the Silicon Valley Engineering Hall of Fame. He served on the Information Technology Advisory Committee to the US President, as chair of the CS division in the Berkeley EECS department, as chair of the Computing Research Association, and as President of ACM. This record led to Distinguished Service Awards from ACM, CRA, and SIGARCH. At Berkeley, Patterson led the design and implementation of RISC I, likely the first VLSI reduced instruction set computer, and the foundation of the commercial SPARC architecture. He was a leader of the Redundant Arrays of Inexpensive Disks (RAID) project, which led to dependable storage systems from many companies. He was also involved in the Network of Workstations (NOW) project, which led to cluster technology used by Internet companies and later to cloud computing. These projects earned four dissertation awards from ACM. His current research projects are Algorithm-Machine-People and Algorithms and Specializers for Provably Optimal Implementations with Resilience and Efficiency. The AMP Lab is developing scalable machine learning algorithms, warehouse-scale-computerfriendly programming models, and crowd-sourcing tools to gain valuable insights quickly from big data in the cloud. The ASPIRE Lab uses deep hardware and software co-tuning to achieve the highest possible performance and energy efficiency for mobile and rack computing systems. John L. Hennessy is a Professor of Electrical Engineering and Computer Science at Stanford University, where he has been a member of the faculty since 1977 and was, from 2000 to 2016, its tenth President. Hennessy is a Fellow of the IEEE and ACM; a member of the National Academy of Engineering, the National Academy of Science, and the American Philosophical Society; and a Fellow of the American Academy of Arts and Sciences. Among his many awards are the 2001 Eckert-Mauchly Award for his contributions to RISC technology, the 2001 Seymour Cray Computer Engineering Award, and the 2000 John von Neumann Award, which he shared with David Patterson. He has also received seven honorary doctorates. In 1981, he started the MIPS project at Stanford with a handful of graduate students. After completing the project in 1984, he took a leave from the university to cofound MIPS Computer Systems (now MIPS Technologies), which developed one of the first commercial RISC microprocessors. As of 2006, over 2 billion MIPS microprocessors have been shipped in devices ranging from video games and palmtop computers to laser printers and network switches. Hennessy subsequently led the DASH (Director Architecture for Shared Memory) project, which prototyped the first scalable cache coherent multiprocessor; many of the key ideas have been adopted in modern multiprocessors. In addition to his technical activities and university responsibilities, he has continued to work with numerous start-ups, both as an early-stage advisor and an investor.

R

I

S

C

-

V

E

D

I

T

I

O

N

Computer Organization and Design T H E

H A R D W A R E / S O F T W A R E

I N T E R FA C E

David A. Patterson University of California, Berkeley John L. Hennessy Stanford University

RISC-V updates and contributions by Andrew S. Waterman SiFive, Inc. Yunsup Lee SiFive, Inc.

Matthew Farrens University of California, Davis

Kevin Lim Hewlett-Packard

David Kaeli Northeastern University

Additional contributions by Perry Alexander The University of Kansas

Eric Love University of California, Berkeley

Nicole Kaiyan University of Adelaide

John Nickolls NVIDIA

Peter J. Ashenden Ashenden Designs Pty Ltd

David Kirk NVIDIA

John Y. Oliver Cal Poly, San Luis Obispo

Jason D. Bakos University of South Carolina

Zachary Kurmas Grand Valley State University

Milos Prvulovic Georgia Tech

Javier Diaz Bruguera Universidade de Santiago de Compostela

James R. Larus School of Computer and Communications Science at EPFL

Partha Ranganathan Google

Jichuan Chang Google

Jacob Leverich Stanford University

Mark Smotherman Clemson University

Morgan Kaufmann is an imprint of Elsevier 50 Hampshire Street, 5th Floor, Cambridge, MA 02139, United States Copyright © 2018 Elsevier Inc. All rights reserved. No part of this publication may be reproduced or transmitted in any form or by any means, electronic or mechanical, including photocopying, recording, or any information storage and retrieval system, without permission in writing from the publisher. Details on how to seek permission, further information about the Publisher’s permissions policies and our arrangements with organizations such as the Copyright Clearance Center and the Copyright Licensing Agency, can be found at our website: www.elsevier.com/permissions. This book and the individual contributions contained in it are protected under copyright by the Publisher (other than as may be noted herein). Notices Knowledge and best practice in this field are constantly changing. As new research and experience broaden our understanding, changes in research methods, professional practices, or medical treatment may become necessary. Practitioners and researchers must always rely on their own experience and knowledge in evaluating and using any information, methods, compounds, or experiments described herein. In using such information or methods they should be mindful of their own safety and the safety of others, including parties for whom they have a professional responsibility. To the fullest extent of the law, neither the Publisher nor the authors, contributors, or editors, assume any liability for any injury and/ or damage to persons or property as a matter of products liability, negligence or otherwise, or from any use or operation of any methods, products, instructions, or ideas contained in the material herein. RISC-V and the RISC-V logo are registered trademarks managed by the RISC-V Foundation, used under permission of the RISC-V Foundation. All rights reserved. This publication is independent of the RISC-V Foundation, which is not affiliated with the publisher and the RISC-V Foundation does not authorize, sponsor, endorse or otherwise approve this publication. All material relating to ARM® technology has been reproduced with permission from ARM Limited, and should only be used for education purposes. All ARM-based models shown or referred to in the text must not be used, reproduced or distributed for commercial purposes, and in no event shall purchasing this textbook be construed as granting you or any third party, expressly or by implication, estoppel or otherwise, a license to use any other ARM technology or know how. Materials provided by ARM are copyright © ARM Limited (or its affi liates). British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library Library of Congress Cataloging-in-Publication Data A catalog record for this book is available from the Library of Congress ISBN: 978-0-12-812275-4 For Information on all Morgan Kaufmann publications visit our website at https://www.elsevier.com/books-and-journals

Publisher: Katey Birtcher Acquisition Editor: Steve Merken Development Editor: Nate McFadden Production Project Manager: Lisa Jones Designer: Victoria Pearson Esser Typeset by MPS Limited, Chennai, India

To Linda, who has been, is, and always will be the love of my life

A C K N O W L E D G M E N T S

Figures 1.7, 1.8 Courtesy of iFixit (www.ifixit.com).

Figure 1.10.4 Courtesy of Cray Inc.

Figure 1.9 Courtesy of Chipworks (www.chipworks.com).

Figure 1.10.5 Courtesy of Apple Computer, Inc.

Figure 1.13 Courtesy of Intel.

Figure 1.10.6 Courtesy of the Computer History Museum.

Figures 1.10.1, 1.10.2, 4.15.2 Courtesy of the Charles Babbage Institute, University of Minnesota Libraries, Minneapolis.

Figures 5.17.1, 5.17.2 Courtesy of Museum of Science, Boston.

Figures 1.10.3, 4.15.1, 4.15.3, 5.12.3, 6.14.2 Courtesy of IBM.

Figure 5.17.4 Courtesy of MIPS Technologies, Inc. Figure 6.15.1 Courtesy of NASA Ames Research Center.

Contents Preface xv

C H A P T E R S

1

Computer Abstractions and Technology  2 1.1 Introduction 3 1.2 Eight Great Ideas in Computer Architecture  11 1.3 Below Your Program  13 1.4 Under the Covers  16 1.5 Technologies for Building Processors and Memory  24 1.6 Performance 28 1.7 The Power Wall  40 1.8 The Sea Change: The Switch from Uniprocessors to Multiprocessors  43 1.9 Real Stuff: Benchmarking the Intel Core i7  46 1.10 Fallacies and Pitfalls  49 1.11 Concluding Remarks  52 1.12 Historical Perspective and Further Reading  54 1.13 Exercises  54

2

Instructions: Language of the Computer  60 2.1 Introduction 62 2.2 Operations of the Computer Hardware  63 2.3 Operands of the Computer Hardware  67 2.4 Signed and Unsigned Numbers  74 2.5 Representing Instructions in the Computer  81 2.6 Logical Operations  89 2.7 Instructions for Making Decisions  92 2.8 Supporting Procedures in Computer Hardware  98 2.9 Communicating with People  108 2.10 RISC-V Addressing for Wide Immediates and Addresses  113 2.11 Parallelism and Instructions: Synchronization  121 2.12 Translating and Starting a Program  124 2.13 A C Sort Example to Put it All Together  133 2.14 Arrays versus Pointers  141 2.15 Advanced Material: Compiling C and Interpreting Java  144

x Contents

2.16 Real Stuff: MIPS Instructions  145 2.17 Real Stuff: x86 Instructions  146 2.18 Real Stuff: The Rest of the RISC-V Instruction Set  155 2.19 Fallacies and Pitfalls  157 2.20 Concluding Remarks  159 2.21 Historical Perspective and Further Reading  162 2.22 Exercises  162

3

Arithmetic for Computers  172 3.1 Introduction 174 3.2 Addition and Subtraction  174 3.3 Multiplication 177 3.4 Division 183 3.5 Floating Point  191 3.6 Parallelism and Computer Arithmetic: Subword Parallelism  216 3.7 Real Stuff: Streaming SIMD Extensions and Advanced Vector Extensions in x86  217 3.8 Going Faster: Subword Parallelism and Matrix Multiply  218 3.9 Fallacies and Pitfalls  222 3.10 Concluding Remarks  225 3.11 Historical Perspective and Further Reading  227 3.12 Exercises  227

4

The Processor  234 4.1 Introduction 236 4.2 Logic Design Conventions  240 4.3 Building a Datapath  243 4.4 A Simple Implementation Scheme  251 4.5 An Overview of Pipelining  262 4.6 Pipelined Datapath and Control  276 4.7 Data Hazards: Forwarding versus Stalling  294 4.8 Control Hazards  307 4.9 Exceptions 315 4.10 Parallelism via Instructions  321 4.11 Real Stuff: The ARM Cortex-A53 and Intel Core i7 Pipelines  334 4.12 Going Faster: Instruction-Level Parallelism and Matrix Multiply  342 4.13 Advanced Topic: An Introduction to Digital Design Using a Hardware Design Language to Describe and Model a Pipeline and More Pipelining Illustrations 345 4.14 Fallacies and Pitfalls  345 4.15 Concluding Remarks  346 4.16 Historical Perspective and Further Reading  347 4.17 Exercises  347

Contents

5

Large and Fast: Exploiting Memory Hierarchy  364 5.1 Introduction 366 5.2 Memory Technologies  370 5.3 The Basics of Caches  375 5.4 Measuring and Improving Cache Performance  390 5.5 Dependable Memory Hierarchy  410 5.6 Virtual Machines  416 5.7 Virtual Memory  419 5.8 A Common Framework for Memory Hierarchy  443 5.9 Using a Finite-State Machine to Control a Simple Cache  449 5.10 Parallelism and Memory Hierarchy: Cache Coherence  454 5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 458 5.12 Advanced Material: Implementing Cache Controllers  459 5.13 Real Stuff: The ARM Cortex-A53 and Intel Core i7 Memory Hierarchies 459 5.14 Real Stuff: The Rest of the RISC-V System and Special Instructions  464 5.15 Going Faster: Cache Blocking and Matrix Multiply  465 5.16 Fallacies and Pitfalls  468 5.17 Concluding Remarks  472 5.18 Historical Perspective and Further Reading  473 5.19 Exercises  473

6

Parallel Processors from Client to Cloud  490 6.1 Introduction 492 6.2 The Difficulty of Creating Parallel Processing Programs  494 6.3 SISD, MIMD, SIMD, SPMD, and Vector  499 6.4 Hardware Multithreading  506 6.5 Multicore and Other Shared Memory Multiprocessors  509 6.6 Introduction to Graphics Processing Units  514 6.7 Clusters, Warehouse Scale Computers, and Other Message-Passing Multiprocessors 521 6.8 Introduction to Multiprocessor Network Topologies  526 6.9 Communicating to the Outside World: Cluster Networking  529 6.10 Multiprocessor Benchmarks and Performance Models  530 6.11 Real Stuff: Benchmarking and Rooflines of the Intel Core i7 960 and the NVIDIA Tesla GPU  540 6.12 Going Faster: Multiple Processors and Matrix Multiply  545 6.13 Fallacies and Pitfalls  548 6.14 Concluding Remarks  550 6.15 Historical Perspective and Further Reading  553 6.16 Exercises  553

xi

xii Contents

A P P E N D I X

A

The Basics of Logic Design  A-2 A.1 Introduction A-3 A.2 Gates, Truth Tables, and Logic Equations  A-4 A.3 Combinational Logic  A-9 A.4 Using a Hardware Description Language  A-20 A.5 Constructing a Basic Arithmetic Logic Unit  A-26 A.6 Faster Addition: Carry Lookahead  A-37 A.7 Clocks A-47 A.8 Memory Elements: Flip-Flops, Latches, and Registers  A-49 A.9 Memory Elements: SRAMs and DRAMs  A-57 A.10 Finite-State Machines  A-66 A.11 Timing Methodologies  A-71 A.12 Field Programmable Devices  A-77 A.13 Concluding Remarks  A-78 A.14 Exercises  A-79

Index I-1 O N L I N E



B

C O N T E N T

Graphics and Computing GPUs  B-2 B.1 Introduction B-3 B.2 GPU System Architectures  B-7 B.3 Programming GPUs  B-12 B.4 Multithreaded Multiprocessor Architecture  B-25 B.5 Parallel Memory System  B-36 B.6 Floating Point Arithmetic  B-41 B.7 Real Stuff: The NVIDIA GeForce 8800  B-46 B.8 Real Stuff: Mapping Applications to GPUs  B-55 B.9 Fallacies and Pitfalls  B-72 B.10 Concluding Remarks  B-76 B.11 Historical Perspective and Further Reading  B-77

C

Mapping Control to Hardware  C-2 C.1 Introduction C-3 C.2 Implementing Combinational Control Units  C-4 C.3 Implementing Finite-State Machine Control  C-8 C.4 Implementing the Next-State Function with a Sequencer  C-22 C.5 Translating a Microprogram to Hardware  C-28 C.6 Concluding Remarks  C-32 C.7 Exercises C-33

Contents

 A Survey of RISC Architectures for Desktop, Server, D and Embedded Computers  D-2 D.1 Introduction D-3 D.2 Addressing Modes and Instruction Formats  D-5 D.3 Instructions: the MIPS Core Subset  D-9 D.4 Instructions: Multimedia Extensions of the Desktop/Server RISCs  D-16 D.5 Instructions: Digital Signal-Processing Extensions of the Embedded RISCs D-19 D.6 Instructions: Common Extensions to MIPS Core  D-20 D.7 Instructions Unique to MIPS-64  D-25 D.8 Instructions Unique to Alpha  D-27 D.9 Instructions Unique to SPARC v9  D-29 D.10 Instructions Unique to PowerPC  D-32 D.11 Instructions Unique to PA-RISC 2.0  D-34 D.12 Instructions Un...


Similar Free PDFs