CS146

Computer Architecture

Spring 2004

 

David Brooks

Assistant Professor

Maxwell Dworkin 141
33 Oxford Street
Cambridge MA 02138

Phone: 617-495-3989
Fax: 617-495-2809

E-mail: dbrooks@eecs.harvard.edu

Syllabus

Meeting time

Monday/Wednesday 1:00-2:30PM, MD G135

Related Course

 

CS 246: 

Advanced Computer Architecture

[Fall Semester]




Introduction

The class will review fundamental structures in modern microprocessor and computer system architecture design.  Tentative topics will include computer organization, instruction set design, memory system design, pipelining, and other techniques to exploit parallelism.  We will also cover system level topics such as storage subsystems and basics of multiprocessor systems.  The class will focus on quantitative evaluation of design alternatives while considering design metrics such as performance and power dissipation.

Prerequisites

CS 141 (Computing Hardware) or equivalent, C Programming 

Textbook

Textbook: “Computer Architecture: A Quantitative Approach,” Third Edition,
John L. Hennessy and David A. Patterson, ISBN 1-55860-596-7A

Course Readings

Lecture 1: Introduction to Computer Architecture

Class Notes

Lecture 2: CPU Performance and Metrics

Class Notes

Lecture 3: Instruction Set Architecture

Readings:  Ruby B. Lee, "Subword Parallelism with MAX-2," IEEE Micro, 16(4),August 1996, pp. 51-59.

Class Notes

Homework 1

Lecture 4: Implementation and Pipelining

Class Notes

Lecture 5: Exceptions, Multi-cycle Ops, Dynamic Scheduling

Class Notes

Lecture 6: Scoreboarding Example, Tomasulo's Algorithm

Class Notes

Lecture 7: Dynamic Branch Prediction

Readings: Tse-Yu Yeh, Yale N. Patt, "A Comparison of Dynamic Branch Predictors that use Two Levels of Branch History," The 20th International Symposium on Computer Architecture, May, 1993. 

Class Notes

Lecture 8: Multiple Issue and Speculation

Class Notes

Readings: G. S. Sohi and S. Vajapeyam, "Instruction Issue Logic for 
High-performance, Interruptable Pipelined Processors," International Symposium on Computer Architecture, 1987.

Readings: J. E. Smith and A. Pleszkun, "Implementing Precise Interrupts in Pipelined Processors," IEEE Transactions on Computers, Volume 37, Issue 5 (May 1988).

Homework 2

Lecture 9: Limits of ILP, Case Studies

Class Notes

Readings: David W. Wall, "Limits of instruction-level parallelism," Architectural Support for Programming Languages and Operating Systems (ASPLOS) 1991. (Also, see updated Tech Report at Western Research Lab: http://research.compaq.com/wrl/techreports/abstracts/93.6.html)

Readings: Subbarao Palacharla, Norman P. Jouppi, James E. Smith, 
"Complexity-Effective Superscalar Processors," 24th International Symposium on Computer Architecture (ISCA-24), June 1997.

Readings: Eric Rotenberg, Steve Bennett, J. E. Smith, "Trace Cache: A Low Latency Approach to High Bandwidth Instruction Fetching," 29th International Symposium on Microarchitecture (MICRO-29), Dec 1996.

Lecture 10: Static Scheduling, Loop Unrolling, and Software Pipelining

Class Notes

Homework 3

Lecture 11: Software Pipelining and Global Scheduling

Class Notes

Sample Midterm from Fall 2002

Lecture 12: Hardware Assisted Software ILP and IA64/Itanium Case Study

Class Notes

Lecture 14: Introduction to Caches

Class Notes

Lecture 15: More on Caches

Class Notes

Homework 4

Lecture 16: More on Caches

Class Notes

Lecture 17: Main Memory

Class Notes

Ars Technica RAM Guide(Read Parts I, II, and III).

Lecture 18: Virtual Memory

Class Notes

Lecture 19: Multiprocessors

Class Notes

Homework 5

Lecture 20: More Multiprocessors

Class Notes

Readings Simultaneous Multithreading: Maximizing On-Chip Parallelism, D.M. Tullsen, S.J. Eggers, and H.M. Levy, In 22nd Annual International Symposium on Computer Architecture, June, 1995.

Readings L. Barroso, K. Gharachorloo, R. McNamara, A. Nowatzyk, S. Qadeer, B. Sano, S. Smith, R. Stets, and B. Verghese. Piranha: A Scalable Architecture Based on Single-Chip Multiprocessing. In Proceedings of the 27th Annual International Symposium on Computer Architecture (ISCA'00), June 2000.

Lecture 21: Multithreading and I/O

Class Notes

Lecture 22: More I/O

Class Notes

Lecture 23: Clusters and Wrapup

Class Notes

Readings L. Barroso, J. Dean, and U. Holzle, "Web search for a planet: The Google Cluster Architecture," IEEE Micro, 23, 2, March-April 2003, pp. 22-28.