Fundamentals of Quantitative Design and Analysis
- Chapter 1 in Computer Architecture A Quantitative Approach (6th) by Hennessy and Patterson (2017)
Introduction
- Two significant changes in computer market place
- Virtual elimination of assembly language programming
- The creation of standardized, vendor-independent operating systems
- UNIX, Linux, …
- New set of architectures with simpler instructions
- RISC (Reduced instruction set computer) architecture
- Two critical performance technique:
- Instruction level parallelism
- Use of caches
- Challenges to 80x86 instructions
- ARM, becoming dominant
- Fourfold effect of dramatic growth of computer market place
- Significantly enhanced capability of personal users
- New classes of computers
- Personal computers
- Workstations
- Smart cell phones
- Tablet computers
- Warehouse computers
- Supercomputers
- Moore’s law-driven hardware renaissance
- Software development
- Performance-oriented languages: C, C++
- Managed programming languages: Java, Scala
- Scripting languages: JavaScript, Python
- Programming frameworks: AngularJS, Django
- Interpreters with just-in-time compilers
- Trace-based compiling
- Software as a Service: SaaS
- Internet
- Application: Speech, sound, images, videos
- Google translate running on warehouse-scale computer (WSC)
- End of hardware renaissance
- Dennard Scaling: constant power density for smaller transistor dimensions
- Moore’s Law: the number of transistors on a microchip doubles every two years.
- Started to use multiple cores
- Instruction level parallelism:
- Compiler and hardware conspire to exploit ILP.
- No engagement of programmers
- Data level parallelism
- Thread level parallelism
- Request level parallelism for WSC
- Instruction level parallelism:
- Amdahl’s Law
- prescribes practical limits to the #cores per chip
- Thus, ”The only path left to improve energy-performance-cost is specialization”
- Growth in processor performance since the late 1970s
Classes of Computers
Internet of Things (IoT), Embedded Computers
- 8/32-bit for low cost devices (Microwaves, Washing machines)
- 64-bit for high-end product (Cars, Network switches)
Personal Mobile Devices (PMD)
- Responsiveness and predictability
- Real-time performance
- Minimize memory and energy consumption
Desktop Computing
- Benchmarking + Web-centric, interactive apps
Servers
- Availability: Open 7 days, 24 hours
- Scalability: Scale up computing capacity, the memory, the storage, the I/O bandwidth
- Efficient throughput (Overall Perf.) > Responsiveness (Individual Perf.)
Clusters/Warehouse-Scale Computers
- The growth of Software-as-a-Service (SaaS)
- Clusters:
- Collection of desktop computers or servers connected by local area networks to act as a single larger computer
- Each node run its own operating system, and nodes communicates using a networking protocol
- WSCs: The largest of the clusters
- Use inexpensive, redundant components compared to the servers
- Price-performance and power
- Availability: Peak hours for Christmas!
- Supercomputer
- Expensive, floating point performance,
- Running large, communication-intensive batch programs
Class of Parallelism and Parallel Architectures
- Parallelism in applications:
- Data-level parallelism
- Task-level parallelism
- Parallelism in computer hardware
- Instruction-level parallelism
- Pipelining, Speculative execution
- Vector architectures, graphic processing units (GPUs), multimedia instruction sets
- Data-level parallelism by applying a single instruction to a collection of data in parallel
- Thread-level parallelism
- DLP & TLP in a tightly coupled hardware model with interaction hardwares
- Request-level parallelism
- Parallelism between largely decoupled tasks
- Instruction-level parallelism
- Flynn’s (1966) taxonomy
- SISD: ILPs such as superscalar and speculative execution
- SIMD: Vector architectures, graphic processing units (GPUs), multimedia instruction sets
- MISD: eg. Systolic array
- MIMD: DLP & TLP & RLP, Tightly/loosely coupled multicore
Defining Computer Architecture
Instruction Set Architecture: The Myopic View of Computer Architecture
- ISA
- The actual programmer-visible instruction set
- A boundary between the software and hardware
- 80x86, ARMv8, RISC-V
- RISC-V
- A large set of registers
- Easy-to-pipeline instructions
- A lean set of operations
- Class of ISA
- General-purpose register architectures: Nearly all ISAs
- Register-memory ISAs (80x86)
- Load-store ISAs (ARMv8 & RISC-V)
- General-purpose register architectures: Nearly all ISAs
- Memory addressing
- Mostly, byte addressing
- Some, objects should be aligned
- Addressing Mode
- Addressing modes specify the address of a memory object
- eg. In RISC-V, Registers / Immediate / Displacement
- Types and Sizes of operands
- ASCII, Unicode, INT, word, FP32, FP64 ….
- Operations
- Data transfer, arithmetic logical, control, floating point
- Control flow instructions
- Conditional branches, unconditional branches, jumps, procedure calls, and returns
- Encoding an ISA
- Fixed length vs Variable length
- RISC-V registers, names, usage, and calling conventions
Genuine Computer Architecture: Designing the Organization and Hardware to Meet Goals and Functional Requirements
- Implementation = Organization (Microarchiecture)+ Hardware
-
Architecture = ISA + Organization (Microarchiecture) + Hardware
- More information in Appendix.A (Instruction Set Principles)
Trends in …
- Technology: IC logic technology, DRAM, Flash, Disk, Network
- Performance
- Scaling of transistors and wires
- Power and energy
- The shift in computer architecture because of limits of energy:
- Dark silicon
- Domain-specific processors
- Cost
- Time, volume, commoditization
- Cost of manufacturing vs operation
Dependability
- Is a system operating properly?
- How a failure in a specific level of computer influence other levels, esp. for application level?
- Service level agreements (SLAs) or Service level objectives (SLOs)
- Module reliability
- Mean time to failure (MTTF)
- Mean time to restore (MTTR)
- Module availability = MTTF/(MTTF+MTTR)
Measuring, Reporting, and Summarizing
- Benchmarks
- Kernels: small, key pieces of real applications
- Toy programs
- Synthetic benchmarks
- Benchmark suites: Collections of benchmarks
- SPEC (Standard Performance Evaluation Corporation)
- Processor performance
- CPI: Clock cycles per instruction
- IPC: Instructions per clock
Quantitative Principles of Computer Design
- Take Advantage of Parallelism
- Principle of Locality
- Focus on Common Case
- Amdahl’s Law
Reference
- Computer Architecture A Quantitative Approach (6th) by Hennessy and Patterson (2017)
- Notebook: Computer Architecture Quantitive Approach
Notes Mentioning This Note
Table of Contents
- Fundamentals of Quantitative Design and Analysis
- Introduction
- Classes of Computers
- Defining Computer Architecture
- Instruction Set Architecture: The Myopic View of Computer Architecture
- Trends in …
- Dependability
- Measuring, Reporting, and Summarizing
- Quantitative Principles of Computer Design
- Reference