Lecture from: 24.02.2023 | Video: YT

Data-Centric Computing

The Data Bottleneck

Modern computing is increasingly bottlenecked by data movement rather than raw computational power. We can generate data far faster than we can process it efficiently. Data-intensive workloads are prevalent and require rapid processing of massive datasets. Estimates suggest that 60-90% of total system energy is consumed by data movement (fetching, storing, and transferring data between memory and processing units). This highlights the need for architectures that minimize data transfer overhead.

Emerging Computing Paradigms

The limitations of traditional architectures have spurred research into novel computing paradigms, often involving rethinking the entire computing stack:

  • Processing-in-Memory (PIM) and Processing-Near-Data (PND): Moving computation closer to where the data resides to reduce data transfer distances and energy consumption.
  • Neuromorphic Computing: Inspired by the brain’s architecture, using analog or mixed-signal circuits to emulate neural networks for efficient pattern recognition and learning.
  • Quantum Computing: Utilizing quantum mechanics to perform computations that are intractable for classical computers, potentially revolutionizing fields like cryptography and drug discovery.
  • Fundamentally Secure and Dependable Computers: Designing architectures with inherent security mechanisms to prevent vulnerabilities and ensure reliable operation.

New Accelerators and Algorithm-Hardware Co-Design

  • AI and ML Accelerators: Specialized hardware for accelerating deep learning and other machine learning algorithms.
  • Graph and Data Analytics, Vision, and Video Accelerators: Optimizing hardware for tasks involving large graphs, image processing, and video encoding/decoding.
  • Genome Analysis Accelerators: Accelerating the computationally intensive tasks involved in DNA sequencing and analysis.

Advances in Memory, Storage, Interconnects, and Devices

  • Non-Volatile Main Memory (NVMM): Replacing traditional DRAM with persistent memory technologies for faster boot times and data persistence.
  • Intelligent Memory Systems: Incorporating processing capabilities within memory modules to enable data filtering and aggregation.
  • Quantum Memory: Developing memory technologies that can store and manipulate quantum information.
  • High-Speed Interconnects: Designing faster and more efficient interconnects for communication between different components in a system.
  • Disaggregated Systems: Decoupling compute, memory, and storage resources and connecting them via high-bandwidth interconnects, allowing for flexible resource allocation and scaling.

A Heterogeneous Landscape

Currently, there is no clear consensus on the optimal computing architecture for all tasks. The field can be characterized as a heterogeneous soup of competing ideas. These diverse paradigms will be explored and evaluated, and over time, dominant approaches are likely to emerge. This is an exciting and dynamic period for computer architecture research and development.

Principled Design in Computer Architecture

Principled design begins with fundamental building blocks and design principles, along with a deep understanding of how to utilize, apply, and enhance them. While underlying technologies evolve, methods for leveraging those technologies often share similarities, and design methodologies are rooted in established principles.

Basic Building Blocks of a Computer System

  • Electrons: The fundamental charge carriers.
  • Transistors: The basic switching elements.
  • Logic Gates: Building blocks of digital circuits (AND, OR, NOT, etc.).
  • Combinational Logic Circuits: Circuits whose outputs are solely determined by their current inputs (e.g., adders, multiplexers).
  • Sequential Logic Circuits: Circuits whose outputs depend on both current and past inputs, incorporating memory elements.
    • Storage Elements and Memory: Latches, flip-flops, SRAM, DRAM.
  • (Increasing Complexity)
  • Cores: Processing units consisting of arithmetic logic units (ALUs), control units, and registers.
  • Caches: Small, fast memory used to store frequently accessed data.
  • Interconnect: On-chip and off-chip communication networks.
  • Memories: DRAM, SRAM, Flash memory, and storage devices.

Defining a Computer

Following the Von Neumann architecture, a computer can be conceptually divided into three essential components:

  • Computation: Performing arithmetic and logical operations.
  • Communication: Transferring data between different components.
  • Storage/Memory: Storing data and instructions.

General-Purpose vs. Special-Purpose Systems

The spectrum of computer systems ranges from general-purpose CPUs to specialized ASICs (Application-Specific Integrated Circuits), with GPUs (Graphics Processing Units) and FPGAs (Field-Programmable Gate Arrays) occupying intermediate positions.

Transistors: The Foundation

Transistors are the fundamental building blocks of modern computers. These tiny, relatively simple structures are used in vast quantities to implement complex logic and memory functions. The number of transistors integrated into a single chip continues to increase rapidly, driving improvements in performance and energy efficiency.

This lecture will focus on understanding how the MOS (Metal-Oxide-Semiconductor) transistor operates as a logic element.

MOS Transistor Composition

MOS transistors are constructed by combining:

  • Conductors (Metal): Used for electrical connections.
  • Insulators (Oxide): Used to isolate different parts of the transistor.
  • Semiconductors: Materials with conductivity between conductors and insulators, such as silicon, which can be controlled by an electric field.

Instead of delving into the detailed physics of transistor operation, we will abstract the transistor as a simple switch that can be turned on and off. This simplified model allows us to understand how transistors can be used to implement logic gates and build more complex digital circuits.

MOS Transistor Types

There are two main types of MOS transistors: n-type MOS (nMOS) and p-type MOS (pMOS).

Both nMOS and pMOS transistors act as electronically controlled switches. Think of a lamp connected to a power source through a switch. The state of the switch (on or off) controls whether the lamp is illuminated.

The fundamental difference between nMOS and pMOS transistors lies in the voltage polarity required to activate (turn on) or deactivate (turn off) the switch. This difference dictates how they are used in digital circuits.

nMOS Transistors

An nMOS transistor conducts (is “on”) when a sufficiently high voltage is applied to its gate terminal. When the gate voltage is low (or zero), the transistor is non-conducting (“off”).

  • High Gate Voltage (Logic 1): Switch closed, circuit connected.
  • Low Gate Voltage (Logic 0): Switch open, circuit disconnected.

pMOS Transistors

A pMOS transistor operates in the opposite manner. It conducts when a sufficiently low voltage is applied to its gate terminal. When the gate voltage is high, the transistor is non-conducting.

  • Low Gate Voltage (Logic 0): Switch closed, circuit connected.
  • High Gate Voltage (Logic 1): Switch open, circuit disconnected.

CMOS Logic

Modern digital circuits predominantly employ Complementary MOS (CMOS) technology, which utilizes both nMOS and pMOS transistors. CMOS offers advantages in terms of power consumption, noise immunity, and switching speed.

CMOS Inverter (NOT Gate)

The inverter is the simplest logic gate in CMOS technology.

The circuit consists of a pMOS transistor connected in series with an nMOS transistor. The input signal is connected to the gates of both transistors.

  • Input = High (Logic 1, e.g., 3V): The nMOS transistor turns ON, pulling the output down to ground (0V). The pMOS transistor turns OFF.
  • Input = Low (Logic 0, e.g., 0V): The pMOS transistor turns ON, pulling the output up to the supply voltage (3V). The nMOS transistor turns OFF.

Therefore, the output is always the logical inverse of the input.

Symbol:

CMOS NAND Gate

A NAND (NOT AND) gate is another fundamental logic gate.

The CMOS NAND gate consists of two pMOS transistors connected in parallel and two nMOS transistors connected in series.

Input AInput BOutput
001
011
101
110

The output is LOW (0) only when both inputs A and B are HIGH (1). Otherwise, the output is HIGH (1).

Symbol:

CMOS AND Gate

An AND gate is the logical complement of a NAND gate. It can be constructed by inverting the output of a NAND gate using an inverter.

Symbol:

Continue here: 03 Combinational Logic 2