An Instruction Set Architecture (ISA) is a formal specification that defines the interface between software and hardware. It provides:

  • a set of instructions
  • registers
  • data types
  • memory addressing modes
  • hardware features that the software (e.g., compilers) can use to interact with the CPU.

Important

The ISA acts as the contract between software developers and hardware manufacturers, ensuring compatibility and portability of software across different CPU implementations that adhere to the same ISA.

How Does an ISA Work in Practice?

The ISA defines the operations a CPU can perform, but the CPU itself must have a microarchitecture that implements this ISA. Modern CPUs use a hardware decoder to translate ISA instructions into micro-operations (uOps), which are lower-level instructions that the CPU execution units can understand and process.

Example

For an instruction like ADD RAX, RBX, the CPU decoder might produce a micro-op that performs load operations for RAX and RBX, an addition, and writes the result back.

Key Components of an ISA

Instruction Set

The instruction set is a collection of operations that the CPU can execute, including:

  • Arithmetic and Logical Instructions: ADD, SUB, MUL, DIV, AND, OR, XOR
  • Data Movement Instructions: MOV, LOAD, STORE
  • Control Flow Instructions: JMP, CALL, RET, CMP, JNE
  • SIMD Instructions: FMA (Fused Multiply-Add), AVX, SSE (for vectorized operations)

Registers

Registers are fast, small storage locations within the CPU. The ISA defines which registers are available to software and how they can be used:

  • General-Purpose Registers (GPRs): For integer and pointer operations.
  • Floating-Point Registers: For floating-point arithmetic.
  • SIMD/Vector Registers: For parallel data processing (XMM, YMM, ZMM in x86-64 AVX-512).
  • Control Registers: For special functions, such as the instruction pointer (RIP) or stack pointer (RSP).
MOV RAX, 10  ; RAX is an ISA-defined register
ADD RBX, RAX ; Uses architectural registers specified by x86-64

Physical registers

The actual hardware registers inside the CPU are way more numerous than the ISA-defined registers exposed to the compilers. The physical registers are managed by the CPU, which performs register renaming, allowing out-of-order execution and minimizing pipeline stalls.

Data Types

The ISA specifies supported data types, including:

  • Primitive types: Integers, floats, doubles, characters.
  • Vector types: For SIMD operations (e.g., 256-bit vectors in AVX2, 512-bit vectors in AVX-512).

Why the ISA Must Define Data Types

Although defining something like an int8 may seem straightforward, the ISA must unambiguously specify its size, how it is stored in memory (endianness), and how instructions interpret and manipulate those bits. Without this, different hardware implementations could vary in how they handle data, leading to incompatibility and unpredictable behavior. By defining these details, the ISA guarantees that software running on any CPU adhering to that ISA will interpret and operate on data types consistently.

Memory Addressing Modes

Defines how instructions access memory, including:

  • Immediate Addressing: Value is directly in the instruction (MOV RAX, 10).
  • Register Addressing: Uses register contents as operands.
  • Direct/Absolute Addressing: Accesses a specific memory address.
  • Indirect Addressing: Uses a register to point to memory (MOV RAX, [RBX]).
  • Indexed Addressing: Combines base address and offset (MOV RAX, [RBX+4]).

How Do Compilers Use the ISA?

  • Targeting Specific ISAs: Compilers like GCC, Clang, and MSVC support multiple ISAs through flags and configuration (-march=x86-64, -march=armv8-a).
  • ISA Extensions: The compiler can generate specialized instructions if extensions are enabled (-mavx2 for SIMD).
  • Bundles: The compiler toolchain includes backend components for different ISAs, and you select the target ISA during compilation.

Major ISAs in Use Today

CISC (Complex Instruction Set Computing)

  • x86-64: Used by Intel and AMD. Known for rich instructions and backward compatibility.

RISC (Reduced Instruction Set Computing)

  • ARM: Dominant in mobile devices. Known for low power consumption and simplicity.
  • RISC-V: An open-source ISA gaining traction for its modularity and flexibility.