LLVM

You’re right. The note is 20+ sections all at the same level, which makes it hard to see the conceptual groupings. Here’s a restructured version with hierarchy:

What LLVM Is

LLVM is a compiler infrastructure—a collection of reusable components for building compilers. It’s not a compiler itself, though it includes one (Clang). The name originally stood for “Low Level Virtual Machine” but that’s now considered a historical artifact; LLVM is just a name.

The core insight behind LLVM is separation of concerns in compiler design:

┌─────────────────────────────────────────────────────────────────┐
│                        Traditional Compilers                    │
│                                                                 │
│  C ──────────► x86                                              │
│  C++ ────────► x86     Each compiler is monolithic.             │
│  Fortran ────► x86     N languages × M targets = N×M compilers. │
│  C ──────────► ARM                                              │
│  C++ ────────► ARM                                              │
│  ...                                                            │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│                          LLVM Approach                          │
│                                                                 │
│  C ─────┐                  ┌────► x86                           │
│  C++ ───┼──► LLVM IR ──────┼────► ARM                           │
│  Rust ──┤   (common        ├────► RISC-V                        │
│  Swift ─┤    representation)├────► WebAssembly                   │
│  Julia ─┘                  └────► ...                           │
│                                                                 │
│  N languages + M targets = N+M components                       │
└─────────────────────────────────────────────────────────────────┘

Language frontends lower source code to LLVM IR (intermediate representation). The LLVM optimizer transforms IR into more efficient IR. Backends convert IR to machine code for specific targets. Each piece is independent—add a new frontend and you get all backends for free; add a new backend and all frontends can target it.

LLVM IR

LLVM IR is a typed, low-level language that sits between source code and machine code. It’s architecture-independent but low-level enough to expose optimization opportunities.

# Generate LLVM IR from C
echo 'int add(int a, int b) { return a + b; }' | clang -S -emit-llvm -O0 -x c - -o -

define i32 @add(i32 %a, i32 %b) {
entry:
  %a.addr = alloca i32
  %b.addr = alloca i32
  store i32 %a, ptr %a.addr
  store i32 %b, ptr %b.addr
  %0 = load i32, ptr %a.addr
  %1 = load i32, ptr %b.addr
  %add = add nsw i32 %0, %1
  ret i32 %add
}

This is unoptimized IR—it allocates stack slots and loads/stores redundantly, mirroring the source structure. Key observations:

Types are explicit: i32 means 32-bit integer. LLVM IR is strongly typed with integers of arbitrary width (i1, i8, i32, i64, i128), floating point (float, double), pointers (ptr), vectors, and aggregates.

SSA form: Variables starting with % are in Static Single Assignment form—each is assigned exactly once. This simplifies optimization by making data flow explicit. The %0, %1 naming shows temporaries.

Instructions are simple: Each instruction does one thing. add adds two integers. load reads from memory. store writes to memory. Complex source operations decompose into sequences of simple IR instructions.

With optimization:

echo 'int add(int a, int b) { return a + b; }' | clang -S -emit-llvm -O2 -x c - -o -

define i32 @add(i32 %a, i32 %b) {
  %1 = add nsw i32 %b, %a
  ret i32 %1
}

The optimizer eliminated all memory operations. Arguments arrive in %a and %b, the add instruction computes the sum, and ret returns it. This directly mirrors what the machine code will do.

Three Representations of IR

LLVM IR exists in three equivalent forms:

Form	Extension	Use Case
Human-readable text	`.ll`	Debugging, learning, manual inspection
Bitcode (binary)	`.bc`	Efficient storage, LTO, distribution
In-memory C++ objects	—	Used during compilation

# Generate text IR
clang -S -emit-llvm program.c -o program.ll
 
# Generate bitcode
clang -c -emit-llvm program.c -o program.bc
 
# Convert between them
llvm-as program.ll -o program.bc    # text → bitcode
llvm-dis program.bc -o program.ll   # bitcode → text

Bitcode is the format used for Link-Time Optimization (LTO). Instead of compiling to object files, the compiler emits bitcode. The linker then runs LLVM optimization passes across the entire program before generating machine code. This enables inlining and optimization across compilation unit boundaries.

The Compilation Pipeline

Optimization Passes

LLVM’s optimizer runs passes over the IR. Each pass performs a specific transformation. Passes compose—running them in sequence produces the overall optimization effect.

# See which passes run at -O2
clang -O2 -mllvm -debug-pass=Arguments program.c 2>&1 | head -50

Some important optimization passes:

Pass	What It Does
`mem2reg`	Promotes stack allocations to SSA registers
`instcombine`	Combines/simplifies instruction sequences
`inline`	Inlines function calls
`gvn`	Global Value Numbering (eliminates redundant computations)
`licm`	Loop Invariant Code Motion (hoists loop-independent code)
`sccp`	Sparse Conditional Constant Propagation
`dce`	Dead Code Elimination
`loop-vectorize`	Converts scalar loops to SIMD operations

You can run passes manually with opt:

# Generate unoptimized IR
echo 'int add(int a, int b) { return a + b; }' | clang -S -emit-llvm -O0 -x c - -o add.ll
 
# Run just mem2reg
opt -S -passes=mem2reg add.ll -o add-opt.ll
cat add-opt.ll

define i32 @add(i32 %a, i32 %b) {
entry:
  %add = add nsw i32 %a, %b
  ret i32 %add
}

The mem2reg pass alone eliminated all the alloca/load/store instructions by recognizing that the stack variables could be kept in SSA registers.

The Optimizer Doesn’t Know Your Language

LLVM’s optimizer sees only IR. It doesn’t know whether the input was C, Rust, Swift, or anything else. Optimizations are purely IR transformations based on the semantics encoded in the IR itself.

Rust’s safety guarantees don’t directly help LLVM optimize. However, Rust emits IR with more metadata (like noalias on mutable references) that tells LLVM about invariants the optimizer can exploit:

; Rust's &mut T guarantees no aliasing, expressed as noalias
define void @modify(ptr noalias %x, ptr noalias %y) {
  ...
}

The noalias attribute tells LLVM that %x and %y don’t overlap, enabling optimizations that wouldn’t be safe for C pointers (which might alias).

Backend: IR to Machine Code

The backend converts IR to machine code for a specific target. This involves:

Instruction selection: Mapping IR operations to target instructions. The IR add i32 might become x86 add, ARM ADD, or RISC-V addw.
Register allocation: The IR has unlimited virtual registers; real CPUs have fixed register files. The allocator assigns virtual registers to physical ones, spilling to memory when necessary.
Instruction scheduling: Reordering instructions to avoid pipeline stalls and maximize throughput.
Target-specific optimization: Peephole optimizations for the specific CPU.

# See the journey from IR to assembly
echo 'int add(int a, int b) { return a + b; }' | clang -O2 -S -x c - -o -

add:
        leal    (%rdi,%rsi), %eax
        retq

The backend chose leal (LEA—load effective address) as an efficient way to add two registers into a third on x86-64. A different target produces different assembly:

# Same IR, targeting ARM64
echo 'int add(int a, int b) { return a + b; }' | clang -O2 -S --target=aarch64-linux-gnu -x c - -o -

add:
        add     w0, w0, w1
        ret

The IR was identical; only the backend changed.

Supported Targets

LLVM supports many targets out of the box:

llc --version

  Registered Targets:
    aarch64    - AArch64 (little endian)
    amdgcn     - AMD GCN GPUs
    arm        - ARM
    bpf        - BPF (host endian)
    hexagon    - Hexagon
    mips       - MIPS (32-bit big endian)
    nvptx64    - NVIDIA PTX 64-bit
    riscv64    - 64-bit RISC-V
    wasm32     - WebAssembly 32-bit
    x86        - 32-bit X86
    x86-64     - 64-bit X86
    ...

Cross-compilation means running the frontend and optimizer on your host, then selecting a different backend.

Language Frontends

Clang

Clang is the C, C++, and Objective-C frontend for LLVM. It parses source code, performs semantic analysis, and emits LLVM IR. Clang handles everything language-specific; LLVM handles everything target-specific.

┌─────────────────────────────────────────────────────────────────┐
│  Clang (Frontend)                                               │
│   ├── Lexer: Source text → tokens                               │
│   ├── Parser: Tokens → AST (Abstract Syntax Tree)               │
│   ├── Sema: Type checking, overload resolution, template        │
│   │         instantiation, semantic analysis                    │
│   ├── CodeGen: AST → LLVM IR                                    │
│   └── Driver: Orchestrates compilation, invokes linker          │
├─────────────────────────────────────────────────────────────────┤
│  LLVM (Middle-end + Backend)                                    │
│   ├── Optimizer: IR → optimized IR                              │
│   └── Backend: IR → machine code                                │
└─────────────────────────────────────────────────────────────────┘

Clang is a separate project from LLVM core, though they’re developed together and released in sync. You can use LLVM without Clang (as Rust does), and theoretically Clang could target a different IR (though nobody does this).

# Clang's view of compilation stages
clang -ccc-print-phases hello.c

0: input, "hello.c", c
1: preprocessor, {0}, cpp-output
2: compiler, {1}, ir
3: backend, {2}, assembler
4: assembler, {3}, object
5: linker, {4}, image

Stages 0-2 are Clang. Stages 3-4 are LLVM backend. Stage 5 invokes the system linker.

Rustc

The Rust compiler (rustc) uses LLVM as its backend. Rust has its own frontend and intermediate representations:

┌─────────────────────────────────────────────────────────────────┐
│  rustc (Rust Frontend)                                          │
│   ├── Lexer/Parser: Source → AST                                │
│   ├── Name resolution, macro expansion                          │
│   ├── HIR: High-level IR (type checking happens here)           │
│   ├── MIR: Mid-level IR (borrow checking, Rust-specific opts)   │
│   ├── Codegen: MIR → LLVM IR                                    │
│   └── Driver: Orchestrates compilation, invokes linker          │
├─────────────────────────────────────────────────────────────────┤
│  LLVM                                                           │
│   ├── Optimizer: IR → optimized IR                              │
│   └── Backend: IR → machine code                                │
└─────────────────────────────────────────────────────────────────┘

HIR (High-level IR) is a desugared AST where type inference and trait resolution happen. MIR (Mid-level IR) is a control-flow graph used for borrow checking and Rust-specific optimizations. After MIR passes complete, rustc lowers to LLVM IR and hands off to LLVM.

# See Rust's MIR
rustc --emit=mir hello.rs -o hello.mir
 
# See the LLVM IR Rust generates
rustc --emit=llvm-ir hello.rs -o hello.ll

Rust ships with LLVM built-in—you don’t need a separate LLVM installation. Each Rust release bundles a specific LLVM version, sometimes with Rust-specific patches.

# Check which LLVM version rustc uses
rustc --version --verbose | grep LLVM

LLVM version: 17.0.6

Building a competitive optimizing compiler is enormously complex. By using LLVM, Rust gets world-class optimization passes, support for dozens of target architectures, and continuous improvements from the broader LLVM community. The tradeoff is compile time—LLVM is thorough but slow. Rust’s debug builds bypass most LLVM optimization (-C opt-level=0), and projects like Cranelift offer faster alternative backends for development builds.

Other Frontends

Many languages use LLVM as their backend:

Language	Frontend	Notes
C/C++/ObjC	Clang	Reference frontend
Rust	rustc	MIR → LLVM IR
Swift	swiftc	Apple’s language
Julia	Julia compiler	JIT compiled via LLVM
Kotlin Native	Kotlin/Native	LLVM backend for native compilation
Zig	Zig compiler	Also can emit LLVM IR
Haskell	GHC	Optional LLVM backend
Fortran	Flang	Modern Fortran frontend

Each frontend handles language semantics and lowers to LLVM IR. From there, compilation is identical.

The LLVM Ecosystem

Subprojects

LLVM is an umbrella for many related projects:

Project	Purpose
LLVM Core	Optimizer, code generators, IR
Clang	C/C++/ObjC frontend
libc++	C++ standard library
libc++abi	C++ ABI support library
compiler-rt	Compiler runtime (builtins, sanitizers)
LLD	LLVM’s linker
LLDB	Debugger
libunwind	Stack unwinding library
OpenMP	OpenMP runtime
Polly	Polyhedral loop optimizer
MLIR	Multi-Level IR (for ML compilers, DSLs)
Flang	Fortran frontend
llvm-libc	C standard library (in development)

These are developed together but can often be used independently. You might use Clang with GNU’s libstdc++ instead of libc++, or use LLD as a drop-in replacement for GNU ld with GCC.

compiler-rt

The compiler runtime provides functions the compiler assumes exist but that aren’t part of the C standard library:

┌─────────────────────────────────────────────────────────────────┐
│  compiler-rt (LLVM) / libgcc (GCC)                              │
│   ├── Builtins                                                  │
│   │    ├── Integer ops CPU lacks (__divti3, __multi3)           │
│   │    ├── Floating-point soft emulation                        │
│   │    └── Bit manipulation (__clzdi2, __popcountdi2)           │
│   ├── Sanitizers                                                │
│   │    ├── AddressSanitizer (ASan)                              │
│   │    ├── UndefinedBehaviorSanitizer (UBSan)                   │
│   │    ├── ThreadSanitizer (TSan)                               │
│   │    └── MemorySanitizer (MSan)                               │
│   ├── Profiling runtime                                         │
│   └── Platform-specific support                                 │
└─────────────────────────────────────────────────────────────────┘

When you divide two 128-bit integers:

__int128 divide(__int128 a, __int128 b) {
    return a / b;
}

x86-64 has no single instruction for 128-bit division. The compiler emits:

clang -S -O2 divide.c -o -

divide:
        jmp     __divti3        ; Call compiler runtime function

The __divti3 function lives in compiler-rt (or libgcc). It’s not part of libc—it’s compiler support code that the compiler assumes exists.

Tooling

LLVM includes standalone tools for working with IR and binaries:

Tool	Purpose
`clang`	C/C++/ObjC compiler
`opt`	IR optimizer (run passes manually)
`llc`	IR to assembly/object code
`llvm-as`	Text IR to bitcode
`llvm-dis`	Bitcode to text IR
`llvm-link`	Link multiple bitcode files
`lli`	LLVM IR interpreter/JIT
`llvm-nm`	Symbol table viewer
`llvm-objdump`	Disassembler
`llvm-readelf`	ELF reader
`llvm-ar`	Archive tool

These tools work on any platform LLVM supports and understand all LLVM target formats.

# Compile to bitcode
clang -c -emit-llvm program.c -o program.bc
 
# Inspect the IR
llvm-dis program.bc -o -
 
# Optimize it
opt -O2 program.bc -o program-opt.bc
 
# Compile to assembly for a specific target
llc -march=x86-64 program-opt.bc -o program.s
 
# Or compile to object file
llc -filetype=obj program-opt.bc -o program.o

Advanced Topics

Link-Time Optimization (LTO)

LTO defers optimization until link time, enabling whole-program analysis. Instead of compiling to object files, you compile to bitcode:

# Compile each file to bitcode
clang -c -flto file1.c -o file1.o
clang -c -flto file2.c -o file2.o
 
# "Link" - actually runs LLVM optimization then code generation
clang -flto file1.o file2.o -o program

The .o files contain LLVM bitcode, not machine code. At link time, LLVM loads all bitcode, runs optimization passes across the entire program, then generates machine code. This enables:

Cross-module inlining: Functions from file1.c can be inlined into file2.c
Whole-program devirtualization: Virtual calls with known targets become direct calls
Global dead code elimination: Unused functions removed even if exported
Interprocedural optimization: Constant propagation across function boundaries

Rust enables LTO in Cargo.toml:

[profile.release]
lto = true          # "fat" LTO - full cross-crate optimization
# lto = "thin"      # Faster but less thorough

ThinLTO is a scalable variant that achieves most benefits with better parallelism and caching. It’s the default for Rust’s lto = "thin".

JIT Compilation

LLVM supports Just-In-Time compilation—generating machine code at runtime:

// Simplified example using LLVM's C API
LLVMModuleRef module = LLVMModuleCreateWithName("my_jit");
// ... build IR programmatically ...
LLVMExecutionEngineRef engine;
LLVMCreateJITCompilerForModule(&engine, module, 2, &error);
int (*fn)(int, int) = (int(*)(int,int))LLVMGetFunctionAddress(engine, "add");
int result = fn(3, 4);  // Execute JIT-compiled code

Julia uses this extensively—code is compiled to LLVM IR at runtime, optimized, and executed. This enables Julia’s “compile what you use” model where functions are specialized for the types they’re called with.

Languages like Python could theoretically JIT-compile hot paths via LLVM, but CPython doesn’t. Projects like Numba add LLVM JIT compilation to Python for numerical code:

from numba import jit
 
@jit(nopython=True)
def sum_array(arr):
    total = 0
    for x in arr:
        total += x
    return total

Numba compiles this to LLVM IR, optimizes it, and generates native machine code—orders of magnitude faster than interpreted Python.

Practical Usage

Inspecting the Full Pipeline

Trace a C function through every stage:

# Source
echo 'int square(int x) { return x * x; }' > square.c
 
# Preprocessed (macros expanded, includes resolved)
clang -E square.c -o square.i
 
# LLVM IR (unoptimized)
clang -S -emit-llvm -O0 square.c -o square-O0.ll
 
# LLVM IR (optimized)
clang -S -emit-llvm -O2 square.c -o square-O2.ll
 
# Assembly (target-specific)
clang -S -O2 square.c -o square.s
 
# Object file (machine code + metadata)
clang -c -O2 square.c -o square.o
 
# Compare sizes
wc -l square-O0.ll square-O2.ll square.s

  12 square-O0.ll
   6 square-O2.ll
   8 square.s

For Rust:

echo 'pub fn square(x: i32) -> i32 { x * x }' > square.rs
 
# HIR (Rust high-level IR)
rustc -Z unpretty=hir square.rs 2>/dev/null
 
# MIR (Rust mid-level IR)
rustc --emit=mir square.rs -o square.mir
 
# LLVM IR
rustc --emit=llvm-ir -O square.rs -o square.ll
 
# Assembly
rustc --emit=asm -O square.rs -o square.s

LLVM vs GCC

GCC (GNU Compiler Collection) is the other major open-source compiler infrastructure. Both produce excellent optimized code, but they differ architecturally:

Aspect	LLVM	GCC
IR	LLVM IR (well-documented, stable)	GIMPLE/RTL (internal, less accessible)
License	Apache 2.0 with LLVM exception	GPL 3.0
Modularity	Library-first design, reusable components	Monolithic, harder to embed
Frontends	Clang, Rust, Swift, Julia, etc.	C, C++, Fortran, Ada, Go, D
Build system integration	Easy to embed (JIT, tooling)	Designed as standalone compiler

GCC has been around longer (1987 vs 2003) and supports some targets and language features LLVM doesn’t. LLVM’s cleaner architecture makes it easier to build tools on top of—this is why so many new languages chose LLVM.

The license matters for commercial users. GCC’s GPL requires derivative works to be GPL-licensed. LLVM’s permissive license allows proprietary use, which is why Apple invested heavily in LLVM and why it’s popular in commercial toolchains.

Edmondo's Vault

Explorer