You’re right. The note is 20+ sections all at the same level, which makes it hard to see the conceptual groupings. Here’s a restructured version with hierarchy:
What LLVM Is
LLVM is a compiler infrastructure—a collection of reusable components for building compilers. It’s not a compiler itself, though it includes one (Clang). The name originally stood for “Low Level Virtual Machine” but that’s now considered a historical artifact; LLVM is just a name.
The core insight behind LLVM is separation of concerns in compiler design:
┌─────────────────────────────────────────────────────────────────┐
│ Traditional Compilers │
│ │
│ C ──────────► x86 │
│ C++ ────────► x86 Each compiler is monolithic. │
│ Fortran ────► x86 N languages × M targets = N×M compilers. │
│ C ──────────► ARM │
│ C++ ────────► ARM │
│ ... │
└─────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────┐
│ LLVM Approach │
│ │
│ C ─────┐ ┌────► x86 │
│ C++ ───┼──► LLVM IR ──────┼────► ARM │
│ Rust ──┤ (common ├────► RISC-V │
│ Swift ─┤ representation)├────► WebAssembly │
│ Julia ─┘ └────► ... │
│ │
│ N languages + M targets = N+M components │
└─────────────────────────────────────────────────────────────────┘
Language frontends lower source code to LLVM IR (intermediate representation). The LLVM optimizer transforms IR into more efficient IR. Backends convert IR to machine code for specific targets. Each piece is independent—add a new frontend and you get all backends for free; add a new backend and all frontends can target it.
LLVM IR
LLVM IR is a typed, low-level language that sits between source code and machine code. It’s architecture-independent but low-level enough to expose optimization opportunities.
# Generate LLVM IR from C
echo 'int add(int a, int b) { return a + b; }' | clang -S -emit-llvm -O0 -x c - -o -define i32 @add(i32 %a, i32 %b) {
entry:
%a.addr = alloca i32
%b.addr = alloca i32
store i32 %a, ptr %a.addr
store i32 %b, ptr %b.addr
%0 = load i32, ptr %a.addr
%1 = load i32, ptr %b.addr
%add = add nsw i32 %0, %1
ret i32 %add
}This is unoptimized IR—it allocates stack slots and loads/stores redundantly, mirroring the source structure. Key observations:
Types are explicit: i32 means 32-bit integer. LLVM IR is strongly typed with integers of arbitrary width (i1, i8, i32, i64, i128), floating point (float, double), pointers (ptr), vectors, and aggregates.
SSA form: Variables starting with % are in Static Single Assignment form—each is assigned exactly once. This simplifies optimization by making data flow explicit. The %0, %1 naming shows temporaries.
Instructions are simple: Each instruction does one thing. add adds two integers. load reads from memory. store writes to memory. Complex source operations decompose into sequences of simple IR instructions.
With optimization:
echo 'int add(int a, int b) { return a + b; }' | clang -S -emit-llvm -O2 -x c - -o -define i32 @add(i32 %a, i32 %b) {
%1 = add nsw i32 %b, %a
ret i32 %1
}The optimizer eliminated all memory operations. Arguments arrive in %a and %b, the add instruction computes the sum, and ret returns it. This directly mirrors what the machine code will do.
Three Representations of IR
LLVM IR exists in three equivalent forms:
| Form | Extension | Use Case |
|---|---|---|
| Human-readable text | .ll | Debugging, learning, manual inspection |
| Bitcode (binary) | .bc | Efficient storage, LTO, distribution |
| In-memory C++ objects | — | Used during compilation |
# Generate text IR
clang -S -emit-llvm program.c -o program.ll
# Generate bitcode
clang -c -emit-llvm program.c -o program.bc
# Convert between them
llvm-as program.ll -o program.bc # text → bitcode
llvm-dis program.bc -o program.ll # bitcode → textBitcode is the format used for Link-Time Optimization (LTO). Instead of compiling to object files, the compiler emits bitcode. The linker then runs LLVM optimization passes across the entire program before generating machine code. This enables inlining and optimization across compilation unit boundaries.
The Compilation Pipeline
Optimization Passes
LLVM’s optimizer runs passes over the IR. Each pass performs a specific transformation. Passes compose—running them in sequence produces the overall optimization effect.
# See which passes run at -O2
clang -O2 -mllvm -debug-pass=Arguments program.c 2>&1 | head -50Some important optimization passes:
| Pass | What It Does |
|---|---|
mem2reg | Promotes stack allocations to SSA registers |
instcombine | Combines/simplifies instruction sequences |
inline | Inlines function calls |
gvn | Global Value Numbering (eliminates redundant computations) |
licm | Loop Invariant Code Motion (hoists loop-independent code) |
sccp | Sparse Conditional Constant Propagation |
dce | Dead Code Elimination |
loop-vectorize | Converts scalar loops to SIMD operations |
You can run passes manually with opt:
# Generate unoptimized IR
echo 'int add(int a, int b) { return a + b; }' | clang -S -emit-llvm -O0 -x c - -o add.ll
# Run just mem2reg
opt -S -passes=mem2reg add.ll -o add-opt.ll
cat add-opt.lldefine i32 @add(i32 %a, i32 %b) {
entry:
%add = add nsw i32 %a, %b
ret i32 %add
}The mem2reg pass alone eliminated all the alloca/load/store instructions by recognizing that the stack variables could be kept in SSA registers.
The Optimizer Doesn’t Know Your Language
LLVM’s optimizer sees only IR. It doesn’t know whether the input was C, Rust, Swift, or anything else. Optimizations are purely IR transformations based on the semantics encoded in the IR itself.
Rust’s safety guarantees don’t directly help LLVM optimize. However, Rust emits IR with more metadata (like noalias on mutable references) that tells LLVM about invariants the optimizer can exploit:
; Rust's &mut T guarantees no aliasing, expressed as noalias
define void @modify(ptr noalias %x, ptr noalias %y) {
...
}The noalias attribute tells LLVM that %x and %y don’t overlap, enabling optimizations that wouldn’t be safe for C pointers (which might alias).
Backend: IR to Machine Code
The backend converts IR to machine code for a specific target. This involves:
- Instruction selection: Mapping IR operations to target instructions. The IR
add i32might become x86add, ARMADD, or RISC-Vaddw. - Register allocation: The IR has unlimited virtual registers; real CPUs have fixed register files. The allocator assigns virtual registers to physical ones, spilling to memory when necessary.
- Instruction scheduling: Reordering instructions to avoid pipeline stalls and maximize throughput.
- Target-specific optimization: Peephole optimizations for the specific CPU.
# See the journey from IR to assembly
echo 'int add(int a, int b) { return a + b; }' | clang -O2 -S -x c - -o -add:
leal (%rdi,%rsi), %eax
retqThe backend chose leal (LEA—load effective address) as an efficient way to add two registers into a third on x86-64. A different target produces different assembly:
# Same IR, targeting ARM64
echo 'int add(int a, int b) { return a + b; }' | clang -O2 -S --target=aarch64-linux-gnu -x c - -o -add:
add w0, w0, w1
retThe IR was identical; only the backend changed.
Supported Targets
LLVM supports many targets out of the box:
llc --version Registered Targets:
aarch64 - AArch64 (little endian)
amdgcn - AMD GCN GPUs
arm - ARM
bpf - BPF (host endian)
hexagon - Hexagon
mips - MIPS (32-bit big endian)
nvptx64 - NVIDIA PTX 64-bit
riscv64 - 64-bit RISC-V
wasm32 - WebAssembly 32-bit
x86 - 32-bit X86
x86-64 - 64-bit X86
...
Cross-compilation means running the frontend and optimizer on your host, then selecting a different backend.
Language Frontends
Clang
Clang is the C, C++, and Objective-C frontend for LLVM. It parses source code, performs semantic analysis, and emits LLVM IR. Clang handles everything language-specific; LLVM handles everything target-specific.
┌─────────────────────────────────────────────────────────────────┐
│ Clang (Frontend) │
│ ├── Lexer: Source text → tokens │
│ ├── Parser: Tokens → AST (Abstract Syntax Tree) │
│ ├── Sema: Type checking, overload resolution, template │
│ │ instantiation, semantic analysis │
│ ├── CodeGen: AST → LLVM IR │
│ └── Driver: Orchestrates compilation, invokes linker │
├─────────────────────────────────────────────────────────────────┤
│ LLVM (Middle-end + Backend) │
│ ├── Optimizer: IR → optimized IR │
│ └── Backend: IR → machine code │
└─────────────────────────────────────────────────────────────────┘
Clang is a separate project from LLVM core, though they’re developed together and released in sync. You can use LLVM without Clang (as Rust does), and theoretically Clang could target a different IR (though nobody does this).
# Clang's view of compilation stages
clang -ccc-print-phases hello.c0: input, "hello.c", c
1: preprocessor, {0}, cpp-output
2: compiler, {1}, ir
3: backend, {2}, assembler
4: assembler, {3}, object
5: linker, {4}, image
Stages 0-2 are Clang. Stages 3-4 are LLVM backend. Stage 5 invokes the system linker.
Rustc
The Rust compiler (rustc) uses LLVM as its backend. Rust has its own frontend and intermediate representations:
┌─────────────────────────────────────────────────────────────────┐
│ rustc (Rust Frontend) │
│ ├── Lexer/Parser: Source → AST │
│ ├── Name resolution, macro expansion │
│ ├── HIR: High-level IR (type checking happens here) │
│ ├── MIR: Mid-level IR (borrow checking, Rust-specific opts) │
│ ├── Codegen: MIR → LLVM IR │
│ └── Driver: Orchestrates compilation, invokes linker │
├─────────────────────────────────────────────────────────────────┤
│ LLVM │
│ ├── Optimizer: IR → optimized IR │
│ └── Backend: IR → machine code │
└─────────────────────────────────────────────────────────────────┘
HIR (High-level IR) is a desugared AST where type inference and trait resolution happen. MIR (Mid-level IR) is a control-flow graph used for borrow checking and Rust-specific optimizations. After MIR passes complete, rustc lowers to LLVM IR and hands off to LLVM.
# See Rust's MIR
rustc --emit=mir hello.rs -o hello.mir
# See the LLVM IR Rust generates
rustc --emit=llvm-ir hello.rs -o hello.llRust ships with LLVM built-in—you don’t need a separate LLVM installation. Each Rust release bundles a specific LLVM version, sometimes with Rust-specific patches.
# Check which LLVM version rustc uses
rustc --version --verbose | grep LLVMLLVM version: 17.0.6
Building a competitive optimizing compiler is enormously complex. By using LLVM, Rust gets world-class optimization passes, support for dozens of target architectures, and continuous improvements from the broader LLVM community. The tradeoff is compile time—LLVM is thorough but slow. Rust’s debug builds bypass most LLVM optimization (-C opt-level=0), and projects like Cranelift offer faster alternative backends for development builds.
Other Frontends
Many languages use LLVM as their backend:
| Language | Frontend | Notes |
|---|---|---|
| C/C++/ObjC | Clang | Reference frontend |
| Rust | rustc | MIR → LLVM IR |
| Swift | swiftc | Apple’s language |
| Julia | Julia compiler | JIT compiled via LLVM |
| Kotlin Native | Kotlin/Native | LLVM backend for native compilation |
| Zig | Zig compiler | Also can emit LLVM IR |
| Haskell | GHC | Optional LLVM backend |
| Fortran | Flang | Modern Fortran frontend |
Each frontend handles language semantics and lowers to LLVM IR. From there, compilation is identical.
The LLVM Ecosystem
Subprojects
LLVM is an umbrella for many related projects:
| Project | Purpose |
|---|---|
| LLVM Core | Optimizer, code generators, IR |
| Clang | C/C++/ObjC frontend |
| libc++ | C++ standard library |
| libc++abi | C++ ABI support library |
| compiler-rt | Compiler runtime (builtins, sanitizers) |
| LLD | LLVM’s linker |
| LLDB | Debugger |
| libunwind | Stack unwinding library |
| OpenMP | OpenMP runtime |
| Polly | Polyhedral loop optimizer |
| MLIR | Multi-Level IR (for ML compilers, DSLs) |
| Flang | Fortran frontend |
| llvm-libc | C standard library (in development) |
These are developed together but can often be used independently. You might use Clang with GNU’s libstdc++ instead of libc++, or use LLD as a drop-in replacement for GNU ld with GCC.
compiler-rt
The compiler runtime provides functions the compiler assumes exist but that aren’t part of the C standard library:
┌─────────────────────────────────────────────────────────────────┐
│ compiler-rt (LLVM) / libgcc (GCC) │
│ ├── Builtins │
│ │ ├── Integer ops CPU lacks (__divti3, __multi3) │
│ │ ├── Floating-point soft emulation │
│ │ └── Bit manipulation (__clzdi2, __popcountdi2) │
│ ├── Sanitizers │
│ │ ├── AddressSanitizer (ASan) │
│ │ ├── UndefinedBehaviorSanitizer (UBSan) │
│ │ ├── ThreadSanitizer (TSan) │
│ │ └── MemorySanitizer (MSan) │
│ ├── Profiling runtime │
│ └── Platform-specific support │
└─────────────────────────────────────────────────────────────────┘
When you divide two 128-bit integers:
__int128 divide(__int128 a, __int128 b) {
return a / b;
}x86-64 has no single instruction for 128-bit division. The compiler emits:
clang -S -O2 divide.c -o -divide:
jmp __divti3 ; Call compiler runtime functionThe __divti3 function lives in compiler-rt (or libgcc). It’s not part of libc—it’s compiler support code that the compiler assumes exists.
Tooling
LLVM includes standalone tools for working with IR and binaries:
| Tool | Purpose |
|---|---|
clang | C/C++/ObjC compiler |
opt | IR optimizer (run passes manually) |
llc | IR to assembly/object code |
llvm-as | Text IR to bitcode |
llvm-dis | Bitcode to text IR |
llvm-link | Link multiple bitcode files |
lli | LLVM IR interpreter/JIT |
llvm-nm | Symbol table viewer |
llvm-objdump | Disassembler |
llvm-readelf | ELF reader |
llvm-ar | Archive tool |
These tools work on any platform LLVM supports and understand all LLVM target formats.
# Compile to bitcode
clang -c -emit-llvm program.c -o program.bc
# Inspect the IR
llvm-dis program.bc -o -
# Optimize it
opt -O2 program.bc -o program-opt.bc
# Compile to assembly for a specific target
llc -march=x86-64 program-opt.bc -o program.s
# Or compile to object file
llc -filetype=obj program-opt.bc -o program.oAdvanced Topics
Link-Time Optimization (LTO)
LTO defers optimization until link time, enabling whole-program analysis. Instead of compiling to object files, you compile to bitcode:
# Compile each file to bitcode
clang -c -flto file1.c -o file1.o
clang -c -flto file2.c -o file2.o
# "Link" - actually runs LLVM optimization then code generation
clang -flto file1.o file2.o -o programThe .o files contain LLVM bitcode, not machine code. At link time, LLVM loads all bitcode, runs optimization passes across the entire program, then generates machine code. This enables:
- Cross-module inlining: Functions from file1.c can be inlined into file2.c
- Whole-program devirtualization: Virtual calls with known targets become direct calls
- Global dead code elimination: Unused functions removed even if exported
- Interprocedural optimization: Constant propagation across function boundaries
Rust enables LTO in Cargo.toml:
[profile.release]
lto = true # "fat" LTO - full cross-crate optimization
# lto = "thin" # Faster but less thoroughThinLTO is a scalable variant that achieves most benefits with better parallelism and caching. It’s the default for Rust’s lto = "thin".
JIT Compilation
LLVM supports Just-In-Time compilation—generating machine code at runtime:
// Simplified example using LLVM's C API
LLVMModuleRef module = LLVMModuleCreateWithName("my_jit");
// ... build IR programmatically ...
LLVMExecutionEngineRef engine;
LLVMCreateJITCompilerForModule(&engine, module, 2, &error);
int (*fn)(int, int) = (int(*)(int,int))LLVMGetFunctionAddress(engine, "add");
int result = fn(3, 4); // Execute JIT-compiled codeJulia uses this extensively—code is compiled to LLVM IR at runtime, optimized, and executed. This enables Julia’s “compile what you use” model where functions are specialized for the types they’re called with.
Languages like Python could theoretically JIT-compile hot paths via LLVM, but CPython doesn’t. Projects like Numba add LLVM JIT compilation to Python for numerical code:
from numba import jit
@jit(nopython=True)
def sum_array(arr):
total = 0
for x in arr:
total += x
return totalNumba compiles this to LLVM IR, optimizes it, and generates native machine code—orders of magnitude faster than interpreted Python.
Practical Usage
Inspecting the Full Pipeline
Trace a C function through every stage:
# Source
echo 'int square(int x) { return x * x; }' > square.c
# Preprocessed (macros expanded, includes resolved)
clang -E square.c -o square.i
# LLVM IR (unoptimized)
clang -S -emit-llvm -O0 square.c -o square-O0.ll
# LLVM IR (optimized)
clang -S -emit-llvm -O2 square.c -o square-O2.ll
# Assembly (target-specific)
clang -S -O2 square.c -o square.s
# Object file (machine code + metadata)
clang -c -O2 square.c -o square.o
# Compare sizes
wc -l square-O0.ll square-O2.ll square.s 12 square-O0.ll
6 square-O2.ll
8 square.s
For Rust:
echo 'pub fn square(x: i32) -> i32 { x * x }' > square.rs
# HIR (Rust high-level IR)
rustc -Z unpretty=hir square.rs 2>/dev/null
# MIR (Rust mid-level IR)
rustc --emit=mir square.rs -o square.mir
# LLVM IR
rustc --emit=llvm-ir -O square.rs -o square.ll
# Assembly
rustc --emit=asm -O square.rs -o square.sLLVM vs GCC
GCC (GNU Compiler Collection) is the other major open-source compiler infrastructure. Both produce excellent optimized code, but they differ architecturally:
| Aspect | LLVM | GCC |
|---|---|---|
| IR | LLVM IR (well-documented, stable) | GIMPLE/RTL (internal, less accessible) |
| License | Apache 2.0 with LLVM exception | GPL 3.0 |
| Modularity | Library-first design, reusable components | Monolithic, harder to embed |
| Frontends | Clang, Rust, Swift, Julia, etc. | C, C++, Fortran, Ada, Go, D |
| Build system integration | Easy to embed (JIT, tooling) | Designed as standalone compiler |
GCC has been around longer (1987 vs 2003) and supports some targets and language features LLVM doesn’t. LLVM’s cleaner architecture makes it easier to build tools on top of—this is why so many new languages chose LLVM.
The license matters for commercial users. GCC’s GPL requires derivative works to be GPL-licensed. LLVM’s permissive license allows proprietary use, which is why Apple invested heavily in LLVM and why it’s popular in commercial toolchains.