Executable Binary Formats

Here’s the merged note. I’ve integrated the header field details, expanded the section list, and preserved your image reference:

What Binary Formats Encode

Executable formats solve the problem of representing compiled code in a way that loaders can map into memory and execute. They must describe where code and data live, what addresses need runtime patching, which external symbols are required, and how to start execution. The three dominant formats—ELF, Mach-O, and PE—all solve these problems with different trade-offs in metadata encoding, linking flexibility, and toolchain integration.

Aspect	ELF (Linux/BSD)	Mach-O (macOS/iOS)	PE (Windows)
Magic bytes	`7f 45 4c 46`	`fe ed fa ce` (32), `cf fa ed fe` (64), `ca fe ba be` (fat)	`4d 5a` (`MZ`)
Dynamic linker	`ld.so` / `ld-linux.so`	`dyld`	`ntdll.dll` loader
Shared library extension	`.so`	`.dylib`	`.dll`
Import indirection	PLT/GOT	Stubs + lazy pointers	IAT
Symbol visibility default	Export all	Export all	Export nothing
Position independence	Optional (PIE)	Mandatory (since 10.7)	Optional (ASLR)

ELF

ELF (Executable and Linkable Format) is the standard format for executables, object files, and shared libraries on Linux and BSD systems. It organizes binaries around two parallel views: sections (used by linkers and analysis tools) and segments (used by the OS loader). The section view provides semantic meaning; the segment view provides memory mapping instructions.

┌─────────────────────┐
│     ELF Header      │  ← Magic, class (32/64), endianness, entry point
├─────────────────────┤
│   Program Headers   │  ← Segment table: what to map into memory
├─────────────────────┤
│       .text         │  ← Executable code
│       .rodata       │  ← Read-only data (string literals, constants)
│       .plt          │  ← Procedure linkage table (call stubs)
│       .got          │  ← Global offset table (data pointers)
│       .got.plt      │  ← GOT entries for PLT (function pointers)
│       .data         │  ← Initialized writable globals
│       .bss          │  ← Zero-initialized data (no file bytes)
│       .dynsym       │  ← Dynamic symbol table (for loader)
│       .symtab       │  ← Full symbol table (for debuggers, strippable)
│       .strtab       │  ← String table for .symtab
│       .dynstr       │  ← String table for .dynsym
│       .rela.dyn     │  ← Dynamic relocations for data
│       .rela.plt     │  ← Dynamic relocations for PLT
│       .eh_frame     │  ← Exception/unwind tables
│       .debug_*      │  ← DWARF debug sections
├─────────────────────┤
│   Section Headers   │  ← Section table: metadata for tools
└─────────────────────┘

Important

When the operating system loads the executable into memory, it uses memory page permissions to map specific segments as read-only, read-write, or executable. This prevents accidental or malicious modification of code.

ELF Header

The ELF header contains metadata describing the file type and how to parse the rest of the file:

Field	Purpose
`e_ident`	16-byte identifier: magic number (`0x7F ELF`), class (32/64-bit), endianness, ABI
`e_type`	File type: `ET_REL` (relocatable), `ET_EXEC` (executable), `ET_DYN` (shared/PIE), `ET_CORE`
`e_machine`	Target architecture (e.g., `EM_X86_64`, `EM_AARCH64`)
`e_entry`	Virtual address of the entry point (where execution begins)
`e_phoff`	File offset to program headers
`e_shoff`	File offset to section headers

The e_entry field enables Position-Independent Executables (PIE), which support Address Space Layout Randomization (ASLR). Rather than hardcoding a fixed entry address, PIE binaries contain a relative entry point that works at any load address.

readelf -h ./binary

Program Headers and Segments

Program headers describe how segments should be loaded into memory. The loader reads these to set up the process address space.

Field	Purpose
`p_type`	Segment type: `PT_LOAD` (loadable), `PT_DYNAMIC` (dynamic linking info), `PT_INTERP` (interpreter path)
`p_flags`	Permissions: `PF_R` (read), `PF_W` (write), `PF_X` (execute)
`p_offset`	File offset where segment data begins
`p_vaddr`	Virtual address where segment should be mapped
`p_filesz`	Size of segment in the file
`p_memsz`	Size of segment in memory (may exceed `p_filesz` for `.bss`)

Segments are contiguous regions that group sections with similar memory attributes. The compiler and linker arrange sections so that read-only sections (.text, .rodata) are adjacent, and writable sections (.data, .bss) are adjacent. This allows each segment to be described by a single pointer and length, minimizing the number of memory mappings.

A typical executable has two PT_LOAD segments: one read-execute (containing .text, .rodata, .plt) and one read-write (containing .data, .bss, .got).

readelf -lW ./binary

Sections

Each section serves a distinct role in execution or linking:

Section	Purpose	Flags
`.text`	Machine code instructions	Read, Execute
`.rodata`	Read-only data (string literals, constants)	Read
`.data`	Initialized global and static variables	Read, Write
`.bss`	Uninitialized globals (zeroed at load, no file bytes)	Read, Write
`.got`	Global Offset Table (resolved addresses for data)	Read, Write
`.got.plt`	GOT entries for PLT (function addresses)	Read, Write
`.plt`	Procedure Linkage Table (call stubs for lazy binding)	Read, Execute
`.dynamic`	Dynamic linker info (dependencies, relocations)	Read
`.dynsym`	Dynamic symbol table (for loader)	Read
`.symtab`	Full symbol table (for debuggers, strippable)	—
`.strtab`	String table for `.symtab`	—
`.dynstr`	String table for `.dynsym`	Read
`.rela.dyn`	Relocations for data references	Read
`.rela.plt`	Relocations for PLT entries	Read
`.init`	Code executed before `main()`	Read, Execute
`.fini`	Code executed after program ends	Read, Execute
`.init_array`	Array of constructor function pointers	Read, Write
`.fini_array`	Array of destructor function pointers	Read, Write
`.interp`	Path to dynamic linker (e.g., `/lib64/ld-linux-x86-64.so.2`)	Read
`.eh_frame`	Exception handling and stack unwinding info	Read
`.debug_*`	DWARF debug information	—

The .bss section occupies zero bytes in the file—only its size is recorded. At load time, the kernel allocates this region and zeros it. The p_memsz of the containing segment exceeds p_filesz by the .bss size.

Section Headers

Each section is described by a header in the section header table:

Field	Purpose
`sh_name`	Offset into `.shstrtab` for section name
`sh_type`	Section type: `SHT_PROGBITS` (code/data), `SHT_SYMTAB`, `SHT_NOBITS` (.bss), etc.
`sh_flags`	Attributes: `SHF_ALLOC` (loaded), `SHF_WRITE`, `SHF_EXECINSTR`
`sh_addr`	Virtual address when loaded (0 for non-allocated sections)
`sh_offset`	File offset to section data
`sh_size`	Size in bytes

Sections without SHF_ALLOC don’t appear in any segment and aren’t loaded into memory. Debug sections (.debug_*) lack this flag—they exist only for offline analysis and can be stripped without affecting execution.

readelf -SW ./binary

Dynamic Linking

Dynamic linking defers symbol resolution until load time or first use. This requires cooperation between the compiler, linker, and runtime loader.

Symbol Tables

ELF maintains two symbol tables with different purposes:

.symtab contains all symbols from compilation—local functions, static variables, debug labels. It’s used by debuggers and analysis tools but can be stripped without affecting execution.

.dynsym contains only symbols needed for dynamic linking—imported functions, exported functions, required globals. The loader uses this exclusively. Stripping .symtab leaves .dynsym intact.

# Full symbol table (may be stripped)
readelf -s ./binary
 
# Dynamic symbol table (required for execution)
readelf --dyn-syms ./binary
 
# Or with objdump (shows defined vs undefined)
objdump -T ./binary

A symbol “existing in the binary” doesn’t guarantee runtime availability. Only .dynsym entries participate in dynamic resolution.

PLT and GOT

Position-independent code cannot hardcode absolute addresses for external symbols—the target address isn’t known until runtime. ELF solves this with two cooperating structures:

The GOT (Global Offset Table) is a writable data section containing addresses. Code references external symbols indirectly through GOT slots. The loader patches these slots with resolved addresses.

The PLT (Procedure Linkage Table) is executable stub code that implements lazy binding for function calls. Each imported function gets a PLT entry that loads and jumps through a GOT slot.

Call site:        call puts@PLT
                       │
                       ▼
PLT stub:         jmp *GOT[puts]      ──► (first call) resolver
                  push reloc_index        (later calls) actual puts()
                  jmp resolver
                       │
                       ▼
Resolver:         lookup "puts" in loaded libraries
                  write address to GOT[puts]
                  jump to puts()

On the first call, the GOT slot points back to resolver code. The resolver (part of ld.so) finds the symbol, patches the GOT, and transfers control. Subsequent calls jump directly to the resolved address.

# Disassemble PLT stubs
objdump -d -j .plt ./binary
 
# Show GOT contents at runtime
gdb -batch -ex 'start' -ex 'x/20xg &_GLOBAL_OFFSET_TABLE_' ./binary

Relocations

Relocations are instructions to the loader: “patch this location with this symbol’s address.” They’re stored in .rela.dyn (for data) and .rela.plt (for function calls).

readelf -rW ./binary

Relocation section '.rela.plt' at offset 0x628 contains 3 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000003fd8  0000000300000007 R_X86_64_JUMP_SLOT     0000000000000000 puts@GLIBC_2.2.5 + 0
0000000000003fe0  0000000400000007 R_X86_64_JUMP_SLOT     0000000000000000 __libc_start_main@GLIBC_2.34 + 0

Relocation Type	Purpose
R_X86_64_JUMP_SLOT	Patch GOT slot for PLT (function call)
R_X86_64_GLOB_DAT	Patch GOT slot for data reference
R_X86_64_RELATIVE	Add load address to embedded offset (no symbol lookup)
R_X86_64_64	Absolute 64-bit address

Binding Modes

Lazy binding (default) resolves symbols on first use. This speeds up startup but means resolution errors occur mid-execution.

Immediate binding (LD_BIND_NOW=1 or linked with -z now) resolves all symbols at load time. Slower startup, but failures occur immediately and GOT can be marked read-only afterward (RELRO).

Full RELRO (-z relro -z now) makes the GOT read-only after resolution, preventing GOT overwrite attacks.

Mach-O

Mach-O (Mach Object) uses load commands as its central organizing principle. The header declares how many load commands follow, and each command describes a segment, library dependency, symbol table location, or other metadata.

┌──────────────────────┐
│    Mach-O Header     │  ← Magic, cputype, filetype, ncmds
├──────────────────────┤
│    Load Commands     │  ← LC_SEGMENT_64, LC_LOAD_DYLIB, LC_SYMTAB, etc.
├──────────────────────┤
│   __TEXT Segment     │  ← Read-execute, immutable after load
│     __text           │     Machine code
│     __stubs          │     Import stubs (like PLT)
│     __stub_helper    │     Lazy binding trampoline
│     __cstring        │     C string literals
│     __const          │     Other constants
├──────────────────────┤
│   __DATA Segment     │  ← Read-write
│     __got            │     Non-lazy symbol pointers
│     __la_symbol_ptr  │     Lazy symbol pointers
│     __data           │     Initialized globals
│     __bss            │     Zero-fill
├──────────────────────┤
│  __LINKEDIT Segment  │  ← Linking metadata, not mapped as usable memory
│     Symbol table     │
│     String table     │
│     Code signature   │
│     Export trie      │
│     Binding opcodes  │
└──────────────────────┘

Unlike ELF’s dual section/segment model, Mach-O segments directly contain named sections. The __TEXT segment contains sections like __text (code) and __cstring (strings). This single hierarchy simplifies the format but reduces flexibility.

# View load commands (the core of Mach-O structure)
otool -l ./binary
 
# Concise section listing
otool -L ./binary   # Library dependencies
size -m ./binary    # Section sizes

Dynamic Linking

Mach-O’s import mechanism parallels ELF’s PLT/GOT but with different encoding:

__stubs contains small code stubs (like PLT entries). Each stub loads an address from __la_symbol_ptr and jumps to it.

__la_symbol_ptr holds lazy symbol pointers (like .got.plt). Initially these point to __stub_helper code that triggers the resolver.

__got holds non-lazy pointers for data symbols, resolved at load time.

The binding information itself isn’t stored as relocation entries. Instead, LC_DYLD_INFO_ONLY points to a bytecode stream that encodes binding operations compactly. Modern macOS (11+) uses chained fixups (LC_DYLD_CHAINED_FIXUPS) which embed fixup metadata directly in pointer slots.

# Binding info (older format)
dyldinfo -bind -lazy_bind ./binary
 
# Modern chained fixups (macOS 12+)
xcrun dyld_info -fixups ./binary
 
# Export trie (symbols this library provides)
xcrun dyld_info -exports ./binary

Exports use a trie (prefix tree) rather than a flat table. Symbol lookup is O(n) in symbol name length, regardless of library size.

Fat Binaries

Mach-O supports universal binaries containing multiple architectures. A fat header (magic ca fe ba be) indexes embedded Mach-O images. The kernel selects the appropriate slice at exec time.

# Inspect fat binary
lipo -info ./binary
file ./binary
 
# Extract single architecture
lipo ./binary -thin arm64 -output ./binary_arm64
 
# Combine architectures
lipo -create ./binary_x86_64 ./binary_arm64 -output ./binary_universal

Warning

Fat binaries roughly double file size. When analyzing binary size on macOS, extract the relevant architecture first to avoid confusion.

Code Signing

macOS requires code signatures for execution. Even ad-hoc signing adds a LC_CODE_SIGNATURE load command pointing to signature data in __LINKEDIT. This contributes ~16KB minimum to file size.

# Check signature status
codesign -dv ./binary
 
# Ad-hoc sign (no identity)
codesign -s - ./binary

PE

PE (Portable Executable) wraps COFF with Windows-specific headers. The format retains DOS compatibility through a stub header, then transitions to PE structures.

┌──────────────────────┐
│     DOS Header       │  ← MZ magic, e_lfanew points to PE signature
│     DOS Stub         │  ← "This program cannot be run in DOS mode"
├──────────────────────┤
│    PE Signature      │  ← "PE\0\0"
│    COFF Header       │  ← Machine type, section count, timestamp
│   Optional Header    │  ← Entry point, image base, data directories
├──────────────────────┤
│   Section Headers    │  ← .text, .rdata, .data, .rsrc, .reloc
├──────────────────────┤
│       .text          │  ← Executable code
│       .rdata         │  ← Read-only data, import/export tables
│       .data          │  ← Writable initialized data
│       .bss           │  ← Uninitialized data
│       .rsrc          │  ← Windows resources (icons, dialogs, etc.)
│       .reloc         │  ← Base relocations for ASLR
└──────────────────────┘

The Optional Header (not actually optional for executables) contains Data Directories—a fixed-size array of pointers to import tables, export tables, exception data, debug info, and other structures. This indirection lets the loader find metadata regardless of section layout.

Import Mechanism (IAT)

PE resolves imports through the Import Address Table (IAT). Unlike ELF’s lazy binding, PE traditionally resolves everything at load time.

The import directory contains Import Descriptors for each DLL dependency. Each descriptor references:

Import Lookup Table (ILT): Names or ordinals to resolve
Import Address Table (IAT): Slots the loader patches with resolved addresses

; PE import call (no stub indirection)
call qword ptr [__imp_puts]   ; __imp_puts is an IAT slot

Call sites reference IAT entries directly—no PLT-style stub. This saves one indirection compared to ELF but requires the IAT to be writable during loading.

Important

PE symbols are not exported by default. DLL exports require explicit __declspec(dllexport) or a .def file. This is the opposite of ELF and Mach-O where symbols default to visible.

Delay-Load DLLs

PE supports optional lazy binding through delay-load imports. A separate Delay Import Descriptor and helper function load the DLL on first use. This is opt-in at link time (/DELAYLOAD:foo.dll), unlike ELF where lazy binding is the default.

# PE inspection (Windows)
dumpbin /headers myapp.exe
dumpbin /imports myapp.exe
dumpbin /exports mylib.dll
dumpbin /dependents myapp.exe
 
# Cross-platform
llvm-readobj --coff-imports myapp.exe

Debug Information

Debug info enables source-level debugging and symbolicated crash reports. Each platform handles this differently:

Format	Debug Container	Typical External Debug
ELF	DWARF in `.debug_*` sections	Split DWARF (`.dwo`), separate `.debug` files
Mach-O	DWARF in `__DWARF` segment	`.dSYM` bundles
PE	CodeView reference	`.pdb` files (almost always separate)

ELF embeds DWARF directly. Stripping removes it. Split DWARF (-gsplit-dwarf) moves debug info to .dwo files, keeping the main binary small while preserving debuggability.

Mach-O can embed DWARF, but Apple tooling typically extracts it to dSYM bundles—directories containing a Mach-O with only debug sections. The main binary stores a UUID for correlation.

# Create dSYM
dsymutil ./binary -o ./binary.dSYM
 
# Verify dSYM integrity
dwarfdump --verify ./binary.dSYM

PE almost never embeds debug info. The compiler generates .pdb (Program Database) files containing types, symbols, and line mappings. The PE file contains only a CodeView reference (GUID + age) to locate the matching PDB.

Practical Inspection

Cross-Platform Tools

LLVM tools provide consistent interfaces across all formats:

llvm-readobj --headers ./binary      # Universal header dump
llvm-objdump -d ./binary             # Disassembly
llvm-nm ./binary                     # Symbol listing
llvm-size ./binary                   # Section sizes

Task	ELF	Mach-O	PE
Headers	`readelf -h`	`otool -h`	`dumpbin /headers`
Sections	`readelf -S`	`otool -l`, `size -m`	`dumpbin /headers`
Symbols	`nm`, `readelf -s`	`nm`	`dumpbin /symbols`
Dynamic symbols	`readelf --dyn-syms`	`nm -m`, `xcrun dyld_info -exports`	`dumpbin /exports`
Imports	`readelf -r`, `objdump -R`	`xcrun dyld_info -fixups`	`dumpbin /imports`
Dependencies	`ldd`, `readelf -d`	`otool -L`	`dumpbin /dependents`
Disassembly	`objdump -d`	`otool -tV`	`dumpbin /disasm`

Tracing Dynamic Linking

ELF

# What libraries are needed?
readelf -d ./binary | grep NEEDED
 
# What symbols are imported?
readelf --dyn-syms ./binary | grep UND
 
# What relocations will the loader process?
readelf -rW ./binary
 
# PLT disassembly (see the stubs)
objdump -d -j .plt -j .plt.sec ./binary
 
# Runtime: where did symbols resolve?
LD_DEBUG=bindings ./binary 2>&1 | grep puts

Mach-O

# What libraries are needed?
otool -L ./binary
 
# What symbols are imported?
nm -mu ./binary
 
# Binding information
xcrun dyld_info -fixups ./binary
 
# Stub disassembly
otool -tV -p _puts ./binary
 
# Runtime: trace dyld
DYLD_PRINT_BINDINGS=1 ./binary

PE

# What DLLs are needed?
dumpbin /dependents myapp.exe
 
# What functions are imported?
dumpbin /imports myapp.exe
 
# What functions are exported?
dumpbin /exports mylib.dll

Edmondo's Vault

Explorer

Executable Binary Formats

What Binary Formats Encode

ELF

ELF Header

Program Headers and Segments

Sections

Section Headers

Dynamic Linking

Symbol Tables

PLT and GOT

Relocations

Binding Modes

Mach-O

Dynamic Linking

Fat Binaries

Code Signing

PE

Import Mechanism (IAT)

Delay-Load DLLs

Debug Information

Practical Inspection

Cross-Platform Tools

Tracing Dynamic Linking

ELF

Mach-O

PE

Graph View

Table of Contents

Backlinks