Having Fun with Goblin

Goblin is a binary format parser. It reads the structure and metadata of executable files — headers, sections, symbols, dependencies, relocations.

It is NOT a disassembler. It doesn’t decode machine instructions (0x55 0x48 0x89 0xe5 → push rbp; mov rbp, rsp). For that, see Disassemblying.

Think of it this way:

Layer	Tool	What it reads
Format/Metadata	Goblin	Headers, sections, symbols, imports
Instructions	capstone, iced-x86	Raw bytes → assembly mnemonics
Semantics	Binary ninja, Ghidra	Control flow, decompilation

Goblin is Layer 0 — you need to understand the container before you can find the code inside it. It answers: where is the code, what symbols exist, how is the binary structured. A disassembler answers: what do the bytes do.

To build a disassembler in Rust, you’d combine:

// Goblin finds the .text section (where code lives)
// capstone-rs decodes the bytes inside it
goblin + capstone-rs  // = DIY disassembler
goblin + iced-x86     // = alternative, pure Rust

ELF Crash Course: The Mental Model

ELF (Executable and Linkable Format) is how Linux packages a compiled program. Think of it as a zip file with a very rigid table of contents.

┌──────────────────────────────────────────┐
│              ELF Header                  │  ← "What am I?" (arch, type, entry point)
├──────────────────────────────────────────┤
│         Program Headers (Segments)       │  ← "How to LOAD me into memory"
│         (used at runtime by the kernel)  │     (the kernel reads these)
├──────────────────────────────────────────┤
│                                          │
│         .text    (executable code)       │
│         .rodata  (read-only constants)   │
│         .data    (initialized globals)   │
│         .bss     (uninitialized globals) │
│         .symtab  (symbol table)          │
│         .strtab  (string table)          │
│         .dynsym  (dynamic symbols)       │
│         .dynstr  (dynamic strings)       │
│         .plt     (procedure linkage)     │
│         .got     (global offset table)   │
│         .dynamic (dynamic linking info)  │
│         ...                              │
│                                          │
├──────────────────────────────────────────┤
│         Section Headers                  │  ← "How to ANALYZE me"
│         (used by tools like readelf/nm)  │     (debuggers/linkers read these)
└──────────────────────────────────────────┘

Key distinction:

Segments (Program Headers) = runtime view. The kernel loads these into memory.
Sections (Section Headers) = development view. Tools care about these.

A segment can contain multiple sections. The .text section lives inside a PT_LOAD segment marked executable.

The Jargon Decoder

ELF Header Constants

Constant	Full Name	Meaning
`ET_EXEC`	Executable Type	Traditional fixed-address executable
`ET_DYN`	Dynamic/Shared	Position-independent (PIE executable OR .so library)
`ET_REL`	Relocatable	Object file (.o), not yet linked
`ET_CORE`	Core dump	Process memory dump from a crash

Why ET_DYN matters: Modern executables are compiled as PIE (Position-Independent Executables) so the kernel can load them at a random address (ASLR). PIE binaries show up as ET_DYN, not ET_EXEC. If you see ET_EXEC, ASLR is NOT possible — that’s a security finding.

Program Header Types (Segments)

Constant	What it is
`PT_LOAD`	”Load this chunk into memory” — the actual code & data
`PT_DYNAMIC`	Dynamic linking info (what .so files are needed)
`PT_INTERP`	Path to the dynamic linker (`/lib64/ld-linux-x86-64.so.2`)
`PT_GNU_STACK`	Declares whether the stack should be executable
`PT_GNU_RELRO`	Marks memory to be made read-only after relocation
`PT_NOTE`	Metadata (build ID, ABI tag, etc.)

Why PT_GNU_STACK matters: If this segment has the execute flag (PF_X), the process gets an executable stack — meaning buffer overflow exploits can inject shellcode directly onto the stack and run it. Modern binaries should NEVER have this set. If you find it, something is very wrong (probably hand-written assembly or ancient build flags).

Why PT_GNU_RELRO matters: After the dynamic linker resolves all symbols, the GOT (Global Offset Table) can be made read-only. This prevents attackers from overwriting function pointers in the GOT. “Full RELRO” = PT_GNU_RELRO

BIND_NOW (resolve everything upfront, then lock it down).

Dynamic Section Tags

Constant	What it controls
`DT_NEEDED`	”I need this shared library” (e.g., `libc.so.6`)
`DT_SONAME`	”My canonical name is…” (for library versioning)
`DT_RPATH`	Hardcoded library search path (generally bad practice)
`DT_RUNPATH`	Library search path (slightly better than RPATH)
`DT_FLAGS`	Bitfield of flags, including…
`DF_BIND_NOW`	”Resolve ALL symbols at load time, not lazily”
`DT_FLAGS_1`	Extended flags (PIE indicator, etc.)

Why DT_FLAGS / DF_BIND_NOW matters: Lazy binding means function addresses are resolved the first time they’re called. The GOT entries are writable until then — an attack window. BIND_NOW closes it: resolve everything at startup, then (with RELRO) make the GOT read-only. The cost is slightly slower startup. The benefit is a much smaller attack surface.

Symbol Binding & Visibility

Constant	Meaning
`STB_GLOBAL`	Visible to all object files during linking
`STB_LOCAL`	Only visible within this object file
`STB_WEAK`	Like global but can be overridden without error
`STV_HIDDEN`	Not exported from shared library (internal use only)
`STV_DEFAULT`	Normal visibility, exported

Why a Library Instead of Coreutils

Approach	Good for
`readelf -d mybin`	Quick look by a human in a terminal
Goblin in a Rust program	Automated systems that make decisions

The moment you need to:

Analyze at scale (100s-1000s of binaries)
Make programmatic decisions (pass/fail CI, allow/deny plugin load)
Produce structured output (JSON reports, database entries, dashboards)
Compose with other logic (network calls, DB lookups, policy engines)
Do it cross-platform (parse a Windows PE on Linux, inspect iOS Mach-O on CI)

…shelling out falls apart. Parsing unstable text. Handling locale differences. Spawning processes. Goblin gives typed, structured, zero-copy access.

It’s the difference between grep-ing HTML and using a DOM parser.

Use Cases: Practical and Non-Trivial

1. Container Image Security Auditor

Scan every binary in a Docker image. Check hardening flags. Fail the CI pipeline if anything isn’t locked down.

use goblin::elf::Elf;
use goblin::elf::program_header::{PT_GNU_STACK, PT_GNU_RELRO};
use goblin::elf::dynamic::{DF_BIND_NOW, DT_FLAGS};
 
struct SecurityReport {
    path: String,
    is_pie: bool,
    has_nx_stack: bool,
    has_full_relro: bool,
}
 
fn audit_elf(path: &str, buffer: &[u8]) -> Option<SecurityReport> {
    let elf = Elf::parse(buffer).ok()?;
 
    let is_pie = elf.header.e_type == goblin::elf::header::ET_DYN;
 
    let has_nx_stack = elf.program_headers.iter()
        .find(|ph| ph.p_type == PT_GNU_STACK)
        .map(|ph| ph.p_flags & 0x1 == 0) // PF_X not set
        .unwrap_or(false);
 
    let has_relro = elf.program_headers.iter()
        .any(|ph| ph.p_type == PT_GNU_RELRO);
 
    let has_bind_now = elf.dynamic.as_ref()
        .map(|d| d.dyns.iter().any(|dyn_entry|
            dyn_entry.d_tag == DT_FLAGS
            && (dyn_entry.d_val & DF_BIND_NOW as u64) != 0
        ))
        .unwrap_or(false);
 
    Some(SecurityReport {
        path: path.to_string(),
        is_pie,
        has_nx_stack,
        has_full_relro: has_relro && has_bind_now,
    })
}

Real-world: This is essentially what checksec does, but now it’s part of your Rust pipeline, producing structured JSON, integrated with your alerting.

2. Binary Diffing Across Releases (ABI Compatibility)

What changed at the symbol level between two versions of a shared library? Catch accidental ABI breaks before they hit production.

fn diff_binaries(old: &[u8], new: &[u8]) {
    let old_syms = extract_symbols(old);  // HashMap<String, SymbolInfo>
    let new_syms = extract_symbols(new);
 
    for (name, _) in &old_syms {
        if !new_syms.contains_key(name) {
            println!("🔴 REMOVED: {name}");  // ABI break!
        }
    }
 
    for (name, _) in &new_syms {
        if !old_syms.contains_key(name) {
            println!("🟢 ADDED: {name}");
        }
    }
 
    for (name, old_info) in &old_syms {
        if let Some(new_info) = new_syms.get(name) {
            if old_info.size != new_info.size {
                let delta = new_info.size as i64 - old_info.size as i64;
                println!("🟡 RESIZED: {name} ({delta:+} bytes)");
            }
        }
    }
}

3. Plugin Loader with Pre-Execution Validation

Before dlopen-ing an untrusted .so, inspect it statically:

const REQUIRED_EXPORTS: &[&str] = &[
    "plugin_init", "plugin_handle", "plugin_version"
];
const BANNED_IMPORTS: &[&str] = &[
    "system", "exec", "popen", "dlopen" // no shell, no loading more libs
];
 
fn validate_plugin(buffer: &[u8]) -> PluginValidation {
    let elf = Elf::parse(buffer).unwrap();
 
    let exported: HashSet<&str> = elf.dynsyms.iter()
        .filter(|s| s.is_function() && s.st_bind() == STB_GLOBAL)
        .filter_map(|s| elf.dynstrtab.get_at(s.st_name))
        .collect();
 
    let suspicious: Vec<&str> = elf.dynsyms.iter()
        .filter(|s| s.is_import())
        .filter_map(|s| elf.dynstrtab.get_at(s.st_name))
        .filter(|name| BANNED_IMPORTS.contains(name))
        .collect();
 
    // Does it have .init_array? (runs code automatically on dlopen!)
    // See: C Runtime — .init_array constructors
    let has_constructor = elf.section_headers.iter()
        .any(|sh| matches!(
            elf.shdr_strtab.get_at(sh.sh_name),
            Some(".init_array" | ".ctors")
        ));
 
    PluginValidation {
        exports_required_api: REQUIRED_EXPORTS.iter()
            .all(|r| exported.contains(r)),
        suspicious_imports: suspicious,
        has_constructor,
    }
}

Why this matters: .init_array functions run the instant you dlopen. If a malicious plugin has one, it executes before you can do anything. Detecting this BEFORE loading is a real security boundary.

4. Shared Library Dependency Graph Auditor

Map every binary in a deployment to its transitive .so dependencies. Answer: “What still links OpenSSL?” or “Why is libpython in our container?“

fn build_dep_graph(root: &Path) -> HashMap<String, Vec<String>> {
    let mut graph = HashMap::new();
 
    for entry in WalkDir::new(root).into_iter().filter_map(Result::ok) {
        let buffer = fs::read(entry.path()).unwrap_or_default();
        if let Ok(Object::Elf(elf)) = Object::parse(&buffer) {
            let libs: Vec<String> = elf.libraries.iter()
                .map(|s| s.to_string())
                .collect();
            if !libs.is_empty() {
                graph.insert(entry.path().display().to_string(), libs);
            }
        }
    }
    graph
}
// Output this as JSON → feed it into a graph visualizer
// Detect conflicts: two binaries needing different versions of the same .so

5. Build a Minimal Disassembler (Goblin + Capstone)

NOW we’re actually disassembling. Goblin finds the code. Capstone decodes it.

use goblin::elf::Elf;
use goblin::elf::section_header::SHT_PROGBITS;
use capstone::prelude::*;
 
fn disassemble_text_section(buffer: &[u8]) {
    let elf = Elf::parse(buffer).unwrap();
 
    // Find .text section
    let text_section = elf.section_headers.iter()
        .find(|sh| {
            elf.shdr_strtab.get_at(sh.sh_name) == Some(".text")
            && sh.sh_type == SHT_PROGBITS
        })
        .expect("no .text section found");
 
    let offset = text_section.sh_offset as usize;
    let size = text_section.sh_size as usize;
    let code = &buffer[offset..offset + size];
    let base_addr = text_section.sh_addr;
 
    // Disassemble with capstone
    let cs = Capstone::new()
        .x86()
        .mode(arch::x86::ArchMode::Mode64)
        .syntax(arch::x86::ArchSyntax::Intel)
        .build()
        .unwrap();
 
    let instructions = cs.disasm_all(code, base_addr).unwrap();
    for insn in instructions.iter() {
        println!("{:#010x}:  {} {}",
            insn.address(),
            insn.mnemonic().unwrap_or("???"),
            insn.op_str().unwrap_or(""),
        );
    }
}
 
// Output:
// 0x00401000:  push rbp
// 0x00401001:  mov  rbp, rsp
// 0x00401004:  sub  rsp, 0x10
// ...

6. Symbol-Aware Disassembly (The Cool Part)

Plain disassembly is just addresses. By combining Goblin’s symbol table with Capstone, you can label functions by name:

fn disassemble_function(buffer: &[u8], func_name: &str) {
    let elf = Elf::parse(buffer).unwrap();
 
    // Find the symbol by name
    let symbol = elf.syms.iter()
        .find(|s| elf.strtab.get_at(s.st_name) == Some(func_name))
        .expect("symbol not found");
 
    // Build address → name lookup for cross-references
    let sym_map: HashMap<u64, &str> = elf.syms.iter()
        .filter_map(|s| {
            let name = elf.strtab.get_at(s.st_name)?;
            Some((s.st_value, name))
        })
        .collect();
 
    let offset = symbol.st_value as usize;
    let size = symbol.st_size as usize;
 
    // Find which section contains this address to calculate file offset
    let section = elf.section_headers.iter()
        .find(|sh| {
            symbol.st_value >= sh.sh_addr
            && symbol.st_value < sh.sh_addr + sh.sh_size
        })
        .unwrap();
 
    let file_offset = (symbol.st_value - section.sh_addr + section.sh_offset) as usize;
    let code = &buffer[file_offset..file_offset + size];
 
    let cs = Capstone::new()
        .x86().mode(arch::x86::ArchMode::Mode64)
        .syntax(arch::x86::ArchSyntax::Intel)
        .detail(true)
        .build().unwrap();
 
    println!("─── {} ({} bytes) ───", func_name, size);
 
    let instructions = cs.disasm_all(code, symbol.st_value).unwrap();
    for insn in instructions.iter() {
        // If a call/jump target matches a known symbol, annotate it
        let annotation = insn.op_str()
            .and_then(|ops| {
                // crude: parse immediate from "call 0x401234"
                let addr = u64::from_str_radix(
                    ops.trim_start_matches("0x"), 16
                ).ok()?;
                sym_map.get(&addr).map(|name| format!("  ; <{name}>"))
            })
            .unwrap_or_default();
 
        println!("  {:#010x}:  {:6} {}{}",
            insn.address(),
            insn.mnemonic().unwrap_or(""),
            insn.op_str().unwrap_or(""),
            annotation,
        );
    }
}
 
// Output:
// ─── main (47 bytes) ───
//   0x00401120:  push   rbp
//   0x00401121:  mov    rbp, rsp
//   0x00401124:  call   0x401200  ; <initialize_server>
//   0x00401129:  call   0x401340  ; <run_event_loop>
//   ...

7. Cross-Platform Binary Inspector (PE + ELF + Mach-O)

Goblin’s killer feature: one API for all formats. Useful if you ship clients for multiple platforms:

use goblin::Object;
 
fn inspect(buffer: &[u8]) -> String {
    match Object::parse(buffer).unwrap() {
        Object::Elf(elf) => format!(
            "Linux ELF | {} | {} sections | {} symbols | deps: {:?}",
            if elf.is_64 { "64-bit" } else { "32-bit" },
            elf.section_headers.len(),
            elf.syms.len(),
            elf.libraries,
        ),
        Object::PE(pe) => format!(
            "Windows PE | {} | {} sections | imports: {:?}",
            if pe.is_64 { "64-bit" } else { "32-bit" },
            pe.sections.len(),
            pe.libraries,
        ),
        Object::Mach(mach) => format!("macOS Mach-O | {:?}", mach),
        Object::Archive(archive) => format!(
            "Static archive (.a) | {} members", archive.members().len()
        ),
        Object::Unknown(magic) => format!("Unknown format (magic: {:#x})", magic),
    }
}

Ideas to Explore Further

WASM binary inspector: Goblin doesn’t parse WASM, but wasmparser crate follows the same philosophy — could pair them for a universal binary analysis toolkit
Fuzzing harness validator: Before deploying a fuzz target, verify it’s compiled with sanitizers (check for __asan_* symbols)
License compliance: Scan binaries for statically linked libraries by looking for telltale symbols (e.g., OPENSSL_* symbols = OpenSSL statically linked = license implications)
Binary size profiler: Map every symbol’s size, sort by largest, find bloat (Binary Size Analysis). “Why is this binary 50MB?” → “Because serde monomorphized 200 versions of deserialize”
Core dump analyzer: Parse ET_CORE files to extract register state, memory maps, and stack traces programmatically
Reproducible build verifier: Compare two builds of the same source — are the symbols and sections identical? If not, where’s the divergence?

Companion Crates

Crate	Purpose
`goblin`	Parse ELF/PE/Mach-O structure
`capstone-rs`	Disassemble x86/ARM/MIPS/etc.
`iced-x86`	Pure Rust x86 disassembler (no C deps)
`object`	Alternative to goblin (used by rustc)
`gimli`	Parse DWARF debug info
`addr2line`	Map addresses to source file:line
`memmap2`	Memory-map files for zero-copy parsing
`wasmparser`	Parse WASM binaries

References

ELF Specification (PDF)
Goblin docs.rs
Linux Foundation - ELF spec
man 5 elf — seriously, this man page is excellent
Capstone Engine

Executable Binary Formats — the formats (ELF, Mach-O, PE) Goblin parses
The Abstraction Stack Below Compilation — the full pipeline from source to executable
Symbol tables — what Goblin reads: names, types, relocations
Loading a program or a library into memory — runtime counterpart of what Goblin inspects statically
C Runtime — _start, .init_array constructors, the execution environment
C Application Binary Interface — calling conventions and ABI contracts in binaries
Foreign Function Interface — symbol export/import, PLT/GOT, dynamic linking
Disassemblying — the layer above Goblin (decoding instructions)
Cargo-show-asm — inspecting compiler output at each pipeline stage
Low Level Rust — #[no_mangle], extern "C", #[link_section] that control binary layout
Shellcodes — the offensive side: ASLR, DEP, what Goblin’s security auditor defends against
Memory page permissions — PROT_* flags that correspond to segment permissions
Binary Size Analysis — tools and techniques for the binary size profiler idea
LLVM Sanitizers — __asan_* symbols for the fuzzing harness validator idea
LLVM — the compiler backend that produces the binaries Goblin parses

Edmondo's Vault

Explorer

Having Fun with Goblin

ELF Crash Course: The Mental Model

The Jargon Decoder

ELF Header Constants

Program Header Types (Segments)

Dynamic Section Tags

Symbol Binding & Visibility

Why a Library Instead of Coreutils

Use Cases: Practical and Non-Trivial

1. Container Image Security Auditor

2. Binary Diffing Across Releases (ABI Compatibility)

3. Plugin Loader with Pre-Execution Validation

4. Shared Library Dependency Graph Auditor

5. Build a Minimal Disassembler (Goblin + Capstone)

6. Symbol-Aware Disassembly (The Cool Part)

7. Cross-Platform Binary Inspector (PE + ELF + Mach-O)

Ideas to Explore Further

Companion Crates

References

Graph View

Table of Contents

Backlinks

Edmondo's Vault

Explorer

Having Fun with Goblin

ELF Crash Course: The Mental Model

The Jargon Decoder

ELF Header Constants

Program Header Types (Segments)

Dynamic Section Tags

Symbol Binding & Visibility

Why a Library Instead of Coreutils

Use Cases: Practical and Non-Trivial

1. Container Image Security Auditor

2. Binary Diffing Across Releases (ABI Compatibility)

3. Plugin Loader with Pre-Execution Validation

4. Shared Library Dependency Graph Auditor

5. Build a Minimal Disassembler (Goblin + Capstone)

6. Symbol-Aware Disassembly (The Cool Part)

7. Cross-Platform Binary Inspector (PE + ELF + Mach-O)

Ideas to Explore Further

Companion Crates

References

Related

Graph View

Table of Contents

Backlinks