Goblin is a binary format parser. It reads the structure and metadata of executable files — headers, sections, symbols, dependencies, relocations.

It is NOT a disassembler. It doesn’t decode machine instructions (0x55 0x48 0x89 0xe5push rbp; mov rbp, rsp). For that, see Disassemblying.

Think of it this way:

LayerToolWhat it reads
Format/MetadataGoblinHeaders, sections, symbols, imports
Instructionscapstone, iced-x86Raw bytes → assembly mnemonics
SemanticsBinary ninja, GhidraControl flow, decompilation

Goblin is Layer 0 — you need to understand the container before you can find the code inside it. It answers: where is the code, what symbols exist, how is the binary structured. A disassembler answers: what do the bytes do.

To build a disassembler in Rust, you’d combine:

// Goblin finds the .text section (where code lives)
// capstone-rs decodes the bytes inside it
goblin + capstone-rs  // = DIY disassembler
goblin + iced-x86     // = alternative, pure Rust

ELF Crash Course: The Mental Model

ELF (Executable and Linkable Format) is how Linux packages a compiled program. Think of it as a zip file with a very rigid table of contents.

┌──────────────────────────────────────────┐
│              ELF Header                  │  ← "What am I?" (arch, type, entry point)
├──────────────────────────────────────────┤
│         Program Headers (Segments)       │  ← "How to LOAD me into memory"
│         (used at runtime by the kernel)  │     (the kernel reads these)
├──────────────────────────────────────────┤
│                                          │
│         .text    (executable code)       │
│         .rodata  (read-only constants)   │
│         .data    (initialized globals)   │
│         .bss     (uninitialized globals) │
│         .symtab  (symbol table)          │
│         .strtab  (string table)          │
│         .dynsym  (dynamic symbols)       │
│         .dynstr  (dynamic strings)       │
│         .plt     (procedure linkage)     │
│         .got     (global offset table)   │
│         .dynamic (dynamic linking info)  │
│         ...                              │
│                                          │
├──────────────────────────────────────────┤
│         Section Headers                  │  ← "How to ANALYZE me"
│         (used by tools like readelf/nm)  │     (debuggers/linkers read these)
└──────────────────────────────────────────┘

Key distinction:

  • Segments (Program Headers) = runtime view. The kernel loads these into memory.
  • Sections (Section Headers) = development view. Tools care about these.

A segment can contain multiple sections. The .text section lives inside a PT_LOAD segment marked executable.


The Jargon Decoder

ELF Header Constants

ConstantFull NameMeaning
ET_EXECExecutable TypeTraditional fixed-address executable
ET_DYNDynamic/SharedPosition-independent (PIE executable OR .so library)
ET_RELRelocatableObject file (.o), not yet linked
ET_CORECore dumpProcess memory dump from a crash

Why ET_DYN matters: Modern executables are compiled as PIE (Position-Independent Executables) so the kernel can load them at a random address (ASLR). PIE binaries show up as ET_DYN, not ET_EXEC. If you see ET_EXEC, ASLR is NOT possible — that’s a security finding.

Program Header Types (Segments)

ConstantWhat it is
PT_LOAD”Load this chunk into memory” — the actual code & data
PT_DYNAMICDynamic linking info (what .so files are needed)
PT_INTERPPath to the dynamic linker (/lib64/ld-linux-x86-64.so.2)
PT_GNU_STACKDeclares whether the stack should be executable
PT_GNU_RELROMarks memory to be made read-only after relocation
PT_NOTEMetadata (build ID, ABI tag, etc.)

Why PT_GNU_STACK matters: If this segment has the execute flag (PF_X), the process gets an executable stack — meaning buffer overflow exploits can inject shellcode directly onto the stack and run it. Modern binaries should NEVER have this set. If you find it, something is very wrong (probably hand-written assembly or ancient build flags).

Why PT_GNU_RELRO matters: After the dynamic linker resolves all symbols, the GOT (Global Offset Table) can be made read-only. This prevents attackers from overwriting function pointers in the GOT. “Full RELRO” = PT_GNU_RELRO

  • BIND_NOW (resolve everything upfront, then lock it down).

Dynamic Section Tags

ConstantWhat it controls
DT_NEEDED”I need this shared library” (e.g., libc.so.6)
DT_SONAME”My canonical name is…” (for library versioning)
DT_RPATHHardcoded library search path (generally bad practice)
DT_RUNPATHLibrary search path (slightly better than RPATH)
DT_FLAGSBitfield of flags, including…
DF_BIND_NOW”Resolve ALL symbols at load time, not lazily”
DT_FLAGS_1Extended flags (PIE indicator, etc.)

Why DT_FLAGS / DF_BIND_NOW matters: Lazy binding means function addresses are resolved the first time they’re called. The GOT entries are writable until then — an attack window. BIND_NOW closes it: resolve everything at startup, then (with RELRO) make the GOT read-only. The cost is slightly slower startup. The benefit is a much smaller attack surface.

Symbol Binding & Visibility

ConstantMeaning
STB_GLOBALVisible to all object files during linking
STB_LOCALOnly visible within this object file
STB_WEAKLike global but can be overridden without error
STV_HIDDENNot exported from shared library (internal use only)
STV_DEFAULTNormal visibility, exported

Why a Library Instead of Coreutils

ApproachGood for
readelf -d mybinQuick look by a human in a terminal
Goblin in a Rust programAutomated systems that make decisions

The moment you need to:

  • Analyze at scale (100s-1000s of binaries)
  • Make programmatic decisions (pass/fail CI, allow/deny plugin load)
  • Produce structured output (JSON reports, database entries, dashboards)
  • Compose with other logic (network calls, DB lookups, policy engines)
  • Do it cross-platform (parse a Windows PE on Linux, inspect iOS Mach-O on CI)

…shelling out falls apart. Parsing unstable text. Handling locale differences. Spawning processes. Goblin gives typed, structured, zero-copy access.

It’s the difference between grep-ing HTML and using a DOM parser.


Use Cases: Practical and Non-Trivial

1. Container Image Security Auditor

Scan every binary in a Docker image. Check hardening flags. Fail the CI pipeline if anything isn’t locked down.

use goblin::elf::Elf;
use goblin::elf::program_header::{PT_GNU_STACK, PT_GNU_RELRO};
use goblin::elf::dynamic::{DF_BIND_NOW, DT_FLAGS};
 
struct SecurityReport {
    path: String,
    is_pie: bool,
    has_nx_stack: bool,
    has_full_relro: bool,
}
 
fn audit_elf(path: &str, buffer: &[u8]) -> Option<SecurityReport> {
    let elf = Elf::parse(buffer).ok()?;
 
    let is_pie = elf.header.e_type == goblin::elf::header::ET_DYN;
 
    let has_nx_stack = elf.program_headers.iter()
        .find(|ph| ph.p_type == PT_GNU_STACK)
        .map(|ph| ph.p_flags & 0x1 == 0) // PF_X not set
        .unwrap_or(false);
 
    let has_relro = elf.program_headers.iter()
        .any(|ph| ph.p_type == PT_GNU_RELRO);
 
    let has_bind_now = elf.dynamic.as_ref()
        .map(|d| d.dyns.iter().any(|dyn_entry|
            dyn_entry.d_tag == DT_FLAGS
            && (dyn_entry.d_val & DF_BIND_NOW as u64) != 0
        ))
        .unwrap_or(false);
 
    Some(SecurityReport {
        path: path.to_string(),
        is_pie,
        has_nx_stack,
        has_full_relro: has_relro && has_bind_now,
    })
}

Real-world: This is essentially what checksec does, but now it’s part of your Rust pipeline, producing structured JSON, integrated with your alerting.


2. Binary Diffing Across Releases (ABI Compatibility)

What changed at the symbol level between two versions of a shared library? Catch accidental ABI breaks before they hit production.

fn diff_binaries(old: &[u8], new: &[u8]) {
    let old_syms = extract_symbols(old);  // HashMap<String, SymbolInfo>
    let new_syms = extract_symbols(new);
 
    for (name, _) in &old_syms {
        if !new_syms.contains_key(name) {
            println!("🔴 REMOVED: {name}");  // ABI break!
        }
    }
 
    for (name, _) in &new_syms {
        if !old_syms.contains_key(name) {
            println!("🟢 ADDED: {name}");
        }
    }
 
    for (name, old_info) in &old_syms {
        if let Some(new_info) = new_syms.get(name) {
            if old_info.size != new_info.size {
                let delta = new_info.size as i64 - old_info.size as i64;
                println!("🟡 RESIZED: {name} ({delta:+} bytes)");
            }
        }
    }
}

3. Plugin Loader with Pre-Execution Validation

Before dlopen-ing an untrusted .so, inspect it statically:

const REQUIRED_EXPORTS: &[&str] = &[
    "plugin_init", "plugin_handle", "plugin_version"
];
const BANNED_IMPORTS: &[&str] = &[
    "system", "exec", "popen", "dlopen" // no shell, no loading more libs
];
 
fn validate_plugin(buffer: &[u8]) -> PluginValidation {
    let elf = Elf::parse(buffer).unwrap();
 
    let exported: HashSet<&str> = elf.dynsyms.iter()
        .filter(|s| s.is_function() && s.st_bind() == STB_GLOBAL)
        .filter_map(|s| elf.dynstrtab.get_at(s.st_name))
        .collect();
 
    let suspicious: Vec<&str> = elf.dynsyms.iter()
        .filter(|s| s.is_import())
        .filter_map(|s| elf.dynstrtab.get_at(s.st_name))
        .filter(|name| BANNED_IMPORTS.contains(name))
        .collect();
 
    // Does it have .init_array? (runs code automatically on dlopen!)
    // See: C Runtime — .init_array constructors
    let has_constructor = elf.section_headers.iter()
        .any(|sh| matches!(
            elf.shdr_strtab.get_at(sh.sh_name),
            Some(".init_array" | ".ctors")
        ));
 
    PluginValidation {
        exports_required_api: REQUIRED_EXPORTS.iter()
            .all(|r| exported.contains(r)),
        suspicious_imports: suspicious,
        has_constructor,
    }
}

Why this matters: .init_array functions run the instant you dlopen. If a malicious plugin has one, it executes before you can do anything. Detecting this BEFORE loading is a real security boundary.


4. Shared Library Dependency Graph Auditor

Map every binary in a deployment to its transitive .so dependencies. Answer: “What still links OpenSSL?” or “Why is libpython in our container?“

fn build_dep_graph(root: &Path) -> HashMap<String, Vec<String>> {
    let mut graph = HashMap::new();
 
    for entry in WalkDir::new(root).into_iter().filter_map(Result::ok) {
        let buffer = fs::read(entry.path()).unwrap_or_default();
        if let Ok(Object::Elf(elf)) = Object::parse(&buffer) {
            let libs: Vec<String> = elf.libraries.iter()
                .map(|s| s.to_string())
                .collect();
            if !libs.is_empty() {
                graph.insert(entry.path().display().to_string(), libs);
            }
        }
    }
    graph
}
// Output this as JSON → feed it into a graph visualizer
// Detect conflicts: two binaries needing different versions of the same .so

5. Build a Minimal Disassembler (Goblin + Capstone)

NOW we’re actually disassembling. Goblin finds the code. Capstone decodes it.

use goblin::elf::Elf;
use goblin::elf::section_header::SHT_PROGBITS;
use capstone::prelude::*;
 
fn disassemble_text_section(buffer: &[u8]) {
    let elf = Elf::parse(buffer).unwrap();
 
    // Find .text section
    let text_section = elf.section_headers.iter()
        .find(|sh| {
            elf.shdr_strtab.get_at(sh.sh_name) == Some(".text")
            && sh.sh_type == SHT_PROGBITS
        })
        .expect("no .text section found");
 
    let offset = text_section.sh_offset as usize;
    let size = text_section.sh_size as usize;
    let code = &buffer[offset..offset + size];
    let base_addr = text_section.sh_addr;
 
    // Disassemble with capstone
    let cs = Capstone::new()
        .x86()
        .mode(arch::x86::ArchMode::Mode64)
        .syntax(arch::x86::ArchSyntax::Intel)
        .build()
        .unwrap();
 
    let instructions = cs.disasm_all(code, base_addr).unwrap();
    for insn in instructions.iter() {
        println!("{:#010x}:  {} {}",
            insn.address(),
            insn.mnemonic().unwrap_or("???"),
            insn.op_str().unwrap_or(""),
        );
    }
}
 
// Output:
// 0x00401000:  push rbp
// 0x00401001:  mov  rbp, rsp
// 0x00401004:  sub  rsp, 0x10
// ...

6. Symbol-Aware Disassembly (The Cool Part)

Plain disassembly is just addresses. By combining Goblin’s symbol table with Capstone, you can label functions by name:

fn disassemble_function(buffer: &[u8], func_name: &str) {
    let elf = Elf::parse(buffer).unwrap();
 
    // Find the symbol by name
    let symbol = elf.syms.iter()
        .find(|s| elf.strtab.get_at(s.st_name) == Some(func_name))
        .expect("symbol not found");
 
    // Build address → name lookup for cross-references
    let sym_map: HashMap<u64, &str> = elf.syms.iter()
        .filter_map(|s| {
            let name = elf.strtab.get_at(s.st_name)?;
            Some((s.st_value, name))
        })
        .collect();
 
    let offset = symbol.st_value as usize;
    let size = symbol.st_size as usize;
 
    // Find which section contains this address to calculate file offset
    let section = elf.section_headers.iter()
        .find(|sh| {
            symbol.st_value >= sh.sh_addr
            && symbol.st_value < sh.sh_addr + sh.sh_size
        })
        .unwrap();
 
    let file_offset = (symbol.st_value - section.sh_addr + section.sh_offset) as usize;
    let code = &buffer[file_offset..file_offset + size];
 
    let cs = Capstone::new()
        .x86().mode(arch::x86::ArchMode::Mode64)
        .syntax(arch::x86::ArchSyntax::Intel)
        .detail(true)
        .build().unwrap();
 
    println!("─── {} ({} bytes) ───", func_name, size);
 
    let instructions = cs.disasm_all(code, symbol.st_value).unwrap();
    for insn in instructions.iter() {
        // If a call/jump target matches a known symbol, annotate it
        let annotation = insn.op_str()
            .and_then(|ops| {
                // crude: parse immediate from "call 0x401234"
                let addr = u64::from_str_radix(
                    ops.trim_start_matches("0x"), 16
                ).ok()?;
                sym_map.get(&addr).map(|name| format!("  ; <{name}>"))
            })
            .unwrap_or_default();
 
        println!("  {:#010x}:  {:6} {}{}",
            insn.address(),
            insn.mnemonic().unwrap_or(""),
            insn.op_str().unwrap_or(""),
            annotation,
        );
    }
}
 
// Output:
// ─── main (47 bytes) ───
//   0x00401120:  push   rbp
//   0x00401121:  mov    rbp, rsp
//   0x00401124:  call   0x401200  ; <initialize_server>
//   0x00401129:  call   0x401340  ; <run_event_loop>
//   ...

7. Cross-Platform Binary Inspector (PE + ELF + Mach-O)

Goblin’s killer feature: one API for all formats. Useful if you ship clients for multiple platforms:

use goblin::Object;
 
fn inspect(buffer: &[u8]) -> String {
    match Object::parse(buffer).unwrap() {
        Object::Elf(elf) => format!(
            "Linux ELF | {} | {} sections | {} symbols | deps: {:?}",
            if elf.is_64 { "64-bit" } else { "32-bit" },
            elf.section_headers.len(),
            elf.syms.len(),
            elf.libraries,
        ),
        Object::PE(pe) => format!(
            "Windows PE | {} | {} sections | imports: {:?}",
            if pe.is_64 { "64-bit" } else { "32-bit" },
            pe.sections.len(),
            pe.libraries,
        ),
        Object::Mach(mach) => format!("macOS Mach-O | {:?}", mach),
        Object::Archive(archive) => format!(
            "Static archive (.a) | {} members", archive.members().len()
        ),
        Object::Unknown(magic) => format!("Unknown format (magic: {:#x})", magic),
    }
}

Ideas to Explore Further

  • WASM binary inspector: Goblin doesn’t parse WASM, but wasmparser crate follows the same philosophy — could pair them for a universal binary analysis toolkit
  • Fuzzing harness validator: Before deploying a fuzz target, verify it’s compiled with sanitizers (check for __asan_* symbols)
  • License compliance: Scan binaries for statically linked libraries by looking for telltale symbols (e.g., OPENSSL_* symbols = OpenSSL statically linked = license implications)
  • Binary size profiler: Map every symbol’s size, sort by largest, find bloat (Binary Size Analysis). “Why is this binary 50MB?” → “Because serde monomorphized 200 versions of deserialize
  • Core dump analyzer: Parse ET_CORE files to extract register state, memory maps, and stack traces programmatically
  • Reproducible build verifier: Compare two builds of the same source — are the symbols and sections identical? If not, where’s the divergence?

Companion Crates

CratePurpose
goblinParse ELF/PE/Mach-O structure
capstone-rsDisassemble x86/ARM/MIPS/etc.
iced-x86Pure Rust x86 disassembler (no C deps)
objectAlternative to goblin (used by rustc)
gimliParse DWARF debug info
addr2lineMap addresses to source file:line
memmap2Memory-map files for zero-copy parsing
wasmparserParse WASM binaries

References