Symbols and relocation data play a crucial role in how executables and shared libraries are structured, linked, and executed. Tools like strip -s can manipulate these elements, removing them to optimize size and protect proprietary details.

Symbols

Symbols represent names of variables, functions, and other entities in the program. During compilation and linking, symbols are used to resolve references between different parts of the code, such as function calls or global variable access. There are four types:

  1. Global Symbols: Exposed across modules, allowing functions or variables to be used in other translation units.
  2. Local Symbols: Limited to the file in which they are defined.
  3. Undefined Symbols: Referenced in the code but defined elsewhere (e.g., in a shared library).
  4. Weak Symbols: Have a lower precedence and can be overridden by strong symbols during linking.

The symbol table is a metadata structure that stores information about each symbol, including:

  • Name: Identifier of the symbol (e.g., function name).
  • Type: Function, variable, or section.
  • Value: Memory address or offset of the symbol.
  • Size: Size of the symbol in memory.
  • Section Index: Index of the section where the symbol resides.

The symbol table is usually found in the .symtab section of an ELF binary. Stripping the binary (strip -s) removes this table to reduce file size and obscure symbol information.

Relocation Information

Relocation is the process of adjusting addresses in a program to reflect its actual memory layout during execution. Relocation information is crucial for both statically linked executables and dynamically linked libraries.

Relocation Entries

Relocation entries specify:

  • Offset: The location in the binary where an address needs to be modified.
  • Symbol: The symbol that the address refers to.
  • Type: The kind of relocation (e.g., relative, absolute).
  • Addend: A constant adjustment to the symbol’s address.

Relocation tables, typically found in .rel or .rela sections, are consulted during linking or dynamic loading to apply these adjustments.

Relocation Example

If a binary references a global variable int x, but the variable’s final address isn’t known at compile time, a relocation entry might look like this:

  • Offset: Address in the .text section where x is accessed.
  • Symbol: The identifier for x.
  • Type: R_X86_64_32 (32-bit absolute relocation).
  • Addend: Offset adjustment, if any.

Impact of strip -s

The strip -s command removes:

  • Symbol Table: Eliminates names of symbols, leaving only essential information required for execution.
  • Relocation Information: Reduces file size by removing metadata needed only for debugging or dynamic linking.

Stripping affects how a binary can be debugged or reverse-engineered:

  1. Obfuscated Symbols: Without symbol names, understanding the binary’s behavior is harder.
  2. No Relocation Info: Makes it difficult to re-link or modify the binary without detailed knowledge of its structure.

How Symbols and Relocations Look

Symbol Table Entry (ELF Example)

An entry in the .symtab section may look like this:

Num:    Value          Size Type    Bind   Vis      Ndx Name
  1: 0000000000001130    34 FUNC    GLOBAL DEFAULT   12 main
  2: 0000000000000000     0 NOTYPE  LOCAL  DEFAULT    1 .text

Relocation Entry (ELF Example)

A relocation entry in the .rela.text section:

Offset          Info           Type           Symbol
00000000000010c8  000100000007 R_X86_64_JUMP_SLOT  printf

Here:

  • Offset: Location in the code that calls printf.
  • Info: Encodes the type and symbol index.
  • Type: Specifies the type of relocation.
  • Symbol: The name of the function or variable (e.g., printf).

Use Cases and Implications

Stripping binaries is useful in:

  • Size Optimization: Smaller binaries for constrained environments (e.g., embedded systems).
  • Security: Obfuscates symbol names and metadata to hinder reverse engineering.
  • Proprietary Protection: Removes debug symbols to protect intellectual property.

However, stripped binaries are harder to debug and may require unstripped versions for development. Relocation information is critical for dynamically linked executables, so it is usually retained unless explicitly stripped.

Let me know if you need further elaboration!