The C Application Binary Interface (ABI) defines the rules and conventions for how compiled C programs interact at the binary level, both with the operating system and with other components, such as shared libraries. It ensures compatibility across compilers and platforms by standardizing function calls, memory layout, and symbol resolution.

Calling Conventions

The ABI specifies how functions are invoked and how arguments are passed between the caller and callee. For example, in the x86-64 System V ABI, the first arguments to a function are passed in specific registers (RDI, RSI, RDX, etc.), while additional arguments are placed on the stack. Return values are typically passed back to the caller in the RAX register for scalar values.

The stack layout is also standardized, with function calls pushing return addresses and saving registers. This organization ensures compatibility between components compiled by different tools.

Data Layout

The C ABI defines how fundamental types, such as integers and floating-point numbers, are stored in memory. It includes alignment rules, where certain data types must be stored at specific memory boundaries to optimize access speed, and endianness, which specifies the order of bytes in multi-byte data types (e.g., big-endian vs. little-endian).

Structs and arrays also follow specific layout rules to maintain predictable memory alignment and access patterns. These rules ensure consistent data representation across compilers.

Dynamic Libraries

When an executable relies on dynamic libraries, the ABI defines how they are loaded and resolved at runtime. Shared libraries are mapped into the program’s virtual address space, allowing multiple programs to share the same library code without duplication. This is achieved using the Global Offset Table (GOT) and Procedure Linkage Table (PLT), which handle dynamic symbol resolution.

The loader resolves the required functions and variables by updating the GOT entries during program execution. The PLT ensures that function calls to shared libraries are efficient, resolving symbols only on their first use and caching the results for subsequent calls.

Symbol Resolution

In C, symbols such as function and variable names are represented in a plain, unmangled format in the binary. This simplicity ensures compatibility across compilers and other programming languages. Other languages, like C++ or Rust, often use the extern "C" declaration when interfacing with C libraries. This forces the use of the C ABI conventions, avoiding the name mangling that would typically occur in these languages.

Symbol resolution is a critical aspect of interoperability, enabling seamless integration between different components and ensuring that the underlying binary code behaves predictably, regardless of the tools or environments used.