Cargo-show-asm

cargo-show-asm is a Cargo subcommand that extracts and displays the low-level compiled representation of individual Rust functions. It compiles your crate, parses the compiler output artifacts, and isolates the output for a single symbol you specify. The current version is 0.2.56 (February 2026), and it installs as cargo asm. It also responds to cargo show-asm.

cargo install cargo-show-asm

It replaced the now-unmaintained cargo-asm crate (RUSTSEC-2025-0122, November 2025). The key improvements over its predecessor are that it avoids recompiling the entire project on every invocation, respects your existing profile settings instead of forcing codegen-units=1, produces cleaner demangled output, and auto-detects whether color should be enabled based on terminal attachment.

Supported Platforms and Architectures

OS: Linux, macOS, Windows (limited support on Windows)
Rust: Stable and Nightly
Architectures: x86, x86_64, ARM, AArch64, PowerPC, MIPS, SPARC, WASM

Output Formats

cargo-show-asm can emit six distinct representations. Each corresponds to a different stage in the Rust compilation pipeline. For context, the full pipeline is:

Rust source → MIR → LLVM IR (unoptimized) → LLVM IR (optimized) → Assembly text (.s) → Assembler → Object file (.o) → Linker → Binary

The flag you pass selects which artifact is parsed:

Flag	Output	Stage
`--asm`	Assembly source text (default)	`.s` file as LLVM’s backend wrote it
`--disasm`	Disassembled object code	`.o` file disassembled back into readable assembly
`--llvm`	LLVM IR (post-optimization)	After LLVM optimization passes
`--llvm-input`	LLVM IR (pre-optimization)	Raw translation from Rust MIR into LLVM
`--mir`	Rust MIR	Before LLVM, where borrow checker operates
`--wasm`	WebAssembly text format	Requires a `wasm32` target
`--mca`	llvm-mca performance analysis	Simulates microarchitectural execution

For x86 targets, assembly and disassembly support two syntaxes: --intel (default) and --att.

Assembly vs Disassembly

--asm (the default) shows the assembly source text from the .s file that LLVM’s backend emits. This is assembly as LLVM wrote it, before the assembler encodes it into machine code bytes. --disasm takes the assembled object file (.o, containing actual encoded machine code) and disassembles it back into readable assembly. The round-trip through the assembler can surface differences: instruction encoding choices, relaxations (e.g., shortening a 32-bit jump offset to 8-bit when the target is close enough), or reorderings. --disasm is closer to what the CPU actually executes.

cargo asm --release --lib my_crate::compute
cargo asm --release --disasm --lib my_crate::compute

LLVM IR: Pre- and Post-Optimization

Two distinct views are available. --llvm shows the LLVM IR after optimization passes have run — this is what LLVM’s backend actually lowers to assembly. --llvm-input shows the IR before optimization, which is the raw translation from Rust’s MIR into LLVM. Comparing the two reveals exactly what LLVM’s optimizer contributed.

# Pre-optimization: what rustc hands to LLVM
cargo asm --release --llvm-input --lib my_crate::compute
 
# Post-optimization: what LLVM's backend receives
cargo asm --release --llvm --lib my_crate::compute

MIR

--mir emits Rust’s Mid-level Intermediate Representation, which sits above LLVM entirely. MIR is the representation on which the borrow checker operates and where Rust-level optimizations (like copy propagation and const evaluation) happen. This is useful when you want to understand what Rust itself did before handing off to LLVM.

Warning

MIR’s human-readable format is not stabilized. Its structure can change between Rust compiler versions without notice.

cargo asm --mir --lib my_crate::compute

WebAssembly

--wasm parses the .wasm binary and presents it in WebAssembly text format. This requires targeting a WASM triple.

rustup target add wasm32-unknown-unknown
cargo asm --release --wasm --target wasm32-unknown-unknown --lib my_crate::compute

llvm-mca Analysis

--mca pipes the assembly output through LLVM’s Machine Code Analyzer, which simulates the microarchitectural execution of the instruction sequence. It reports estimated cycle counts, bottleneck resources (e.g., port pressure on x86), and throughput. You almost always want to pair this with --target-cpu or --native so that the analysis models the correct microarchitecture. Extra arguments can be forwarded to llvm-mca via -M.

cargo asm --release --mca --native --lib my_crate::hot_loop
cargo asm --release --mca --target-cpu=skylake -M "--timeline" --lib my_crate::hot_loop

CLI Reference

Artifact and Package Selection

In a workspace, -p <SPEC> selects the package (defaults to the current one). Then exactly one artifact flag tells it which compilation unit to analyze:

Flag	Target
`--lib`	Library target
`--bin <NAME>`	A binary target
`--test <NAME>`	A test target
`--bench <NAME>`	A benchmark target
`--example <NAME>`	An example target

cargo asm -p my_subcrate --lib my_subcrate::process
cargo asm --bin my_server my_server::handle_request
cargo asm --example my_example my_example::demo

Build Configuration

Flag	Effect
`--release`	Compile with release profile
`--dev`	Compile with dev profile
`--profile <PROFILE>`	Use a custom profile
`--features <FEATURES>`	Activate specific features (repeatable)
`--all-features`	Activate all features
`--no-default-features`	Disable the default feature
`--target <TRIPLE>`	Cross-compile for a target triple
`--native`	Optimize for the host CPU (`-C target-cpu=native`)
`--target-cpu <CPU>`	Optimize for a specific CPU model (e.g., `skylake`, `znver3`)
`-C <FLAG>`	Pass codegen flags to `rustc`
`-Z <FLAG>`	Pass unstable (nightly-only) flags to Cargo
`--manifest-path <PATH>`	Path to `Cargo.toml`
`--target-dir <DIR>`	Custom target directory
`--frozen`	Require `Cargo.lock` and cache are up to date
`--locked`	Require `Cargo.lock` is up to date
`--offline`	Run without network access
`--dry`	Show the build plan without executing

--native and --target-cpu affect which SIMD instruction sets, branch hinting, and scheduling strategies the backend can use. --native causes LLVM to detect the host CPU’s exact model and feature set at compile time (via the cpuid instruction on x86, system registers like ID_AA64ISAR0_EL1 on AArch64, etc.) and emit code using all available extensions. --target-cpu names a specific microarchitecture explicitly.

# Release mode with specific features
cargo asm --release --features "simd,parallel" --lib my_crate::compute
 
# Cross-compile for ARM
cargo asm --release --target armv7-unknown-linux-gnueabihf --lib my_crate::compute
 
# Optimize for a specific CPU model
cargo asm --release --target-cpu=skylake --lib my_crate::compute

Display and Formatting

Flag	Effect
`--rust`	Interleave Rust source lines with output
`-c, --context <N>`	Include called functions recursively up to depth N
`--simplify`	Strip assembler directives (`.cfi_*`, `.section`, etc.)
`--include-constants`	Include string literals, lookup tables, constant data sections
`--color`	Force color output
`--no-color`	Disable color output
`--full-name`	Show full demangled symbol names including hash
`--short-name`	Show abbreviated demangled names
`--keep-mangled`	Preserve raw mangled symbol names
`-K, --keep-labels`	Keep all original labels
`-B, --keep-blanks`	Strip redundant labels but keep blank lines/whitespace
`-R, --reduce-labels`	Strip redundant labels entirely
`-b, --keep-blank`	Keep blank lines
`--everything`	Dump output for ALL symbols, bypass function selection
`-v, --verbose`	More verbose output
`-s, --silent`	Less user-facing information
`-q, --quiet`	Suppress Cargo log messages
`--this-workspace`	Scope interleaved Rust sources to the current workspace
`--all-crates`	Include Rust sources from all crates
`--all-sources`	Include all available source files

# Clean, simplified output with Rust source
cargo asm --release --rust --simplify --lib my_crate::compute
 
# See a function and everything it calls (2 levels deep)
cargo asm --release --context 2 --lib my_crate::compute
 
# Include constant data referenced by the function
cargo asm --release --include-constants --lib my_crate::compute
 
# Dump ALL functions' assembly to a file
cargo asm --release --everything --simplify --lib > all_asm.s

Function Selection and Disambiguation

When invoked with an artifact flag and no function name, cargo-show-asm lists every symbol in the artifact along with its line count:

$ cargo asm --lib

When you provide a substring, it filters to matching symbols. If multiple symbols match, they are listed with numeric indices:

0: my_crate::process::process_data       (45 lines)
1: my_crate::process::process_header     (23 lines)
2: <my_crate::Proc as my_crate::Run>::go (67 lines)

You can then select by appending the index as a positional argument:

cargo asm --lib "process" 2

For unambiguous selection across sessions (e.g., in scripts or CI), use the full demangled name with hash suffix:

cargo asm --lib "once_cell::imp::OnceCell<T>::initialize::h9c5c7d5bd745000b"

Working with Generics and Inlining

Rust monomorphizes generic functions, producing a separate compiled copy for each concrete type instantiation. This means a function like fn sort<T: Ord>(slice: &mut [T]) may appear multiple times in the symbol list — once as sort::<i32>, once as sort::<String>, and so on. Use --full-name or the index to pick the instantiation you care about.

The more common problem is that a function does not appear at all. This happens when LLVM inlines it into its callers, eliminating the standalone symbol. Two attributes prevent this:

#[inline(never)]
pub fn my_function<T: std::fmt::Display>(val: T) -> String {
    format!("{}", val)
}

#[inline(never)] prevents inlining, and pub visibility prevents the compiler from deciding the symbol is internal-only and eliminating it entirely. For generic functions where you want to inspect a specific monomorphization, a monomorphic wrapper gives you a stable, findable symbol:

#[inline(never)]
pub fn compute_f64(data: &[f64]) -> f64 {
    compute_generic(data)
}

#[no_mangle] on a non-generic function also makes it trivially easy to find in the output, though it changes linkage semantics — the symbol becomes externally visible with its exact Rust name. Use it for inspection, not in production code.

Cross-Compilation Setup

To generate assembly for a non-host target, you need the target installed via rustup and potentially a cross-linker configured.

# Add the target
rustup target add aarch64-unknown-linux-gnu

Configure the linker in .cargo/config.toml:

[target.aarch64-unknown-linux-gnu]
linker = "aarch64-linux-gnu-gcc"

Then inspect assembly for that target:

cargo asm --release --target aarch64-unknown-linux-gnu --lib my_crate::compute

This works for any supported architecture. For WASM specifically, no linker configuration is needed — just rustup target add wasm32-unknown-unknown and pass --wasm --target wasm32-unknown-unknown.

Practical Workflows

Checking for Vectorization

Compile with --release --native (so the backend knows the full SIMD width available) and --rust (so you can correlate loops to instructions):

cargo asm --release --native --rust --lib my_crate::sum_squares

On x86_64, look for packed instructions like vaddps, vmulpd, vfmadd213ps (AVX/AVX2) or addps, mulpd (SSE). The register prefix tells you the vector width: xmm is 128-bit, ymm is 256-bit, zmm is 512-bit. If you see only scalar xmm0 operations with ss/sd suffixes, vectorization did not occur.

Comparing Optimizer Impact

Viewing pre- and post-optimization LLVM IR for the same function reveals exactly what LLVM contributes:

cargo asm --release --llvm-input --lib my_crate::my_fn
cargo asm --release --llvm --lib my_crate::my_fn

A pattern worth watching for is --llvm-input showing explicit bounds checks (call void @panic_bounds_check) that disappear in --llvm output — this means LLVM proved them redundant.

Profiling with llvm-mca

When you have an already-tight inner loop and want to understand its throughput ceiling, --mca simulates execution on a modeled CPU pipeline. The output includes a “Block RThroughput” (reciprocal throughput) estimate and a resource pressure table showing which execution port is saturated. This is most reliable for straight-line code without branches.

cargo asm --release --mca --target-cpu=skylake --lib my_crate::bottleneck

Inspecting MIR for Rust-Level Decisions

When you suspect the issue is in Rust’s own optimization passes rather than LLVM’s — for example, understanding how match arms are lowered, or whether a Copy type is being memcpied — MIR is the right level:

cargo asm --mir --lib my_crate::my_fn

Inspecting WASM Output

rustup target add wasm32-unknown-unknown
cargo asm --release --wasm --target wasm32-unknown-unknown --lib my_crate::my_fn

Tips and Best Practices

Always use --release. Dev mode produces unoptimized code with stack spills for every variable, redundant loads and stores, and no inlining. It bears no resemblance to what ships in production.

Use --rust liberally. Interleaved source makes output dramatically easier to read. Pair it with --simplify to strip assembler directives you rarely care about.

Use --native for performance work. Without it, the backend targets a conservative baseline (e.g., x86-64-v1 with only SSE2) and will not emit AVX/AVX2/AVX-512 instructions even if your CPU supports them.

Pipe large outputs to a file. cargo asm --release --everything --lib > output.s is useful for searching across all functions with grep or an editor.

Mark functions pub and #[inline(never)] when you need them to survive as standalone symbols in the output. Private, small functions are aggressively inlined and eliminated.

Use --context <N> to see a function alongside its callees. This is invaluable when a function delegates to a small helper — you see the full picture without hunting for the helper separately.

Common Pitfalls

Forgetting --release is the most frequent source of confusion. The second most common is a missing function: small functions called from only one site are almost certain to be inlined away in release mode. If a function you expect is missing, check whether it appears in the caller’s output via --context 1 on the call site before assuming a tooling problem.

The old cargo-asm forced codegen-units=1, which meant the assembly you saw came from a single-CGU compilation — different from what a normal cargo build --release produces (which defaults to 16 CGUs). cargo-show-asm does not override this, so the output matches your actual build. If you want single-CGU output for easier analysis or to enable cross-function optimizations, set it explicitly:

[profile.release]
codegen-units = 1

Generic functions with weird names can be confusing. Use --full-name to see the full monomorphized signature including the concrete type parameters, or use the numeric index to select among matches.

Quick Reference

# Install
cargo install cargo-show-asm
 
# List all functions in your lib
cargo asm --lib
 
# View assembly with Rust source (release mode)
cargo asm --release --rust --lib my_crate::my_fn
 
# View disassembly (from the .o, closer to what the CPU executes)
cargo asm --release --disasm --lib my_crate::my_fn
 
# View LLVM IR (post-optimization)
cargo asm --release --llvm --lib my_crate::my_fn
 
# View LLVM IR (pre-optimization)
cargo asm --release --llvm-input --lib my_crate::my_fn
 
# View MIR
cargo asm --mir --lib my_crate::my_fn
 
# View WASM
cargo asm --release --wasm --target wasm32-unknown-unknown --lib my_crate::my_fn
 
# Performance analysis with llvm-mca
cargo asm --release --mca --native --lib my_crate::my_fn
 
# Cross-compile for AArch64
cargo asm --release --target aarch64-unknown-linux-gnu --lib my_crate::my_fn
 
# Everything, simplified, to a file
cargo asm --release --everything --simplify --lib > all_asm.s
 
# Disambiguate by index
cargo asm --lib "my_fn" 2
 
# Disambiguate by full name with hash
cargo asm --lib "my_crate::my_fn::h9c5c7d5bd745000b"

Edmondo's Vault

Explorer

Cargo-show-asm

Supported Platforms and Architectures

Output Formats

Assembly vs Disassembly

LLVM IR: Pre- and Post-Optimization

MIR

WebAssembly

llvm-mca Analysis

CLI Reference

Artifact and Package Selection

Build Configuration

Display and Formatting

Function Selection and Disambiguation

Working with Generics and Inlining

Cross-Compilation Setup

Practical Workflows

Checking for Vectorization

Comparing Optimizer Impact

Profiling with llvm-mca

Inspecting MIR for Rust-Level Decisions

Inspecting WASM Output

Tips and Best Practices

Common Pitfalls

Quick Reference

Graph View

Table of Contents

Backlinks