The Problem

During Analysis, Bazel walks the dependency graph bottom-up. When it analyzes target A (which depends on target B), A’s rule implementation needs to know things about B. Not just “what files did B produce” — but structured metadata. A Rust target depending on another Rust target needs to know: what’s the crate name? Where’s the compiled .rlib? What Rust edition was used? What are all the transitive .rlib files I also need to pass to rustc?

Bazel’s answer is providers. A provider is essentially a typed struct — a named bundle of fields — that a rule’s implementation function returns. Downstream targets read these structs from their dependencies.

How Providers Work

Defining a provider

You create a new provider type by calling provider(). This is analogous to defining a dataclass or a named tuple — it creates a constructor:

# This creates a NEW TYPE called MyInfo, and MyInfo is also its constructor
MyInfo = provider(
    fields = {
        "name": "A human-readable name",
        "output_file": "The compiled artifact",
    },
)

After this, MyInfo is a callable. You call it to create instances:

info = MyInfo(name = "foo", output_file = some_file)

And you read fields from instances:

print(info.name)         # "foo"
print(info.output_file)  # File object

Returning providers from a rule

A rule’s implementation function returns a list of provider instances. This is how a target publishes metadata about itself:

def _my_rule_impl(ctx):
    output = ctx.actions.declare_file("result.txt")
    # ... declare some action that produces output ...
 
    return [
        MyInfo(name = "foo", output_file = output),
    ]

Reading providers from dependencies

When target A depends on target B, A’s implementation function receives B as an already-analyzed Target object (via ctx.attr.deps). You access B’s providers using index notation — like accessing a dict by type:

def _consumer_impl(ctx):
    for dep in ctx.attr.deps:
        if MyInfo in dep:                  # check if this dep returned MyInfo
            info = dep[MyInfo]             # get the MyInfo instance
            print(info.name)               # read its fields
            print(info.output_file.path)

That’s the whole mechanism. Everything else is just instances of this pattern.

The Provider Ecosystem

There is no fundamental difference between “built-in providers”, “rule-set providers”, and “custom providers.” They’re all created the same way — with provider(). The only distinction is who defined them and where:

Providers defined by Bazel itself

Bazel ships a few providers that every rule should know about. The most important is DefaultInfo:

DefaultInfo(
    files = depset([...]),       # What `bazel build` will produce
    runfiles = runfiles(...),    # What's available at runtime
)

Every rule should return a DefaultInfo. If you don’t, Bazel creates a default one. It answers the universal questions: “what files does this target produce?” and “what files does this target need at runtime?”

Important

files vs runfiles: files is what gets built. runfiles is what’s available when the program actually runs. For a PyO3 .so, it must be in both — built by bazel build, and present at runtime so import my_module works.

Here’s what returning DefaultInfo looks like in a real rule:

def _my_rule_impl(ctx):
    compiled_so = ctx.actions.declare_file("my_module.abi3.so")
    # ... declare action that produces compiled_so ...
 
    return [
        DefaultInfo(
            # files: what `bazel build //my:target` produces on disk
            files = depset([compiled_so]),
 
            # runfiles: what's available at runtime when a py_binary runs
            # Without this, `bazel build` works but `import my_module` fails
            runfiles = ctx.runfiles(files = [compiled_so]),
        ),
    ]

Providers defined by rule sets

Rule sets like rules_rust and rules_python define their own providers — they just call provider() in their .bzl files, same as you would. For example, rules_rust defines something like:

# Somewhere inside rules_rust source code:
CrateInfo = provider(
    fields = {
        "name": "Crate name",
        "type": "rlib, cdylib, proc-macro, etc.",
        "output": "The compiled artifact",
        "edition": "Rust edition",
        "transitive_outputs": "depset of all .rlibs in the transitive closure",
    },
)

When you see rust_common.crate_info(name = "my_module", ...) in documentation, that’s just calling this constructor. rust_common.crate_info is an alias for the CrateInfo provider constructor, namespaced under rust_common for organizational reasons. There’s nothing magical about it — it’s the same provider() mechanism.

Similarly, rules_python defines PyInfo, and rules_cc defines CcInfo. They’re all just provider types that those rule sets created to carry their domain-specific metadata.

Providers you define yourself

If you write a custom rule, you define your own providers the exact same way:

PyO3ModuleInfo = provider(
    fields = {
        "module_name": "The Python module name",
        "shared_library": "The compiled .so file",
        "python_version": "Python version this was built against",
    },
)

No different from what rules_rust did with CrateInfo.

Complete Example: Custom Provider End-to-End

Here’s the full lifecycle — defining a provider, writing a rule that returns it, and writing a consumer rule that reads it:

# tools/pyo3_module/defs.bzl
 
# ── Step 1: Define the provider (like defining a dataclass) ──
 
PyO3ModuleInfo = provider(
    fields = {
        "module_name": "Python module name for `import`",
        "shared_library": "The compiled .so File",
        "python_version": "Python version string",
    },
)
 
# ── Step 2: Write a rule that returns it ──
 
def _pyo3_extension_impl(ctx):
    rust_tc = ctx.toolchains["@rules_rust//rust:toolchain_type"]
    python_tc = ctx.toolchains["@rules_python//python:toolchain_type"]
 
    # Declare the output file
    output = ctx.actions.declare_file(ctx.attr.module_name + ".abi3.so")
 
    # Collect .rlib files from Rust deps (reading THEIR providers)
    dep_rlibs = []
    for dep in ctx.attr.deps:
        dep_rlibs.append(dep[CrateInfo].output)
 
    # Declare the build action
    args = ctx.actions.args()
    args.add("--crate-type=cdylib")
    args.add("--edition=2021")
    args.add("-o", output)
    for dep in ctx.attr.deps:
        ci = dep[CrateInfo]
        args.add("--extern", ci.name + "=" + ci.output.path)
    args.add(ctx.files.srcs[0])
 
    ctx.actions.run(
        executable = rust_tc.rustc,
        arguments = [args],
        inputs = ctx.files.srcs + dep_rlibs,
        outputs = [output],
        mnemonic = "PyO3Rustc",
    )
 
    # Return providers — both the universal DefaultInfo and our custom one
    return [
        DefaultInfo(
            files = depset([output]),
            runfiles = ctx.runfiles(files = [output]),
        ),
        PyO3ModuleInfo(
            module_name = ctx.attr.module_name,
            shared_library = output,
            python_version = "3.11",
        ),
    ]
 
pyo3_extension = rule(
    implementation = _pyo3_extension_impl,
    attrs = {
        "srcs": attr.label_list(allow_files = [".rs"]),
        "module_name": attr.string(mandatory = True),
        "deps": attr.label_list(providers = [CrateInfo]),
    },
    toolchains = [
        "@rules_rust//rust:toolchain_type",
        "@rules_python//python:toolchain_type",
    ],
)
 
# ── Step 3: Write a consumer that reads the provider ──
 
def _pyo3_test_suite_impl(ctx):
    # Read our custom provider from the dependency
    mod = ctx.attr.module[PyO3ModuleInfo]
 
    # Use the structured metadata
    test_script = ctx.actions.declare_file("run_tests.sh")
    ctx.actions.write(
        output = test_script,
        content = "echo 'Testing {name} (Python {ver})'\npython3 -m pytest".format(
            name = mod.module_name,
            ver = mod.python_version,
        ),
        is_executable = True,
    )
 
    # Also read DefaultInfo to get the .so for runtime
    so_file = mod.shared_library
    runfiles = ctx.runfiles(files = [test_script, so_file])
 
    return [DefaultInfo(executable = test_script, runfiles = runfiles)]
 
pyo3_test_suite = rule(
    implementation = _pyo3_test_suite_impl,
    attrs = {
        "module": attr.label(providers = [PyO3ModuleInfo]),  # MUST provide PyO3ModuleInfo
    },
    test = True,
)

Usage in a BUILD file:

pyo3_extension(
    name = "my_module",
    srcs = ["src/lib.rs"],
    module_name = "my_module",
    deps = ["@crate_index//:pyo3"],
)
 
pyo3_test_suite(
    name = "my_module_test",
    module = ":my_module",  # passes because pyo3_extension returns PyO3ModuleInfo
)

Requiring providers on attributes

You can tell Bazel “this attribute only accepts targets that return a specific provider.” This gives you a compile-time type check on your build graph:

my_rule = rule(
    implementation = _my_rule_impl,
    attrs = {
        "deps": attr.label_list(providers = [CrateInfo]),
    },
)

If someone writes deps = ["//some:py_library"], Bazel errors during Analysis:

ERROR: '//some:py_library' does not have mandatory providers: 'CrateInfo'

How Cross-Language Builds Work

An important consequence of the provider design: different language rule sets don’t know about each other. rules_python has no idea what CrateInfo is, and rules_rust has no idea what PyInfo is. They communicate through the universal provider — DefaultInfo — and file artifacts:

  1. rust_shared_library returns DefaultInfo(files = {.so}, runfiles = {.so}) plus CrateInfo(...)
  2. py_library puts the Rust target in its data attribute. It reads DefaultInfo to get the .so file and adds it to its own runfiles
  3. py_binary depends on the py_library, merges all transitive runfiles
  4. At runtime, the .so is in the runfiles tree, Python finds it on sys.path

No rule needs to understand another rule set’s internals. Files flow through DefaultInfo. This is the design pattern for all cross-language builds in Bazel.


depset: The Efficiency Problem

Consider a target deep in a dependency tree. It transitively depends on 10,000 libraries. Each library’s CrateInfo has a transitive_outputs field listing all .rlib files in the transitive closure. If every rule implementation built this as a flat Python list, it would copy and extend the list at every level — O(n) per target, O(n²) total across the graph.

depset (dependency set) solves this. It’s a persistent, immutable tree structure with O(1) construction:

# Leaf target (libc) — just its own output:
CrateInfo(transitive_outputs = depset([libc_rlib]))
 
# Middle target (pyo3, depends on libc) — adds its output, points to libc's depset:
libc_info = ctx.attr.deps[0][CrateInfo]
CrateInfo(
    transitive_outputs = depset(
        direct = [pyo3_rlib],                          # my own output
        transitive = [libc_info.transitive_outputs],   # pointer to libc's depset — O(1)
    ),
)

Each depset() call just creates a tree node pointing to its children — O(1) time and memory. The full flattening into an actual list only happens when Bazel finally needs it (e.g., when constructing the inputs to a rustc action), and duplicates are removed automatically.

Think of it like a linked list of lists — you prepend your items and link to the previous structure, without ever copying.

Traversal order (specified at depset creation):

  • "default" — unspecified, Bazel picks efficiently
  • "preorder" — parent items before children
  • "postorder" — children before parent
  • "topological" — guaranteed topological order

Order doesn’t matter for most Rust compilation (you pass --extern flags). It can matter for linker inputs or Python’s sys.path.