Interacting with GC from Application Code

Garbage-collected languages abstract away memory management, but sometimes application code needs to cooperate with the collector — to cache objects without preventing their collection, to run cleanup logic when resources are released, or to build data structures (like WeakHashMap) that automatically evict stale entries.

This note covers the theory behind GC-application interaction, how different GC models shape what’s possible, and the concrete standard library mechanisms in Java, Python, and Go.

Prerequisites

  • Basic understanding of garbage collection (mark-and-sweep, reference counting, reachability graphs)

Why interact with the GC?

Three common use cases:

  1. Weak caching — hold a reference to an object without keeping it alive. If nothing else needs it, let the GC reclaim it. Useful for metadata maps keyed by transient objects (query plan nodes, DOM elements, session objects).

  2. Resource cleanup — run deterministic cleanup (close file handles, release native memory, deregister listeners) when an object is collected. This is the “destructor” problem in GC’d languages.

  3. Canonicalization / interning — maintain a single canonical instance of equal objects (string interning, flyweight pattern) without pinning them in memory forever.

GC models and what they imply

The GC’s design determines what interaction mechanisms are feasible:

ModelLanguagesCollection triggerNotification timing
Tracing (mark-and-sweep)Java, Go, C#, OCamlPeriodic or memory pressureNon-deterministic — you don’t know when an object will be collected
Reference countingCPython, Swift, Objective-C, Rust (Arc)Immediate when refcount → 0Deterministic — collection happens at the exact point the last reference drops
Hybrid (tracing + refcount)CPython (refcount + cycle collector), PHPImmediate for acyclic, periodic for cyclesMostly deterministic, except cycles

What the model determines

Tracing GCs (Java, Go) cannot tell you when an object will be collected — only that it eventually will be. This means:

  • Cleanup callbacks (finalizers) fire at unpredictable times
  • You cannot rely on cleanup for time-sensitive resource release (use explicit close() / defer / try-with-resources instead)
  • Notification mechanisms (ReferenceQueue, finalizers) are inherently asynchronous

Reference counting (CPython) gives deterministic destruction — __del__ fires the instant the last reference drops. This makes weak reference callbacks predictable: they fire synchronously during the operation that dropped the last strong reference.

However, reference counting alone cannot handle reference cycles — when two or more objects reference each other (A → B → A), their refcounts never reach zero even after all external references are dropped (A’s refcount stays at 1 because B still points to it, and vice versa). CPython supplements its refcount GC with a periodic cycle collector (gc module) — a tracing collector that detects and breaks these unreachable cycles. Objects caught by the cycle collector are destroyed non-deterministically (whenever the collector runs), losing the deterministic-destruction guarantee that makes refcounting attractive.

Generational hypothesis (Java, .NET, Go partially): most objects die young. Generational collectors scan the young generation frequently and the old generation rarely. This affects weak reference semantics — a weak reference to a long-lived object may take a long time to be noticed as dead if the object lives in the old generation and full GCs are infrequent.

Java: ReferenceQueue and reference types

Java provides the most structured GC interaction API via the java.lang.ref package (since JDK 1.2).

The four reference strengths

See Java’s reference hierarchy for the full table. In summary:

  • Strong — normal reference, prevents collection
  • Soft — collected only under memory pressure (caches)
  • Weak — collected at next GC cycle if no strong refs exist (metadata maps)
  • Phantom — enqueued after finalization, before memory reclaim (cleanup without resurrection risk)

ReferenceQueue: the notification channel

The problem ReferenceQueue solves: your application holds associated resources tied to an object’s lifetime — a cache entry, a native handle, a map value — and needs to clean them up when the object dies. The application cannot predict when a tracing GC will collect the object, so it needs a way to be notified after the fact.

The flow is:

  1. Application registers interest. It creates a WeakReference to the target object and passes a ReferenceQueue at construction time. This says: “when this object is collected, tell me via this queue.”
  2. GC collects the object. At some future GC cycle, the target becomes unreachable. The GC clears the WeakReference (its get() returns null) and enqueues it onto the registered queue.
  3. Application polls for notifications. On its own schedule, the application drains the queue and runs cleanup for each dead reference.
ReferenceQueue<MyKey> queue = new ReferenceQueue<>();
 
// Step 1: register interest — "notify me when key dies"
WeakReference<MyKey> ref = new WeakReference<>(key, queue);
 
// Step 3: poll for GC notifications
Reference<? extends MyKey> dead;
while ((dead = queue.poll()) != null) {
    // The GC collected this key — clean up the associated map entry
    cleanupEntry(dead);
}

The pattern is polling-based — the application decides when to check. This gives the application full control over when cleanup runs (no surprise callbacks during unrelated operations). WeakHashMap calls expungeStaleEntries() at the start of every public method — that’s where the polling happens.

Phantom references for leak-proof cleanup

The problem with finalize(). Java originally offered Object.finalize() — a method the GC calls before reclaiming an object’s memory. The idea was to put cleanup logic there (close file handles, release native memory). But finalize() has a critical flaw called resurrection: inside finalize(), the object is still alive (it has a this reference), so the finalizer can store this somewhere reachable — making the object survive collection. The GC cannot know in advance whether a finalizer will resurrect the object, so it must keep the object alive through finalization, then re-check reachability. This makes finalizers slow, unpredictable, and prone to memory leaks. finalize() was deprecated in JDK 9.

The solution: PhantomReference. Unlike weak references, a phantom reference’s get() always returns null — the application code can never obtain a reference to the object, so resurrection is impossible by construction. The GC enqueues the phantom reference after the object is finalized but before its memory is reclaimed, giving the application a safe window to run cleanup (release native handles, close connections) without any risk of accidentally keeping the object alive.

// Cleaner (JDK 9+) wraps PhantomReference + ReferenceQueue + daemon thread
Cleaner cleaner = Cleaner.create();
cleaner.register(myObject, () -> {
    // runs after myObject is phantom-reachable
    nativeHandle.release();
});

Cleaner (JDK 9+) is the modern API — it wraps PhantomReference, a ReferenceQueue, and a daemon thread that processes the queue automatically.

Python: callbacks and weak dictionaries

CPython’s reference-counting GC makes weak reference interaction simpler — cleanup is (mostly) deterministic.

weakref module

import weakref
 
# Weak reference with callback
def on_death(ref):
    print(f"Object collected, ref={ref}")
 
obj = MyPlanNode()
ref = weakref.ref(obj, on_death)  # callback fires when obj's refcount → 0
 
# Access the referent (returns None if collected)
alive = ref()

The callback fires synchronously during the operation that drops the last strong reference (e.g., during del obj or when a local variable goes out of scope). This is deterministic for acyclic objects. For objects in reference cycles, the callback fires when CPython’s cycle collector (a tracing collector that runs periodically) breaks the cycle.

Weak dictionaries

Python provides two built-in weak-reference containers:

import weakref
 
# Keys are weak — entry removed when key is collected
weak_key_dict = weakref.WeakKeyDictionary()
weak_key_dict[plan_node] = filter_data  # like Java's WeakHashMap
 
# Values are weak — entry removed when value is collected
weak_val_dict = weakref.WeakValueDictionary()
weak_val_dict["cache_key"] = expensive_object

Internally, these register per-entry callbacks that remove the stale entry immediately upon collection — no polling needed.

Caveats

  • Not all objects support weak references. Built-in types like int, str, tuple, and list cannot be weakly referenced (they lack the __weakref__ slot). Custom classes support it by default.
  • Cycle collector timing. For cyclic garbage, destruction is non-deterministic — the cycle collector runs at threshold-based intervals (gc.get_threshold()).
  • __del__ is not a destructor. It runs at GC time (non-deterministic for cycles), can resurrect objects, and causes ordering problems. Use context managers (with) for resource cleanup.

Go: finalizers and weak pointers

Go’s GC is a concurrent, non-generational, tri-color mark-and-sweep collector. Its design philosophy is minimal API surface — GC interaction was intentionally limited until recently.

runtime.SetFinalizer (Go 1.0+)

import "runtime"
 
type Handle struct {
    fd int
}
 
h := &Handle{fd: openNativeResource()}
runtime.SetFinalizer(h, func(h *Handle) {
    closeNativeResource(h.fd)
})

The finalizer runs in a dedicated goroutine when the GC determines the object is unreachable. Significant constraints:

  • Two-cycle collection. An object with a finalizer survives the first GC cycle (finalizer runs), then is collected in the second cycle (if not resurrected). This doubles the memory lifetime.
  • No ordering. Finalizers on different objects run in arbitrary order — you cannot depend on one finalizer running before another.
  • Single registration. Calling SetFinalizer again replaces the previous finalizer. Calling with nil removes it.
  • Resurrection. If the finalizer stores a reference to the object somewhere reachable, the object survives. The finalizer is not re-registered — you must call SetFinalizer again explicitly.
  • Not guaranteed to run. If the program exits before GC runs, pending finalizers are not executed.

weak.Pointer (Go 1.24+)

Go 1.24 (February 2025) added weak.Pointer[T] — the first official weak reference type:

import "weak"
 
p := weak.Make(&myObject)
 
// Later — check if object is still alive
if strong := p.Value(); strong != nil {
    // object still alive, use it
} else {
    // object was collected
}

weak.Pointer returns nil from Value() once the object is collected. There is no ReferenceQueue equivalent — if you need notification, you must combine it with runtime.SetFinalizer on the target or poll periodically.

unique.Handle (Go 1.23+)

For the canonicalization use case, Go 1.23 added unique.Handle[T] — an interning mechanism that deduplicates equal values and lets the GC reclaim entries when no handles remain:

import "unique"
 
h1 := unique.Make("hello")
h2 := unique.Make("hello")
// h1 == h2 (same handle, deduplicated)

This is Go’s equivalent of a weak-valued intern map, but with a purpose-built API rather than exposing raw GC interaction.

Design space: polling vs. callbacks vs. finalizers

MechanismControl flowDeterminismResurrection riskThread safety
ReferenceQueue (Java)Application polls when readyNon-deterministic (tracing GC)Possible with Weak/Soft; impossible with PhantomQueue is thread-safe
Callback (Python)Fires during last decrefDeterministic (acyclic), non-deterministic (cyclic)Possible (callback receives ref, not object)Fires on the thread that drops the last ref
Finalizer (Go, Java legacy)Runs in GC goroutine/threadNon-deterministicPossible (footgun)Runs on arbitrary thread
Cleaner (Java 9+)Daemon thread processes PhantomRefsNon-deterministicImpossible (PhantomRef.get() → null)Dedicated thread

Polling (Java’s ReferenceQueue) is safest for concurrent code — cleanup runs at predictable points in the application’s control flow. Callbacks (Python) are most ergonomic but can fire at surprising times in multi-threaded code. Finalizers are universally considered a last resort — unpredictable, fragile, and easy to misuse.

See also