WeakHashMap and Reference Types
Java’s WeakHashMap is a hash map whose entries are automatically removed when their keys become unreachable through normal (strong) references. It relies on the java.lang.ref package — a set of reference wrapper types that let application code cooperate with the garbage collector.
Prerequisites
- Basic understanding of Java’s garbage collector (mark-and-sweep, reachability)
Java’s reference hierarchy
The JVM recognizes four strengths of reference, from strongest to weakest:
| Type | Class | GC behaviour | Use case |
|---|---|---|---|
| Strong | (normal variable) | Never collected while reachable | Default — all regular code |
| Soft | SoftReference<T> | Collected only under memory pressure | Caches (keep if memory allows) |
| Weak | WeakReference<T> | Collected at next GC cycle if no strong refs exist | Metadata maps, canonicalization |
| Phantom | PhantomReference<T> | Enqueued after finalization, before memory reclaim | Cleanup actions (replaces finalize()) |
The critical distinction: a strong reference prevents collection. A weak reference does not — it merely observes whether the object is still alive.
How WeakHashMap works
WeakHashMap<K, V> stores each key wrapped in a WeakReference, not as a direct strong reference. This is the critical design choice: in a normal HashMap, the map itself holds a strong reference to the key, which keeps the key alive — the GC will never collect it as long as the map is reachable. WeakHashMap breaks this by wrapping the key in a WeakReference, so the map can “observe” the key without “owning” it. Values, however, are held as normal strong references.
Why both WeakReference and ReferenceQueue are needed
WeakReference alone solves half the problem — it lets the GC collect the key. But after collection, the stale entry (the dead WeakReference wrapper + the strongly-held value) remains in the map’s hash table, leaking memory. The map needs a way to find and remove these dead entries.
Without ReferenceQueue, the only option would be scanning the entire map on every operation checking ref.get() == null — O(n) per access. ReferenceQueue solves this efficiently: the GC deposits dead references onto the queue, and the map only processes entries that actually died — O(dead entries) per access.
So the two mechanisms compose: WeakReference = “don’t keep the key alive” (allows collection). ReferenceQueue = “tell me which keys died” (enables targeted entry removal).
The mechanism in detail
The lifecycle of a WeakHashMap entry. The key (red) is wrapped in a WeakReference — the map does not keep it alive. The value (green) is held by a strong reference — the map keeps it alive until the entry is expunged. When the GC collects the key, it enqueues the dead reference; the map polls and removes the entry on its next operation.
Step by step:
-
Insertion.
map.put(key, value)wrapskeyin aWeakReferenceregistered with the map’s internalReferenceQueue. The value is stored as a normal strong reference — the map keeps the value alive until the entry is explicitly removed. -
Key becomes weakly reachable. When no strong references to
keyexist outside the map (the caller dropped it, the plan node was GC’d, etc.), the GC considers the key unreachable. -
GC enqueues the reference. At the next collection cycle, the GC clears the
WeakReference(itsget()now returnsnull) and places it on theReferenceQueue. This is the GC → application communication channel. -
Lazy expungement. On the next
WeakHashMapoperation (get,put,size, iteration), the map callsexpungeStaleEntries()— it polls the queue, finds dead references, and removes the corresponding entries (key reference + value + hash bucket link). The value is released at this point.
Consequences
- Cleanup is lazy, not immediate. A dead key’s entry persists until the next map operation. If the map is never accessed again, entries leak until the map itself is collected.
- Values are held strongly. If the value holds a strong reference back to the key (directly or transitively), the key is never weakly reachable — creating a memory leak. This is the most common
WeakHashMapbug. - Not thread-safe.
WeakHashMaphas no synchronization. In concurrent contexts, useCollections.synchronizedMap(new WeakHashMap<>())or pair with aThreadLocalto avoid contention. - Lookup uses
equals(), but GC uses reachability.WeakHashMapfinds entries viaequals()/hashCode()(like a normalHashMap). But the GC decides whether to collect a key based on whether that specific object instance is strongly reachable — not whether an equal object exists somewhere. Once the GC collects a key and expungement removes the entry, creating a new object wherenewKey.equals(oldKey)istruewon’t find anything — the entry is gone. The map tracks object lifetimes, not logical identity.
See also
- Interacting with GC from Application Code — broader view: how Java, Python, and Go let application code cooperate with the GC (ReferenceQueue, callbacks, finalizers)
- Java SPI — another Java mechanism used by Spark extensions