Memory mapping is a functionality that is provided by the Portable Operating System Interface (POSIX)-defined mmap function on many operating systems. When you memory-map a file, a chunk of the virtual address space for the process equivalent to the size of the file is blocked out, allowing the process to treat the file data as simple memory loading and storing.
In most cases, the kernel’s page (file) cache is mapped into virtual memory, which means that no copies are created when accessing the data. All accesses to a file go through the kernel’s page cache so that updates to the mapped file are immediately visible to any processes that attempt to read the file, even though they won’t exist physically on the device until the buffers are flushed and synced back to the physical drive.
This means that multiple processes can memory-map the same file and treat it as a region of shared memory, which is how dedicated shared-process memory is often implemented under the hood.
Using mmap
mmap is the POSIX-compliant syscall that allows sharing across processes: when using a backing file in /dev/shm, processes are using a In Memory FileSystem making access much faster because there is no effective disk IO
Memory Mapping as an Allocator
In addition to mapping files, mmap can be used to allocate memory directly from the kernel by using anonymous mapping. This bypasses the standard heap and allows for page-aligned memory allocation directly from the system. The allocated memory does not correspond to any file and is initialized to zero. This happens by calling mmap with the MAP_ANONYMOUS flag:
- The memory is allocated directly by the kernel.
- It is backed by the swap space (if needed), not by a file.
- Pages are aligned to system page boundaries (typically 4 KB on most systems).
- The memory is independent of the heap, which is managed by the C runtime (via
malloc).
#include <sys/mman.h>
#include <stdio.h>
#include <unistd.h>
int main() {
size_t page_size = sysconf(_SC_PAGESIZE); // Get system page size (e.g., 4 KB)
size_t num_pages = 10; // Allocate 10 pages
size_t total_size = page_size * num_pages;
void *addr = mmap(NULL, total_size, PROT_READ | PROT_WRITE,
MAP_PRIVATE | MAP_ANONYMOUS, -1, 0);
if (addr == MAP_FAILED) {
perror("mmap");
return 1;
}
printf("Allocated %zu bytes at %p\n", total_size, addr);
// Use the memory
char *data = (char *)addr;
data[0] = 'A';
data[1] = 'B';
// Clean up
if (munmap(addr, total_size) != 0) {
perror("munmap");
return 1;
}
return 0;
}Tip
mmapguarantees page alignment by default, while malloc requires additional work, and provides fine grained access control as well more efficient large allocations
The Heap
The heap is instead managed by the C runtime library, see C memory allocators
Memory mapping and page caches
The Kernel Page Cache
The kernel page cache plays a central role in optimizing file I/O operations:
- When a file is accessed, its contents are loaded into the page cache in RAM, allowing subsequent reads or writes to bypass the disk and operate directly on cached data.
- In the case of memory mapping, the page cache is used to back the mapped region, enabling processes to access file contents directly through their virtual memory without creating additional copies. Any changes made to the mapped region are reflected in the page cache, ensuring that other processes accessing the same file see consistent updates, though these changes may not be immediately written to disk until the cache is flushed.
Minor page faults
A page fault happens when a process attempts to access a page of its virtual memory space that hasn’t yet been loaded into memory.
In the case of memory-mapped I/O, a minor page fault is when a page exists in memory, but the memory management unit of the system hasn’t yet marked it as being loaded when it is accessed by the process. This would happen when a block of data has been loaded into the kernel page cache, but has not yet been mapped and connected to the appropriate location in the process’s virtual memory space.
The role of the Translation Lookaside Buffer
When you use mmap, the operating system sets up virtual memory page mappings so a file or device appears as part of the process’s address space, therefore it interacts with Page Tables infrastructure. Creating memory mappings(e.g., mmaping a file) populates entries in the page tables, while altering or removing mappings (e.g., munmap, mprotect) often requires invalidating stale TLB entries so that future memory accesses don’t use incorrect translations.
Limitations of memory mapping
Memory-mapped I/O ends up being less scalable than standard I/O, for example when you have a lot of small files, or too many processes accessing memory mapping. For this reason is commonly not used in DBMS or other latency-critical applications. In particular, there is a lack of control for the developer on the synchronization between memory and disk, and databases developers prefer to develop their own Buffer Pool
OS Page Cache
The OS page cache is a region of main memory that stores recently accessed file system blocks. Key points:
- It adjusts dynamically based on system RAM usage, shrinking when applications need memory, and growing when more file I/O occurs.
- If a file-backed page is read from disk, the OS places it in the page cache. Subsequent reads can come from memory instead of hitting disk again.
- Pages used by
mmaped files can also reside in the page cache, but are accessed via the process’s virtual memory mappings rather than a direct read syscall.
This cache is distinct from the CPU caches (L1, L2, L3), which are hardware-managed and sit directly on the processor die. The OS page cache is simply a portion of RAM the kernel uses to speed up I/O operations on files or block devices. If you have a system with a large amount of free memory, the kernel will use as much of that as feasible for the page cache to improve performance.
Important
RAM Pages: Any data a process is actively using might also be in CPU caches (a different layer of caching managed by hardware). The OS page cache is a software-managed cache for file-backed data in system memory. It does not directly store application-allocated anonymous pages (like those returned by
malloc), though processes can map those pages privately.