Vulnerability Report: net/sched/sch_pie.c
Summary
Integer overflow in pie_calculate_probability() due to unvalidated alpha/beta parameters — arbitrary control of drop probability, leading to AQM bypass or total packet loss.
- File:
net/sched/sch_pie.c+net/sched/sch_pie.c(shared logic frominclude/net/pie.h) - Function:
pie_calculate_probability()(line 304) - Root cause: No bounds checking on
TCA_PIE_ALPHA/TCA_PIE_BETAnetlink attributes inpie_change()(lines 178–182), combined with integer overflow in the probability update arithmetic. - Privilege required:
CAP_NET_ADMINin any network namespace (reachable by unprivileged users viaunshare(1) -n).
Vulnerability Detail
Missing Input Validation (pie_change, lines 178–182)
if (tb[TCA_PIE_ALPHA])
WRITE_ONCE(q->params.alpha, nla_get_u32(tb[TCA_PIE_ALPHA]));
if (tb[TCA_PIE_BETA])
WRITE_ONCE(q->params.beta, nla_get_u32(tb[TCA_PIE_BETA]));
params->alpha and params->beta are u32. Any value in [0, U32_MAX] is accepted without validation. The code comment says “alpha and beta should be between 0 and 32” but this is never enforced.
Integer Overflow in pie_calculate_probability (lines 341–362)
alpha = ((u64)params->alpha * (MAX_PROB / PSCHED_TICKS_PER_SEC)) >> 4;
beta = ((u64)params->beta * (MAX_PROB / PSCHED_TICKS_PER_SEC)) >> 4;
/* ... conditional right-shifts reduce alpha/beta further ... */
delta += alpha * (qdelay - params->target); // u64 * u64 → u64, no overflow check
delta += beta * (qdelay - qdelay_old); // same
Step-by-step overflow with params->alpha = U32_MAX:
| Step | Expression | Value |
|---|---|---|
MAX_PROB / PSCHED_TICKS_PER_SEC |
(U64_MAX>>8) / 1e9 |
≈ 72,057,594 |
(u64)U32_MAX * 72057594 |
pre-shift alpha | ≈ 3.1 × 10¹⁷ |
>> 4 |
post-shift | ≈ 1.94 × 10¹⁶ |
Additional >> 1 (prob < MAX_PROB/10 branch) |
≈ 9.7 × 10¹⁵ | |
Five >> 2 iterations (inner while loop, power ≤ 10⁶) |
≈ 9.5 × 10¹² | |
alpha * (qdelay - target) where delay ≈ 250 ms (2.5×10⁸ ticks) |
overflows u64! | wraps mod 2⁶⁴ |
U64_MAX ≈ 1.8 × 10¹⁹; the product 9.5×10¹² × 2.5×10⁸ ≈ 2.4×10²¹ overflows. The resulting u64 is arbitrary — the wrapped value is then implicitly converted to s64 for the delta += operation, producing a value that can be any sign and magnitude.
Consequences
The overflowed delta is then applied at line 379:
vars->prob += delta;
The overflow/underflow guards (lines 382–395) clamp vars->prob to MAX_PROB (if prob wrapped around upward) or 0 (if it wrapped downward). Whether prob lands at 0 or MAX_PROB depends on which bits survive the u64 wraparound — effectively making the drop probability attacker-controlled.
Attack scenario — AQM bypass (prob → 0):
unshare -nto obtain a new network namespace (no privileges needed).- Attach a PIE qdisc with
TCA_PIE_ALPHA = U32_MAX,TCA_PIE_DQ_RATE_ESTIMATOR = 1. - Under high-latency conditions, each timer tick overflows
deltadownward → prob underflows → clamped to0. - PIE drops no packets regardless of congestion. The queue fills to
sch->limit, then every subsequent packet is hard-dropped atqdisc_qlen >= sch->limit(overlimit path) — this is correct tail-drop, but AQM is completely defeated.
Attack scenario — total packet black-hole (prob → MAX_PROB):
- Same setup; arrange overflow in the upward direction.
- Every subsequent call to
pie_drop_early()returnstrue(prob ≥ MAX_PROB, andaccu_prob >= (MAX_PROB/2)*17immediately after a few packets). - All new packets are dropped — effective DoS of any service bound to that interface.
Secondary Finding: Data Race in pie_dump_stats
pie_dump_stats() (line 500) reads q->vars.prob, q->vars.qdelay, and q->vars.avg_dq_rate without holding the qdisc root lock:
struct tc_pie_xstats st = {
.prob = q->vars.prob << BITS_PER_BYTE, // no lock, no READ_ONCE
.delay = ((u32)PSCHED_TICKS2NS(q->vars.qdelay)) / NSEC_PER_USEC,
...
};
pie_timer() (line 427) modifies these same fields while holding root_lock. On 32-bit kernels, 64-bit reads of prob/qdelay are non-atomic and can be torn, leaking a half-written kernel value into the netlink response sent to userspace.
Root Cause
pie_change() – accepts any u32 for alpha/beta (no validation)
↓
pie_calculate_probability() – u64 × u64 multiplication overflows
↓
vars->prob – clamped to MAX_PROB or 0 (arbitrary from attacker's perspective)
↓
pie_drop_early() – drop probability is 0 (bypass) or MAX_PROB (blackhole)
Fix
1. Clamp alpha and beta in pie_change() to the documented range [0, 32]:
if (tb[TCA_PIE_ALPHA])
WRITE_ONCE(q->params.alpha,
min(nla_get_u32(tb[TCA_PIE_ALPHA]), 32U));
if (tb[TCA_PIE_BETA])
WRITE_ONCE(q->params.beta,
min(nla_get_u32(tb[TCA_PIE_BETA]), 32U));
2. Add READ_ONCE() guards in pie_dump_stats() to match the pattern already used in pie_dump(), and consider taking the root lock (or using a seqlock).