FERRAMENTAS LINUX: Linux 6.15 Kernel Power Regression: Critical Fixes and Performance Implications

domingo, 1 de junho de 2025

Linux 6.15 Kernel Power Regression: Critical Fixes and Performance Implications

 

Kernel Linux

Linux 6.15 kernel’s CPU power regression fixed in 6.16 Git—details on Intel’s Sierra Forest impact, Rust-based power management upgrades, and how this affects enterprise server performance.

Critical CPU Power Regression in Linux 6.15 Kernel

A significant CPU power consumption regression was discovered in the stable Linux 6.15 kernel release, impacting systems with specific configurations. The issue, now patched in Linux 6.16 Git, will soon be backported to 6.15 point releases.

This regression caused elevated idle power draw, particularly problematic for:

  • Data centers prioritizing energy efficiency

  • Enterprise servers running "nosmt" kernel parameters

  • Suspend-to-idle (S2idle) workloads

Intel engineer Rafael Wysocki, the power management subsystem maintainer, identified the flaw and reverted the problematic commit.


Technical Breakdown: Root Cause and Fix

What Went Wrong?

The regression stemmed from commit 96040f7273e2 ("x86/smp: Eliminate mwait_play_dead_cpuid_hint()"), which disrupted deep package C-states (e.g., PC10) on systems with:

 "nosmt" kernel flag enabled

 Early SMT sibling offline before cpuidle initialization

Result:

  • Siblings stuck in C1 state (via HLT) instead of deeper sleep

  • 40-60% higher idle power consumption (estimated)

  • Suspend-to-idle efficiency drops

The Solution

Wysocki’s revert restored mwait_play_dead(), allowing proper C-state transitions. The fix prioritizes:

  1. Stability for 6.15.y backports

  2. Power efficiency for Intel Xeon (Sierra Forest) and others

  3. Future-proofing with Rust-based power management code


Broader Implications for Linux Performance

1. Impact on Intel Xeon "Sierra Forest"

The original patch targeted Xeon 6 C-state bugs, highlighting:

  • Trade-offs between optimization and stability

  • Hardware-specific kernel tuning challenges

2. Rust in Power Management

The same pull request introduced Rust abstractions for:

  • CPUFreq (dynamic clock scaling)

  • OPP (Operating Performance Points)

  • CLK (clock control)

  • Cpumasks (CPU core management)

Why it matters? Rust’s memory safety could reduce future regressions.


Key Takeaways for SysAdmins & Developers

✅ Monitor 6.15.y updates for the power fix

✅ Benchmark suspend-to-idle post-patch

✅ Evaluate Rust-based drivers for future deployments


FAQ: Linux 6.15 Power Regression

Q: How severe was the regression?

A: "Nasty" (per Wysocki)—idle power spikes made it critical for data centers.

Q: Will 6.16 Git prevent similar issues?

A: Yes, Rust integration aims to improve long-term reliability.

Q: Which systems were most affected?

A: Servers with "nosmt" flags and Intel Xeon workloads.

Nenhum comentário:

Postar um comentário