FERRAMENTAS LINUX: Fix Laggy Cursor and Scrolling in CPU-Heavy Apps on Wayland

sexta-feira, 8 de maio de 2026

Fix Laggy Cursor and Scrolling in CPU-Heavy Apps on Wayland

 



Stop cursor lag on Wayland. Learn how zero‑copy DMA‑BUF fixes CPU rendering stutter and cuts CPU usage by 70%.


Have you ever moved your mouse over a code editor or a settings window on Wayland and felt the cursor skip frames? Or scrolled through a document only to see choppy redraws? 

This often happens when an app relies on CPU-based rendering while the compositor has to copy memory buffers multiple times before they ever reach your GPU. 

In this post, you’ll learn why those extra copies hurt performance and how a smarter approach using DMA-BUF eliminates the bottleneck—giving you a fluid desktop experience even on power-saving profiles.


The Hidden Cost of Shared Memory


Under Wayland, applications that draw with the CPU (like those built on QtWidgets or older toolkits) use a shared memory mechanism called wl_shm. The app writes pixels into a memory region that the compositor then reads and uploads to the GPU as a texture. 

This upload step is the culprit: the compositor must copy the entire buffer, often blocking its main thread.  For example, on a modern laptop set to a “power save” profile, moving the cursor quickly over project files in a heavy IDE caused noticeable stutter. 

Every texture upload consumed valuable CPU cycles, and the cursor movement had to wait. Normally this isn’t obvious, but on slower CPU governors it feels sluggish and unresponsive.


One Buffer, Zero Copies: The UDMABUF Solution



The key insight was that the Linux kernel already provides a way to share memory directly with the GPU: DMA-BUF. By wrapping a memfd-allocated buffer into a DMA-BUF using UDMABUF, the compositor can import that buffer as a GPU resource without any extra copying.

KDE developer Xaver Hugl implemented exactly this for KWin. Instead of uploading texture data line by line, the compositor now asks the GPU driver to reference the same physical memory pages used by the application. 

The result is dramatic: scrolling the same IDE previously consumed 80–90% of a CPU core, but after the change it dropped to just 20%. Cursor movement became perfectly smooth, even under power-saving settings.


What This Means for Your Daily Work


You don’t need to be a developer to appreciate the improvement. Any application that paints with the CPU will feel snappier—think tooltips, dialog boxes, old GTK apps, or custom in-house tools. On a laptop, lower CPU usage also means less fan noise and longer battery life. 

On older hardware where the CPU is the main bottleneck, the difference can make a previously frustrating desktop feel responsive again.

Other compositors and toolkits can adopt the same technique. The change has already landed in a future Qt release, and any Wayland compositor that implements UDMABUF for wl_shm buffers can offer the same zero-copy efficiency.


Actionable Tips You Can Try Today



Check your compositor’s capabilities – If you use KDE Plasma, look for release notes mentioning DMA-BUF imports for shared memory. For other compositors, search for “wl_shm udmabuf” support.

Monitor CPU usage during stutter – Open a system monitor and move your cursor over a CPU-rendered app. If one core jumps to 100%, you’re seeing the copy bottleneck.

For developers using wl_shm – Replace plain shared memory with memfd_create + UDMABUF export. The kernel will give you a DMA-BUF file descriptor that the compositor can import directly.

Test with a power-saving profile – Set your CPU governor to “powersave” or “conservative”. If cursor movement becomes choppy, your compositor likely lacks this optimization.


Smooth Scrolling and Snappy Cursors Are Within Reach


The move from brute‑force texture uploads to zero‑copy DMA‑BUF sharing transforms how CPU‑rendered applications feel on Wayland. You get a snappier desktop without waiting for app rewrites or hardware upgrades. 

If you’ve struggled with cursor lag or choppy scrolling, check whether your compositor supports this optimization.   And if you’re a toolkit maintainer, consider adopting the UDMABUF pattern—it’s one of the highest‑impact low‑effort performance wins you can deliver.

Try monitoring your system’s CPU behavior during regular use this week. Share your experience with CPU‑rendered apps in the comments below—we’d love to hear how this change has improved your own workflow.

For more details, visit Xavier's blog.

Nenhum comentário:

Postar um comentário