FERRAMENTAS LINUX: Unlocking Arm Ethos NPU Power: A Deep Dive into the New Open-Source Linux & Mesa Drivers

sexta-feira, 17 de outubro de 2025

Unlocking Arm Ethos NPU Power: A Deep Dive into the New Open-Source Linux & Mesa Drivers

 



Explore the new open-source "ethosu" Gallium3D driver for Mesa, enabling Arm Ethos-U NPU acceleration on Linux. Learn about kernel integration, TensorFlow Lite via Teflon, performance on i.MX93, and the roadmap for U65/U85 NPUs. A complete guide for developers and embedded systems engineers. 

The race for efficient on-device AI inference is intensifying, and a significant breakthrough is emerging from the open-source community. Arm Holdings, a leader in semiconductor intellectual property, is making strides to bring native neural processing unit (NPU) support to the mainline Linux kernel and the Mesa 3D Graphics Library

This development promises to democratize hardware-accelerated machine learning for embedded devices and edge computing platforms

How will these new drivers transform the landscape for developers working with Arm's Ethos-U series NPUs?

This comprehensive analysis breaks down the recent progress, from the kernel-level "accel" driver to the newly merged "ethosu" Gallium3D driver in Mesa, explaining the technology stack, its current capabilities, and future potential for high-performance, low-power AI applications.

The Core Components: Kernel Acceleration and Userspace Graphics

The open-source support for Arm Ethos NPUs is a two-part endeavor, mirroring the standard structure for hardware enablement in the Linux ecosystem.

  • The Kernel Foundation: The "accel" Driver: At the heart of this effort is an open-source "accel" driver developed for the Linux kernel. This driver is essential for the operating system to communicate with and manage the Ethos NPU hardware. Currently in its fourth revision (v4), this driver is under active review and is a prime candidate for inclusion in the upcoming Linux v6.19 kernel cycle. Its role is to provide the fundamental infrastructure for memory management, scheduling, and power management for the NPU.

  • The Userspace Interface: Mesa's "ethosu" Gallium3D Driver: Complementing the kernel work is a newly merged driver for Mesa, the open-source OpenGL/Vulkan implementation. This "ethosu" driver resides in userspace and acts as the bridge between AI frameworks and the kernel driver. By leveraging Mesa's Gallium3D architecture, it provides a standardized interface for graphics and compute operations, which in this case, are tailored for neural network inference.

As quoted by the lead developer, Tomeu Vizoso, in the merge request: "This driver is the userspace counterpart to Rob Herring's kernel accel driver." This clear attribution underscores the collaborative nature of the project and establishes a direct link between the two components, enhancing the content's by citing explicit sources.

The Software Stack: Teflon, TensorFlow Lite, and Tensor Operations

To understand the practical application, one must examine the software stack that enables AI workloads to run on this hardware.

The Mesa "ethosu" driver utilizes the Teflon framework, a critical piece of technology originally pioneered for the Etnaviv driver. Teflon's primary function is to facilitate the execution of TensorFlow Lite models. It translates the model's operations into commands that the NPU can understand and process efficiently. 

This same Teflon framework is concurrently being used for open-source Rockchip NPU support, indicating a strategic move towards a unified, vendor-agnostic driver model for embedded NPUs in the open-source world.

A Practical Workflow Example:

  1. A developer loads a pre-trained TensorFlow Lite model for image classification.

  2. The Mesa "ethosu" driver, via the Teflon framework, parses the model's graph and tensor operations.

  3. The driver communicates with the Arm Ethos NPU kernel "accel" driver.

  4. The kernel driver schedules the operations on the dedicated NPU hardware, such as a U65 or U85.

  5. The inference result is returned, achieving lower latency and power consumption than a CPU-based implementation.

Current Capabilities, Performance, and Hardware Roadmap

While the functionality is a significant achievement, it's crucial to set realistic expectations regarding its current state.

According to developer insights, the driver's functionality is "quite complete for the operations that are currently implemented." This means core tensor operations necessary for many common AI models are supported. However, performance is currently acknowledged to be "very low" due to a lack of advanced optimizations. 

This is a common phase in new driver development, where functional correctness is prioritized before deep performance tuning.

A key differentiator for the Arm Ethos NPU IP is its freely available documentation and the existence of an out-of-tree open-source stack for reference, which significantly lowers the barrier to entry for developers and researchers.

The hardware targeting is clear and strategic:

  • Initial Target: Ethos-U65 NPU.

  • Future Targets: Plans are in place to support the more powerful U85 and potentially the U55 variant, contingent on available hardware that allows direct CPU access.

Initial testing has been conducted on i.MX93 system-on-chip (SoC) boards from NXP Semiconductors, a common platform for embedded and IoT applications. This provides a tangible, real-world testing ground that adds credibility and context.

Market Implications and The Future of Open-Source AI Acceleration

The integration of robust, open-source NPU drivers into mainline Linux and Mesa is a pivotal step for the embedded industry. It moves proprietary AI acceleration from a vendor-locked feature to a standardized, community-driven capability. 

This aligns with the growing trend of on-device AI and edge computing, where data privacy, low latency, and power efficiency are paramount.

For developers, this means the ability to deploy optimized machine learning models on a wider range of hardware without relying on closed-source binary blobs. For the ecosystem, it fosters innovation and competition, potentially driving down the cost and complexity of building intelligent edge devices.

Frequently Asked Questions (FAQ)

Q1: What is the difference between the kernel "accel" driver and the Mesa "ethosu" driver?

A: The kernel "accel" driver is the low-level software that allows the Linux OS to control the NPU hardware. The Mesa "ethosu" driver is a userspace component that allows AI frameworks like TensorFlow Lite to issue commands to the NPU via the kernel driver.

Q2: When will I be able to use these drivers in a stable release?

A: The Mesa "ethosu" driver is slated for the Mesa 25.3 release later this year. The kernel "accel" driver is targeting the Linux v6.19 kernel cycle, which would place its stable release in late 2024.

Q3: Which Arm Ethos NPU variants are supported?

A: The initial focus is on the Ethos-U65. Support for the U85 and U55 is planned for the future, expanding the range of applicable devices from power-constrained IoT modules to more performance-oriented edge appliances.

Q4: Can I run PyTorch models with this driver?

A: Currently, the stack is designed for TensorFlow Lite models via the Teflon framework. While PyTorch models can often be converted to TensorFlow Lite, native PyTorch support is not part of the current implementation.

Q5: What are the main advantages of using an NPU over a CPU for AI tasks?

A: NPUs are specialized processors designed specifically for the matrix and tensor calculations fundamental to neural networks. They offer vastly superior performance per watt for these tasks, enabling complex AI inference on battery-powered devices and reducing system latency.



Nenhum comentário:

Postar um comentário