FRIDAY, JUNE 26, 2026 48° E  /  GLOBAL TECH · SUMMARISED SUBSCRIBE
AI, business, devices, policy — global tech, summarised every 30 minutes.
Dev Tools · 2h ago

NVIDIA's cuTile Rust brings safe GPU kernels at 96% of cuBLAS speed

By Meridian48 News Desk · Summarised from DEV Community ·

NVIDIA researchers introduced cuTile Rust, a DSL that extends Rust's ownership model to GPU kernels, eliminating the need for unsafe code. On a B200, it achieves 7 TB/s on memory-bound ops and 2 PFlop/s on GEMM, roughly 96% of cuBLAS performance. The approach partitions tensors into disjoint tiles, ensuring data-race-free parallelism without runtime overhead.

Meridian48 take
The performance parity with cuBLAS is impressive, but the narrow hardware support (sm_80+ and Linux only) limits immediate practical impact.
Read the full reporting
96% of cuBLAS, no `unsafe`: what cuTile Rust proves →
DEV Community
rustgpu-programming
More dev tools briefs
Go deeper on dev tools
AllAIStartupsBusinessDevicesPolicySecurityDev ToolsPakistan