TL;DR CUDA Agent applies large-scale, multi-turn agentic reinforcement learning to teach models to write and iteratively optimize CUDA kernels—aiming for performance that competes with strong compiler baselines. What this is about The paper proposes an “agentic RL” training approach where a model works in a sandbox: profile, write kernels, compile, run, observe performance, and refine…