Last updated: 2026-06-12
Inference Runtime
Residual Rms Norm
T-CPU residual add + RMS norm implementation.
Computes: output = weight * rms_norm(x + residual)
2 exports shown
struct
Params
pub const Params = struct Inputs and outputs for one fused residual-add + RMS-norm call.
function
run
pub fn run(params: Params) !void Compute `output = weight * (x + residual) / sqrt(mean((x + residual)^2) + eps)`.
Fuses the post-attention/post-MLP residual add with the RMS norm so the sum lives only in registers; matches the GPU kernel that the Vulkan path uses on the decode hot loop. companion slice is shorter than `x`, otherwise void.