Last updated: 2026-06-12
Inference Runtime
Embed
T-CPU EMBED implementation.
Reads one token row from a GGUF tensor into f32 hidden state.
2 exports shown
struct
Params
pub const Params = struct Inputs and outputs for one EMBED call.
function
run
pub fn run(params: Params) !void Dequantize the row at `params.token_id` of the embedding matrix into `params.output`.
Thin wrapper over `dequant.row` that validates the token index and output shape. the output slice does not match `hidden_dim`, otherwise void.