Last updated: 2026-06-12

Inference Runtime

Matvec

All API Sections

T-CPU matrix-vector projection implementation.

Dequantizes one GGUF tensor row at a time and computes a scalar matvec.

2 exports shown

struct

Params

#
pub const Params = struct

Inputs and outputs for one matrix-vector projection.

Parameters

raw_data
Raw GGUF tensor bytes for the weight matrix `[rows, cols]`.
tensor_type
GGML quantization format of `raw_data` (forwarded to `dequant.row`).
input
Input vector of length `cols`.
row_scratch
Caller-owned scratch of length `>= cols` used to materialize one dequantized row.
output
Destination vector of length `rows`; row `i` of the matrix lands in `output[i]`.
accumulate
When true, add into `output` instead of overwriting (e.g. for MoE expert mixing).

src/zinc_rt/isa/cpu_zig/matvec.zig:15

function

run

#
pub fn run(params: Params) !void

Compute `output = W * input` (or `output += W * input` when `accumulate` is set) one row at a time.

Each row of the GGUF matrix is dequantized into `row_scratch` and dotted against `input`. `row_scratch` is shorter than `input`, otherwise void.

Parameters

params
Tensor data, input vector, scratch row, and output slice; see `Params`.

Returns

`error.EmptyInput` when input or output is empty, `error.ShapeMismatch` when

src/zinc_rt/isa/cpu_zig/matvec.zig:29