Last updated: 2026-06-12

Inference Runtime

Moe Gate Topk

All API Sections

T-CPU MOE_GATE_TOPK implementation.

Computes router logits from a GGUF gate matrix, then selects and normalizes the active expert weights using the same routing rules as the Vulkan path.

3 exports shown

enum

RoutingRule

#
pub const RoutingRule = enum

Selection rule applied after the router projection to convert logits into expert weights.

src/zinc_rt/isa/cpu_zig/moe_gate_topk.zig:12

struct

Params

#
pub const Params = struct

Inputs and outputs for one MoE gate + top-k call.

Parameters

raw_data
Raw GGUF tensor bytes for the router matrix `[num_experts, hidden_dim]`.
tensor_type
GGML quantization format of `raw_data` (forwarded to `dequant.row`).
hidden
Hidden state of length `hidden_dim`.
row_scratch
Caller-owned scratch of length `>= hidden_dim` for one dequantized router row.
logits
Destination router logits of length `num_experts`; capped at 256 experts.
k
Number of experts to select; must satisfy `1 <= k <= logits.len`.
output_ids
Destination expert indices of length `>= k`.
output_weights
Destination per-expert weights of length `>= k`, summing to 1 after the call.
rule
Routing rule that decides how the top-k weights are computed (see `RoutingRule`).

src/zinc_rt/isa/cpu_zig/moe_gate_topk.zig:30

function

run

#
pub fn run(params: Params) !void

Project the hidden state through the router matrix, then select and normalize the top-k experts.

First fills `logits` row by row (matvec via `dequant.row`), then dispatches on `rule` to either `softmax_all` (softmax across all experts, pick top-k, renormalize) or `softmax_selected` (pick top-k by raw logit, softmax across that subset). `error.InvalidTopK` when `k` is zero or larger than the expert count, `error.ShapeMismatch` when scratch or output slices are too small, otherwise void.

Parameters

params
Router weights, hidden state, scratch buffers, selection size, and outputs; see `Params`.

Returns

`error.EmptyInput` for empty inputs, `error.TooManyExperts` when `logits.len > 256`,

src/zinc_rt/isa/cpu_zig/moe_gate_topk.zig:50