Last updated: 2026-06-12
Decode Planning
Architecture
Build static decode graphs for the supported model families.
These graphs describe the logical order of decode-time operations so runtime code can bind buffers and record compute work against a stable structure.
2 exports shown
function
buildDecodeGraph
pub fn buildDecodeGraph(config: *const ModelConfig, allocator: std.mem.Allocator) !Graph Build a compute graph for a single transformer decode step.
This creates the graph structure; actual buffer bindings are set at runtime.
function
buildDecodeGraphDetailed
pub fn buildDecodeGraphDetailed(config: *const ModelConfig, allocator: std.mem.Allocator, gf: ?*const gguf.GGUFFile) !Graph Build a compute graph with per-op weight-size annotations derived from a GGUF file.
Dispatches to the appropriate architecture-specific builder based on `config.architecture`. from it instead of using float-size approximations.