Last updated: 2026-06-12

Decode Planning

Architecture

All API Sections

Build static decode graphs for the supported model families.

These graphs describe the logical order of decode-time operations so runtime code can bind buffers and record compute work against a stable structure.

2 exports 0 methods src/model/architecture.zig

2 exports shown

function

buildDecodeGraph

#
pub fn buildDecodeGraph(config: *const ModelConfig, allocator: std.mem.Allocator) !Graph

Build a compute graph for a single transformer decode step.

This creates the graph structure; actual buffer bindings are set at runtime.

Parameters

config
Normalized model dimensions and architecture metadata.
allocator
Allocator used for graph storage.

Returns

A Graph describing the decode-time op order for the selected architecture.

src/model/architecture.zig:174

function

buildDecodeGraphDetailed

#
pub fn buildDecodeGraphDetailed(config: *const ModelConfig, allocator: std.mem.Allocator, gf: ?*const gguf.GGUFFile) !Graph

Build a compute graph with per-op weight-size annotations derived from a GGUF file.

Dispatches to the appropriate architecture-specific builder based on `config.architecture`. from it instead of using float-size approximations.

Parameters

config
Normalized model dimensions and architecture metadata.
allocator
Allocator used for graph storage.
gf
Optional parsed GGUF file; when non-null, actual tensor byte sizes are read

Returns

A Graph describing the decode-time op order for the selected architecture.

Notes

Returns `error.UnsupportedArchitecture` if `config.architecture` is `.unknown`.

src/model/architecture.zig:186