Last updated: 2026-06-12

Managed Models

Model Manager

All API Sections

Managed active-model runtime state for the HTTP server and CLI startup.

ZINC still loads one model into memory at a time. This manager keeps the current engine/tokenizer/model bundle together and handles serialized swaps.

5 exports 12 methods src/server/model_manager.zig

5 exports shown

struct

LoadSpec

#
pub const LoadSpec = struct

Describes which model to load: a filesystem path, an optional managed-catalog ID, and an optional context-length override.

src/server/model_manager.zig:20

struct

ModelSummary

#
pub const ModelSummary = struct

Flat representation of a catalog model for JSON serialization to API clients.

src/server/model_manager.zig:27

struct

ModelCatalogView

#
pub const ModelCatalogView = struct

Snapshot of the full model catalog annotated with the current GPU profile.

src/server/model_manager.zig:56

Methods

1

method

ModelCatalogView.deinit

#
pub fn deinit(self: *ModelCatalogView, allocator: std.mem.Allocator) void

Frees the owned summary slice.

src/server/model_manager.zig:61

struct

LoadedResources

#
pub const LoadedResources = struct

Bundle of model, tokenizer, and inference engine that represents a fully loaded model.

src/server/model_manager.zig:68

struct

ModelManager

#
pub const ModelManager = struct

Thread-safe owner of the currently active model, providing load, swap, and catalog queries.

src/server/model_manager.zig:95

Methods

11

method

ModelManager.init

#
pub fn init( spec: LoadSpec, instance: *const Instance, gpu_config_value: gpu_detect.GpuConfig, shader_dir: []const u8, allocator: std.mem.Allocator, ) !ModelManager

Creates a manager and immediately loads the model described by `spec`.

Acquires the per-device GPU process lock before loading. or an error if loading fails.

Parameters
spec
Path, optional catalog ID, and optional context-length override for the model to load.
instance
Vulkan instance that owns the selected GPU device.
gpu_config_value
Detected GPU capabilities used for shader selection and catalog filtering.
shader_dir
Filesystem path to the directory containing compiled SPIR-V shaders.
allocator
Used for all heap allocations owned by this manager.
Returns

An initialised `ModelManager` with `current` pointing to the loaded resources,

src/server/model_manager.zig:122

method

ModelManager.initEmpty

#
pub fn initEmpty( instance: *const Instance, gpu_config_value: gpu_detect.GpuConfig, shader_dir: []const u8, requested_context_length: ?u32, allocator: std.mem.Allocator, ) ModelManager

Creates a manager with no model loaded; the server starts idle and the GPU lock is not held.

Parameters
instance
Vulkan instance that owns the selected GPU device.
gpu_config_value
Detected GPU capabilities used for catalog filtering.
shader_dir
Filesystem path to the directory containing compiled SPIR-V shaders.
requested_context_length
Optional token-count override applied when a model is later activated.
allocator
Used for all heap allocations owned by this manager.

src/server/model_manager.zig:152

method

ModelManager.currentResources

#
pub fn currentResources(self: *ModelManager) ?*LoadedResources

Returns a pointer to the active model resources, or null if none is loaded.

src/server/model_manager.zig:182

method

ModelManager.activeDisplayName

#
pub fn activeDisplayName(self: *ModelManager) []const u8

Returns the human-readable name of the active model, or `"none"`.

src/server/model_manager.zig:187

method

ModelManager.catalogProfile

#
pub fn catalogProfile(self: *const ModelManager) []const u8

Returns the catalog profile string for the detected GPU (e.g.

`"amd-rdna4-32gb"`).

src/server/model_manager.zig:194

method

ModelManager.currentMemoryUsage

#
pub fn currentMemoryUsage(self: *ModelManager) MemoryUsage

Snapshots the VRAM usage of the active model, or returns zeroes with the full VRAM budget in `device_local_budget_bytes` if idle.

src/server/model_manager.zig:224

method

ModelManager.collectCatalogView

#
pub fn collectCatalogView(self: *ModelManager, allocator: std.mem.Allocator, include_all: bool) !ModelCatalogView

Builds a catalog snapshot with install/active/fit status for every entry.

When `include_all` is false, entries unsupported on the current GPU are excluded.

src/server/model_manager.zig:251

method

ModelManager.supportsManagedEntry

#
pub fn supportsManagedEntry(self: *ModelManager, entry: catalog_mod.CatalogEntry, allocator: std.mem.Allocator) bool

Reports whether a catalog entry is both GPU-architecture-compatible and fits within the current VRAM budget.

Parameters
entry
The catalog entry to evaluate.
allocator
Used for temporary allocations during fit computation; no long-lived allocation is made.
Returns

`true` when the entry matches the detected GPU profile and `describeFit` reports it fits.

src/server/model_manager.zig:348

method

ModelManager.activateManagedModel

#
pub fn activateManagedModel(self: *ModelManager, model_id: []const u8, persist_active: bool) !void

Loads and activates a managed catalog model, replacing any currently loaded model.

If the requested model is already active the function returns immediately (optionally persisting the selection). The GPU process lock is acquired if not already held. survives process restarts. `error.ModelUnsupportedOnThisGpu` if the entry does not match the GPU profile, `error.ModelNotInstalled` if the weights file is absent, or `error.ModelDoesNotFit` if the model exceeds the VRAM budget.

Parameters
model_id
Catalog entry ID to activate; must be installed on disk.
persist_active
When true, writes the selection to the active-model file so it
Returns

`error.UnknownManagedModel` if the ID is not in the catalog,

Notes

Caller must hold the shared generation lock before calling this function.

src/server/model_manager.zig:370

method

ModelManager.removeManagedModel

#
pub fn removeManagedModel(self: *ModelManager, model_id: []const u8, force: bool) !RemoveResult

Uninstalls a managed model from disk and, if it is currently loaded, optionally evicts it from the GPU.

active. When `true`, the model is unloaded from the GPU before deletion. active-selection file was updated.

Parameters
model_id
Catalog entry ID of the model to remove.
force
When `false`, returns `error.ModelLoadedInGpu` if the model is currently
Returns

A `RemoveResult` describing whether the GPU was cleared and whether the

Notes

Caller must hold the shared generation lock before calling this function.

src/server/model_manager.zig:426