Last updated: 2026-06-12
Managed Models
Model Manager
Managed active-model runtime state for the HTTP server and CLI startup.
ZINC still loads one model into memory at a time. This manager keeps the current engine/tokenizer/model bundle together and handles serialized swaps.
5 exports shown
struct
LoadSpec
pub const LoadSpec = struct Describes which model to load: a filesystem path, an optional managed-catalog ID, and an optional context-length override.
struct
ModelSummary
pub const ModelSummary = struct Flat representation of a catalog model for JSON serialization to API clients.
struct
ModelCatalogView
pub const ModelCatalogView = struct Snapshot of the full model catalog annotated with the current GPU profile.
Methods
1method
ModelCatalogView.deinit
pub fn deinit(self: *ModelCatalogView, allocator: std.mem.Allocator) void Frees the owned summary slice.
struct
LoadedResources
pub const LoadedResources = struct Bundle of model, tokenizer, and inference engine that represents a fully loaded model.
struct
ModelManager
pub const ModelManager = struct Thread-safe owner of the currently active model, providing load, swap, and catalog queries.
Methods
11method
ModelManager.init
pub fn init( spec: LoadSpec, instance: *const Instance, gpu_config_value: gpu_detect.GpuConfig, shader_dir: []const u8, allocator: std.mem.Allocator, ) !ModelManager Creates a manager and immediately loads the model described by `spec`.
Acquires the per-device GPU process lock before loading. or an error if loading fails.
method
ModelManager.initEmpty
pub fn initEmpty( instance: *const Instance, gpu_config_value: gpu_detect.GpuConfig, shader_dir: []const u8, requested_context_length: ?u32, allocator: std.mem.Allocator, ) ModelManager Creates a manager with no model loaded; the server starts idle and the GPU lock is not held.
method
ModelManager.deinit
pub fn deinit(self: *ModelManager) void Tears down the loaded model (if any) and releases all owned resources.
method
ModelManager.currentResources
pub fn currentResources(self: *ModelManager) ?*LoadedResources Returns a pointer to the active model resources, or null if none is loaded.
method
ModelManager.activeDisplayName
pub fn activeDisplayName(self: *ModelManager) []const u8 Returns the human-readable name of the active model, or `"none"`.
method
ModelManager.catalogProfile
pub fn catalogProfile(self: *const ModelManager) []const u8 Returns the catalog profile string for the detected GPU (e.g.
`"amd-rdna4-32gb"`).
method
ModelManager.currentMemoryUsage
pub fn currentMemoryUsage(self: *ModelManager) MemoryUsage Snapshots the VRAM usage of the active model, or returns zeroes with the full VRAM budget in `device_local_budget_bytes` if idle.
method
ModelManager.collectCatalogView
pub fn collectCatalogView(self: *ModelManager, allocator: std.mem.Allocator, include_all: bool) !ModelCatalogView Builds a catalog snapshot with install/active/fit status for every entry.
When `include_all` is false, entries unsupported on the current GPU are excluded.
method
ModelManager.supportsManagedEntry
pub fn supportsManagedEntry(self: *ModelManager, entry: catalog_mod.CatalogEntry, allocator: std.mem.Allocator) bool Reports whether a catalog entry is both GPU-architecture-compatible and fits within the current VRAM budget.
method
ModelManager.activateManagedModel
pub fn activateManagedModel(self: *ModelManager, model_id: []const u8, persist_active: bool) !void Loads and activates a managed catalog model, replacing any currently loaded model.
If the requested model is already active the function returns immediately (optionally persisting the selection). The GPU process lock is acquired if not already held. survives process restarts. `error.ModelUnsupportedOnThisGpu` if the entry does not match the GPU profile, `error.ModelNotInstalled` if the weights file is absent, or `error.ModelDoesNotFit` if the model exceeds the VRAM budget.
method
ModelManager.removeManagedModel
pub fn removeManagedModel(self: *ModelManager, model_id: []const u8, force: bool) !RemoveResult Uninstalls a managed model from disk and, if it is currently loaded, optionally evicts it from the GPU.
active. When `true`, the model is unloaded from the GPU before deletion. active-selection file was updated.