Last updated: 2026-06-12
API Server
Routes
Route dispatcher and endpoint handlers for the OpenAI-compatible API.
Handles /v1/chat/completions, /v1/completions, /v1/models, /health, and a built-in chat UI. Supports both streaming (SSE) and non-streaming responses.
3 exports shown
function
toolCallingEnabled
pub fn toolCallingEnabled() bool Return true if OpenAI-compatible tool calling is enabled.
Default on. Set `ZINC_TOOL_CALLING=0` (or `false`) to opt out — useful as a kill switch for clients that misbehave when seeing the new `tool_calls` response shape, or for debugging.
struct
ServerState
pub const ServerState = struct Shared server state tracking active requests, context usage, and generation serialization.
Methods
8method
ServerState.init
pub fn init(started_at: i64) ServerState Create a new server state anchored to the given UNIX timestamp.
method
ServerState.deinit
pub fn deinit(self: *ServerState) void Release owned resources (chat reuse cache).
method
ServerState.uptimeSeconds
pub fn uptimeSeconds(self: *const ServerState, now: i64) u64 Return elapsed seconds since the server started.
method
ServerState.snapshot
pub fn snapshot(self: *const ServerState, now: i64) HealthSnapshot Atomically capture current request and context counters for the health endpoint.
method
ServerState.setActiveContextTokens
pub fn setActiveContextTokens(self: *ServerState, tokens: u32) void Update the active KV-cache token count reported by the health endpoint.
method
ServerState.clearActiveContext
pub fn clearActiveContext(self: *ServerState) void Reset the active context token count to zero.
method
ServerState.clearChatReuseCache
pub fn clearChatReuseCache(self: *ServerState) void Evict all entries from the chat prompt-reuse cache.
method
ServerState.clearChatReuseSession
pub fn clearChatReuseSession(self: *ServerState, session_id: []const u8) void Remove a single session from the chat prompt-reuse cache.
function
handleConnection
pub fn handleConnection( conn: *http.Connection, manager: *model_manager_mod.ModelManager, server_state: *ServerState, allocator: std.mem.Allocator, ) !void Handle one HTTP connection: parse request, dispatch to endpoint, send response.