Last updated: 2026-06-12

API Server

Routes

All API Sections

Route dispatcher and endpoint handlers for the OpenAI-compatible API.

Handles /v1/chat/completions, /v1/completions, /v1/models, /health, and a built-in chat UI. Supports both streaming (SSE) and non-streaming responses.

3 exports 8 methods src/server/routes.zig

3 exports shown

function

toolCallingEnabled

#
pub fn toolCallingEnabled() bool

Return true if OpenAI-compatible tool calling is enabled.

Default on. Set `ZINC_TOOL_CALLING=0` (or `false`) to opt out — useful as a kill switch for clients that misbehave when seeing the new `tool_calls` response shape, or for debugging.

src/server/routes.zig:27

struct

ServerState

#
pub const ServerState = struct

Shared server state tracking active requests, context usage, and generation serialization.

src/server/routes.zig:164

Methods

8

method

ServerState.init

#
pub fn init(started_at: i64) ServerState

Create a new server state anchored to the given UNIX timestamp.

Parameters
started_at
UNIX timestamp (seconds) of server startup, stored for uptime calculations.
Returns

Initialized `ServerState` with zeroed counters and an empty chat-reuse cache.

src/server/routes.zig:176

method

ServerState.uptimeSeconds

#
pub fn uptimeSeconds(self: *const ServerState, now: i64) u64

Return elapsed seconds since the server started.

Parameters
now
Current UNIX timestamp (seconds) to compare against `started_at`.
Returns

Non-negative elapsed seconds; clamped to 0 if `now` is before `started_at`.

src/server/routes.zig:191

method

ServerState.snapshot

#
pub fn snapshot(self: *const ServerState, now: i64) HealthSnapshot

Atomically capture current request and context counters for the health endpoint.

Parameters
now
Current UNIX timestamp (seconds) used to compute uptime in the snapshot.
Returns

`HealthSnapshot` with monotonic reads of all live counters and computed uptime.

src/server/routes.zig:198

method

ServerState.setActiveContextTokens

#
pub fn setActiveContextTokens(self: *ServerState, tokens: u32) void

Update the active KV-cache token count reported by the health endpoint.

Parameters
tokens
Number of tokens currently occupying the KV cache.

src/server/routes.zig:209

method

ServerState.clearActiveContext

#
pub fn clearActiveContext(self: *ServerState) void

Reset the active context token count to zero.

src/server/routes.zig:214

method

ServerState.clearChatReuseCache

#
pub fn clearChatReuseCache(self: *ServerState) void

Evict all entries from the chat prompt-reuse cache.

src/server/routes.zig:219

method

ServerState.clearChatReuseSession

#
pub fn clearChatReuseSession(self: *ServerState, session_id: []const u8) void

Remove a single session from the chat prompt-reuse cache.

Parameters
session_id
Opaque session identifier matching the entry to evict.

src/server/routes.zig:225

function

handleConnection

#
pub fn handleConnection( conn: *http.Connection, manager: *model_manager_mod.ModelManager, server_state: *ServerState, allocator: std.mem.Allocator, ) !void

Handle one HTTP connection: parse request, dispatch to endpoint, send response.

Parameters

conn
Active client connection to read from and write to.
manager
Model manager used to resolve the active model for inference.
server_state
Shared server metrics and generation serialization lock.
allocator
Allocator for per-request temporaries.

src/server/routes.zig:368