Last updated: 2026-06-12
Inference Runtime
Bench Support
Shared helpers for benchmark and standalone runner entrypoints.
This module re-exports the Metal runtime pieces that the benchmark tools need and centralizes the GPU-process-lock error path so the small bench binaries do not duplicate server/runtime boilerplate.
12 exports shown
constant
metal_device
pub const metal_device = @import("metal/device.zig") Metal device initialization and capability queries (MTLDevice wrapper).
constant
metal_loader
pub const metal_loader = @import("model/loader_metal.zig") Model loader that maps GGUF weights onto Metal buffers.
constant
metal_buffer
pub const metal_buffer = @import("metal/buffer.zig") Metal buffer allocation and management utilities.
constant
metal_command
pub const metal_command = @import("metal/command.zig") Metal command queue and command buffer submission helpers.
constant
kernel_timing
pub const kernel_timing = @import("metal/kernel_timing.zig") Per-kernel Metal dispatch timing probe for profiling individual GPU kernels.
constant
metal_pipeline
pub const metal_pipeline = @import("metal/pipeline.zig") Metal compute pipeline state cache and compilation helpers.
constant
metal_c
pub const metal_c = @import("metal/c.zig") Raw Objective-C/Metal C shim types and bindings.
constant
gguf
pub const gguf = @import("model/gguf.zig") GGUF file parser for reading quantized model weights and metadata.
constant
tokenizer_mod
pub const tokenizer_mod = @import("model/tokenizer.zig") BPE tokenizer encode and decode for text pre/post-processing.
constant
forward_metal
pub const forward_metal = @import("compute/forward_metal.zig") Metal forward-pass runtime that runs the full model inference graph.
constant
process_lock
pub const process_lock = @import("gpu/process_lock.zig") Cross-process GPU ownership lock preventing two zinc processes from sharing a GPU.
function
reportGpuProcessLockError
pub fn reportGpuProcessLockError(err: anyerror, backend: process_lock.Backend, device_index: u32) noreturn Log a user-facing GPU-process-lock error and terminate the benchmark binary.
Prints a human-readable message to stderr explaining why the lock could not be acquired, then calls `std.process.exit(1)`.
dedicated "stop the other instance" message; all other errors fall back to a generic failure message.